CN111984625A - Database load characteristic processing method, device, medium and electronic equipment - Google Patents

Database load characteristic processing method, device, medium and electronic equipment Download PDF

Info

Publication number
CN111984625A
CN111984625A CN202010853809.4A CN202010853809A CN111984625A CN 111984625 A CN111984625 A CN 111984625A CN 202010853809 A CN202010853809 A CN 202010853809A CN 111984625 A CN111984625 A CN 111984625A
Authority
CN
China
Prior art keywords
statement
transaction
session
type
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010853809.4A
Other languages
Chinese (zh)
Other versions
CN111984625B (en
Inventor
尹强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingbase Information Technologies Co Ltd
Original Assignee
Beijing Kingbase Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingbase Information Technologies Co Ltd filed Critical Beijing Kingbase Information Technologies Co Ltd
Priority to CN202010853809.4A priority Critical patent/CN111984625B/en
Publication of CN111984625A publication Critical patent/CN111984625A/en
Application granted granted Critical
Publication of CN111984625B publication Critical patent/CN111984625B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a database load characteristic processing method, a database load characteristic processing device, a database load characteristic processing medium and electronic equipment. The method comprises the following steps: acquiring and recording statement information, transaction identification and session identification related to each SQL statement when the execution of each SQL statement is completed; determining statement type identification of each SQL statement according to statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; determining a corresponding session type identifier according to the transaction type identifiers belonging to the same session identifier; establishing a structured data relation model based on all the determined session type identifications, transaction type identifications and statement type identifications; the structured data relationship model is used for storing session characteristic information, transaction characteristic information and statement characteristic information of each session type. The scheme disclosed by the invention can accurately and comprehensively depict and reflect the load characteristic condition of the database.

Description

Database load characteristic processing method, device, medium and electronic equipment
Technical Field
The disclosed embodiments relate to the field of database technologies, and in particular, to a database load characteristic processing method, a database load characteristic processing apparatus, a computer-readable storage medium and an electronic device for implementing the database load characteristic processing method.
Background
Databases are an important component of information systems, the task of which is to store and manage data. The performance of the database will directly affect the scalability of the service and the user experience. Therefore, users want the database to work in an optimal mode for a long time, and how to evaluate and improve the performance of the database becomes an important topic.
At present, in the related art, a third-party tool can be used for monitoring and analyzing the load performance of the database, sorting and analyzing the load characteristic data of the database, and finally obtaining a database performance data report, so that a database administrator can better configure the database and provide basis and reference for the database administrator to work in an efficient mode.
However, the load of the database tends to become more complex with the passage of time, for example, the complexity of the database load increases due to changes in functions, increase in access amount, and even increase in applications, and it is difficult to know the load characteristics of the database more completely, which brings great challenges to the management, operation and maintenance and possible reconfiguration of the database. However, the related technology has not been found to concern the problem at present.
Disclosure of Invention
In order to solve the technical problem or at least partially solve the technical problem, embodiments of the present disclosure provide a database load characteristic processing method, a database load characteristic processing apparatus, and a computer-readable storage medium and an electronic device implementing the database load characteristic processing method.
In a first aspect, an embodiment of the present disclosure provides a database load characteristic processing method, including:
acquiring and recording statement information, transaction identification and session identification related to each SQL statement when the execution of each SQL statement is completed; the transaction identifier represents a transaction to which each SQL statement belongs, and the session identifier represents a session to which the transaction to which each SQL statement belongs;
determining statement type identification of each SQL statement according to statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; determining a corresponding session type identifier according to the transaction type identifiers belonging to the same session identifier;
establishing a structured data relation model based on all the determined session type identifications, transaction type identifications and statement type identifications; the structured data relationship model is used for storing session characteristic information of each session type, transaction characteristic information attributed to the session characteristic information and statement characteristic information attributed to the transaction characteristic information.
In some embodiments of the present disclosure, the collecting and recording statement information, transaction identifiers, and session identifiers related to each SQL statement when the execution of each SQL statement is completed includes:
when the execution of each SQL statement is completed, obtaining statement information, transaction identification and session identification related to each SQL statement through a log system;
wherein the statement information at least comprises one or more of statement content, statement execution time consumption and table information related to the statement.
In some embodiments of the present disclosure, the determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement includes:
and (3) carrying out parameterization on the statement content of each SQL statement, carrying out Hash calculation on the parameterized content, and using the obtained Hash value as statement type identification.
In some embodiments of the present disclosure, the parameterizing the statement content of each SQL statement includes:
and replacing the constant part of the statement content of each SQL statement by preset characters.
In some embodiments of the present disclosure, the determining a corresponding transaction type identifier according to a statement type identifier of an SQL statement belonging to the same transaction identifier includes:
traversing the transaction identifier related to each SQL statement to determine the SQL statements belonging to the same transaction identifier;
and obtaining statement type identifications of SQL statements belonging to the same transaction identification, performing hash calculation on the obtained statement type identifications, and taking the obtained hash values as the transaction type identifications.
In some embodiments of the present disclosure, further comprising:
when a plurality of SQL sentences belonging to the same transaction identifier exist, acquiring a plurality of statement type identifiers corresponding to the plurality of SQL sentences belonging to the same transaction identifier;
removing repeated statement type identifications in the statement type identifications;
and performing hash calculation based on the removed residual statement type identifier, and taking the obtained hash value as the transaction type identifier.
In some embodiments of the present disclosure, the determining, according to the transaction type identifier belonging to the same session identifier, a corresponding session type identifier includes:
traversing the session identification and the transaction identification related to each SQL statement to acquire M transaction type identifications belonging to the same session identification; m is a natural number greater than or equal to 2;
removing repeated transaction type identifications in the M transaction type identifications to which the same session identification belongs to obtain the rest N transaction type identifications;
and performing hash calculation based on the rest N transaction type identifications, and taking the obtained hash value as the session type identification.
In some embodiments of the present disclosure, further comprising:
comparing the repetition proportion of the rest N transaction type identifiers to which the two adjacent session identifiers belong one by one, and determining that the sessions represented by the two session identifiers are the same session when the repetition proportion is greater than a preset proportion threshold; the preset proportion threshold is more than 80%;
removing repeated transaction type identifiers in the remaining 2N transaction type identifiers to which the two session identifiers belong to obtain remaining P transaction type identifiers;
and performing hash calculation based on the remaining P transaction type identifications, and taking the obtained hash value as the session type identification of the same session.
In some embodiments of the present disclosure, further comprising:
removing repeated statement type identifications in statement type identifications of all SQL statements to obtain remaining statement type identifications;
obtaining table information related to the SQL statement based on the residual statement type identification, and obtaining data characteristic information based on the table information;
the data characteristic information at least comprises one or more of table name, table capacity, attribute number, page number, tuple number and statistical information.
In some embodiments of the present disclosure, the structured data relationship model further comprises data feature information attributed to statement feature information; the session characteristic information comprises one or more of session number, session time consumption, transaction type number and transaction total number; the transaction characteristic information comprises one or more of transaction execution times, transaction execution time consumption and execution statement sequences to which the transaction belongs; the sentence characteristic information comprises one or more of sentence execution times, sentence execution time consumption and sentence content.
In some embodiments of the present disclosure, before determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement, the method further includes:
writing the obtained statement information, the transaction identifier and the session identifier related to each SQL statement into a log file in a preset file format;
introducing the log file into a database in the form of an external table, and converting the external table into a table of an engine of the database;
three attribute columns for updating the session type identifier, the transaction type identifier and the statement type identifier determined by the record are newly added in the converted table.
In some embodiments of the present disclosure, further comprising:
acquiring a session type identifier of a session to be analyzed;
and inquiring and acquiring one or more of session characteristic information, transaction characteristic information, statement characteristic information and data characteristic information which the session to be analyzed belongs to based on the session type identification of the session to be analyzed and the structured data relationship model.
In a second aspect, an embodiment of the present disclosure provides a database load characteristic processing apparatus, including:
the data acquisition module is used for acquiring and recording statement information, transaction identification and session identification related to each SQL statement when the execution of each SQL statement is finished; the transaction identifier represents a transaction to which each SQL statement belongs, and the session identifier represents a session to which the transaction to which each SQL statement belongs;
the data preprocessing module is used for determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; determining a corresponding session type identifier according to the transaction type identifiers belonging to the same session identifier;
the model establishing module is used for establishing a structured data relation model based on all the determined session type identifications, transaction type identifications and statement type identifications; the structured data relationship model is used for storing session characteristic information of each session type, transaction characteristic information attributed to the session characteristic information and statement characteristic information attributed to the transaction characteristic information.
In a third aspect, the present disclosure provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the database load characteristic processing method according to any one of the foregoing embodiments.
In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to execute the steps of the database load characteristic processing method according to any one of the above embodiments by executing the executable instructions.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages:
in the embodiment of the disclosure, the statement information, the transaction identifier and the session identifier related to each SQL statement are collected and recorded when the execution of each SQL statement is completed, and then the statement type identifier of each SQL statement is determined according to the statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; determining a corresponding session type identifier according to the transaction type identifiers belonging to the same session identifier; finally, based on all the determined session type identifications, transaction type identifications and statement type identifications, a structured data relation model is established; the structured data relationship model is used for storing session characteristic information of each session type, transaction characteristic information attributed to the session characteristic information and statement characteristic information attributed to the transaction characteristic information. Therefore, the embodiment focuses on information such as session types, transaction types and statement types reflecting database operation characteristics, and a structured data relation model reflecting database load characteristic information is established based on the information, so that the load characteristic condition of the database can be accurately and comprehensively depicted and reflected, subsequent load characteristic data analysis is facilitated, the database is better configured by a database administrator, and accurate and comprehensive reference is provided for the database administrator to work in a high-efficiency mode.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a flow chart of a database load characteristic processing method according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of a structured data relationship model of database load characteristics according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of a database load characteristic processing method according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a database load characteristic processing apparatus according to an embodiment of the disclosure;
fig. 5 is a schematic diagram of an electronic device for implementing a database load characteristic processing method according to an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
Fig. 1 is a flowchart of a database load characteristic processing method according to an embodiment of the present disclosure, where the database load characteristic processing method may include the following steps:
step S101: and collecting and recording statement information, transaction identification and session identification related to each SQL statement when the execution of each SQL statement is completed. The transaction identification represents the transaction to which each SQL statement belongs, and the session identification represents the session to which the transaction to which each SQL statement belongs.
Step S102: determining statement type identification of each SQL statement according to statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; and determining the corresponding session type identifier according to the transaction type identifier belonging to the same session identifier.
Step S103: establishing a structured data relation model based on all the determined session type identifications, transaction type identifications and statement type identifications; the structured data relationship model is used for storing session characteristic information of each session type, transaction characteristic information attributed to the session characteristic information and statement characteristic information attributed to the transaction characteristic information.
According to the method for processing the load characteristics of the database, disclosed by the embodiment of the disclosure, information such as session types, transaction types, statement types and the like reflecting the operation characteristics of the database is concerned, and a structured data relation model reflecting the load characteristic information of the database is established based on the information, so that the load characteristic condition of the database can be accurately and comprehensively described and reflected, and therefore, subsequent load characteristic data analysis is facilitated, the database is better configured for a database administrator, and accurate and comprehensive reference is provided for the database administrator to work in an efficient mode.
In some embodiments of the present disclosure, in step S101, statement information, a transaction identifier, and a session identifier related to each SQL statement are collected and recorded when the execution of each SQL statement is completed. The transaction identification represents the transaction to which each SQL statement belongs, and the session identification represents the session to which the transaction to which each SQL statement belongs.
Illustratively, the statement information may include at least one or more of, but not limited to, statement content, time consumed by statement execution, and statement-related table information. The statement content may be, for example, content relating to data operations such as query, delete, update, and the like. The statement execution time is the time length from the beginning to the end of the execution of one SQL statement. The table information referred to by the statement may be a table name, but is not limited thereto. The tables referred to by the statement may be one or more, and are related to the statement content, as determined by the content of the data operation, such as query, delete, update, etc. In addition, as shown in table 1 below, generally a session (session) may contain one or more transactions attributed to the session, such as session 1 including transaction 1, transaction 2, and transaction 3. A transaction may in turn include one or more SQL statements attributed to the transaction, such as transaction 1 including SQL statement 1 and SQL statement 2. The transaction identifier may be a transaction number indicating a transaction to which an SQL statement belongs, such as SQL statement 1 and SQL statement 2 belonging to transaction 1. The session identifier may be a session number indicating the session to which the transaction to which an SQL statement belongs, such as transaction 1, transaction 2, and transaction 3 belong to session 1, and transaction 4, transaction 5, and transaction 6 belong to session 2.
TABLE 1
Figure BDA0002645701420000081
In this embodiment, when the execution of each SQL statement is completed, the related statement information, the transaction number, and the session number of each SQL statement may be collected and recorded.
Specifically, as an example, when execution of each SQL statement is completed, statement information, a transaction identifier, such as a transaction number, and a session identifier, such as a session number, related to each SQL statement may be acquired by the log system. The related statement information, the transaction number and the conversation number of each SQL statement can be directly acquired through the database log system, and the processing efficiency is high.
Optionally, in some embodiments of the present disclosure, the determining, in step S102, the statement type identifier of each SQL statement according to the statement information related to each SQL statement may specifically include: and (3) carrying out parameterization on the statement content of each SQL statement, carrying out Hash calculation on the parameterized content, and using the obtained Hash value as statement type identification.
Illustratively, the statement type identifier, such as the statement type ID, may represent a type of an SQL statement, such as an query statement type, a delete statement type, or an update statement type, and the like, and values of different statement type IDs, such as unique digital codes, may represent corresponding different statement types, for example, a corresponding relationship between a value of the statement type ID and the SQL statement type may be configured in advance, but is not limited thereto. The hash value calculated through the hash calculation is used as the statement type ID, so that the data acquisition and query efficiency during subsequent feature extraction and analysis can be improved.
In some embodiments of the present disclosure, optionally, the parameterization processing on the statement content of each SQL statement in step S102 may specifically include: and replacing the constant part of the statement content of each SQL statement by preset characters.
Specifically, as an example, parameterizing the sentence content, the constant part of the sentence content may be replaced with a preset character such as "? ", but the predetermined character is not limited thereto. For example, replacing the sentence content "where id > 5" with "where id >? "then, HASH calculation is performed on the parameterized content, for example, HASH calculation is performed based on a character string in the content, and a HASH value obtained by calculation is identified as a sentence type into a sentence type ID. It is understood that the specific hash calculation may refer to the prior art, and is not described herein or below.
Optionally, in some embodiments of the present disclosure, in step S102, determining a corresponding transaction type identifier according to a statement type identifier of an SQL statement belonging to the same transaction identifier may specifically include the following steps:
and traversing the transaction identifications related to each SQL statement to determine the SQL statements belonging to the same transaction identification.
And obtaining statement type identifications of SQL statements belonging to the same transaction identification, performing hash calculation on the obtained statement type identifications, and taking the obtained hash values as the transaction type identifications.
Specifically, the transaction type identifier, such as the transaction type ID, indicates a type of the transaction, and values of different transaction type IDs may indicate different types of transactions. When a session includes multiple transactions, the types of the respective transactions may not be completely the same, for example, the session 1 includes the same transaction types of the transaction 1 and the transaction 2, and the transaction type of the transaction 3 is different from both the transaction 1 and the transaction 2, which is only an example here, and the present embodiment is not limited thereto. With the above-mentioned attribution relationship, for example, as shown in table 1, in this embodiment, the transaction number associated with each SQL statement may be traversed to determine the SQL statements belonging to the same transaction number. For example, if the transaction numbers of the SQL statement 1 and the SQL statement 2 are both transaction 1, i.e. the transaction numbers are the same, the SQL statement belonging to the same transaction 1 may be determined. For another example, if the transaction numbers of the SQL statement 4, the SQL statement 5, and the SQL statement 6 are all determined to be the transaction 3, that is, the transaction numbers are the same, the SQL statement belonging to the same transaction 3 can be determined.
Then, the statement type identifier of the SQL statement belonging to the same transaction number may be obtained, for example, the statement type ID4, the statement type ID5, and the statement type ID6 corresponding to the SQL statement 4, the SQL statement 5, and the SQL statement 6 belonging to the transaction 3, respectively, are obtained. Then, hash calculation can be performed on the statement type ID4, the statement type ID5, and the statement type ID6, that is, hash calculation is performed on the whole statement sequence (SQL statement 4, SQL statement 5, and SQL statement 6) ID, and the obtained hash value is used as a transaction type identifier of the transaction 3, such as the transaction type ID.
Further optionally, on the basis of the above embodiments, in some embodiments of the present disclosure, the method may further include the following steps:
when a plurality of SQL sentences belonging to the same transaction identifier exist, a plurality of statement type identifiers corresponding to the plurality of SQL sentences belonging to the same transaction identifier are obtained.
Removing duplicate statement type identifications in the plurality of statement type identifications.
And performing hash calculation based on the removed residual statement type identifier, and taking the obtained hash value as the transaction type identifier.
Specifically, some transactions may repeat certain operations an indefinite number of times, and the statement type ID included in the transaction may be deduplicated and then calculated. Continuing with the above example, when there are three SQL statements belonging to the transaction 3, such as SQL statement 4, SQL statement 5, and SQL statement 6, the statement type ID4, statement type ID5, and statement type ID6 corresponding to the SQL statement 4, SQL statement 5, and SQL statement 6 to which the transaction 3 belongs are acquired. In some cases, there may be duplicate statement type identifications for statement type ID4, statement type ID5, and statement type ID6, for example, the values of statement type ID4 and statement type ID5 are the same, which means that the corresponding SQL statement 4 and SQL statement 5 are statements of the same type, for example, both are deletion statement types. At this time, any one of the duplicate statement type ID4 and statement type ID5 is removed, hash calculation is performed based on, for example, the statement type ID4 and statement type ID6 after removal, and the obtained hash value is used as a transaction type identifier of the transaction 3, such as a transaction type ID. Therefore, the data volume to be processed can be reduced, and the overall data processing efficiency during the database load characteristic processing is improved.
Optionally, on the basis of the foregoing embodiments, in some embodiments of the present disclosure, determining a corresponding session type identifier according to a transaction type identifier belonging to the same session identifier in step S102 may specifically include the following steps:
traversing the session identification and the transaction identification related to each SQL statement to acquire M transaction type identifications belonging to the same session identification; m is a natural number of 2 or more.
And removing the repeated transaction type identifiers in the M transaction type identifiers to which the same session identifier belongs to obtain the rest N transaction type identifiers.
And performing hash calculation based on the rest N transaction type identifications, and taking the obtained hash value as the session type identification.
Specifically, referring to table 1 above, a session number and a transaction number associated with each SQL statement are traversed to obtain M transaction type identifiers, such as transaction type IDs, belonging to the same session number. For example, the transaction type IDs attributed to each of the three transactions (e.g., transaction 4, transaction 5, and transaction 6) to which session 2 belongs are obtained. Meanwhile, the transaction type IDs of the 3 transactions (e.g., transaction 1, transaction 2, and transaction 3) to which the session 1 belongs may also be obtained.
Then, the duplicate transaction type ID in the M transaction type IDs to which the same session number belongs may be removed to obtain the remaining N transaction type IDs. For example, the transaction type IDs of the three transactions belonging to the same session 2 (e.g., transaction 4, transaction 5, and transaction 6) may have two transaction type IDs, such as the transaction type IDs of transaction 4 and transaction 5, which are the same, and the duplicate transaction type IDs may be removed. A hash calculation may then be performed based on, for example, the respective corresponding transaction type IDs of the remaining transactions 5 and 6, and the obtained hash value is used as the session type identifier of the session 2, such as the session type ID. The determination processing manner of the session type ID of other sessions, such as session 1, is the same, and is not described herein again.
Further optionally, on the basis of the above embodiments, in some embodiments of the present disclosure, the method may further include the following steps:
comparing the repetition proportion of the rest N transaction type identifiers to which the two adjacent session identifiers belong one by one, and determining that the sessions represented by the two session identifiers are the same session when the repetition proportion is greater than a preset proportion threshold; the preset proportion threshold is more than 80%.
And removing the repeated transaction type identifiers in the remaining 2N transaction type identifiers to which the two session identifiers belong to obtain the remaining P transaction type identifiers.
And performing hash calculation based on the remaining P transaction type identifications, and taking the obtained hash value as the session type identification of the same session.
Specifically, the overall data processing efficiency in database load characteristic processing is improved by reducing the data processing amount. In this embodiment, when the types related to the two sessions are the same, the sessions of the same type may be merged and then hash calculation may be performed. Continuing with the above example, the repetition ratios of the remaining N transaction type IDs to which the two adjacent session numbers belong, that is, the transaction type ID similarities, are compared one by one, and when the repetition ratio is greater than 80%, it is determined that the sessions represented by the two session numbers are the same session. For example, if it is assumed that session 2 has transaction type IDs corresponding to 10 remaining transactions after deduplication, and session 1 has transaction type IDs corresponding to 11 remaining transactions after deduplication, at this time, if it is determined by comparison that 10 transaction type IDs belonging to session 2 are the same as 9 transaction type IDs out of 11 transaction type IDs belonging to session 1, that is, more than 90% of the transaction type IDs are the same, it may be considered that session types of session 1 and session 2 are the same. At this time, 9 duplicate transaction type IDs of the 21 transaction type IDs to which the two sessions 1 and 2 belong may be removed to obtain the remaining 12 transaction type IDs, hash calculation is performed based on the 12 transaction type IDs, and the obtained hash value is used as the session type ID of the same session.
Optionally, on the basis of any one of the above embodiments, in some embodiments of the present disclosure, the method may further include the following steps:
and removing repeated statement type identifications in the statement type identifications of all SQL statements to obtain the rest statement type identifications.
And acquiring table information related to the belonged SQL statement based on the residual statement type identification, and acquiring data characteristic information based on the table information.
The data characteristic information may include, but is not limited to, at least one or more of a table name, a table capacity, an attribute number, a page number, a tuple number, and statistical (statistical) information.
Specifically, based on the statement type IDs of all the SQL statements 1 to 10 belonging to the session 1 and the session 2 determined in the above embodiment, there may be duplicate statement type IDs, that is, the types of some two or more SQL statements are the same, at this time, duplicate statement type IDs in the statement type IDs of 10 SQL statements may be removed, and the remaining statement type IDs are obtained. Table information related to the relevant SQL statement may then be obtained based on the remaining statement type ID, and data characteristic information, such as table name, table capacity, number of attributes, number of pages, number of tuples, and statistics (statistics) information, may be obtained based on the table information.
Further optionally, after obtaining the data feature information, in some embodiments of the present disclosure, the structured data relationship model may further include data feature information attributed to the sentence feature information. As shown in FIG. 2, the structured data relationship model may be used to store session feature information for each session type (Session type 1, Session type 2 … Session type N), transaction feature information attributed to the session feature information, statement feature information attributed to the transaction feature information, and data feature information attributed to the statement feature information. The feature information of each level in the structured data relationship model is stored with the corresponding session type ID, transaction type ID and statement type ID as index keys, so that the storage of the associated feature information of each session type (e.g., session type 1, session type 2 … session type N) can be realized. The structured data relation model established based on the information of the added data characteristic dimension can accurately and comprehensively depict and reflect the load characteristic condition of the database, so that the subsequent load characteristic data analysis is facilitated, a database administrator is better configured, and the database can be more accurately and comprehensively referred to when working in an efficient mode.
In some embodiments of the present disclosure, the session characteristic information may include, but is not limited to, one or more of a number of sessions, a session elapsed time, a number of transaction types, and a total number of transactions. The transaction characteristic information may include, but is not limited to, one or more of the number of times the transaction is executed, the time consumed by the transaction, and the execution statement sequence to which the transaction belongs. The sentence characteristic information may include, but is not limited to, one or more of a number of times the sentence is executed, a time consumed by the sentence, and a content of the sentence. Some information of the previous level can be obtained by summarizing the related information of the next level.
On the basis of any of the foregoing embodiments, in some embodiments of the present disclosure, in order to facilitate processing load characteristic data, before determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement, the following steps are further included:
and writing the obtained statement information, the transaction identifier and the session identifier related to each SQL statement into a log file in a preset file format record.
And introducing the log file into the database in the form of an external table, and converting the external table into a table of an engine of the database.
Three attribute columns for updating the session type identifier, the transaction type identifier and the statement type identifier determined by the record are newly added in the converted table.
Specifically, in this embodiment, the minimum collection granularity is a statement level, so the collection granularity of the original collection information should not be higher than the statement level, and in this embodiment, one piece of original information, such as statement information, a transaction number, a session number, and the like, may be recorded for each SQL statement. Because the space occupied by the data to be collected is larger, the original information is recorded by adopting a log file instead of a memory. It is popular that a log system records a session number, a transaction number, statement contents, time consumed by statement execution and table information related to statements into a log file in a preset file format. The preset file format may specifically be: each attribute is divided by, for example, the character "| | |", and the table related to the sentence is divided by, for example, the character string "@ @ @ @ @ @ @". Therefore, subsequent data preprocessing can be facilitated, and the processing efficiency is improved.
In this embodiment, the log file is introduced into the database in the form of an external table, and the external table is converted into a table of an engine of the database itself, so that the database performs subsequent data processing.
And finally, newly adding three attribute columns for updating the session type identifier, the transaction type identifier and the statement type identifier determined by the record in the converted table. Then, based on the session type ID, the transaction type ID, and the statement type ID determined in the above embodiments, the session type ID, the transaction type ID, and the statement type ID can be updated and written in the table, so as to construct a final structured data relationship model.
On the basis of the above embodiments, some embodiments of the present disclosure may further include the following steps:
and acquiring the session type identification of the session to be analyzed.
And inquiring and acquiring one or more of session characteristic information, transaction characteristic information, statement characteristic information and data characteristic information which the session to be analyzed belongs to based on the session type identification of the session to be analyzed and the structured data relationship model.
Specifically, for example, a session type ID of the session to be analyzed is obtained, and based on the session type ID of the session to be analyzed and the structured data relationship model, the session feature information, the transaction feature information, the statement feature information, and the data feature information to which the session to be analyzed belongs are obtained through query. Therefore, the database load characteristic information which is more comprehensive and accurate can be extracted based on the structured data relation model, the subsequent load characteristic data analysis is facilitated, and the database is better configured for a database administrator, so that the database can work in a high-efficiency mode and more accurate and comprehensive reference is provided.
The technical solution of the embodiment of the present disclosure is described below with reference to a specific embodiment shown in fig. 3. The specific embodiment mainly comprises the following three processing flows:
1) a data acquisition process:
the method comprises the following steps: and starting a load characteristic collection process through a function workload _ capture _ start.
Step two: the upper layer application sends a data request to the database.
Step three: and the database responds to the data request, and when each SQL statement is executed, the related session number, the transaction number, the statement content, the statement execution time consumption, the table information related to the statement and the like are recorded into a log file through a log system. The character "| | |" is used for division among each attribute, and the sub-attribute of the table information related to the statement is divided by the character string "@ @ @".
Step four: and ending the acquisition process through workload _ capture _ stop.
2) The pretreatment process comprises the following steps:
the method comprises the following steps: the log file is introduced into the database in the form of a sys _ log external table.
Step two: and dumping the external table into a table of an engine of the database, newly adding three attribute columns of a session type ID, a transaction type ID and a statement type ID in the table, and naming the table as sys _ workload _ quries.
Step three: the record of each SQL statement of the history list sys _ workload _ queries is parameterized for the statement content, and specifically operates to replace the constant part of the statement content with "? ", for example, replace" where id >5 "with" "where id >? Then HASH calculation is carried out on the parameterized content, and the obtained value is used as a statement type ID to be updated into the table sys _ workload _ queries.
Step four: all transaction numbers in the history table sys _ workload _ queries are subjected to duplication removal operation on statement type IDs belonging to statements in the same transaction number, then HASH values are obtained through HASH calculation after combination processing and serve as transaction type IDs, and the transaction type IDs are updated into the table sys _ workload _ queries.
Step five: all session numbers of the history table sys _ workload _ queries are judged whether to be the same session by comparing the repetition proportion of the transaction type IDs after the duplication contained in the two sessions one by one to be more than 80%, if so, the transaction type IDs of the two sessions are subjected to duplication elimination and combination and then subjected to HASH calculation, and the obtained HASH value is used as the session type ID and is updated into the table.
Step six: and (5) creating a sys _ workload _ tables, wherein the column names comprise a table name, a table ID and a statement type ID. And recording the statement type ID in the sys _ workload _ queries table after the statement type ID in the sys _ workload _ queries table is deduplicated.
Step seven: the statement type ID in the table sys _ workload _ queries is recorded into the table sys _ workload _ tables after being deduplicated, if one SQL statement relates to a plurality of tables, the SQL statement is divided into a plurality of records, and the table IDs of the plurality of tables are searched through a system table and are updated into the sys _ workload _ tables.
Step eight: and finishing the pretreatment flow.
It can be understood that after the preprocessing flow is finished, the structured data relationship model is also established.
3) A characteristic extraction process:
the method comprises the following steps: entering a feature extraction flow through a function workload _ analyze _ report
Step two: and aggregating the number of sessions, the consumed time of the sessions, the number of transaction types and the number of transactions by taking one or more session type IDs as grouping conditions to obtain a session-level information result.
Step three: for each conversation, the conversation type ID is taken as a filtering condition, and a transaction-level information result is obtained by grouping the transaction type IDs, aggregating the execution times, the execution time consumption, the execution statement sequence and the like.
Step four: and for each conversation, grouping by using the conversation type ID as a filtering condition and through the statement type ID, aggregating the execution times and the execution time consumption to obtain a statement level information result.
Step five: for each conversation, the conversation type ID is taken as a filtering condition, all tables related to the conversation are obtained in a gathering mode by connecting the statement type IDs in sys _ workload _ queries and sys _ workload _ tables, and then data characteristic information such as table names, table capacity, attribute numbers, page numbers, tuple numbers, static information and the like is obtained after a system table is inquired.
Step six: and forming a feature report based on all the extracted load feature information, and ending the feature extraction process.
In this embodiment, the session type IDs are traversed, transaction-level and statement-level information is gathered for each session type ID under the condition of the transaction type ID and the statement type ID, all tables related to the session are found by using the statement type ID corresponding to the session type ID, and the data characteristic information is obtained by searching the system table.
It should be noted that although the various steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that these steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc. Additionally, it will also be readily appreciated that the steps may be performed synchronously or asynchronously, e.g., among multiple modules/processes/threads.
Based on the same concept, the embodiment of the present disclosure provides a database load characteristic processing apparatus, as shown in fig. 4, the database load characteristic processing apparatus 40 may include: the data acquisition module 401 is configured to acquire and record statement information, transaction identifiers and session identifiers related to each SQL statement when the execution of each SQL statement is completed; the transaction identification represents the transaction to which each SQL statement belongs, and the session identification represents the session to which the transaction to which each SQL statement belongs. The data preprocessing module 402 is configured to determine a statement type identifier of each SQL statement according to statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; and determining the corresponding session type identifier according to the transaction type identifier belonging to the same session identifier. A model building module 403, configured to build a structured data relationship model based on all the determined session type identifiers, transaction type identifiers, and statement type identifiers; the structured data relationship model is used for storing session characteristic information of each session type, transaction characteristic information attributed to the session characteristic information and statement characteristic information attributed to the transaction characteristic information.
The database load characteristic processing device shown in the embodiment of the disclosure focuses on information such as session types, transaction types, statement types and the like reflecting database operation characteristics, and a structured data relation model reflecting database load characteristic information is established based on the information, so that the load characteristic condition of the database can be accurately and comprehensively depicted and reflected, subsequent load characteristic data analysis is facilitated, the database is better configured by a database administrator, and accurate and comprehensive reference is provided for the database administrator to work in an efficient mode.
In some embodiments of the present disclosure, the data collection module 401 collects and records statement information, a transaction identifier, and a session identifier related to each SQL statement when the execution of each SQL statement is completed, and specifically may include: when the execution of each SQL statement is completed, obtaining statement information, transaction identification and session identification related to each SQL statement through a log system; wherein the statement information at least comprises one or more of statement content, statement execution time consumption and table information related to the statement.
In some embodiments of the present disclosure, the data preprocessing module 402 determines the statement type identifier of each SQL statement according to the statement information related to each SQL statement, which may specifically include: and (3) carrying out parameterization on the statement content of each SQL statement, carrying out Hash calculation on the parameterized content, and using the obtained Hash value as statement type identification.
In some embodiments of the present disclosure, the data preprocessing module 402 parameterizes the statement content of each SQL statement, which may specifically include: and replacing the constant part of the statement content of each SQL statement by preset characters.
In some embodiments of the present disclosure, the data preprocessing module 402 determines a corresponding transaction type identifier according to a statement type identifier of an SQL statement belonging to the same transaction identifier, which may specifically include: traversing the transaction identifier related to each SQL statement to determine the SQL statements belonging to the same transaction identifier; and obtaining statement type identifications of SQL statements belonging to the same transaction identification, carrying out hash calculation on the statement type identifications, and taking obtained hash values as the transaction type identifications.
In some embodiments of the present disclosure, the data preprocessing module 402 is further configured to: when a plurality of SQL sentences belonging to the same transaction identifier exist, acquiring a plurality of statement type identifiers corresponding to the plurality of SQL sentences belonging to the same transaction identifier; removing repeated statement type identifications in the statement type identifications; and performing hash calculation based on the removed residual statement type identifier, and taking the obtained hash value as the transaction type identifier.
In some embodiments of the present disclosure, the data preprocessing module 402 determines, according to the transaction type identifier belonging to the same session identifier, a corresponding session type identifier, which may specifically include: traversing the session identification and the transaction identification related to each SQL statement to acquire M transaction type identifications belonging to the same session identification; m is a natural number greater than or equal to 2; removing repeated transaction type identifications in the M transaction type identifications to which the same session identification belongs to obtain the rest N transaction type identifications; and performing hash calculation based on the rest N transaction type identifications, and taking the obtained hash value as the session type identification.
In some embodiments of the present disclosure, the data preprocessing module 402 is further configured to compare repetition ratios of the remaining N transaction type identifiers to which the two adjacent session identifiers belong one by one, and when the repetition ratio is greater than a preset ratio threshold, determine that the sessions represented by the two session identifiers are the same session; the preset proportion threshold is more than 80%; removing repeated transaction type identifiers in the remaining 2N transaction type identifiers to which the two session identifiers belong to obtain remaining P transaction type identifiers; and performing hash calculation based on the remaining P transaction type identifications, and taking the obtained hash value as the session type identification of the same session.
In some embodiments of the present disclosure, the system may further include a data feature obtaining module, configured to remove repeated statement type identifiers from the statement type identifiers of all SQL statements to obtain remaining statement type identifiers; and acquiring table information related to the belonged SQL statement based on the residual statement type identification, and acquiring data characteristic information based on the table information. The data characteristic information at least comprises one or more of table name, table capacity, attribute number, page number, tuple number and statistical information.
Optionally, in some embodiments of the present disclosure, the structured data relationship model may further include data feature information attributed to the sentence feature information. The session characteristic information may include, but is not limited to, one or more of a number of sessions, a session elapsed time, a number of transaction types, and a total number of transactions; the transaction characteristic information can include but is not limited to one or more of transaction execution times, transaction execution time consumption and execution statement sequence to which the transaction belongs; the sentence characteristic information may include, but is not limited to, one or more of a number of times the sentence is executed, a time consumed by the sentence, and a content of the sentence.
In some embodiments of the present disclosure, an information conversion processing module may be further included, configured to, before determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement, write the obtained statement information related to each SQL statement, the transaction identifier, and the session identifier into a log file in a preset file format record; introducing the log file into a database in the form of an external table, and converting the external table into a table of an engine of the database; three attribute columns for updating the session type identifier, the transaction type identifier and the statement type identifier determined by the record are newly added in the converted table.
In some embodiments of the present disclosure, a feature extraction module may further be included, configured to obtain a session type identifier of a session to be analyzed; and inquiring and acquiring one or more of session characteristic information, transaction characteristic information, statement characteristic information and data characteristic information which the session to be analyzed belongs to based on the session type identification of the session to be analyzed and the structured data relationship model.
The specific manner in which the above-mentioned embodiments of the apparatus, and the corresponding technical effects brought about by the operations performed by the respective modules, have been described in detail in the embodiments related to the method, and will not be described in detail herein.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units. The components shown as modules or units may or may not be physical units, i.e. may be located in one place or may also be distributed over a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the wood-disclosed scheme. One of ordinary skill in the art can understand and implement it without inventive effort.
The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the database load feature processing method according to any one of the embodiments.
By way of example, and not limitation, such readable storage media can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Embodiments of the present disclosure also provide an electronic device, such as a database server, including a processor and a memory, where the memory is used to store executable instructions of the processor. Wherein the processor is configured to perform the steps of the database load characteristic processing method in any of the above embodiments via execution of the executable instructions.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 5. The electronic device 600 shown in fig. 5 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 5, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 that connects the various system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.
Wherein the storage unit stores program code executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention described in the database load characterization processing method section above in this specification. For example, the processing unit 610 may perform the steps of the method as shown in fig. 1.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the above-mentioned database load characteristic processing method according to the embodiments of the present disclosure.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (15)

1. A database load characteristic processing method is characterized by comprising the following steps:
acquiring and recording statement information, transaction identification and session identification related to each SQL statement when the execution of each SQL statement is completed; the transaction identifier represents a transaction to which each SQL statement belongs, and the session identifier represents a session to which the transaction to which each SQL statement belongs;
determining statement type identification of each SQL statement according to statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; determining a corresponding session type identifier according to the transaction type identifiers belonging to the same session identifier;
establishing a structured data relation model based on all the determined session type identifications, transaction type identifications and statement type identifications; the structured data relationship model is used for storing session characteristic information of each session type, transaction characteristic information attributed to the session characteristic information and statement characteristic information attributed to the transaction characteristic information.
2. The database load feature processing method according to claim 1, wherein the collecting and recording statement information, transaction identifiers and session identifiers related to each SQL statement when the execution of each SQL statement is completed comprises:
when the execution of each SQL statement is completed, obtaining statement information, transaction identification and session identification related to each SQL statement through a log system;
wherein the statement information at least comprises one or more of statement content, statement execution time consumption and table information related to the statement.
3. The database load feature processing method according to claim 2, wherein the determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement comprises:
and (3) carrying out parameterization on the statement content of each SQL statement, carrying out Hash calculation on the parameterized content, and using the obtained Hash value as statement type identification.
4. The database load feature processing method according to claim 3, wherein the parameterization processing of the statement content of each SQL statement comprises:
and replacing the constant part of the statement content of each SQL statement by preset characters.
5. The database load feature processing method according to claim 1, wherein determining the corresponding transaction type identifier according to the statement type identifier of the SQL statement belonging to the same transaction identifier includes:
traversing the transaction identifier related to each SQL statement to determine the SQL statements belonging to the same transaction identifier;
and obtaining statement type identifications of SQL statements belonging to the same transaction identification, performing hash calculation on the obtained statement type identifications, and taking the obtained hash values as the transaction type identifications.
6. The database load signature processing method of claim 5, further comprising:
when a plurality of SQL sentences belonging to the same transaction identifier exist, acquiring a plurality of statement type identifiers corresponding to the plurality of SQL sentences belonging to the same transaction identifier;
removing repeated statement type identifications in the statement type identifications;
and performing hash calculation based on the removed residual statement type identifier, and taking the obtained hash value as the transaction type identifier.
7. The database load feature processing method according to claim 1, wherein determining the corresponding session type identifier according to the transaction type identifiers belonging to the same session identifier includes:
traversing the session identification and the transaction identification related to each SQL statement to acquire M transaction type identifications belonging to the same session identification; m is a natural number greater than or equal to 2;
removing repeated transaction type identifications in the M transaction type identifications to which the same session identification belongs to obtain the rest N transaction type identifications;
and performing hash calculation based on the rest N transaction type identifications, and taking the obtained hash value as the session type identification.
8. The database load signature processing method of claim 7, further comprising:
comparing the repetition proportion of the rest N transaction type identifiers to which the two adjacent session identifiers belong one by one, and determining that the sessions represented by the two session identifiers are the same session when the repetition proportion is greater than a preset proportion threshold; the preset proportion threshold is more than 80%;
removing repeated transaction type identifiers in the remaining 2N transaction type identifiers to which the two session identifiers belong to obtain remaining P transaction type identifiers;
and performing hash calculation based on the remaining P transaction type identifications, and taking the obtained hash value as the session type identification of the same session.
9. The database load signature processing method of claim 1, further comprising:
removing repeated statement type identifications in statement type identifications of all SQL statements to obtain remaining statement type identifications;
obtaining table information related to the SQL statement based on the residual statement type identification, and obtaining data characteristic information based on the table information;
the data characteristic information at least comprises one or more of table name, table capacity, attribute number, page number, tuple number and statistical information.
10. The database load feature processing method according to claim 9, wherein the structured data relationship model further includes data feature information attributed to statement feature information; the session characteristic information comprises one or more of session number, session time consumption, transaction type number and transaction total number; the transaction characteristic information comprises one or more of transaction execution times, transaction execution time consumption and execution statement sequences to which the transaction belongs; the sentence characteristic information comprises one or more of sentence execution times, sentence execution time consumption and sentence content.
11. The database load feature processing method according to any one of claims 1 to 10, further comprising, before determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement:
writing the obtained statement information, the transaction identifier and the session identifier related to each SQL statement into a log file in a preset file format;
introducing the log file into a database in the form of an external table, and converting the external table into a table of an engine of the database;
three attribute columns for updating the session type identifier, the transaction type identifier and the statement type identifier determined by the record are newly added in the converted table.
12. The database load signature processing method of claim 10, further comprising:
acquiring a session type identifier of a session to be analyzed;
and inquiring and acquiring one or more of session characteristic information, transaction characteristic information, statement characteristic information and data characteristic information which the session to be analyzed belongs to based on the session type identification of the session to be analyzed and the structured data relationship model.
13. A database load characteristic processing device is characterized in that,
the data acquisition module is used for acquiring and recording statement information, transaction identification and session identification related to each SQL statement when the execution of each SQL statement is finished; the transaction identifier represents a transaction to which each SQL statement belongs, and the session identifier represents a session to which the transaction to which each SQL statement belongs;
the data preprocessing module is used for determining the statement type identifier of each SQL statement according to the statement information related to each SQL statement; determining corresponding transaction type identification according to the statement type identification of the SQL statement belonging to the same transaction identification; determining a corresponding session type identifier according to the transaction type identifiers belonging to the same session identifier;
the model establishing module is used for establishing a structured data relation model based on all the determined session type identifications, transaction type identifications and statement type identifications; the structured data relationship model is used for storing session characteristic information of each session type, transaction characteristic information attributed to the session characteristic information and statement characteristic information attributed to the transaction characteristic information.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the database load characterization method according to any one of claims 1 to 12.
15. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the steps of the database load signature processing method of any of claims 1 to 12 via execution of the executable instructions.
CN202010853809.4A 2020-08-24 2020-08-24 Database load characteristic processing method and device, medium and electronic equipment Active CN111984625B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010853809.4A CN111984625B (en) 2020-08-24 2020-08-24 Database load characteristic processing method and device, medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010853809.4A CN111984625B (en) 2020-08-24 2020-08-24 Database load characteristic processing method and device, medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN111984625A true CN111984625A (en) 2020-11-24
CN111984625B CN111984625B (en) 2023-09-15

Family

ID=73443971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010853809.4A Active CN111984625B (en) 2020-08-24 2020-08-24 Database load characteristic processing method and device, medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111984625B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115914334A (en) * 2022-12-05 2023-04-04 中国工商银行股份有限公司 Method, device, equipment and medium for processing access session of database
CN116774970A (en) * 2023-08-22 2023-09-19 北京启源问天量子科技有限公司 ID generation method and device based on quantum random numbers

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140337283A1 (en) * 2013-05-09 2014-11-13 International Business Machines Corporation Comparing database performance without benchmark workloads
CN107480009A (en) * 2017-08-18 2017-12-15 北京中电普华信息技术有限公司 A kind of transaction recovery method and device
WO2018113534A1 (en) * 2016-12-20 2018-06-28 阿里巴巴集团控股有限公司 Database deadlock processing method and apparatus, and database system
CN111221869A (en) * 2018-11-27 2020-06-02 北京京东振世信息技术有限公司 Method and device for tracking database transaction time and analyzing database lock

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140337283A1 (en) * 2013-05-09 2014-11-13 International Business Machines Corporation Comparing database performance without benchmark workloads
WO2018113534A1 (en) * 2016-12-20 2018-06-28 阿里巴巴集团控股有限公司 Database deadlock processing method and apparatus, and database system
CN107480009A (en) * 2017-08-18 2017-12-15 北京中电普华信息技术有限公司 A kind of transaction recovery method and device
CN111221869A (en) * 2018-11-27 2020-06-02 北京京东振世信息技术有限公司 Method and device for tracking database transaction time and analyzing database lock

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王伟平;王子卿;: "Oracle用户SQL会话还原方法研究", 计算机工程与应用, no. 12 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115914334A (en) * 2022-12-05 2023-04-04 中国工商银行股份有限公司 Method, device, equipment and medium for processing access session of database
CN116774970A (en) * 2023-08-22 2023-09-19 北京启源问天量子科技有限公司 ID generation method and device based on quantum random numbers

Also Published As

Publication number Publication date
CN111984625B (en) 2023-09-15

Similar Documents

Publication Publication Date Title
CN109684352B (en) Data analysis system, data analysis method, storage medium, and electronic device
JP6697392B2 (en) Transparent discovery of semi-structured data schema
US7480643B2 (en) System and method for migrating databases
WO2017019879A1 (en) Multi-query optimization
US20080140627A1 (en) Method and apparatus for aggregating database runtime information and analyzing application performance
JP2010524060A (en) Data merging in distributed computing
JP2002244898A (en) Database managing program and database system
CN111400338A (en) SQ L optimization method, device, storage medium and computer equipment
CN111459698A (en) Database cluster fault self-healing method and device
CN109213826B (en) Data processing method and device
CN112162983A (en) Database index suggestion processing method, device, medium and electronic equipment
Cheng et al. Efficient event correlation over distributed systems
CN110795614A (en) Index automatic optimization method and device
CN110716950A (en) Method, device and equipment for establishing aperture system and computer storage medium
CN114817243A (en) Method, device and equipment for establishing database joint index and storage medium
CN111984625B (en) Database load characteristic processing method and device, medium and electronic equipment
CN106919566A (en) A kind of query statistic method and system based on mass data
CN106776704B (en) Statistical information collection method and device
CN115168389A (en) Request processing method and device
Jiadi et al. Research on Data Center Operation and Maintenance Management Based on Big Data
CN115098029A (en) Data processing method and device
CN110837508A (en) Method, device and equipment for establishing aperture system and computer storage medium
CN106776772B (en) Data retrieval method and device
JP2017010376A (en) Mart-less verification support system and mart-less verification support method
KR101638048B1 (en) Sql query processing method using mapreduce

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant