CN109800240B - SQL sentence classifying method, device, computer equipment and storage medium - Google Patents

SQL sentence classifying method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN109800240B
CN109800240B CN201811523456.0A CN201811523456A CN109800240B CN 109800240 B CN109800240 B CN 109800240B CN 201811523456 A CN201811523456 A CN 201811523456A CN 109800240 B CN109800240 B CN 109800240B
Authority
CN
China
Prior art keywords
sql
sql statement
data
classified
base table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811523456.0A
Other languages
Chinese (zh)
Other versions
CN109800240A (en
Inventor
高中正
金海锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811523456.0A priority Critical patent/CN109800240B/en
Publication of CN109800240A publication Critical patent/CN109800240A/en
Application granted granted Critical
Publication of CN109800240B publication Critical patent/CN109800240B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to the field of data processing technologies, and in particular, to a method and a related device for classifying SQL statements, including: after SQL sentence data are obtained, an SQL sentence data set to be categorized is generated, and an SQL sentence data base table to be categorized is established; filtering index items irrelevant to classification to obtain a corrected SQL sentence data base table to be classified; extracting an SQL sentence ID mark and an object name corresponding to each SQL sentence in the SQL sentence data set, and importing a modified SQL sentence data base table to be classified to obtain an SQL sentence building row; acquiring a row ID mark in an SQL statement building row, and splicing to obtain a full object name; and performing similarity comparison on the SQL statement ID mark and the full object name, and classifying the SQL statement data according to the comparison result to form an SQL statement classification table. According to the method and the device, the SQL sentences are accurately classified after the full object names of the SQL sentences are generated through the data view.

Description

SQL sentence classifying method, device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and apparatus for classifying SQL statements, a computer device, and a storage medium.
Background
Grabbing and optimizing the structured query language SQL is a major function that the database auditing platform needs to implement. In the database, each SQL statement has a unique SQL_ID, but SQL statements of different SQL_IDs may perform the same function. When the audit platform audits out the problem SQL statement, the problem SQL statement needs to be optimized, but other SQL statements with the same function as the problem SQL statement are likely to have the same problem. If all SQL sentences with the same function can be classified, all SQL sentences with the same function can be found directly according to the corresponding category of the problem SQL sentences during optimization, and then unified optimization is performed, so that the optimization efficiency of the SQL sentences is greatly improved, and the stability of a database is improved.
At present, when classifying SQL sentences, a text classification algorithm is mostly adopted for classifying, the complete text of each SQL sentence needs to be extracted, and the writing habit of the text is required to be uniform. However, since two SQL sentences with the same function may have a problem of inconsistent writing rules, the SQL sentences cannot be classified quickly.
Disclosure of Invention
In view of this, it is necessary to provide an SQL statement classifying method, apparatus, computer device and storage medium for the problem that SQL statements cannot be classified quickly because of inconsistent writing rules of the SQL statements.
An SQL sentence classifying method comprises the following steps:
after SQL sentence data are obtained, an SQL sentence data set to be categorized is generated, and an SQL sentence data base table to be categorized is established according to the SQL sentence data set to be categorized;
filtering index items which are irrelevant to classification in the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified;
extracting SQL sentence ID identifiers and object names corresponding to the SQL sentences in the SQL sentence data set from a database view, and importing the SQL sentence ID identifiers and the object names into the corrected SQL sentence data base table to be classified to obtain an SQL sentence building line;
acquiring a row ID (identity) in the SQL statement building row, and splicing different object names corresponding to the same row ID to obtain a full object name;
and performing similarity comparison on the SQL statement ID mark and the full object name, and classifying the SQL statement data according to a comparison result to form an SQL statement classification table.
In one possible embodiment, the generating the SQL statement data set to be categorized after the obtaining the SQL statement data, and building the SQL statement data base table to be categorized according to the SQL statement data set to be categorized, includes:
extracting SQL sentence data from a database, packaging the SQL sentence data to generate an SQL sentence data set, and endowing the SQL data set with a time mark to form an SQL data set to be classified with the time mark;
extracting an object name field from the SQL data group to be classified with the time mark;
and according to the initial letter of the object name field and the time mark, sequentially arranging the SQL statement data in the SQL data group to be classified to form the SQL statement data base table to be classified.
In one possible embodiment, the filtering the index items irrelevant to classification in the to-be-classified SQL statement data base table to obtain a modified to-be-classified SQL statement data base table includes:
extracting index identifiers and text language segments contained in the SQL sentence data base table to be classified;
comparing the index identifier with the text language segment with a preset index rule, and if characters which are not matched with the index rule exist in the index identifier or the text language segment, clearing the index identifier or the text language segment from the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified.
In one possible embodiment, the extracting, from the database view, the SQL statement ID identifier and the object name corresponding to each SQL statement in the SQL statement data set, and importing the SQL statement ID identifier and the object name into the modified to-be-categorized SQL statement data base table to obtain an SQL statement building row includes:
extracting SQL statement ID marks and object names corresponding to each SQL statement in the SQL statement data set from the database view;
extracting the SQL sentence ID mark and index characters contained in the object name according to the object type of the database view;
and filtering the index characters, and importing the filtered object names and the SQL sentence ID identifications into the corrected SQL language data base table to be classified to obtain an SQL sentence building row.
In one possible embodiment, the obtaining the row ID identifier in the SQL statement building row, and concatenating different object names corresponding to the same row ID identifier to obtain a full object name includes:
acquiring a binary number corresponding to a row ID in the SQL statement establishment row, and extracting all SQL establishment rows containing the binary number from the corrected SQL language segment base table to be classified by taking the binary number as a query object;
extracting object names in all SQL building lines, and sorting the object names according to initial letters to form an object name sequence;
and splicing object names with the same initial letters in the object name sequence to form the full object name.
In one possible embodiment, the performing similarity comparison between the SQL statement ID identifier and the full object name, classifying the SQL statement data according to a comparison result, and forming an SQL statement classification table includes:
calculating the similarity between the SQL sentence ID identifier and the full object name by applying a character string similarity function;
acquiring a preset similarity threshold, classifying the full object name and the SQL sentence data into one type if the similarity is larger than the similarity threshold, otherwise, not classifying the full object name and the SQL sentence data into one type;
and giving the classified SQL statement data a type identifier, and importing the type identifier and the SQL statement data into the corrected SQL statement data base table to form an SQL statement classification table.
In one possible embodiment, the extracting the SQL statement data from the database, packaging the SQL statement data to generate an SQL statement data set, and time-stamping the SQL statement data set to form a time-stamped SQL data set to be categorized, includes:
acquiring a preset frequency threshold and the frequency generated by the SQL sentence data, and comparing the frequency generated by the SQL sentence data with the frequency threshold;
generating a timing task when the frequency of the SQL sentence data generation is greater than the frequency threshold value, and waiting for the generation of new SQL sentence data when the frequency of the SQL sentence data generation is less than the frequency threshold value;
triggering the timing task, and packaging all SQL sentence data from the last time of triggering the timing task to the time of triggering the timing task;
and acquiring a trigger time node of the timing task, marking the packaged SQL language data by taking the time node as a mark, and obtaining the SQL data set to be classified with the time mark.
An SQL sentence classifying device comprises the following modules:
the base table establishing module is used for generating an SQL statement data set to be classified after acquiring SQL statement data, and establishing an SQL statement data base table to be classified according to the SQL statement data set to be classified;
the base table filtering module is used for filtering index items which are irrelevant to classification in the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified;
the building line generation module is used for extracting SQL statement ID identifiers and object names corresponding to the SQL statements in the SQL statement data set from a database view, and importing the SQL statement ID identifiers and the object names into the corrected SQL statement data base table to be classified to obtain an SQL statement building line;
the full object name generation module is used for acquiring row ID identifiers in the SQL statement building row and splicing different object names corresponding to the same row ID identifiers to obtain full object names;
the classification table establishing module is used for carrying out similarity comparison on the SQL statement ID mark and the full object name, and classifying the SQL statement data according to a comparison result to form an SQL statement classification table.
A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the SQL statement classification method described above.
A storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the SQL statement classification method described above.
The SQL sentence classification method, the SQL sentence classification device, the SQL sentence classification computer device and the SQL sentence classification storage medium comprise the following steps: after SQL sentence data are obtained, an SQL sentence data set to be categorized is generated, and an SQL sentence data base table to be categorized is established according to the SQL sentence data set to be categorized; filtering index items which are irrelevant to classification in the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified; extracting SQL sentence ID identifiers and object names corresponding to the SQL sentences in the SQL sentence data set from a database view, and importing the SQL sentence ID identifiers and the object names into the corrected SQL sentence data base table to be classified to obtain an SQL sentence building line; acquiring a row ID (identity) in the SQL statement building row, and splicing different object names corresponding to the same row ID to obtain a full object name; and performing similarity comparison on the SQL statement ID mark and the full object name, and classifying the SQL statement data according to a comparison result to form an SQL statement classification table. According to the technical scheme, the SQL sentences are accurately classified after the full object names of the SQL sentences are generated through the database view, so that the problem that the SQL sentences cannot be classified rapidly due to inconsistent SQL writing rules is solved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application.
FIG. 1 is a general flow diagram of a method of classifying SQL statements in one embodiment of the application;
FIG. 2 is a diagram illustrating a base table filtering process in an SQL statement categorization method in one embodiment of the application;
FIG. 3 is a schematic diagram of a build line generation process in an SQL statement categorization method in one embodiment of the application;
FIG. 4 is a block diagram of an SQL statement categorizing device in one embodiment of the application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Fig. 1 is an overall flowchart of an SQL statement classification method in one embodiment of the present application, as shown in fig. 1, and the SQL statement classification method includes the following steps:
s1, acquiring SQL sentence data, generating an SQL sentence data set to be classified, and establishing an SQL sentence data base table to be classified according to the SQL sentence data set to be classified;
specifically, the SQL statements are mainly classified into 3 classes according to the functions implemented: data Manipulation Language (DML), data Definition Language (DDL), data Control Language (DCL). The general SQL sentences are stored in the database in the form of stream data, when the SQL sentences are acquired, the SQL sentences can be acquired sequentially according to the storage positions of the SQL sentences in the database, and then a plurality of SQL sentences are packed into an SQL sentence data group according to the position information stored in the database.
S2, filtering index items which are irrelevant to classification in the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified;
specifically, according to the statement rules of the SQL statement to be classified, index items which do not accord with the statement rules are extracted from the SQL statement data base table to be classified, and the index items are removed to obtain a corrected SQL statement data base table to be classified.
S3, extracting SQL statement ID marks and object names corresponding to the SQL statements in the SQL statement data set from a database view, and importing the SQL statement ID marks and the object names into the corrected SQL statement data base table to be classified to obtain an SQL statement building line;
in particular, a database view is a table derived from one or more tables or views. The view is a virtual table, that is, the data corresponding to the view is not actually stored, only the definition of the view is stored in the database, and when the data of the view is operated, the system operates the basic table associated with the view according to the definition of the view. For example, the database view in the Oracle database includes virtual tables of dba_hist_sql_text, dba_hist_sql_plan, etc.
S4, acquiring row ID identifiers in the SQL statement building row, and splicing different object names corresponding to the same row ID identifiers to obtain a full object name;
specifically, the line ID identifier may be the first three characters of the first SQL statement in the SQL statement creation line, where each SQL statement creation line corresponds to an object name, and these object names have line uniqueness, that is, the object names corresponding to different creation lines are different.
And S5, performing similarity comparison on the SQL statement ID mark and the full object name, and classifying the SQL statement data according to a comparison result to form an SQL statement classification table.
The SQL statement ID marks are extracted from dba_hist_sql_text and dba_hist_sql_plan.
In this embodiment, by creating the SQL statement classification table, the SQL statements of different writing rules can be classified effectively.
In one embodiment, the step S1 of obtaining the SQL statement data to generate an SQL statement data set to be categorized, and creating the SQL statement data base table to be categorized according to the SQL statement data set to be categorized includes:
extracting SQL sentence data from a database, packaging the SQL sentence data to generate an SQL sentence data set, and endowing the SQL data set with a time mark to form an SQL data set to be classified with the time mark;
specifically, when the SQL data set is given a time mark, the extraction time of the SQL sentences in the SQL data set can be used as the mark, and the same SQL sentences can be deleted when packaging is performed, so that the SQL sentences in the SQL data set are ensured to be different.
Extracting an object name field from the SQL data group to be classified with the time mark;
and according to the initial letter of the object name field and the time mark, sequentially arranging the SQL statement data in the SQL data group to be classified to form the SQL statement data base table to be classified.
Specifically, in the SQL sentence with the first behavior initial letter of "A" of the base table of the SQL sentence to be classified, the time stamp is "1", the second behavior initial letter is "A" of the SQL sentence, the time stamp is "2", and so on.
In this embodiment, by establishing the SQL base table to be categorized, the SQL sentence data structure can be conveniently searched.
Fig. 2 is a schematic diagram of a base table filtering process in an SQL statement classification method in an embodiment of the present application, as shown in the drawing, in S2, filtering index items irrelevant to classification in the SQL statement data base table to obtain a corrected SQL statement data base table to be classified, which includes:
s201, extracting index identifiers and text language segments contained in the SQL sentence data base table to be classified;
specifically, an index identification information table in a database is obtained, each row of data in the SQL sentence data base table to be classified is compared with index information in the index identification information table, if the comparison is consistent, the index identification information table is extracted, and if the comparison is inconsistent, the index identification information table is not extracted.
S202, comparing the index identifier and the text language segment with a preset index rule, and if characters which are not matched with the index rule exist in the index identifier or the text language segment, clearing the index identifier or the text language segment from the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified.
Specifically, the preset index rules mainly include character number limitation, field selection establishment, single-field elimination and the like. For example, the index rule selected is that the number of characters is limited to 3 characters, and then index identifications of those more than 3 characters are cleared when the comparison is made.
In the embodiment, the query efficiency of the SQL statement data base table is improved by clearing the invalid index.
Fig. 3 is a schematic diagram of a process of creating an establishing line in an SQL statement classification method in an embodiment of the present application, where as shown in the drawing, S3, extracting, from a database view, an SQL statement ID identifier and an object name corresponding to each SQL statement in the SQL statement data set, and importing the SQL statement ID identifier and the object name into the modified to-be-classified SQL statement data base table to obtain an SQL statement establishing line, where the method includes:
s301, extracting SQL statement ID marks and object names corresponding to the SQL statements in the SQL statement data set from a database view;
specifically, acquiring an SQL sentence character, extracting an object name from the SQL sentence character, and acquiring assignment of the object name; and according to the assignment of the object name, acquiring an ID identifier corresponding to the assignment of the object name from the base table, and importing the SQL statement ID identifier and the object name into the corrected to-be-classified SQL statement data base table to obtain an SQL statement building line.
S302, extracting the SQL sentence ID identifier and index characters contained in the object name according to the object type of the database view;
specifically, the object type is obtained, and the SQL sentence ID identifier and the index character contained in the object name are extracted according to the index type corresponding to the object type.
S303, filtering the index character, and importing the filtered object name and the SQL sentence ID mark into the corrected SQL language data base table to be classified to obtain an SQL sentence building row.
Specifically, an index recognition tool is applied to recognize the index characters and then clear the index characters, the filtered SQL sentences are checked, whether the SQL sentences contain object names irrelevant to classification or not is detected, and if yes, the index characters are cleared.
According to the embodiment, the index characters are effectively cleared, so that SQL sentence data in the SQL sentence building row are more convenient to search.
In one embodiment, the step S4 of obtaining the row ID identifier in the SQL statement building row, and concatenating different object names corresponding to the same row ID identifier to obtain a full object name includes:
acquiring a binary number corresponding to a row ID in the SQL statement establishment row, and extracting all SQL establishment rows containing the binary number from the corrected SQL language segment base table to be classified by taking the binary number as a query object;
specifically, each building line has a unique line ID identifier, and this identifier may be assigned according to the location of the building line, for example, the line ID identifier of the first line is "1", and the binary value after binary conversion is still "1".
Extracting object names in all SQL building lines, and sorting the object names according to initial letters to form an object name sequence;
and splicing object names with the same initial letters in the object name sequence to form the full object name.
According to the embodiment, the SQL sentence classification is more convenient and quick by generating the full object name.
In one embodiment, the step S5 of comparing the similarity between the SQL statement ID identifier and the full object name, classifying the SQL statement data according to the comparison result to form an SQL statement classification table includes:
calculating the similarity between the SQL sentence ID identifier and the full object name by applying a character string similarity function;
the string similarity function may use a string similarity function utl_match in the Oracle database to calculate the similarity of the SQL statement, or may use a string similarity function in the shortest path algorithm to calculate the string similarity of the SQL statement.
Acquiring a preset similarity threshold, classifying the full object name and the SQL sentence data into one type if the similarity is larger than the similarity threshold, otherwise, not classifying the full object name and the SQL sentence data into one type;
the preset similarity threshold is obtained after statistics according to the historical data, and when the historical data is counted, the weight of the historical data which is closer to the current moment is larger, and the weight of the historical data which is farther from the current moment is smaller.
And giving the classified SQL statement data a type identifier, and importing the type identifier and the SQL statement data into the corrected SQL statement data base table to form an SQL statement classification table.
The SQL sentence data can be distinguished by using English abbreviations as type identifiers.
In this embodiment, the SQL sentence classification table is built after similarity calculation, so that the SQL sentence is effectively classified.
In one embodiment, the step S101 of extracting the SQL statement data from the database, packaging the SQL statement data to generate an SQL statement data set, and time-stamping the SQL statement data set to form a time-stamped SQL data set to be categorized, includes:
acquiring a preset frequency threshold and the frequency generated by the SQL sentence data, and comparing the frequency generated by the SQL sentence data with the frequency threshold;
specifically, the preset frequency threshold is obtained after statistics according to historical data. When the frequency of the SQL sentence data generation is obtained, two statistical time intervals can be set first, the generated SQL sentence data quantity is recorded respectively in the two statistical time intervals, and then the average value of the generated SQL sentence data quantity is used as the frequency of the SQL sentence data generation.
Generating a timing task when the frequency of the SQL sentence data generation is greater than the frequency threshold value, and waiting for the generation of new SQL sentence data when the frequency of the SQL sentence data generation is less than the frequency threshold value;
triggering the timing task, and packaging all SQL sentence data from the last time of triggering the timing task to the time of triggering the timing task;
and acquiring a trigger time node of the timing task, marking the packaged SQL language data by taking the time node as a mark, and obtaining the SQL data set to be classified with the time mark.
According to the method, the source of the SQL statement can be effectively tracked by effectively time marking the SQL data set, so that the SQL statement type composition of the data source is analyzed after the SQL statement is classified.
In one embodiment, an SQL sentence classifying device is provided, as shown in fig. 4, including the following modules:
the base table establishing module 41 is configured to generate an SQL statement data set to be categorized after acquiring SQL statement data, and establish an SQL statement data base table to be categorized according to the SQL statement data set to be categorized;
the base table filtering module 42 is configured to filter index items which are irrelevant to classification in the to-be-classified SQL sentence data base table to obtain a corrected to-be-classified SQL sentence data base table;
the building line generating module 43 is configured to extract, from the database view, an SQL statement ID identifier and an object name corresponding to each SQL statement in the SQL statement data set, and import the SQL statement ID identifier and the object name into the modified to-be-categorized SQL statement data base table to obtain an SQL statement building line;
the full object name generating module 44 is configured to obtain a row ID identifier in the SQL statement building row, and splice different object names corresponding to the same row ID identifier to obtain a full object name;
the classification table establishing module 45 is configured to compare the similarity between the SQL statement ID identifier and the full object name, and classify the SQL statement data according to the comparison result to form an SQL statement classification table.
A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the SQL statement classification method in the embodiments described above.
A storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the SQL statement classification method described in the embodiments above. The storage medium may be a non-volatile storage medium.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above-described embodiments represent only some exemplary embodiments of the present application, wherein the description is more specific and detailed, but are not, therefore, to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (9)

1. A method for classifying an SQL statement, comprising:
after SQL sentence data are obtained, an SQL sentence data set to be categorized is generated, and an SQL sentence data base table to be categorized is established according to the SQL sentence data set to be categorized;
filtering index items which are irrelevant to classification in the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified;
extracting SQL sentence ID identifiers and object names corresponding to the SQL sentences in the SQL sentence data set from a database view, and importing the SQL sentence ID identifiers and the object names into the corrected SQL sentence data base table to be classified to obtain an SQL sentence building line;
acquiring a row ID (identity) in the SQL statement building row, and splicing different object names corresponding to the same row ID to obtain a full object name;
performing similarity comparison on the SQL statement ID mark and the full object name, and classifying the SQL statement data according to a comparison result to form an SQL statement classification table;
the step of performing similarity comparison between the SQL sentence ID mark and the full object name, classifying the SQL sentence data according to a comparison result to form an SQL sentence classification table, comprises the following steps:
calculating the similarity between the SQL sentence ID identifier and the full object name by applying a character string similarity function;
acquiring a preset similarity threshold, classifying the full object name and the SQL sentence data into one type if the similarity is larger than the similarity threshold, otherwise, classifying the full object name and the SQL sentence data into one type, wherein the similarity threshold is obtained by calculating according to the generation time and the weight of the historical data, and the later the generation time of the historical data is, the larger the corresponding weight is;
and giving the classified SQL statement data a type identifier, and importing the type identifier and the SQL statement data into the corrected SQL statement data base table to form an SQL statement classification table.
2. The method for classifying an SQL statement according to claim 1, wherein the step of obtaining the SQL statement data to generate an SQL statement data set to be classified, and establishing an SQL statement data base table to be classified according to the SQL statement data set to be classified comprises:
extracting SQL sentence data from a database, packaging the SQL sentence data to generate an SQL sentence data set, and endowing the SQL data set with a time mark to form an SQL data set to be classified with the time mark;
extracting an object name field from the SQL data group to be classified with the time mark;
and according to the initial letter of the object name field and the time mark, sequentially arranging the SQL statement data in the SQL data group to be classified to form the SQL statement data base table to be classified.
3. The method for classifying an SQL statement according to claim 1, wherein filtering the index items irrelevant to classification in the data base table of the SQL statement to be classified to obtain a modified data base table of the SQL statement to be classified comprises:
extracting index identifiers and text language segments contained in the SQL sentence data base table to be classified;
comparing the index identifier with the text language segment with a preset index rule, and if characters which are not matched with the index rule exist in the index identifier or the text language segment, clearing the index identifier or the text language segment from the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified.
4. The method for classifying an SQL statement according to claim 1, wherein extracting, from a database view, an SQL statement ID identifier and an object name corresponding to each SQL statement in the SQL statement data set, importing the SQL statement ID identifier and the object name into the modified to-be-classified SQL statement data base table, and obtaining an SQL statement creation row, includes:
extracting SQL statement ID marks and object names corresponding to each SQL statement in the SQL statement data set from the database view;
extracting the SQL sentence ID mark and index characters contained in the object name according to the object type of the database view;
and filtering the index characters, and importing the filtered object names and the SQL sentence ID identifications into the corrected SQL language data base table to be classified to obtain an SQL sentence building row.
5. The method for classifying SQL statements according to claim 1, wherein the obtaining the row ID identifier in the SQL statement building row, and splicing different object names corresponding to the same row ID identifier to obtain a full object name, comprises:
acquiring a binary number corresponding to a row ID in the SQL statement establishment row, and extracting all SQL establishment rows containing the binary number from the corrected SQL language segment base table to be classified by taking the binary number as a query object;
extracting object names in all SQL building lines, and sorting the object names according to initial letters to form an object name sequence;
and splicing object names with the same initial letters in the object name sequence to form the full object name.
6. The method for classifying an SQL statement according to claim 2, wherein the step of extracting the SQL statement data from the database, packaging the SQL statement data to generate an SQL statement data set, and time-stamping the SQL statement data set to form a time-stamped SQL data set to be classified comprises:
acquiring a preset frequency threshold and the frequency generated by the SQL sentence data, and comparing the frequency generated by the SQL sentence data with the frequency threshold;
generating a timing task when the frequency of the SQL sentence data generation is greater than the frequency threshold value, and waiting for the generation of new SQL sentence data when the frequency of the SQL sentence data generation is less than the frequency threshold value;
triggering the timing task, and packaging all SQL sentence data from the last time of triggering the timing task to the time of triggering the timing task;
and acquiring a trigger time node of the timing task, marking the packaged SQL language data by taking the time node as a mark, and obtaining the SQL data set to be classified with the time mark.
7. An SQL statement classification apparatus, comprising:
the base table establishing module is used for generating an SQL statement data set to be classified after acquiring SQL statement data, and establishing an SQL statement data base table to be classified according to the SQL statement data set to be classified;
the base table filtering module is used for filtering index items which are irrelevant to classification in the SQL sentence data base table to be classified to obtain a corrected SQL sentence data base table to be classified;
the building line generation module is used for extracting SQL statement ID identifiers and object names corresponding to the SQL statements in the SQL statement data set from a database view, and importing the SQL statement ID identifiers and the object names into the corrected SQL statement data base table to be classified to obtain an SQL statement building line;
the full object name generation module is used for acquiring row ID identifiers in the SQL statement building row and splicing different object names corresponding to the same row ID identifiers to obtain full object names;
the classification table establishing module is used for carrying out similarity comparison on the SQL statement ID mark and the full object name, classifying the SQL statement data according to a comparison result and then forming an SQL statement classification table;
the step of performing similarity comparison between the SQL sentence ID mark and the full object name, classifying the SQL sentence data according to a comparison result to form an SQL sentence classification table, comprises the following steps:
calculating the similarity between the SQL sentence ID identifier and the full object name by applying a character string similarity function;
acquiring a preset similarity threshold, classifying the full object name and the SQL sentence data into one type if the similarity is larger than the similarity threshold, otherwise, classifying the full object name and the SQL sentence data into one type, wherein the similarity threshold is obtained by calculating according to the generation time and the weight of the historical data, and the later the generation time of the historical data is, the larger the corresponding weight is;
and giving the classified SQL statement data a type identifier, and importing the type identifier and the SQL statement data into the corrected SQL statement data base table to form an SQL statement classification table.
8. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the SQL statement classification method of any one of claims 1 to 6.
9. A storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the SQL statement classification method of any one of claims 1 to 6.
CN201811523456.0A 2018-12-13 2018-12-13 SQL sentence classifying method, device, computer equipment and storage medium Active CN109800240B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811523456.0A CN109800240B (en) 2018-12-13 2018-12-13 SQL sentence classifying method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811523456.0A CN109800240B (en) 2018-12-13 2018-12-13 SQL sentence classifying method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109800240A CN109800240A (en) 2019-05-24
CN109800240B true CN109800240B (en) 2024-03-22

Family

ID=66556654

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811523456.0A Active CN109800240B (en) 2018-12-13 2018-12-13 SQL sentence classifying method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109800240B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112354189B (en) * 2020-11-23 2022-05-20 腾讯科技(深圳)有限公司 Game data object matching method, device, equipment and storage medium
CN112966101B (en) * 2021-02-07 2024-06-18 白腊梅 Statement clustering method, transaction clustering method, statement clustering device and transaction clustering device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102902778A (en) * 2012-09-28 2013-01-30 用友软件股份有限公司 Query sentence optimization device and query sentence optimization method
CN102945256A (en) * 2012-10-18 2013-02-27 福建省海峡信息技术有限公司 Method and device for merging and classifying massive SQL (Structured Query Language) sentences
CN105279276A (en) * 2015-11-11 2016-01-27 浪潮(北京)电子信息产业有限公司 Database index optimization system
CN105635046A (en) * 2014-10-28 2016-06-01 北京启明星辰信息安全技术有限公司 Database command line filtering and audit blocking method and device
CN105912594A (en) * 2016-04-05 2016-08-31 深圳市深信服电子科技有限公司 SQL sentence processing method and system
CN106611044A (en) * 2016-12-02 2017-05-03 星环信息科技(上海)有限公司 SQL optimization method and device
CN108984675A (en) * 2018-07-02 2018-12-11 北京百度网讯科技有限公司 Data query method and apparatus based on evaluation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102902778A (en) * 2012-09-28 2013-01-30 用友软件股份有限公司 Query sentence optimization device and query sentence optimization method
CN102945256A (en) * 2012-10-18 2013-02-27 福建省海峡信息技术有限公司 Method and device for merging and classifying massive SQL (Structured Query Language) sentences
CN105635046A (en) * 2014-10-28 2016-06-01 北京启明星辰信息安全技术有限公司 Database command line filtering and audit blocking method and device
CN105279276A (en) * 2015-11-11 2016-01-27 浪潮(北京)电子信息产业有限公司 Database index optimization system
CN105912594A (en) * 2016-04-05 2016-08-31 深圳市深信服电子科技有限公司 SQL sentence processing method and system
CN106611044A (en) * 2016-12-02 2017-05-03 星环信息科技(上海)有限公司 SQL optimization method and device
CN108984675A (en) * 2018-07-02 2018-12-11 北京百度网讯科技有限公司 Data query method and apparatus based on evaluation

Also Published As

Publication number Publication date
CN109800240A (en) 2019-05-24

Similar Documents

Publication Publication Date Title
CN109510737B (en) Protocol interface testing method and device, computer equipment and storage medium
CN106033416B (en) Character string processing method and device
JP5328808B2 (en) Data clustering method, system, apparatus, and computer program for applying the method
JP6850806B2 (en) Annotation system for extracting attributes from electronic data structures
CN105589894B (en) Document index establishing method and device and document retrieval method and device
CN111353014B (en) Position keyword extraction and position demand updating method and device
CN109471889B (en) Report accelerating method, system, computer equipment and storage medium
CN112395881B (en) Material label construction method and device, readable storage medium and electronic equipment
CN109800240B (en) SQL sentence classifying method, device, computer equipment and storage medium
JP2019503541A5 (en)
CN109885641B (en) Method and system for searching Chinese full text in database
KR101379128B1 (en) Dictionary generation device, dictionary generation method, and computer readable recording medium storing the dictionary generation program
US10877989B2 (en) Data conversion system and method of converting data
CN106844482A (en) A kind of retrieval information matching method and device based on search engine
CN113128213A (en) Log template extraction method and device
CN105589900A (en) Data mining method based on multi-dimensional analysis
CN111191430B (en) Automatic table building method and device, computer equipment and storage medium
CN108776705B (en) Text full-text accurate query method, device, equipment and readable medium
KR101846347B1 (en) Method and apparatus for managing massive documents
US7853597B2 (en) Product line extraction
JP2000231559A (en) Information processor
JP5894273B2 (en) Document association method, document retrieval method, document association apparatus, document retrieval apparatus, and program therefor
JP2021039488A (en) Dictionary creation device and dictionary creation method
CN113157946B (en) Entity linking method, device, electronic equipment and storage medium
CN107784022B (en) Method and device for detecting whether legal documents are repeated

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant