CN105320679A - Data table index set generation method and device - Google Patents

Data table index set generation method and device Download PDF

Info

Publication number
CN105320679A
CN105320679A CN201410331106.XA CN201410331106A CN105320679A CN 105320679 A CN105320679 A CN 105320679A CN 201410331106 A CN201410331106 A CN 201410331106A CN 105320679 A CN105320679 A CN 105320679A
Authority
CN
China
Prior art keywords
field
tables
data
analyzed
sql
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410331106.XA
Other languages
Chinese (zh)
Other versions
CN105320679B (en
Inventor
王凡
谭正
钟全龙
陈乐君
程金波
车华翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Chongqing Co Ltd
Original Assignee
China Mobile Group Chongqing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Chongqing Co Ltd filed Critical China Mobile Group Chongqing Co Ltd
Priority to CN201410331106.XA priority Critical patent/CN105320679B/en
Publication of CN105320679A publication Critical patent/CN105320679A/en
Application granted granted Critical
Publication of CN105320679B publication Critical patent/CN105320679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a data table index set generation method. The method comprises the steps that statistical information of a data table to be analyzed in a database is scanned, and a field selective information table is updated according to the statistical information; SQL information used when an application program operates is collected from the database; the SQL information is analyzed to obtain field combination frequency of the SQL information to fields and/or combined fields of the data table to be analyzed, and the field combination frequency is updated into a combined field frequency table; the fields and/or the combined fields in the data table to be analyzed are optimized according to the combined field frequency table and the field selective information table to obtain the optimal data table index set corresponding to the data table to be analyzed. The invention further discloses a data table index set generation device.

Description

The symphysis of a kind of tables of data indexed set becomes method and device
Technical field
The present invention relates to data performance optimisation technique field, particularly relate to the symphysis of a kind of tables of data indexed set and become method and device.
Background technology
Existing Index Design appraisal procedure, mainly according to the selectivity of the index field of deviser's design, or the coarse evaluation that performance evaluation when application program is run on the computer systems is carried out.As: the several fields of certain deviser on certain table establish index, if based on the database engine product of cost optimization device, then can after correlation table have had a certain amount of data, the instruction his-and-hers watches of database are adopted to carry out table analysis, thus obtain the unique value number (being equivalent to the quantity of field) of relative index field, unique value number is compared with total line number of relative index field, just can judge the selectivity quality of index field; Or, by Structured Query Language (SQL) (SQL, StructuredQueryLanguage) performance analysis tool analyzes the behavior pattern of SQL statement when database runs of application-specific accessing database, judge the whether reasonable of Index Design, here, mainly the cost by index accesses table that key element calculates as database optimizer is judged.
But the rational method of existing judgement Index Design still Shortcomings, mainly comprises:
1, by the method that index field selectivity judges, some special scenes of None-identified.As, the field that selectivity is good establishes index, but real-life program is in operation, less this field of using is inquired about, and by field selectivity, this kind of situation judges that index is rational on the surface, but actual conditions are really not so.Further, the index set up belongs to redundancy, unwanted data, and the existence of this index can increase the cpu resource consumption of related application when doing data manipulation, reduces the performance of application program, waste storage resources.
2, carried out the situation of analysis and evaluation by the performance data of acquisition applications routine access database SQL, the design rationality of some index to application-specific can be assessed.As, use certain indexes applications program efficiency very slow, a newly-built index also adopts rear application program efficiency to accelerate.But the index that this mode just obtains according to the application program run in a period of time, does not consider other off-duty application programs.Therefore, this mode cannot assess in a system that whole application program is to the index set of index accesses, and the method adopted in above-mentioned 1st also faces same problem.
Summary of the invention
In view of this, the embodiment of the present invention is expected to provide the symphysis of a kind of tables of data indexed set to become method and device, at least can solve the defect that the index efficiency that exists in existing index establishing method is low.
The technical scheme of the embodiment of the present invention is achieved in that
Embodiments provide the symphysis of a kind of tables of data indexed set and become method, described method comprises:
Scan tables of data to be analyzed statistical information in a database, according to described statistical information, field selective information table is upgraded;
The SQL information used when acquisition applications program is run from database;
Described SQL information is resolved, obtains the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, the described field combination frequency is updated in combined field frequency table;
According to described combined field frequency table and described field selective information table, the field in described tables of data to be analyzed and/or combined field are optimized, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
Preferably, describedly according to described statistical information, renewal is carried out to field selective information table and comprises:
The field selection rate of each field in described tables of data to be analyzed is calculated according to described statistical information;
Described field selection rate is inserted described field selective information table, realizes the renewal to described field selective information table.
Preferably, described field selection rate is calculated by p (i)=n (i)/N;
Wherein, p (i) is the field selection rate of i-th field in tables of data to be analyzed; N (i) is the unique value number of i-th field in tables of data to be analyzed; N is tables of data record to be analyzed sum.
Preferably, described SQL information comprises the execution number of times of SQL statement, SQL executive plan and SQL statement.
Preferably, described SQL information is resolved, obtains the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, the described field combination frequency is updated to combined field frequency table and comprises:
Search field and SQL executive plan is parsed from described SQL information;
The execution number of times of described SQL statement and/or the correlation inquiry execution number of times of SQL statement is obtained according to described search field and SQL executive plan;
The correlation inquiry of the execution number of times of described SQL statement and/or SQL statement is performed the field combination frequency that number of times is converted into tables of data to be analyzed, and is updated in combined field frequency table.
Preferably, to be describedly optimized the field in described tables of data to be analyzed and/or combined field according to described combined field frequency table and described field selective information table, the tables of data index set obtaining the optimum of corresponding described tables of data to be analyzed comprises:
According to the selection rate of field in described field selective information table to the field in described combined field frequency table and/or combined field selectively rate order from high to low sort;
The field identical to the beginning field after sequence and/or combined field merge;
Field after being combined by setup parameter and/or combined field are screened, and set up index to the field after screening and/or combined field, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
The embodiment of the present invention additionally provides a kind of tables of data indexed set symphysis apparatus for converting, and described device comprises:
Field selection rate initialization unit, for scanning tables of data to be analyzed statistical information in a database, upgrades field selective information table according to described statistical information;
SQL information acquisition unit, the SQL information used when running for acquisition applications program from database;
SQL performs information analysis unit, for resolving described SQL information, obtaining the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, being updated in combined field frequency table by the described field combination frequency;
Tables of data indexed set closes generation unit, for being optimized the field in described tables of data to be analyzed and/or combined field according to described combined field frequency table and described field selective information table, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
Preferably, described field selection rate initialization unit comprises:
Field selection rate computing module, for calculating the field selection rate of each field in described tables of data to be analyzed according to described statistical information;
Field selective information table update module, for described field selection rate is inserted described field selective information table, realizes the renewal to described field selective information table.
Preferably, described field selection rate is calculated by p (i)=n (i)/N;
Wherein, p (i) is the field selection rate of i-th field in tables of data to be analyzed; N (i) is the unique value number of i-th field in tables of data to be analyzed; N is record sum in tables of data to be analyzed.
Preferably, described SQL execution information analysis unit comprises:
Parsing module, for parsing search field and SQL executive plan from described SQL information;
Correlation inquiry performs number of times module, and the correlation inquiry for the execution number of times and/or SQL statement that obtain described SQL statement according to described search field and SQL executive plan performs number of times;
Combined field frequency table update module, for the correlation inquiry of the execution number of times of described SQL statement and/or SQL statement is performed the field combination frequency that number of times is converted into tables of data to be analyzed, and is updated in combined field frequency table.
Preferably, described tables of data indexed set conjunction generation unit comprises:
Field is rate order module selectively, for according to the selection rate of field in described field selective information table to the field in described combined field frequency table and/or combined field selectively rate order from high to low sort;
Field merge cells, for merging the identical field of beginning field after sequence and/or combined field;
Tables of data indexed set closes generation module, screens for the field after being combined by setup parameter and/or combined field, sets up index to the field after screening and/or combined field, obtains the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
The tables of data indexed set symphysis that the embodiment of the present invention provides becomes method and device, upgrading, can improve the precision of each field selection rate in field selective information table by obtaining tables of data statistical information to be analyzed change to field selective information table; Resolved by the described SQL information used the application program run, obtain the correlation inquiry of described SQL information to the inquiry times of described tables of data to be analyzed and/or SQL statement and perform number information, and be updated to combined field frequency table; By field selective information table and combined field frequency table multiple field combination sorted and merge, tables of data index being set up to the combined field after screening, the tables of data index set of the optimum in current setting cycle can be obtained; So, just can remove invalid field and poor efficiency field, correspondingly, also been removed invalid index and poor efficiency index, improve the overall performance of application system in current setting cycle, saved storage resources; Further, the Access status that the application program in access tables of data to be analyzed reaches optimum can be made.
Accompanying drawing explanation
Fig. 1 is the realization flow schematic diagram that the symphysis of the embodiment of the present invention 1 tables of data indexed set becomes method;
Fig. 2 is the composition structural representation of the embodiment of the present invention 2 tables of data indexed set symphysis apparatus for converting;
Fig. 3 is the logical organization schematic diagram that the symphysis of the embodiment of the present invention 3 tables of data indexed set becomes method;
Fig. 4 is the schematic flow sheet of field selection rate initialization unit functional realiey in the embodiment of the present invention 3;
Fig. 5 is the schematic flow sheet of SQL information acquisition unit functional realiey in the embodiment of the present invention 3;
Fig. 6 is the schematic flow sheet that in the embodiment of the present invention 3, SQL performs the realization of information analysis Elementary Function;
Fig. 7 is the schematic flow sheet that in the embodiment of the present invention 3, the symphysis of tables of data indexed set becomes Elementary Function to realize.
Embodiment
Below in conjunction with Figure of description and specific embodiment technical scheme of the present invention done and further elaborate.
Embodiment 1
In order to solve the technical matters that prior art exists, present embodiments provide the symphysis of a kind of tables of data indexed set and become method, as shown in Figure 1, the method comprises the following steps:
S101: scan tables of data to be analyzed statistical information in a database, upgrades field selective information table according to described statistical information;
Here, can the scan period be pre-set, treat analytical data statistical information in a database in the start time of each scan period and scan;
Wherein, described tables of data to be analyzed is existing form in database, the field information that all application programs comprising whole setting cycle are used, that is: tables of data to be analyzed can reflect the historical information of field; Described field selective information table is used for selectively rate order from high to low and deposits field.
When the field information in tables of data to be analyzed changes, the statistical information of corresponding tables of data to be analyzed also can change; Here, statistical information comprises the change information of tables of data to be analyzed, and tables of data to be analyzed comprises various field, so, when finding that statistical information has renewal, the field selection rate treating analytical table recalculates, and realizes the renewal to field selective information table; Here, described field selective information table is preserved in a database.
S102: the SQL information used when acquisition applications program is run from database;
Here, the SQL information collected can be kept in SQL Information Monitoring table, and further, described SQL Information Monitoring table is preserved in a database, and described SQL Information Monitoring table includes the whole SQL information in current setting cycle.
S103: resolve described SQL information, obtains the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, is updated in combined field frequency table by the described field combination frequency;
Here, described combined field frequency table is preserved in a database, for depositing the field of described SQL information to described tables of data to be analyzed and/or the inquiry times information of combined field.
S104: be optimized the field in described tables of data to be analyzed and/or combined field according to described combined field frequency table and described field selective information table, obtains the tables of data index set of the optimum of corresponding described tables of data to be analyzed;
Here, the analysis by treating analytical data obtains described field selective information table; By described SQL Information Monitoring table with describedly treat analytical data binding analysis, obtain described combined field frequency table;
Concrete, described optimization comprises: inquire about the field in described field selective information table and described combined field frequency table, obtain the field that selection rate is high, and the field high to selection rate merges and screen, and sets up tables of data index to the field after screening.
The embodiment of the present invention is upgraded field selective information table by tables of data to be analyzed, can improve the precision of each field selection rate in field selective information table; Resolved by the described SQL information used the application program run, obtain the inquiry times information of described SQL information to described tables of data to be analyzed, the selection rate of field in current setting cycle can be reacted; Determined the field selection rate of field in tables of data to be analyzed and/or combined field by field selective information table and combined field frequency table, and according to field selection rate, multiple field is comprehensively analyzed, invalid field and poor efficiency field can be removed; Correspondingly, also been removed invalid index and poor efficiency index, improve the overall performance of application system, saved storage resources; The tables of data index of the optimum finally obtained, can make the Access status that the application program of accessing in tables of data to be analyzed reaches optimum.
Concrete, step S101 upgrades field selective information table according to described statistical information, comprising:
S1011: the field selection rate calculating each field in described tables of data to be analyzed according to described statistical information;
Here, described field selection rate is calculated by p (i)=n (i)/N;
Wherein, p (i) is the field selection rate of i-th field in tables of data to be analyzed; N (i) is the unique value number of i-th field in tables of data to be analyzed.The unique value number of field refers to: the quantity of different value in certain field of a tables of data, and as a field of A table, A table has 100 records, but all record of a field of A table all only has two kinds of values, then the value of n (i) is 2; N is tables of data record to be analyzed sum.
S1012: described field selection rate is inserted described field selective information table, realizes the renewal to described field selective information table.
Then, in step S102, can obtain by database interface the SQL information run and be saved in SQL Information Monitoring table;
Wherein, described SQL Information Monitoring table comprises the execution number of times of SQL statement, SQL executive plan and SQL statement.
Afterwards, in step S103, described SQL information is resolved, obtains the inquiry times information of described SQL information to described tables of data to be analyzed, by described inquiry times information updating in combined field frequency table, comprising:
Search field and SQL executive plan is parsed from described SQL information; The execution number of times of described SQL statement and/or the correlation inquiry execution number of times of SQL statement is obtained according to described search field and SQL executive plan; The correlation inquiry of the execution number of times of described SQL statement and/or SQL statement is performed the field combination frequency that number of times is converted into tables of data to be analyzed, and is updated in combined field frequency table.Wherein, described correlation inquiry performs number of times and refers to when inquiring about multiple tables of data (if tables of data is nested circulation association etc.), determines the inquiry times of field according to the SQL executive plan of tables of data connection.
The detailed process of step 103 comprises:
S1031: read one in the described SQL information described SQL statement do not analyzed, parses the search field of described SQL statement in described tables of data to be analyzed from described SQL statement;
S1032: whether relevantly with described tables of data to be analyzed inquire about described SQL statement according to described search field, if so, then enter step S1033; Otherwise enter step S1039;
S1033: judge described search field, if described search field is for singly showing inquiry, then enters step S1037; Otherwise described search field is the correlation inquiry of multilist, enters step S1034;
If so, and described SQL executive plan corresponding to described SQL statement is not nested circulation S1034: whether inquire about described tables of data to be analyzed is table-drive, then enters step S1037; Otherwise enter step S1035;
S1035: the row source number parsing contingency table from described SQL executive plan, enters step S1036;
S1036: judge whether described SQL executive plan is nested circulation, if so, then enters step S1038; Otherwise enter step S1037;
S1037: the execution number of times obtaining the described SQL statement in setting cycle, upgrades the information of described combined field frequency table, enters step S1039 in conjunction with described search field;
S1038: the execution number of times obtaining the described SQL statement in setting cycle, the row source number execution number of times of described SQL statement being multiplied by described contingency table obtains the correlation inquiry execution number of times of SQL statement, upgrade the information of described combined field frequency table in conjunction with described search field, enter step S1039;
S1039: the whole described SQL statement judging whether the described SQL Information Monitoring table read in setting cycle, if so, then terminates present treatment flow process; Otherwise return step S1031.
Finally, in step S104, according to described combined field frequency table and described field selective information table, the field in described tables of data to be analyzed and/or combined field are analyzed, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed, comprising:
S1041: according to the selection rate of field in described field selective information table, to the field in described combined field frequency table and/or combined field selectively rate order from high to low sort;
S1042: the field identical to the beginning field after sequence and/or combined field merge;
S1043: the field after being combined by setup parameter and/or combined field are screened, sets up index to the field after screening and/or combined field, obtains the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
Step S104 obtains the high field of selection rate, and the field high to selection rate merges and screen, and sets up tables of data index, like this, just eliminate invalid field and poor efficiency field to the field after screening; Correspondingly, also been removed invalid index and poor efficiency index, improve the overall performance of application system, saved storage resources; After obtaining optimum tables of data index set, application program carries out data access by the field in optimum tables of data index.According to description above, the tables of data index set of optimum is now the index after removing invalid and poor efficiency field, so, the Access status that application program reaches optimum can be made.
Embodiment 2
The present embodiment and embodiment 1 belong to same inventive concept.Present embodiments provide a kind of tables of data indexed set symphysis apparatus for converting, as shown in Figure 2, this device comprises the structure drawing of device of the present embodiment:
Field selection rate initialization unit 201, for scanning tables of data to be analyzed statistical information in a database, upgrades field selective information table according to described statistical information;
SQL information acquisition unit 202, the SQL information used when running for acquisition applications program from database;
Here, the SQL information collected can be kept in SQL Information Monitoring table; Described SQL Information Monitoring table comprises the execution number of times of SQL statement, SQL executive plan and SQL statement;
SQL performs information analysis unit 203, for resolving described SQL information, obtaining the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, being updated in combined field frequency table by the described field combination frequency;
Tables of data indexed set closes generation unit 204, for being optimized the field in described tables of data to be analyzed and/or combined field according to described combined field frequency table and described field selective information table, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
Wherein, described field selection rate initialization unit 201 comprises: field selection rate computing module 2011 and field selective information table update module 2012; Wherein,
Field selection rate computing module 2011, for calculating the field selection rate of each field in described tables of data to be analyzed according to described statistical information;
Described field selection rate is calculated by p (i)=n (i)/N;
Wherein, p (i) is the field selection rate of i-th field in tables of data to be analyzed; N (i) is the unique value number of i-th field in tables of data to be analyzed; N is record sum in tables of data to be analyzed.
Field selective information table update module 2012, for described field selection rate is inserted described field selective information table, realizes the renewal to described field selective information table.
Described SQL performs information analysis unit 203 and comprises: parsing module, correlation inquiry perform number of times module and combined field frequency table update module; Wherein,
Parsing module, for parsing search field and SQL executive plan from described SQL information;
Correlation inquiry performs number of times module, and the correlation inquiry for the execution number of times and/or SQL statement that obtain described SQL statement according to described search field and SQL executive plan performs number of times;
Combined field frequency table update module, for the correlation inquiry of the execution number of times of described SQL statement and/or SQL statement is performed the field combination frequency that number of times is converted into tables of data to be analyzed, and is updated in combined field frequency table.
Concrete, described parsing module comprises:
SQL statement parsing module 2031, for reading a described SQL statement do not analyzed in described SQL Information Monitoring table, parses the search field of described SQL statement in described tables of data to be analyzed from described SQL statement;
Described correlation inquiry performs number of times module and comprises: correlativity judge module 2032, search field judge module 2033, table-drive and cycle criterion module 2034, row source number parsing module 2035 and nested cycle criterion module 2036; Wherein,
Whether correlativity judge module 2032 is relevant with described tables of data to be analyzed for inquiring about described SQL statement;
Search field judge module 2033, for judging described search field;
Whether table-drive and cycle criterion module 2034 are table-drive for inquiring about described tables of data to be analyzed, and whether described SQL executive plan corresponding to described SQL statement is nested circulation;
Row source number parsing module 2035, for parsing the row source number of contingency table from described SQL executive plan;
Nested cycle criterion module 2036, for judging whether described SQL executive plan is nested circulation;
Described combined field frequency table update module comprises: the first combined field frequency table update module 2037, second combined field frequency table update module 2038; Wherein,
First combined field frequency table update module 2037, for obtaining the execution number of times of the described SQL statement in setting cycle, upgrades the information of described combined field frequency table in conjunction with described search field;
Second combined field frequency table update module 2038, for obtaining the execution number of times of the described SQL statement in setting cycle, the row source number execution number of times of described SQL statement being multiplied by described contingency table obtains the correlation inquiry execution number of times of SQL statement, upgrades the information of described combined field frequency table in conjunction with described search field;
In addition, described combined field frequency table update module can also comprise SQL statement enquiry module 2039, for judging whether the whole described SQL statement of the described SQL Information Monitoring table read in setting cycle.
Described tables of data indexed set closes generation unit 204 and comprises: field selectively rate order module 2041, field merge cells 2042 and tables of data indexed set closes generation module 2043; Wherein,
Field is rate order module 2041 selectively, for according to the selection rate of field in described field selective information table to the field in described combined field frequency table and/or combined field selectively rate order from high to low sort;
Field merge cells 2042, for merging the identical field of beginning field after sequence and/or combined field;
Tables of data indexed set closes generation module 2043, screen for the field after being combined by setup parameter and/or combined field, index is set up to the field after screening and/or combined field, obtains the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
The related content described in the annexation of each unit and module and embodiment 1 is corresponding, repeats no more herein.
Embodiment 3
The present invention is described in detail by an actual scene for the present embodiment.
The embodiment of the present invention adopts a kind of new technical thought to assess data table index design rationality and correct, by a set of timer according to the design of new thinking, reach the object of the optimum index set that the extensive application routine access tables of data run in identification complication system needs, and undertaken one by the Index Design present situation of this kind of recognition methods application programs and assess and correct.
As shown in Figure 3, its basic ideas are the building-block of logic of the present embodiment:
1.: which tables of data to be analyzed the inquiry of field selection rate initialization unit has need to identify index optimal set;
2., 3.: whether field selection rate initialization unit inquires about these tables of data to be analyzed by database interface has statistical information, as then the field selective information initialization of tables of data to be analyzed will be carried out according to default rule without statistical information; Upgrade if any statistical information or statistical information, then carry out the field selective information initialization of tables of data to be analyzed according to the statistical information collected;
4.: SQL information acquisition unit gathers all SQL statement information in lane database operation in setting cycle by database interface and is stored in by specific format in SQL Information Monitoring table;
5., 6.: SQL performs the information that information analysis unit resolves needs from SQL Information Monitoring table, the mainly execution number of times of field and/or combined field and correspondence thereof, stored in combined field frequency table after processing process.
7., 8. and 9.: tables of data index generation unit is analyzed information in combined field frequency table, and in conjunction with the field selection rate in field selective information table, various field and/or combined field are merged, and by the parameter that user sets, filter out the tables of data index set of the optimum of qualified correspondence tables of data to be analyzed.
Corresponding, tables of data indexed set symphysis apparatus for converting comprises:
Field selection rate initialization unit 201: this unit periodic scanning tables of data to be analyzed statistical information alteration in a database, to increase or reason tables of data statistical information to be analyzed upgrades as found that there is tables of data to be analyzed, then calculate according to the new statistical information collected and corresponding content more in newer field selective information table:
The field unique value number/table record sum of field selection rate=table to be analyzed.
Field selection rate value (0 ~ 1), more represents selectivity close to 1 better.
Data list structure to be analyzed is as shown in table 1.
Field name Field type Chinese describes Describe in detail Remarks
ID NUMBER(14) Major key
TABNAME VARchar2(200) Table name claims
Table 1
The field selective information list structure obtained is as shown in table 2.
Table 2
SQL information acquisition unit 202: this unit gathers all SQL statement information run at lane database from database, comprise the execution number of times of the SQL statement content in certain period, SQL executive plan, SQL statement, and by relevant information stored in SQL Information Monitoring table.
SQL Information Monitoring list structure is as shown in table 3.
Field name Field type Chinese describes Describe in detail Remarks
ID NUMBER(14) Major key
SQLTEXT VARchar2(3000) SQL content
SQLPLANTEXT VARchar2(4000) SQL executive plan content
EXECUTIONS NUMBER(6) Perform number of times
STARTTIME Date Start time
ENDTIME Date End time
Table 3
SQL performs information analysis unit 203, this unit is core of the present invention, it is by reading the information in SQL Information Monitoring table, the search field of SQL statement in tables of data to be analyzed is parsed from SQL statement, and the parsing combined SQL executive plan, the execution number of times of SQL statement, calculate this SQL in setting cycle, treat the inquiry times that in analytical data, certain several field combination is carried out, inquiry times is converted into the field combination frequency of tables of data to be analyzed, and the field combination frequency is updated in combined field frequency table;
Table field combination frequency table structure to be analyzed is as shown in table 4.
Field name Field type Chinese describes Describe in detail Remarks
ID NUMBER(14) Major key
TABNAME VARchar2(200) Table name claims
COMBILED_CODE VARchar2(100) Combined code
EXECUTIONS NUMBER(6) Perform number of times
STARTTIME Date Start time
ENDTIME Date End time
Table 4
Tables of data indexed set closes generation unit 204, this unit is by analyzing the information in combined field frequency table, and in conjunction with the field selection rate in field selective information table, various field and/or combined field are merged, and by parameter that user sets, filter out the tables of data index set of qualified optimum, the proportion such as accounting for search index total amount according to the inquiry meeting certain index situation carries out rule-based filtering, to lower than millesimal situation, show that relevant SQL statement can not the blocking of initiating system if analyzed, then do not need to set up this index; By traveling through all fields and/or combined field and merging, after conditional plan filters, export the tables of data index set of optimum corresponding to the table to be analyzed of suggestion.
The method detailed step of the embodiment of the present invention:
Field selection rate initialization unit 201 periodic scanning tables of data to be analyzed statistical information in a database, to increase or existing table statistical information to be analyzed has renewal as found that there is tables of data to be analyzed, then calling data bank interface obtains the information that relevant information is come in more newer field selective information table.Such as, in oracle database, num_distinct field value can be extracted by dba_tab_columns table, table row numerical value is obtained by the num_rows field in dba_tables, by the acquisition of above-mentioned information can calculate each field of table to be analyzed selection rate, its method flow diagram adopted is as shown in Figure 4.
SQL information acquisition unit 202 is also that the interface provided by database obtains in a certain period (setting cycle), all SQL statement information performed in a database, such as in oracle database, by four view v $ active_session_history, dba_hist_sqltext, dba_hist_sql_plan, the correlation inquiry of dba_hist_sqlstat, gather the SQL statement content performed in certain period, the executive plan content that SQL statement is corresponding, the number of times that SQL statement performs, and relevant information is stored to SQL Information Monitoring table, its method flow diagram adopted as shown in Figure 5.
SQL performs information analysis unit 203 by resolving SQL statement content from SQL Information Monitoring table, and can obtain table name and claim and search field and SQL executive plan, if the inquiry of single table, then the number of times that SQL statement performs is exactly that search field performs number of times; If the correlation inquiry of multilist (as nested circulation association etc.), then need the executive plan of resolving SQL statement, therefrom parsing which table is table-drive, confirms the position residing for analyzed tables of data; If analyzed tables of data is table-drive, and executive plan employing is not nested circulation, then SQL statement performs number of times is exactly that search field performs number of times.Otherwise need the row source number according to the contingency table parsed in executive plan, take advantage of the execution number of times in SQL statement to calculate the execution number of times of search field, its method flow diagram adopted as shown in Figure 6.
Tables of data indexed set closes generation unit 204 by the combination in acquisition combined field frequency table and frequency information, first according to field selection rate height, combination internal sort is carried out to field and/or combined field, then to starting the identical field of field and/or combined field merges.Such as, the combination of field A, B, C, A field selection rate is the highest, and B is minimum.Suppose the retrieval having several combination ABC, BCA, BC, A, C, AC, then by being ACB, ACB, CB, A, AC after sequence, after merging, remain ACB, CB two kinds of combined fields.Wherein, find that again the frequency that CB combination occurs is very low, less than the per mille of all inquiries, then the inquiry of CB field combination can not set up index, and last optimum indexed set is combined into ACB, and its method flow diagram adopted as shown in Figure 7.
The embodiment of the present invention can identify that tables of data to be analyzed application program within certain period (setting cycle) runs the optimal data table index set needed, and the existing index set can treated in analytical data by this optimal data table index set is assessed, find whether there is invalid index, poor efficiency index, lack the index of important information, or the need of being evaded the bad but frequent index used of certain selectivity by application design modifying, the index set in existing tables of data to be analyzed is optimized with this, thus reach the invalid index of solution minimizing, the technical matterss such as poor efficiency index and the unreasonable application data inefficient operation caused of applied logic design, solve storage resources waste, the problems such as query performance is not good.
The embodiment of the present invention makes user can identify the optimal data table index set of tables of data to be analyzed within certain period that application program is run, and solves traditional index analytical approach and can only carry out independent analysis to single index, can not identify the problem of optimum index set.By adjustment existing system invalid index, poor efficiency index, the application program efficiency in tables of data to be analyzed can be made to reach optimum state, to promote application system overall performance, save storage resources and important effect is played in system adjustment and optimization analytical work.The present invention can be used alone in the Performance tuning analytical work of application system, or be integrated in tuning tool software with support system performance evaluation work, also can assess accordingly trial run system software development work, the problem of Timeliness coverage Index Design, possesses very high using value.
In several embodiments that the application provides, should be understood that disclosed equipment and method can realize by another way.Apparatus embodiments described above is only schematic, such as, the division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, and as: multiple unit or assembly can be in conjunction with, maybe can be integrated into another system, or some features can be ignored, or do not perform.In addition, the coupling each other of shown or discussed each ingredient or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of equipment or unit or communication connection can be electrical, machinery or other form.
The above-mentioned unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, also can be distributed in multiple network element; Part or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in various embodiments of the present invention can all be integrated in a processing module, also can be each unit individually as a unit, also can two or more unit in a unit integrated; Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form that hardware also can be adopted to add SFU software functional unit realizes.
One of ordinary skill in the art will appreciate that: all or part of step realizing said method embodiment can have been come by the hardware that programmed instruction is relevant, aforesaid program can be stored in a computer read/write memory medium, this program, when performing, performs the step comprising said method embodiment; And aforesaid storage medium comprises: movable storage device, ROM (read-only memory) (ROM, Read-OnlyMemory), random access memory (RAM, RandomAccessMemory), magnetic disc or CD etc. various can be program code stored medium.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of described claim.

Claims (11)

1. the symphysis of tables of data indexed set becomes a method, it is characterized in that, described method comprises:
Scan tables of data to be analyzed statistical information in a database, according to described statistical information, field selective information table is upgraded;
The SQL information used when acquisition applications program is run from database;
Described SQL information is resolved, obtains the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, the described field combination frequency is updated in combined field frequency table;
According to described combined field frequency table and described field selective information table, the field in described tables of data to be analyzed and/or combined field are optimized, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
2. method according to claim 1, is characterized in that, describedly carries out renewal according to described statistical information to field selective information table and comprises:
The field selection rate of each field in described tables of data to be analyzed is calculated according to described statistical information;
Described field selection rate is inserted described field selective information table, realizes the renewal to described field selective information table.
3. method according to claim 2, is characterized in that, described field selection rate is calculated by p (i)=n (i)/N;
Wherein, p (i) is the field selection rate of i-th field in tables of data to be analyzed; N (i) is the unique value number of i-th field in tables of data to be analyzed; N is tables of data record to be analyzed sum.
4. method according to claim 1, is characterized in that, described SQL information comprises the execution number of times of SQL statement, SQL executive plan and SQL statement.
5. method according to claim 4, it is characterized in that, described SQL information is resolved, obtains the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, the described field combination frequency is updated to combined field frequency table and comprises:
Search field and SQL executive plan is parsed from described SQL information;
The execution number of times of described SQL statement and/or the correlation inquiry execution number of times of SQL statement is obtained according to described search field and SQL executive plan;
The correlation inquiry of the execution number of times of described SQL statement and/or SQL statement is performed the field combination frequency that number of times is converted into tables of data to be analyzed, and is updated in combined field frequency table.
6. method according to claim 5, it is characterized in that, describedly be optimized the field in described tables of data to be analyzed and/or combined field according to described combined field frequency table and described field selective information table, the tables of data index set obtaining the optimum of corresponding described tables of data to be analyzed comprises:
According to the selection rate of field in described field selective information table to the field in described combined field frequency table and/or combined field selectively rate order from high to low sort;
The field identical to the beginning field after sequence and/or combined field merge;
Field after being combined by setup parameter and/or combined field are screened, and set up index to the field after screening and/or combined field, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
7. a tables of data indexed set symphysis apparatus for converting, is characterized in that, described device comprises:
Field selection rate initialization unit, for scanning tables of data to be analyzed statistical information in a database, upgrades field selective information table according to described statistical information;
SQL information acquisition unit, the SQL information used when running for acquisition applications program from database;
SQL performs information analysis unit, for resolving described SQL information, obtaining the field of described SQL information to described tables of data to be analyzed and/or the field combination frequency of combined field, being updated in combined field frequency table by the described field combination frequency;
Tables of data indexed set closes generation unit, for being optimized the field in described tables of data to be analyzed and/or combined field according to described combined field frequency table and described field selective information table, obtain the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
8. device according to claim 7, is characterized in that, described field selection rate initialization unit comprises:
Field selection rate computing module, for calculating the field selection rate of each field in described tables of data to be analyzed according to described statistical information;
Field selective information table update module, for described field selection rate is inserted described field selective information table, realizes the renewal to described field selective information table.
9. device according to claim 8, is characterized in that, described field selection rate is calculated by p (i)=n (i)/N;
Wherein, p (i) is the field selection rate of i-th field in tables of data to be analyzed; N (i) is the unique value number of i-th field in tables of data to be analyzed; N is record sum in tables of data to be analyzed.
10. device according to claim 9, is characterized in that, described SQL performs information analysis unit and comprises:
Parsing module, for parsing search field and SQL executive plan from described SQL information;
Correlation inquiry performs number of times module, and the correlation inquiry for the execution number of times and/or SQL statement that obtain described SQL statement according to described search field and SQL executive plan performs number of times;
Combined field frequency table update module, for the correlation inquiry of the execution number of times of described SQL statement and/or SQL statement is performed the field combination frequency that number of times is converted into tables of data to be analyzed, and is updated in combined field frequency table.
11. devices according to claim 10, is characterized in that, described tables of data indexed set closes generation unit and comprises:
Field is rate order module selectively, for according to the selection rate of field in described field selective information table to the field in described combined field frequency table and/or combined field selectively rate order from high to low sort;
Field merge cells, for merging the identical field of beginning field after sequence and/or combined field;
Tables of data indexed set closes generation module, screens for the field after being combined by setup parameter and/or combined field, sets up index to the field after screening and/or combined field, obtains the tables of data index set of the optimum of corresponding described tables of data to be analyzed.
CN201410331106.XA 2014-07-11 2014-07-11 A kind of tables of data indexed set symphysis is at method and device Active CN105320679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410331106.XA CN105320679B (en) 2014-07-11 2014-07-11 A kind of tables of data indexed set symphysis is at method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410331106.XA CN105320679B (en) 2014-07-11 2014-07-11 A kind of tables of data indexed set symphysis is at method and device

Publications (2)

Publication Number Publication Date
CN105320679A true CN105320679A (en) 2016-02-10
CN105320679B CN105320679B (en) 2019-05-24

Family

ID=55248081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410331106.XA Active CN105320679B (en) 2014-07-11 2014-07-11 A kind of tables of data indexed set symphysis is at method and device

Country Status (1)

Country Link
CN (1) CN105320679B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202403A (en) * 2016-07-11 2016-12-07 浪潮通用软件有限公司 Method for creating database index through dynamic analysis
CN106919678A (en) * 2017-02-27 2017-07-04 武汉珞佳伟业科技有限公司 A kind of database inquiry optimization system and method
CN107239451A (en) * 2016-03-28 2017-10-10 北京京东尚科信息技术有限公司 Database index creation method and device
CN107515886A (en) * 2016-06-17 2017-12-26 阿里巴巴集团控股有限公司 A kind of recognition methods of tables of data, device and system
CN107562762A (en) * 2016-07-01 2018-01-09 中国联合网络通信集团有限公司 Data directory construction method and device
CN107818114A (en) * 2016-09-14 2018-03-20 中国移动通信有限公司研究院 A kind of data processing method, device and database
CN108038135A (en) * 2017-11-21 2018-05-15 平安科技(深圳)有限公司 Electronic device, the method for multilist correlation inquiry and storage medium
CN108073612A (en) * 2016-11-14 2018-05-25 平安科技(深圳)有限公司 The method and apparatus of synchronous SQL executive plans
CN108268515A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The selection method and device of Aggregation Table dimension
CN108664481A (en) * 2017-03-27 2018-10-16 中国移动通信集团内蒙古有限公司 A kind of data retrieval method and server
CN108920664A (en) * 2018-07-05 2018-11-30 福建星瑞格软件有限公司 A kind of database intelligence index implementation method based on index value
CN108984698A (en) * 2018-07-05 2018-12-11 福建星瑞格软件有限公司 A kind of modeling method of data bank service behavior
CN109063086A (en) * 2018-07-26 2018-12-21 中兴飞流信息科技有限公司 Execution method, apparatus, server and the storage medium of structured query language
CN110909003A (en) * 2019-11-25 2020-03-24 车智互联(北京)科技有限公司 Method for creating data table and computing equipment
CN111427920A (en) * 2020-03-16 2020-07-17 深圳市腾讯计算机系统有限公司 Data acquisition method, device, system, computer equipment and storage medium
CN112783758A (en) * 2019-11-11 2021-05-11 阿里巴巴集团控股有限公司 Test case library and feature library generation method, device and storage medium
CN113590632A (en) * 2021-08-11 2021-11-02 平安普惠企业管理有限公司 Database index creating method, device, equipment and medium
WO2023078130A1 (en) * 2021-11-03 2023-05-11 中兴通讯股份有限公司 Index creation method and apparatus, and computer-readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5560007A (en) * 1993-06-30 1996-09-24 Borland International, Inc. B-tree key-range bit map index optimization of database queries
CN101059810A (en) * 2007-03-16 2007-10-24 华为技术有限公司 System and method for implementing automatic optimization of data base system
CN103164455A (en) * 2011-12-15 2013-06-19 百度在线网络技术(北京)有限公司 Optimization method and optimization device of data base
CN103390066A (en) * 2013-08-08 2013-11-13 上海新炬网络技术有限公司 Database overall automation optimizing early warning device and processing method thereof
CN103810212A (en) * 2012-11-14 2014-05-21 阿里巴巴集团控股有限公司 Automated database index creation method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5560007A (en) * 1993-06-30 1996-09-24 Borland International, Inc. B-tree key-range bit map index optimization of database queries
CN101059810A (en) * 2007-03-16 2007-10-24 华为技术有限公司 System and method for implementing automatic optimization of data base system
CN103164455A (en) * 2011-12-15 2013-06-19 百度在线网络技术(北京)有限公司 Optimization method and optimization device of data base
CN103810212A (en) * 2012-11-14 2014-05-21 阿里巴巴集团控股有限公司 Automated database index creation method and system
CN103390066A (en) * 2013-08-08 2013-11-13 上海新炬网络技术有限公司 Database overall automation optimizing early warning device and processing method thereof

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239451A (en) * 2016-03-28 2017-10-10 北京京东尚科信息技术有限公司 Database index creation method and device
CN107515886A (en) * 2016-06-17 2017-12-26 阿里巴巴集团控股有限公司 A kind of recognition methods of tables of data, device and system
CN107562762B (en) * 2016-07-01 2020-10-27 中国联合网络通信集团有限公司 Data index construction method and device
CN107562762A (en) * 2016-07-01 2018-01-09 中国联合网络通信集团有限公司 Data directory construction method and device
CN106202403B (en) * 2016-07-11 2020-03-20 浪潮通用软件有限公司 Method for creating database index through dynamic analysis
CN106202403A (en) * 2016-07-11 2016-12-07 浪潮通用软件有限公司 Method for creating database index through dynamic analysis
CN107818114A (en) * 2016-09-14 2018-03-20 中国移动通信有限公司研究院 A kind of data processing method, device and database
CN108073612A (en) * 2016-11-14 2018-05-25 平安科技(深圳)有限公司 The method and apparatus of synchronous SQL executive plans
CN108268515A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The selection method and device of Aggregation Table dimension
CN108268515B (en) * 2016-12-30 2020-07-31 北京国双科技有限公司 Selection method and device for dimension of aggregation table
CN106919678A (en) * 2017-02-27 2017-07-04 武汉珞佳伟业科技有限公司 A kind of database inquiry optimization system and method
CN108664481A (en) * 2017-03-27 2018-10-16 中国移动通信集团内蒙古有限公司 A kind of data retrieval method and server
CN108664481B (en) * 2017-03-27 2021-03-23 中国移动通信集团内蒙古有限公司 Data retrieval method and server
CN108038135A (en) * 2017-11-21 2018-05-15 平安科技(深圳)有限公司 Electronic device, the method for multilist correlation inquiry and storage medium
CN108920664B (en) * 2018-07-05 2022-04-15 福建星瑞格软件有限公司 Database intelligent index implementation method based on index value
CN108920664A (en) * 2018-07-05 2018-11-30 福建星瑞格软件有限公司 A kind of database intelligence index implementation method based on index value
CN108984698B (en) * 2018-07-05 2023-06-27 福建星瑞格软件有限公司 Modeling method for database business behavior
CN108984698A (en) * 2018-07-05 2018-12-11 福建星瑞格软件有限公司 A kind of modeling method of data bank service behavior
CN109063086A (en) * 2018-07-26 2018-12-21 中兴飞流信息科技有限公司 Execution method, apparatus, server and the storage medium of structured query language
CN112783758A (en) * 2019-11-11 2021-05-11 阿里巴巴集团控股有限公司 Test case library and feature library generation method, device and storage medium
CN112783758B (en) * 2019-11-11 2024-02-27 阿里云计算有限公司 Test case library and feature library generation method, device and storage medium
CN110909003B (en) * 2019-11-25 2022-06-10 车智互联(北京)科技有限公司 Method for creating data table and computing equipment
CN110909003A (en) * 2019-11-25 2020-03-24 车智互联(北京)科技有限公司 Method for creating data table and computing equipment
CN111427920A (en) * 2020-03-16 2020-07-17 深圳市腾讯计算机系统有限公司 Data acquisition method, device, system, computer equipment and storage medium
CN111427920B (en) * 2020-03-16 2023-08-11 深圳市腾讯计算机系统有限公司 Data acquisition method, device, system, computer equipment and storage medium
CN113590632A (en) * 2021-08-11 2021-11-02 平安普惠企业管理有限公司 Database index creating method, device, equipment and medium
CN113590632B (en) * 2021-08-11 2023-12-19 北京云拓科技有限公司 Database index creation method, device, equipment and medium
WO2023078130A1 (en) * 2021-11-03 2023-05-11 中兴通讯股份有限公司 Index creation method and apparatus, and computer-readable storage medium

Also Published As

Publication number Publication date
CN105320679B (en) 2019-05-24

Similar Documents

Publication Publication Date Title
CN105320679A (en) Data table index set generation method and device
CN103390066B (en) A kind of database overall automation optimizes prior-warning device and disposal route thereof
CN103092867B (en) Method and system for managing data, and data analyzing device
CN104714984A (en) Database optimization method and device
CN104899295B (en) A kind of heterogeneous data source data relation analysis method
CN103678494A (en) Method and device for client side and server side data synchronization
CN105069134A (en) Method for automatically collecting Oracle statistical information
CN104111958A (en) Data query method and device
CN103246745A (en) Device and method for processing data based on data warehouse
CN109376142A (en) Data migration method and terminal device
CN104897817A (en) Chromatographic instrument and reuse method of chromatographic columns
CN105677687A (en) Data processing method and device
CN105095436A (en) Automatic modeling method for data of data sources
CN115237947A (en) SQL database optimization processing method and device, intelligent terminal and medium
CN102004771B (en) Method for querying reverse neighbors of moving object based on dynamic cutting
KR20210060830A (en) Big data intelligent collecting method and device
CN110704442A (en) Real-time acquisition method and device for big data
CN103491564B (en) Self-diagnostic method and system of mobile terminal
CN107085603B (en) Data processing method and device
CN108334565A (en) A kind of data mixing storage organization, data store query method, terminal and medium
EP2761508A1 (en) Dynamic database indexing
CN111078728B (en) Cross-database query method and device in database archiving mode
CN106570160A (en) Mass spatio-temporal data cleaning method and mass spatio-temporal data cleaning device
CN110019192B (en) Database retrieval method and device
CN102054008A (en) Method and device for acquiring network information

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant