CN115905222A - Intelligent index recommendation method and device for airborne database management - Google Patents

Intelligent index recommendation method and device for airborne database management Download PDF

Info

Publication number
CN115905222A
CN115905222A CN202211411062.2A CN202211411062A CN115905222A CN 115905222 A CN115905222 A CN 115905222A CN 202211411062 A CN202211411062 A CN 202211411062A CN 115905222 A CN115905222 A CN 115905222A
Authority
CN
China
Prior art keywords
query
database
index
condition
equivalence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211411062.2A
Other languages
Chinese (zh)
Inventor
王晓昱
马望福
高嘉巍
蔺伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Flight Automatic Control Research Institute of AVIC
Original Assignee
Xian Flight Automatic Control Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Flight Automatic Control Research Institute of AVIC filed Critical Xian Flight Automatic Control Research Institute of AVIC
Priority to CN202211411062.2A priority Critical patent/CN115905222A/en
Publication of CN115905222A publication Critical patent/CN115905222A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides an intelligent index recommendation method and device for airborne database management. According to the method, airborne database index design data under different scenes are used as samples, a naive Bayes model is used for mining the incidence relation between airborne database scene characteristics and database index selection, and based on the incidence relation, an optimal index method is recommended for different airborne database application scene characteristics, database indexes are built, the airborne database query efficiency is improved, and the task guarantee capability of an airplane flight management system is improved. The invention lowers the technical threshold of database management designers, reduces the cost of software design and analysis, and greatly improves the design efficiency of database schemes.

Description

Intelligent index recommendation method and device for airborne database management
Technical Field
The invention belongs to the technical field of airborne databases, and particularly relates to an intelligent index recommendation method and device for airborne database management.
Background
In addition to the general attributes of databases (atomicity, consistency, independence, persistence), on-board databases need to meet both real-time and reliability. While the traditional airborne embedded database contains a great amount of time-consuming operations in the operation process, such as file reading and writing, data query, calculation and exchange and other operations. The airborne database is used as an important source of data in the operation of each system on the airplane, and the real-time and efficient query of the airborne database is crucial to the control of the flight process of the airplane and the correct implementation of the flight mission.
Therefore, the intelligent index recommendation method and device for airborne database management are researched, the technical threshold of database designers is reduced, the software design and analysis cost is reduced, the design efficiency of a database scheme is greatly improved, and the method and device have important engineering significance and practical value for improving the self-life safety of an airborne system.
Disclosure of Invention
The invention provides an intelligent index recommendation method and device for airborne database management, which can be used for reducing the technical threshold of database designers, reducing the software design and analysis cost and greatly improving the design efficiency of database schemes.
The invention provides an intelligent index recommendation method for airborne database management, which comprises the following steps:
acquiring characteristic attributes of a database; the characteristic attributes include: the self attribute of the database and the application attribute of the database;
taking the characteristic attribute of the database as the input of the trained classifier, and acquiring the recommended index mode of the database;
and the trained classifier is obtained by training a training sample by adopting a naive Bayesian model.
Optionally, the method further includes:
after detecting that a user modifies the recommended index mode of the database, adding the database and the modified index mode into a training sample as a new sample;
and (3) using a K-fold cross validation method, adopting the updated training sample, and adopting a naive Bayesian model to train to obtain the latest classifier.
Optionally, the attribute of the database itself includes at least one of the following:
database row number, database column number and database storage mode.
Optionally, the database application attribute includes at least one of the following:
application scenarios, equivalence queries, range queries, joint queries, near queries, index row sources, and index row data types;
the application scene comprises the following steps: query-only, multi-query, other, uncertain operations;
the equivalence query includes: no-equivalence query, single-condition equivalence query, 2-condition equivalence query, 3-condition equivalence query, 4-condition equivalence query, 5-condition equivalence query, and 6-or-more-condition equivalence query;
the scope query includes: the method comprises the following steps of (1) performing range-free query, single condition range query, 2-condition range query, 3-condition range query, 4-condition range query, 5-condition range query and 6-condition range query;
the joint query includes: joint query and no joint query exist;
the approximate query comprises the following steps: whether the query is close or not is judged;
the sources of the index columns include: original data, generated data and uncertain data sources;
the index column data types include: char type, int type, float type, double type, text type, picture type, bit type.
Optionally, the indexing manner at least includes one of the following items:
sequential querying, hash indexing, balanced binary tree indexing, B/B + tree indexing, T/T tree indexing, R tree indexing, and KD tree indexing.
In another aspect, the present invention provides an intelligent index recommendation apparatus for onboard database management, including: the device comprises a storage module, a training module and a recommendation module;
the recommendation module is used for acquiring the characteristic attribute of the database; taking the characteristic attribute of the database as the input of the trained classifier, and acquiring the recommended index mode of the database;
wherein the feature attributes include: the self attribute of the database and the application attribute of the database; the trained classifier is obtained by training the training samples stored in the storage module by the training module and adopting a naive Bayes model for training.
Optionally, the storage module is further configured to, after detecting that the user modifies the recommended index manner of the database, add the database and the modified index manner as new samples to the training sample;
the training module is also used for obtaining the latest classifier by using a K-fold cross validation method, adopting the updated training sample and adopting a naive Bayesian model for training.
Optionally, the attribute of the database itself includes at least one of the following:
database row number, database column number and database storage mode.
Optionally, the database application attribute includes at least one of the following:
applying a scene, an equivalent query, a range query, a joint query, a near query, an index list source and an index list data type;
the application scene comprises the following steps: query-only, multi-query, other, uncertain operations;
the equivalence query includes: no equivalence query, single condition equivalence query, 2 condition equivalence query, 3 condition equivalence query, 4 condition equivalence query, 5 condition equivalence query, 6 and above condition equivalence query;
the scope query includes: the method comprises the following steps of (1) performing range-free query, single condition range query, 2-condition range query, 3-condition range query, 4-condition range query, 5-condition range query and 6-condition range query;
the joint query includes: joint query and no joint query exist;
the approximate query comprises the following steps: whether the query is close or not is judged;
the sources of the index columns include: original data, generated data and uncertain data sources;
the index column data types include: char type, int type, float type, double type, text type, picture type, bit type.
Optionally, the indexing manner at least includes one of the following items:
sequential queries, hash indices, balanced binary tree indices, B/B + tree indices, T/T tree indices, R tree indices, and KD tree indices.
The invention provides an intelligent index recommendation method and device for airborne database management. In order to solve the problem that an airborne database has high real-time requirements on data retrieval, but the traditional sequential query or binary tree index query has low query efficiency, the invention takes airborne database index design data in different scenes as samples, uses a naive Bayes model to mine the incidence relation between airborne database scene characteristics and database index selection, recommends an optimal index method for different airborne databases to apply the scene characteristics based on the incidence relation, establishes database indexes, improves the airborne database query efficiency, and improves the task guarantee capability of an aircraft flight management system. The invention lowers the technical threshold of database management designers, reduces the cost of software design and analysis, and greatly improves the design efficiency of database schemes.
Drawings
FIG. 1 is a block diagram of an onboard database managed intelligent indexing recommendation device provided by the present invention;
fig. 2 is a flowchart of an intelligent index recommendation method for onboard database management provided by the present invention.
Detailed Description
The following explains the nonvolatile memory area access processing method and apparatus at startup of the aviation software provided by the present invention with reference to the drawings.
Fig. 1 is a structural diagram of an intelligent index recommendation device managed by an onboard database according to the present invention, and referring to fig. 1, the intelligent index recommendation device according to the present invention includes: the device comprises a storage module, a training module and a recommendation module;
the inventive device comprises the following parts, as shown in fig. 1:
1. a storage module: the device is mainly used for storing training data and processing the data. Collecting real samples of various airborne databases, extracting sample attributes according to database attributes and application scenes of various databases, and performing data preprocessing;
2. a training module: and for the data sample, dividing a training set and a test set by using a K-fold cross validation method, training the training set by using a naive Bayes model, validating by using the test set, obtaining a trained classifier, and storing the parameters of the classifier.
3. A recommendation module: processing the new airborne database attribute according to the trained classifier model and inputting the processed new airborne database attribute into a classifier, and indexing the recommended optimal database by the recommendation module;
optionally, the optimal database index recommended by the method is directly input into a data management module in the database management system, the data management module has the following functions,
a data management module: according to the optimal index of the database recommended by the method and the recommendation module, a software designer finally determines the index of the database according to actual conditions. And generating a database index code for each onboard system according to the finally determined index and the database attribute file.
Optionally, the characteristic attributes (sample attributes) and indexes (sample labels) of the airborne database are taken as new samples to be put into the storage module of the method, so that the scale of training data is expanded, and the accuracy of the classifier in the training module is improved.
The specific steps of the invention are as follows, as shown in fig. 2:
step 1, collecting corresponding airborne database samples.
The samples are from database applications in real on-board systems.
Sample characteristics are different database attributes and application types. The sample tags are selected for real indexes, and comprise sequential query, hash index, balanced binary tree index, B/B + tree index, T/T tree index, R tree index and KD tree index.
And 2, selecting characteristic attributes according to the real application of the airborne database in the field.
The sample characteristic attribute comprises the database self attribute and the database application attribute.
The database self-attribute comprises database row number (default is not more than 100,000), database column number (not more than 20) and database storage mode (internal memory, disk and uncertainty);
the database application attributes comprise application scenes (only query, multiple queries, other queries, uncertain), equivalence queries (no equivalence query, single-condition equivalence query, 2-condition equivalence query, 3-condition equivalence query, 4-condition equivalence query, 5-condition equivalence query, 6-condition equivalence query or more), range queries (no range query, single-condition range query, 2-condition range query, 3-condition range query, 4-condition range query, 5-condition range query, 6-condition range query or more), joint queries (joint query, no joint query), near query (near query, no near query), index column sources (original data, generated data, uncertain), index column data types (char type, int type, float type, double type, text type, picture type, bit type).
And 3, processing the collected data, processing certain zero values, and screening the number of types of data samples to prevent the samples from being unbalanced. And dividing a training set and a testing set.
The data processing comprises processing data exceeding the range, such as the value of the database row number which is more than 100,000 according to 100,000;
and (3) processing certain uncertain attributes, such as single-condition query processing when query is actually carried out but the number of query conditions is uncertain, and processing according to uncertain sources, index data types and the like.
And dividing the training set and the test set by using an N-fold cross mode. The initial sample is divided into N sub-sample sets, one single sub-sample set is reserved as data of the verification model, and the other N-1 sample sets are used for training. And repeating the cross validation for N times (steps 4-7), validating each subsample set once, and averaging the results of the N times to finally obtain a validation result. Here we take N =5, i.e. initialise to 5 subsample sets, and cross-validation repeats 5 times.
Let the training set be T = { (x) 1 ,y 1 ),(x 2 ,y 2 ),…,(x N ,y N ) N samples, x represents sample characteristic and y represents sample label. The joint probability distribution P (X, Y) is learned through the training set.
Step 4, calculating the probability P (C) of each class attribute k );
Learning a prior probability distribution P (Y = C) k ) K =1,2, … K, where K =7.
Step 5, calculating corresponding conditional probability P (X | C) for each characteristic attribute k );
And (3) calculating the conditional probability:
P(X=x|Y=C k )=P(X (1) =x (1) ,…,X (n) =x (n) |Y=C k )
all characteristic conditions of the method are independently and identically distributed
Figure BDA0003937786240000071
Let x be (j) Can take on the value s j If there are m features and k classes, the number of parameters is->
Figure BDA0003937786240000072
Step 6, calculating the conditional probability P (X | C) of division for each test sample k )P(C k );
From the learned P (X, Y), a posterior probability distribution is obtained
Figure BDA0003937786240000073
/>
Step 7, selecting the class with the maximum value as the class of the X;
obtaining a naive Bayes classifier:
Figure BDA0003937786240000074
for any C k The denominators are all the same, i.e.
Figure BDA0003937786240000075
And 8, recommending the optimal database index for the new airborne database and the specific application. And (3) for the new airborne database needing to be subjected to the recommended index type, extracting the characteristics of the new database according to the sample attributes (characteristics) in the step (2) and constructing a test sample.
And inputting the test sample into the classifier, obtaining a recommended airborne database index, and sending the recommended airborne database index to an airborne database management system platform.
Step 9, in an onboard database management system platform, a software designer selects a proper index according to the recommended index and the real application of the database, and the system generates a relevant database application index code for onboard use;
optionally, the finally determined database index selection in the onboard database management system is also fed back to the storage module of the intelligent index recommendation system to be stored as a real onboard navigation database sample,
the method is used for subsequent database training, so that the system has self-learning capability, the sample library is continuously improved, and the recommendation capability of the recommendation module is improved.
Compared with the traditional sequential query and binary tree query, the method for selecting the specific index pertinently improves the query efficiency of the airborne navigation database and improves the task guarantee capability of the airplane flight management system.
The method uses relevant characteristics (sample attributes) and index selection (sample labels) data of a real airborne navigation database for the first time to research the database index recommendation method.
For example, a possible correspondence relationship between the database characteristics obtained by using the index recommendation method provided by the present invention and 7 recommendation indexes is shown in table 1 below.
TABLE 1 database characteristic and recommendation index correspondence table
Figure BDA0003937786240000091
It will be appreciated that the table manages the intelligent index recommender derivation in accordance with the onboard database. Different database index selections should still be obtained according to the classifier described in steps 1-7 above, and cannot be classified according to the table alone.
Illustratively, according to the present invention, the sequential query index is applicable to a database having a number of rows less than 30, the storage method is a memory database, and the application types may be an equivalent query, a range query, and a joint query scenario.
The Hash index is suitable for a scene with the database application type being equivalent query;
the balanced binary tree index is suitable for a scene in which the storage mode is an internal memory database, the application type is equivalent query, and the index column data types are numerical value type data, character type data and character string type data;
the B tree and B + tree indexes are applicable to the scenes that the application types of the database are equivalent query and range query, and the index column data types are numerical value type, character type and character string type data;
the T tree and T-tree index are suitable for a scene with a storage mode of an internal memory database, application types of the internal memory database are equivalent query and range query, and index column data types of numerical value type, character type and character string type data;
the application types of the R-tree index suitable for the database are equivalent query, range query, combined query and near query (a special application of an airborne navigation database), and the index column data type is a scene of numerical value type and character type data;
the KD tree index is suitable for a scene in which the storage mode is a memory database, the application types are equivalent query, combined query and proximity query, and the index column data types are numerical data and character data;
illustratively, the method includes that airborne database index design data under different scenes are used as samples, a naive Bayes model is used for mining an incidence relation between airborne database scene characteristics and database index selection, an optimal index method is recommended for different airborne database application scene characteristics based on the incidence relation, database indexes are built, airborne database query efficiency is improved, and task guarantee capability of an airplane flight management system is improved. The invention lowers the technical threshold of database management designers, reduces the cost of software design and analysis, and greatly improves the design efficiency of database schemes.

Claims (10)

1. An intelligent index recommendation method for onboard database management, comprising:
acquiring characteristic attributes of a database; the characteristic attributes include: the self attribute of the database and the application attribute of the database;
taking the characteristic attribute of the database as the input of the trained classifier, and acquiring the recommended index mode of the database;
and the trained classifier is obtained by training a training sample by adopting a naive Bayes model.
2. The method of claim 1, further comprising:
after detecting that a user modifies a recommended index mode of a database, adding the database and the modified index mode into a training sample as a new sample;
and (3) obtaining the latest classifier by using a K-fold cross validation method, adopting the updated training sample and adopting a naive Bayes model for training.
3. The method of claim 1, wherein the database self-attributes comprise at least one of:
database row number, database column number and database storage mode.
4. The method of claim 1, wherein the database application attribute comprises at least one of:
applying a scene, an equivalent query, a range query, a joint query, a near query, an index list source and an index list data type;
the application scene comprises the following steps: query-only, multi-query, other, uncertain operations;
the equivalence queries include: no-equivalence query, single-condition equivalence query, 2-condition equivalence query, 3-condition equivalence query, 4-condition equivalence query, 5-condition equivalence query, and 6-or-more-condition equivalence query;
the scope query includes: the method comprises the following steps of (1) performing range-free query, single condition range query, 2-condition range query, 3-condition range query, 4-condition range query, 5-condition range query and 6-condition range query;
the joint query includes: joint query and no joint query exist;
the approximate query comprises the following steps: whether the query is close or not is judged;
the sources of the index columns include: original data, generated data and uncertain data sources;
the index column data types include: char type, int type, float type, double type, text type, picture type, bit type.
5. The method of claim 1, wherein the indexing comprises at least one of:
sequential querying, hash indexing, balanced binary tree indexing, B/B + tree indexing, T/T tree indexing, R tree indexing, and KD tree indexing.
6. An intelligent index recommendation device for on-board database management, comprising: the device comprises a storage module, a training module and a recommendation module;
the recommendation module is used for acquiring the characteristic attribute of the database; taking the characteristic attribute of the database as the input of the trained classifier, and acquiring the recommended index mode of the database;
wherein the feature attributes include: the self attribute of the database and the application attribute of the database; the trained classifier is obtained by training the training samples stored in the storage module by the training module and adopting a naive Bayesian model for training.
7. The apparatus of claim 6, wherein the storage module is further configured to, after detecting that the user modifies the recommended index manner of the database, add the database and the modified index manner as new samples to the training samples;
the training module is also used for obtaining the latest classifier by using a K-fold cross validation method, adopting the updated training sample and adopting a naive Bayesian model for training.
8. The apparatus of claim 6, wherein the database self-attribute comprises at least one of:
database row number, database column number and database storage mode.
9. The apparatus of claim 6, wherein the database application attribute comprises at least one of:
application scenarios, equivalence queries, range queries, joint queries, near queries, index row sources, and index row data types;
the application scene comprises the following steps: query-only, multi-query, other, uncertain operations;
the equivalence query includes: no equivalence query, single condition equivalence query, 2 condition equivalence query, 3 condition equivalence query, 4 condition equivalence query, 5 condition equivalence query, 6 and above condition equivalence query;
the scope query includes: the method comprises the following steps of (1) performing range-free query, single condition range query, 2-condition range query, 3-condition range query, 4-condition range query, 5-condition range query and 6-condition range query;
the joint query includes: joint query and no joint query exist;
the approximate query comprises the following steps: whether the query is close or not is judged;
the sources of the index columns include: original data, generated data and uncertain data sources;
the index column data types include: char type, int type, float type, double type, text type, picture type, bit type.
10. The apparatus of claim 6, wherein the indexing means comprises at least one of:
sequential querying, hash indexing, balanced binary tree indexing, B/B + tree indexing, T/T tree indexing, R tree indexing, and KD tree indexing.
CN202211411062.2A 2022-11-11 2022-11-11 Intelligent index recommendation method and device for airborne database management Pending CN115905222A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211411062.2A CN115905222A (en) 2022-11-11 2022-11-11 Intelligent index recommendation method and device for airborne database management

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211411062.2A CN115905222A (en) 2022-11-11 2022-11-11 Intelligent index recommendation method and device for airborne database management

Publications (1)

Publication Number Publication Date
CN115905222A true CN115905222A (en) 2023-04-04

Family

ID=86473770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211411062.2A Pending CN115905222A (en) 2022-11-11 2022-11-11 Intelligent index recommendation method and device for airborne database management

Country Status (1)

Country Link
CN (1) CN115905222A (en)

Similar Documents

Publication Publication Date Title
JP7073576B2 (en) Association recommendation method, equipment, computer equipment and storage media
CN101408885B (en) Modeling topics using statistical distributions
CN107103362B (en) Updating of machine learning systems
Roll et al. Using machine learning to disentangle homonyms in large text corpora
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
KR100903961B1 (en) Indexing And Searching Method For High-Demensional Data Using Signature File And The System Thereof
CN104156415A (en) Mapping processing system and method for solving problem of standard code control of medical data
CN106202514A (en) Accident based on Agent is across the search method of media information and system
KR20190038243A (en) System and method for retrieving documents using context
CN104169948A (en) Methods, apparatus and products for semantic processing of text
KR101679050B1 (en) Personalized log analysis system using rule based log data grouping and method thereof
CN106909609B (en) Method for determining similar character strings, method and system for searching duplicate files
CN103069825B (en) For the system and method for television search assistant
CN106708929B (en) Video program searching method and device
Oard et al. Jointly minimizing the expected costs of review for responsiveness and privilege in e-discovery
Feng et al. Practical duplicate bug reports detection in a large web-based development community
CN104750776A (en) Accessing information content in a database platform using metadata
JP2020512651A (en) Search method, device, and non-transitory computer-readable storage medium
CN108959550B (en) User focus mining method, device, equipment and computer readable medium
CN106570196B (en) Video program searching method and device
CN115422372A (en) Knowledge graph construction method and system based on software test
CN103226748A (en) Associative memory-based project management system
CN112988982B (en) Autonomous learning method and system for computer comparison space
US20040186833A1 (en) Requirements -based knowledge discovery for technology management
AU2015204339A1 (en) Information processing apparatus and information processing program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination