[summary of the invention]
The technical problem to be solved in the invention is:
In traditional structured data searching, a key is searched function and is needed according to querying condition to all words in bivariate table
Duan Jinhang query search, search performance is poor, and search efficiency is low, leverages search performance.
The present invention reaches above-mentioned purpose by following technical solution:
In a first aspect, the present invention provides an a kind of key querying methods of structured database, comprising:
Every column data in structured database is analyzed, the characteristic attribute of every column data is obtained;
The characteristic attribute of obtained every column data is loaded into caching;
The characteristic attribute of keyword to be checked is matched with the characteristic attribute of column data every in caching;
After the characteristic attribute success of the corresponding column of matching or multi-column data, according to the keyword in corresponding column data
Carry out a key inquiry.
Preferably, the characteristic attribute includes maximum data length, character types, the character of appearance, continuous appearance number
Maximum length and the continuous maximum length for letter occur in it is one or more.
Preferably, every column data in the analysis structured database, obtains the characteristic attribute of every column data specifically:
It determines the maximum data length in every column data and records;
Determine the character types for including in every column data and record;
Determine the character occurred in every column data and record;
Wherein, the character repeated is only recorded once.
Preferably, the feature of every column data is corresponded in the characteristic attribute for being included by keyword to be checked and caching
Attribute is matched, and is specifically included:
The data length of the keyword is obtained, is matched respectively with the maximum data length in every column data, and will
The column of successful match retain, and the column that it fails to match are reduced;
Obtain the character types for including in the keyword, respectively with include in every column data character types carry out
Match, and the column of successful match are retained, the column that it fails to match are reduced;
The character for including in the keyword is obtained, is matched respectively with the character occurred in every column data, and general
Retain with successful column, the column that it fails to match are reduced.
Preferably, when the character pair attributes match of any one characteristic attribute of the keyword and k column data fails,
It fails to match with k column data for the keyword, stops the continuation of other characteristic attributes between the keyword and k column data
Match;When the equal successful match of the various features attribute of keyword various features attribute corresponding with k column data, the key
Word and k column data successful match;Wherein, k is classified as the either rank in structured database.
Preferably, when the data length of the keyword be greater than k column data maximum data length when, the keyword with
Data length attributes match failure between k column data, on the contrary then successful match;
When the character types for including in the character types for including in the keyword and k column data are consistent, the key
Character types attributes match success between word and k column data, it is on the contrary then it fails to match;
When the character occurred in the character occurred in the keyword and k column data is consistent, the keyword and k columns
Character attibute successful match between, it is on the contrary then it fails to match;Wherein, k is classified as the either rank in structured database.
Preferably, after the characteristic attribute for obtaining every column data, the method also includes:
The corresponding maximum data length of every column data is compared respectively, and between the identical column of maximum data length
Establish mapping relations;
The corresponding character types of every column data are compared respectively, and establish mapping between the identical column of character types
Relationship;
Occur character by corresponding in every column data respectively and be compared, and is reflected occurring establishing between the identical column of character
Penetrate relationship.
Preferably, when the character pair attributes match of any one characteristic attribute of the keyword and k column data fails,
It fails to match with k column data for the keyword, meanwhile, when determining that k column include one or more mapping relations data column,
The matching for confirming that the keyword is arranged with one or more of mapping relations data also fails, and then skips the keyword
With the matching of one or more of mapping relations data column;
When the character pair attributes match success of any one characteristic attribute and k column data of the keyword, and determine k
When column include one or more mapping relations data column, then confirm that the keyword and one or more of mappings close
Coefficient skips the keyword and one or more of mappings according to the character pair attribute between column also successful match
The matching of character pair attribute between relation data column;Wherein, k is classified as the either rank in structured database.
Preferably, before carrying out characteristic attribute matching, the method also includes: according to the big of search rate or number of searches
It is small, each column data in structured database is ranked up;
Then when carrying out characteristic attribute matching, according to the sequence that search rate or number of searches are descending, successively carry out
The matching of characteristic attribute between the keyword and each column data.
Second aspect, the present invention also provides an a kind of key inquiry units of structured database, including at least one
Device and memory are managed, is connected between at least one described processor and memory by data/address bus, the memory is stored with
The instruction that can be executed by least one described processor, described instruction by the processor after being executed, for completing above-mentioned the
One key querying method of structured database described in one side.
Compared with prior art, the beneficial effects of the present invention are: the present invention obtains structure by carrying out data analysis in advance
Change the corresponding characteristic attribute of every column data in database, by the way that the keyword of inquiry and every column data are carried out characteristic attribute respectively
Matching, more than half field can be cut with automatic cutting when a key is searched and ask, corresponding number is quickly locked according to keyword
According to column, to substantially increase search efficiency, query performance is improved.
Embodiment 1:
The embodiment of the invention provides an a kind of key querying methods of structured database, as shown in Figure 1, specifically include with
Lower step:
Step 10, every column data in structured database is analyzed, the characteristic attribute of every column data is obtained.Wherein, described
Characteristic attribute includes maximum data length, character types, the character of appearance, the continuous maximum length for number occur and continuous appearance
One or more in the maximum length of letter, the corresponding characteristic attribute of each column data all exists different;It is clear in data
When washing library, data are first passed through in advance and analyze to obtain the characteristic attribute corresponding to every column data.
Step 20, the characteristic attribute of obtained every column data is loaded into caching.Wherein, the characteristic attribute is equivalent to
The corresponding significant data feature of every column data is summarized, plays the role of one " catalogue " or " abstract ", and the corresponding tool of each column
Volume data is equivalent to " text ".
Step 30, the characteristic attribute of keyword to be checked is matched with the characteristic attribute of column data every in caching.
When being matched, need each characteristic attribute progress corresponding with each column respectively of each characteristic attribute of the keyword
Match.Wherein, any k in structured database is arranged, when any one characteristic attribute of the keyword and pair of k column data
When answering characteristic attribute it fails to match, then judging the keyword, it fails to match with k column data, as long as occurring, it fails to match to stop
Only the keyword and k column data continue to match, and k column are reduced, no longer need to carry out the keyword and k column data it
Between other characteristic attributes continue to match;Between the only described keyword and k column data when previous item characteristic attribute successful match,
It just will continue to carry out the matching of the next item down characteristic attribute between the keyword and k column data;When the items of the keyword are special
When levying the equal successful match of attribute various features attribute corresponding with k column data, the keyword and k column data successful match, then
The k of successful match is arranged and is retained.
Step 40, after the characteristic attribute success of the corresponding column of matching or multi-column data, according to the keyword in correspondence
A key inquiry is carried out in column data.After the completion of the keyword is all matched with every column data, the column that it fails to match are all cropped
Fall, and one or more column of successful match are retained, field contents greatly reduce, and final need to be in the successful match left
Data one-key inquiry is carried out in column.
In one key querying method of above structure database provided by the invention, data analysis is carried out in advance, obtains knot
The corresponding characteristic attribute of every column data in structure database, by the way that the keyword of inquiry and every column data are carried out feature category respectively
Property matching, more than half field can be cut with automatic cutting when a key is searched and ask, is quickly locked according to keyword corresponding
Data column improve query performance to substantially increase search efficiency.
By taking two-dimentional list data structure shown in Fig. 2 as an example, it is assumed that the characteristic attribute includes maximum data length, character type
The character of type and appearance, then in the step 10, the acquisition process of the characteristic attribute of every column data can refer to Fig. 3, including such as
Lower step:
Step 101, it determines the maximum data length in every column data and records.For example, being arranged for identity card, due to identity
Demonstrate,proving number is made of 18 bit digitals or digits plus letters, then in the column the corresponding data length of each data be it is identical,
Directly record the identical data length;And address, mailbox etc. are arranged, the corresponding data length of each data can in the column
It can have nothing in common with each other, determine maximum data length therein by comparing at this time and record.
Step 102, the character types for including in every column data and record are determined.For example, being arranged for identity card, lead in data
Often only comprising number, certain further includes letter;Telephone number is arranged, only comprising number in data;Name is arranged, in data
It only include Chinese character;Other each column are not listed one by one.
Step 103, the character occurred in every column data and record are determined;Wherein, the character repeated is only recorded
Once.For example, for some completely by the data " 21032530881 " that form of number, then the character occurred is 0,1,2,3,5,
8, identical statistics is all carried out for all data in each column, finally obtains the corresponding all characters of every column data.
Wherein, the implementation sequence of above three step can be interchanged, not unique to limit, usually can be according to acquisition feature category
Sequence when property from the easier to the more advanced is determined.For example, in the present solution, since the analysis for data length is very fast, it is first
It first determines data length, then determines other characteristic attributes.When the maximum length in the characteristic attribute including continuous appearance number
And when continuously there is the maximum length of letter, data are also needed to determine above-mentioned two item data and be recorded when analyzing.For example, to Mr. Yu
A data " 2103b53acb81 " being made of numeral and letter, the longest numeric string continuously occurred are 2103, are continuously occurred
Longest alphabetic string is acb, thus can determine the corresponding continuous maximum length for number and/or letter occur of the data, for every
All data in column all carry out identical statistics, finally obtain every column data continuously occur number and/or letter most greatly enhance
Degree.
Assuming that a total of n column of bivariate table, respectively indicate maximum data length with alphabetical A, B, C, character types, character occur
This three characteristic attributes, and A1, B1, C1 respectively indicate three characteristic attributes of correspondence of first row in bivariate table, and so on,
Ak, Bk, Ck respectively indicate three characteristic attributes of correspondence that kth arranges in bivariate table, and An, Bn, Cn respectively indicate the n-th column in bivariate table
Three characteristic attributes of correspondence.After obtaining characteristic attribute, record result can refer to Fig. 4.Meanwhile for ease of description, with letter
A ', B ', C ' respectively indicate the data length of the keyword, character types, this three characteristic attributes of character occur.
With continued reference to Fig. 5, the step 30 specifically includes the following steps:
Step 301, obtain the data length of the keyword, respectively in every column data maximum data length carry out
Match, and the column of successful match are retained, the column that it fails to match are reduced.It is specific in combination with Fig. 4, that is, by the keyword
A ' feature respectively A1, A2 corresponding with each column ..., An feature is compared, it is assumed that A ' feature respectively with A1, A2, A5,
A7 successful match, it fails to match for other column, then only retains the 1st, 2,5,7 column, other column that it fails to match are reduced.Its
In, any k in structured database is arranged, when the maximum data that the data length of the keyword is greater than k column data is long
When spending, the data length attributes match between the keyword and k column data fails, on the contrary then successful match.
Step 302, the character types for including in the keyword are obtained, respectively with the character types that include in every column data
It is matched, and the column of successful match is retained, the column that it fails to match are reduced.Continuing with Fig. 4, in bivariate table only at this time
Surplus 1st, 2,5,7 column, then continue that B1, B2, B5, B7 feature corresponding with remaining four column compares respectively by the B ' feature of keyword
Compared with other A feature column that it fails to match no longer need to matching B feature;Assuming that B ' feature respectively with A1, A7 characteristic matching at
Function, and fail with A2, A5 characteristic matching, then only the 1st column and the 7th column are retained, and the 2nd column and the 5th column that it fails to match are cut out
It cuts.Wherein, any k in structured database is arranged, when in the character types and k column data for including in the keyword
When the character types for including are consistent, the success of character types attributes match between the keyword and k column data is on the contrary then match
Failure.
Step 303, the character for including in the keyword is obtained, is carried out respectively with the character that occurs in every column data
Match, and the column of successful match are retained, the column that it fails to match are reduced.Continuing with Fig. 4, the 1st is only remained in bivariate table at this time
Column and the 7th column, then continue that C1 and C7 feature corresponding with remaining two column is compared respectively by the C ' feature of keyword, other B
The feature column that it fails to match no longer need to matching C feature;Assuming that C ' feature and C1 characteristic matching success, and with C7 characteristic matching
Failure then only retains the 1st column, and the 7th column that it fails to match are reduced.Wherein, any k in structured database is arranged,
When the character occurred in the character occurred in the keyword and k column data is consistent, between the keyword and k column data
Character attibute successful match, it is on the contrary then it fails to match.
According to the above process, only remaining 1st column data in final bivariate table then need to only be carried out according to keyword in the 1st column
One key, which is searched, askes, due to having reduced most of column data by the matching of characteristic attribute, finally carrying out a key
When searching inquiry, query time is greatly saved, improves search efficiency.Wherein, the implementation sequence of step 301,302 and 303 can
It exchanges, it is not unique to limit.
In conjunction with the embodiment of the present invention, there is also a kind of preferred implementations, after the characteristic attribute for obtaining every column data,
Before the matching for carrying out characteristic attribute with keyword, the method also includes:
A. the corresponding maximum data length of every column data is compared respectively, and in the identical column of maximum data length
Between establish mapping relations.With reference to Fig. 6, it is assumed that the maximum data length phase of the maximum data length of the 1st column and the 5th column, the 7th column
Together, i.e., A1 feature is identical as A5 feature, A8 feature, then the 1st column, the 5th arrange, establish mapping relations mutually between the 7th column.Specifically may be used
It is marked according to shown in Fig. 6, by the row number for having mapping relations label after the character pair attribute of respective column, i.e., in the 1st column
A1 feature after mark (5,7), the 5th column A5 feature after mark (1,7), the 7th column A7 feature after mark (1,5).Its
In, not marking then representative with other column, there is no mapping relations.
B. the corresponding character types of every column data are compared respectively, and establish and reflects between the identical column of character types
Penetrate relationship.Specific method can refer in a and describe, and details are not described herein again.
C. occur character by corresponding in every column data respectively and be compared, and occurring establishing between the identical column of character
Mapping relations.Specific method can refer in a and describe, and details are not described herein again.
Wherein, the sequence of shown step a, b, c can be interchanged, not unique to limit, can only choose one therein or
Two characteristic attributes are compared, and determine mapping relations, can also be compared with three characteristic attributes, be determined mapping relations.
In the preferred embodiment for determining mapping relations, when carrying out Data Matching, for any k in structured database
Column, when the character pair attributes match of any one characteristic attribute of the keyword and k column data fails, the keyword with
It fails to match for k column data, meanwhile, when determining that k column include one or more mapping relations data column, confirm the key
The matching that word is arranged with one or more of mapping relations data also fails, so skip the keyword and it is one or
The matching of the multiple mapping relations data column of person;When any one characteristic attribute of the keyword and the character pair category of k column data
Property successful match, and determine that k column when including one or more mapping relations data column, then confirm the keyword and institute
State the character pair attribute also successful match between one or more mapping relations data column, so skip the keyword with
The matching of character pair attribute between one or more of mapping relations data column.Concrete example is as follows:
In conjunction with Fig. 6, it is assumed that the A1 characteristic matching success of the A ' feature of the keyword and the 1st column, according to the label of the 1st column
(5,7) are it is found that the A feature of the 5th column, the 7th column is identical as the A feature of the 1st column, therefore can determine the A ' feature of the keyword
It is inevitable also just to no longer need to the matching for carrying out A ' feature and A5, A7 feature with the success of A5 and A7 characteristic matching, directly reservation the 1st, 5,
7 column.Similarly, when the A ' feature of the keyword and the A1 characteristic matching of the 1st column fail, it may be determined that the keyword
A ' feature it is also inevitable fail with A5 and A7 characteristic matching, the matching for carrying out A ' feature and A5, A7 feature is no longer needed to, directly by the
5 column, the 7th column are reduced together with the 1st column.By the above method, matching times can greatly reduce, and improve matching effect
Rate, and then final query performance can be improved.
In conjunction with the embodiment of the present invention, there is also a kind of preferred implementations, before carrying out characteristic attribute matching, the side
Method further include: according to search rate or the size of number of searches, each column data in structured database is ranked up.It presses
According to the Search Requirement of user, a certain column or a few column datas in possible bivariate table are queried more, show the inquiry of user
Demand is larger, after being ranked up according to search rate or number of searches are descending to each column, when carrying out characteristic attribute matching,
Sequence that can be descending according to search rate or number of searches, successively carries out the feature between the keyword and each column data
The matching of attribute.For example, the search rate highest of the 5th column, ranking near preceding, then can preferentially by the keyword and the 5th arrange into
The matching of row various features attribute.By the above method, further user demand can be taken into account, to a certain extent may be used
To improve query performance.