CN113836144B - Method and device for recommending database standard table based on field - Google Patents

Method and device for recommending database standard table based on field Download PDF

Info

Publication number
CN113836144B
CN113836144B CN202111146101.6A CN202111146101A CN113836144B CN 113836144 B CN113836144 B CN 113836144B CN 202111146101 A CN202111146101 A CN 202111146101A CN 113836144 B CN113836144 B CN 113836144B
Authority
CN
China
Prior art keywords
fields
database
standard
field
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111146101.6A
Other languages
Chinese (zh)
Other versions
CN113836144A (en
Inventor
陈毓靖
齐战胜
陈涛涛
吴鸿伟
刁薪予
王海滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meiya Pico Information Co Ltd
Original Assignee
Xiamen Meiya Pico Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meiya Pico Information Co Ltd filed Critical Xiamen Meiya Pico Information Co Ltd
Priority to CN202111146101.6A priority Critical patent/CN113836144B/en
Publication of CN113836144A publication Critical patent/CN113836144A/en
Application granted granted Critical
Publication of CN113836144B publication Critical patent/CN113836144B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools

Abstract

The invention provides a method and a device for recommending a database standard table based on fields, wherein the method comprises the following steps: a preprocessing step, namely preprocessing all database standard tables in a database to generate a field library; a recommendation step, carrying out matching calculation on fields in the source table and fields in a field library to obtain a recommendation database standard table of the source table; and a feedback step, namely sending the recommended database standard table to an interface of a user terminal for a user to confirm. According to the method, a field library is obtained by preprocessing all database standard tables in the database, and then recommendation of a label table is performed based on the field library, so that training of an artificial intelligence model is avoided, time is saved, and the recommendation speed is high; in the invention, the standard table recommended by the system can be confirmed by the user so as to ensure the import accuracy of the data table.

Description

Method and device for recommending database standard table based on field
Technical Field
The invention relates to the technical field of databases, in particular to a method and a device for recommending a database standard table based on fields.
Background
In big data management, a database table used by each user needs to be corresponded to a standard table base. The manual benchmarking obviously has large working amount and is not suitable for the operation of large database tables at all. In the prior art, an artificial intelligence database benchmarking method is also adopted, a first-step artificial intelligence model (generally, a neural network model such as DNN) training of the task is completed, a large number of samples need to be collected for training, retraining is needed after fields of a database table are replaced, time consumption for recommending a standard table of the database is long, if the sample data are small, recommendation accuracy is low, and training of a recommendation model needs to be performed again after the database is replaced.
Disclosure of Invention
The present invention proposes the following technical solutions to address one or more technical defects in the prior art.
A method for recommending database criteria tables based on fields, the method comprising:
a preprocessing step, namely preprocessing all database standard tables in a database to generate a field library;
and a recommendation step, wherein the fields in the source table and the fields in the field library are subjected to matching calculation to obtain a recommended database standard table of the source table.
Still further, the method further comprises: and a feedback step, namely sending the recommended database standard table to an interface of the user terminal for the user to confirm.
Further, the preprocessing step operates as: all fields in all standard tables used in a database are counted, descending arrangement is carried out according to the occurrence times of the fields, a relation (seg, (S1, S2, \8230, sn) num) is constructed, a field library is constructed by using a dictionary based on the relation, wherein seg represents the fields, (S1, S2, \8230, sn) represents a set of standard tables in which the seg appears, S1, S2, \8230 \ 8230, sn respectively represents the table name of one standard table, and num represents the occurrence times of the fields.
Still further, the recommending step operates to: preprocessing a source table to obtain a field set A in the source table, and performing similarity matching on each field in the field set A and fields in a field library to obtain a matching field set B; and calculating the hit times of the standard table corresponding to the fields in the matching field set B based on the matching field set B and the relation (seg, (S1, S2, \8230; \ 8230; sn), num), and taking the standard table with the highest hit times as the recommended database standard table of the source table.
Further, the similarity matching is cosine matching or semantic matching.
The invention also provides a device for recommending the database standard table based on the field, which comprises the following steps:
the preprocessing unit is used for preprocessing all database standard tables in the database to generate a field library;
and the recommending unit is used for matching and calculating the fields in the source table and the fields in the field library to obtain a recommended database standard table of the source table.
Still further, the apparatus further comprises: and the feedback unit is used for sending the recommended database standard table to an interface of the user terminal for the user to confirm.
Further, the operation of the preprocessing unit is: counting all fields in all standard tables used in a database, sequencing in a descending manner according to the occurrence frequency of the fields to construct a relation (seg, (S1, S2, \8230; sn), num), and constructing a field library by using a dictionary based on the relation, wherein seg represents fields, (S1, S2, \8230; sn) represents a set of standard tables in which the seg appears, S1, S2, \8230; \\8230; sn respectively represents the table name of one standard table, and num represents the occurrence frequency of the fields.
Still further, the operation of the recommending unit is: preprocessing a source table to obtain a field set A in the source table, and performing similarity matching on each field in the field set A and fields in a field library to obtain a matching field set B; and calculating the hit times of the standard table corresponding to the fields in the matching field set B based on the matching field set B and the relation (seg, (S1, S2, \8230; \ 8230; sn), num), and taking the standard table with the highest hit times as the recommended database standard table of the source table.
Further, the similarity matching is cosine matching or semantic matching.
The invention also proposes a computer-readable storage medium having stored thereon computer program code which, when executed by a computer, performs any of the methods described above.
The invention has the technical effects that: the invention discloses a method and a device for recommending a database standard table based on fields, wherein the method comprises the following steps: a preprocessing step, namely preprocessing all database standard tables in a database to generate a field library; a recommendation step, carrying out matching calculation on fields in the source table and fields in a field library to obtain a recommendation database standard table of the source table; and a feedback step, namely sending the recommended database standard table to an interface of the user terminal for the user to confirm. According to the method, a field library is obtained by preprocessing all database standard tables in the database, and then recommendation of the label table is performed based on the field library, so that training of an artificial intelligence model is avoided, time is saved, and the recommendation speed is high; in the invention, the standard table recommended by the system can be confirmed by the user, if the user confirms to select the recommended table, the source table is imported into the database currently used by the system based on the recommended standard table, and if the user does not confirm the recommended standard table, the next standard table is recommended to the user for the user to select until the user selects the proper standard table, so as to ensure the import accuracy of the data table; according to the method and the device, a field library is constructed by using the dictionary based on the relation, and the fields in the dictionary are arranged in a descending order according to num, so that the matching speed is improved during subsequent matching.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
FIG. 1 is a flow diagram of a method for recommending database criteria tables based on fields according to an embodiment of the invention.
Fig. 2 is a block diagram of an apparatus for recommending a database standard table based on a field according to an embodiment of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
FIG. 1 shows a method for recommending a database standard table based on fields according to the present invention, which comprises:
a preprocessing step S101, preprocessing all database standard tables in a database to generate a field library; the database is, for example, a database currently used by the system, and a large number of data tables exist in the database, and the data tables of the database are used as standard tables.
And a recommendation step S102, matching and calculating the fields in the source table and the fields in the field library to obtain a recommendation database standard table of the source table. The source table may be a data table to be migrated to another database in the database currently used by the system, and the source table and the standard table of the database are required to be aligned during the migration of the database.
In the invention, a field library is obtained by preprocessing all database standard tables in the database, and then recommendation of a label table is carried out based on the field library, so that training of an artificial intelligence model is avoided, time is saved, and the recommendation speed is high, which is one of important invention points of the invention.
In one embodiment, the method further comprises: and a feedback step S103, sending the recommended database standard table to an interface of the user terminal for the user to confirm. In the invention, the standard table recommended by the system can be confirmed by the user, if the user confirms to select the recommended table, the source table is imported into the database currently used by the system based on the recommended standard table, and if the user does not confirm the recommended standard table, the next standard table is recommended to the user for the user to select until the user selects the proper standard table, so as to ensure the import accuracy of the data table, which is another important invention point of the invention.
In one embodiment, the preprocessing step operates as: counting all fields in all standard tables used in a database, and performing descending order arrangement according to the occurrence times of the fields to construct a relation (seg, (S1, S2, \ 8230 \ 8230;, sn), num), and constructing a field library by using a dictionary based on the relation, wherein seg represents the fields, (S1, S2, \ 8230;, sn) represents a set of standard tables in which the field seg appears, S1, S2, \ 8230; \ 8230;, sn each represents the table name of one standard table, such as table sales, total, sunm in which the field 'sales amount' appears, then (S1, S2, \\ 8230;, sn) is (sales, total, sunm), num represents the occurrence times of the fields, and thus, the fields in the dictionary are arranged according to num in a descending order to improve the matching speed in the subsequent matching, which is another important invention.
In one embodiment, the recommending step operates by: preprocessing a source table to obtain a field set A in the source table, and performing similarity matching on each field in the field set A and fields in a field library to obtain a matched field set B; and calculating the hit times of the standard table corresponding to the fields in the matching field set B based on the matching field set B and the relation (seg, (S1, S2, \8230; \ 8230; sn), num), and taking the standard table with the highest hit times as the recommended database standard table of the source table. Specifically, all the fields in the matching field set B are circulated, the number of hits of each field in each standard table is obtained during circulation, the number of hits of each standard table is added to the standard table, the number of hits of each standard table corresponding to the source table is finally obtained, the number of hits of each standard table is arranged in a descending order, the most hit word number is fed back to the user as a recommendation table, if the user confirms the standard table, the operation is ended, if the user does not select the standard table, the second standard table is fed back to the user in an ordering manner, and so on until the user selects the standard table or the user actively quits the operation. Therefore, the efficiency and the accuracy of the recommendation table are greatly improved, through practical tests, a data table is recommended for about 30 seconds (obtained by average calculation together with model training time) by using a DNN method in a database with 2000 standard tables, the method only needs 30ms, the recommendation time is greatly reduced, and the migration efficiency of the data table is improved, which is another important invention point of the invention.
In the invention, the similarity matching algorithm can be cosine matching or semantic matching, and certainly, other matching algorithms, such as a synonym matching method, can also be adopted.
FIG. 2 shows an apparatus for recommending database standard tables based on fields according to the present invention, which comprises:
a preprocessing unit 201, which preprocesses all database standard tables in the database to generate a field library; the database is, for example, a database currently used by the system, and a large number of data tables exist in the database, and the data tables of the database are used as standard tables.
And the recommending unit 202 performs matching calculation on the fields in the source table and the fields in the field library to obtain a recommended database standard table of the source table. The source table may be a data table to be migrated to another database in the database currently used by the system, and the source table and the database standard table need to be aligned during the migration of the database.
In the invention, a field library is obtained by preprocessing all database standard tables in the database, and then recommendation of a label table is carried out based on the field library, so that training of an artificial intelligence model is avoided, time is saved, and the recommendation speed is high, which is one of important invention points of the invention.
In one embodiment, the method further comprises: and the feedback unit 203 sends the recommended database standard table to an interface of the user terminal for the user to confirm. In the invention, the standard table recommended by the system can be confirmed by the user, if the user confirms to select the recommended table, the source table is imported into the database currently used by the system based on the recommended standard table, and if the user does not confirm the recommended standard table, the next standard table is recommended to the user for the user to select until the user selects the proper standard table, so as to ensure the import accuracy of the data table, which is another important invention point of the invention.
In one embodiment, the operation of the preprocessing step is: counting all fields in all standard tables used in a database, and performing descending order arrangement according to the occurrence times of the fields to construct a relation (seg, (S1, S2, \ 8230 \ 8230;, sn), num), and constructing a field library by using a dictionary based on the relation, wherein seg represents the fields, (S1, S2, \ 8230;, sn) represents a set of standard tables in which the field seg appears, S1, S2, \ 8230; \ 8230;, sn each represents the table name of one standard table, such as table sales, total, sunm in which the field 'sales amount' appears, then (S1, S2, \\ 8230;, sn) is (sales, total, sunm), num represents the occurrence times of the fields, and thus, the fields in the dictionary are arranged according to num in a descending order to improve the matching speed in the subsequent matching, which is another important invention.
In one embodiment, the recommending step operates by: preprocessing a source table to obtain a field set A in the source table, and performing similarity matching on each field in the field set A and fields in a field library to obtain a matching field set B; and calculating the hit times of the standard tables corresponding to the fields in the matching field set B based on the matching field set B and the relation (seg, (S1, S2, \8230; sn), num), and taking the standard table with the highest hit time as the recommended database standard table of the source table. Specifically, all fields in the matching field set B are cycled, a hit frequency is added to the standard table when cycling is performed to obtain a hit frequency of each field in which standard table appears, the hit frequency of each standard table corresponding to the source table is finally obtained, the hit frequency of each standard table is arranged in a descending order, the most hit words are fed back to the user as a recommendation table, if the user confirms the standard table, the process is finished, if the user does not select the standard table, the second standard table in the order is fed back to the user, and so on until the user selects the standard table or the user actively exits the operation. Therefore, the efficiency and the accuracy of the recommendation table are greatly improved, through practical tests, a data table is recommended for about 30 seconds (obtained by average calculation together with model training time) by using a DNN method in a database with 2000 standard tables, the method only needs 30ms, the recommendation time is greatly reduced, and the migration efficiency of the data table is improved, which is another important invention point of the invention.
In the present invention, the similarity matching algorithm may be cosine matching or semantic matching, and of course, other matching algorithms, such as a synonym matching method, may also be adopted.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially implemented or the portions that contribute to the prior art may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the apparatuses described in the embodiments or some portions of the embodiments of the present application.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims (6)

1. A method for recommending database criteria tables based on fields, the method comprising:
a preprocessing step, namely preprocessing all database standard tables in a database to generate a field library;
a recommendation step, carrying out matching calculation on fields in the source table and fields in a field library to obtain a recommendation database standard table of the source table;
wherein the operation of the pretreatment step is as follows: counting all fields in all standard tables used in a database, performing descending order according to the occurrence times of the fields, constructing a relation (seg, (S1, S2, \8230; sn), num), and constructing a field library by using a dictionary based on the relation, wherein seg represents fields, (S1, S2, \8230; 8230; sn) represents a set of standard tables in which the seg appears, S1, S2, \8230; sn each represents a table name of one standard table, and num represents the occurrence times of the fields;
the operation of the recommending step is as follows: preprocessing a source table to obtain a field set A in the source table, and performing similarity matching on each field in the field set A and fields in a field library to obtain a matching field set B; and calculating the hit times of the standard table corresponding to the fields in the matching field set B based on the matching field set B and the relation (seg, (S1, S2, \8230; \ 8230; sn), num), and taking the standard table with the highest hit times as the recommended database standard table of the source table.
2. The method of claim 1, further comprising:
and a feedback step, namely sending the recommended database standard table to an interface of the user terminal for the user to confirm.
3. The method of claim 2, wherein the similarity match is a cosine match or a semantic match.
4. An apparatus for recommending database criteria tables based on fields, the apparatus comprising:
the preprocessing unit is used for preprocessing all database standard tables in the database to generate a field library;
the recommendation unit is used for matching and calculating the fields in the source table and the fields in the field library to obtain a recommendation database standard table of the source table;
wherein the operation of the preprocessing unit is: counting all fields in all standard tables used in a database, performing descending order according to the occurrence times of the fields, constructing a relation (seg, (S1, S2, \8230; sn), num), and constructing a field library by using a dictionary based on the relation, wherein seg represents fields, (S1, S2, \8230; 8230; sn) represents a set of standard tables in which the seg appears, S1, S2, \8230; sn each represents a table name of one standard table, and num represents the occurrence times of the fields;
the operation of the recommendation unit is: preprocessing a source table to obtain a field set A in the source table, and performing similarity matching on each field in the field set A and fields in a field library to obtain a matching field set B; and calculating the hit times of the standard tables corresponding to the fields in the matching field set B based on the matching field set B and the relation (seg, (S1, S2, \8230; sn), num), and taking the standard table with the highest hit time as the recommended database standard table of the source table.
5. The apparatus of claim 4, further comprising:
and the feedback unit is used for sending the recommended database standard table to an interface of the user terminal for the user to confirm.
6. The apparatus of claim 5, wherein the similarity match is a cosine match or a semantic match.
CN202111146101.6A 2021-09-28 2021-09-28 Method and device for recommending database standard table based on field Active CN113836144B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111146101.6A CN113836144B (en) 2021-09-28 2021-09-28 Method and device for recommending database standard table based on field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111146101.6A CN113836144B (en) 2021-09-28 2021-09-28 Method and device for recommending database standard table based on field

Publications (2)

Publication Number Publication Date
CN113836144A CN113836144A (en) 2021-12-24
CN113836144B true CN113836144B (en) 2023-01-24

Family

ID=78967176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111146101.6A Active CN113836144B (en) 2021-09-28 2021-09-28 Method and device for recommending database standard table based on field

Country Status (1)

Country Link
CN (1) CN113836144B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273482A (en) * 2017-06-12 2017-10-20 北京市天元网络技术股份有限公司 Alarm data storage method and device based on HBase
CN112783921A (en) * 2021-01-26 2021-05-11 中国银联股份有限公司 Database operation method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766436A (en) * 2018-12-04 2019-05-17 北京明略软件系统有限公司 A kind of matched method and apparatus of data element of the field and knowledge base of tables of data
CN109871382A (en) * 2019-02-13 2019-06-11 北京明略软件系统有限公司 A kind of implementation method and device of tables of data access java standard library
CN110196834B (en) * 2019-05-21 2022-04-29 厦门市美亚柏科信息股份有限公司 Benchmarking method and system for data items, files and databases
CN111078776A (en) * 2019-12-10 2020-04-28 北京明略软件系统有限公司 Data table standardization method, device, equipment and storage medium
CN111259222B (en) * 2020-01-22 2023-08-22 北京百度网讯科技有限公司 Article recommendation method, system, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273482A (en) * 2017-06-12 2017-10-20 北京市天元网络技术股份有限公司 Alarm data storage method and device based on HBase
CN112783921A (en) * 2021-01-26 2021-05-11 中国银联股份有限公司 Database operation method and device

Also Published As

Publication number Publication date
CN113836144A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
JP7343568B2 (en) Identifying and applying hyperparameters for machine learning
CN107436875B (en) Text classification method and device
US20220405641A1 (en) Method for recommending information, recommendation server, and storage medium
CN107807960B (en) Intelligent customer service method, electronic device and computer readable storage medium
CN110990533B (en) Method and device for determining standard text corresponding to query text
US11514498B2 (en) System and method for intelligent guided shopping
EP2741220A1 (en) Apparatus and method for indexing electronic content
CN106294505B (en) Answer feedback method and device
Zhao et al. Learning and transferring ids representation in e-commerce
CN111368096A (en) Knowledge graph-based information analysis method, device, equipment and storage medium
US8548999B1 (en) Query expansion
CN108182200B (en) Keyword expansion method and device based on semantic similarity
CN114817746A (en) Insurance product recommendation method, device, equipment and storage medium
CN115659044A (en) Recommendation method and system for people and sentry matching, electronic equipment and storage medium
CN110008396B (en) Object information pushing method, device, equipment and computer readable storage medium
CN116955538B (en) Medical dictionary data matching method and device, electronic equipment and storage medium
US11361032B2 (en) Computer driven question identification and understanding within a commercial tender document for automated bid processing for rapid bid submission and win rate enhancement
CN110377803B (en) Information processing method and device
CN113836144B (en) Method and device for recommending database standard table based on field
CN111160699A (en) Expert recommendation method and system
CN108959327B (en) Service processing method, device and computer readable storage medium
CN106775962A (en) A kind of rule performs method and device
CN115827841A (en) Searching method and device
CN108256018A (en) A kind of item recommendation method and device
CN112667809A (en) Text processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant