CN111767298A - Data dictionary construction method and device - Google Patents

Data dictionary construction method and device Download PDF

Info

Publication number
CN111767298A
CN111767298A CN202010621816.1A CN202010621816A CN111767298A CN 111767298 A CN111767298 A CN 111767298A CN 202010621816 A CN202010621816 A CN 202010621816A CN 111767298 A CN111767298 A CN 111767298A
Authority
CN
China
Prior art keywords
data
data fields
dictionary
fields
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010621816.1A
Other languages
Chinese (zh)
Inventor
黄文强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202010621816.1A priority Critical patent/CN111767298A/en
Publication of CN111767298A publication Critical patent/CN111767298A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Abstract

The embodiment of the application discloses a data dictionary construction method and device, comprising the following steps: acquiring data fields in all databases in a bank system; analyzing the data fields, and determining whether any two data fields have an association relation; if yes, inquiring sources of all data fields with incidence relation, and determining data flow among the data fields according to the sources; and constructing a data dictionary according to the data stream, wherein the data dictionary embodies system information and evolution information of the data field. The method provided by the embodiment of the application analyzes the data fields in all bank systems, establishes the bank intelligent data dictionary, embodies the evolution information, namely the incidence relation, among different data fields and the system information, namely the source of the data fields, and can prevent wrong data from being used by querying through the data dictionary when a system developer needs to use the data fields.

Description

Data dictionary construction method and device
Technical Field
The present application relates to the field of data processing, and in particular, to a data dictionary construction method and apparatus.
Background
The bank, especially a large bank, has a plurality of systems therein, and each system has a plurality of data with similar data names, and the data names of the data may be the same or only partially similar but different, which causes data use errors in the process of developing the system by workers, especially for systems of long ages, which is easy to cause production problems and bring unnecessary loss to the bank.
Disclosure of Invention
In order to solve the technical problem, the application provides a data dictionary construction method and device, and when a system developer needs to use a data field, the data dictionary can be inquired to prevent data from being used wrongly.
In a first aspect, an embodiment of the present application provides a data dictionary construction method, where the method includes:
acquiring data fields in all databases in a bank system;
analyzing the data fields, and determining whether any two data fields have an association relation;
if yes, inquiring sources of all data fields with incidence relation, and determining data flow among the data fields according to the sources;
and constructing a data dictionary according to the data stream, wherein the data dictionary embodies system information and evolution information of the data field.
Optionally, the method further includes:
and determining a corresponding target data dictionary according to the field to be queried.
Optionally, the method further includes:
and if the field to be queried is not queried in the data dictionary, supplementing the field to be queried in the data dictionary.
Optionally, the analyzing the data fields to determine whether any two data fields have an association relationship includes:
analyzing the data fields, and determining the similarity between any two data fields;
and determining whether the association relationship exists between any two data fields according to whether the similarity reaches a preset threshold value.
Optionally, constructing a data dictionary according to the data stream includes:
presenting the data stream;
and constructing the data dictionary according to the feedback information aiming at the data stream.
Optionally, the feedback information is acknowledgement information or data stream supplementary information.
In a second aspect, an embodiment of the present application provides a data dictionary constructing apparatus, where the apparatus includes:
the acquisition unit is used for acquiring data fields in all databases in the bank system;
the analysis unit is used for analyzing the data fields and determining whether any two data fields have an association relation;
the determining unit is used for inquiring the sources of all the data fields with the association relationship if the data fields are in the associated relationship, and determining the data flow among the data fields according to the sources;
and the construction unit is used for constructing a data dictionary according to the data stream, and the data dictionary embodies the system information and the evolution information of the data field.
Optionally, the determining unit is further configured to:
and determining a corresponding target data dictionary according to the field to be queried.
Optionally, the apparatus further comprises:
and the supplementing unit is used for supplementing the field to be queried in the data dictionary if the field to be queried is not queried in the data dictionary.
Optionally, the analysis unit is configured to:
analyzing the data fields, and determining the similarity between any two data fields;
and determining whether the association relationship exists between any two data fields according to whether the similarity reaches a preset threshold value.
Optionally, the building unit is configured to:
presenting the data stream;
and constructing the data dictionary according to the feedback information aiming at the data stream.
Optionally, the feedback information is acknowledgement information or data stream supplementary information.
According to the technical scheme, the data dictionary construction method comprises the following steps: acquiring data fields in all databases in a bank system; analyzing the data fields, and determining whether any two data fields have an association relation; if yes, inquiring sources of all data fields with incidence relation, and determining data flow among the data fields according to the sources; and constructing a data dictionary according to the data stream, wherein the data dictionary embodies system information and evolution information of the data field. The method provided by the embodiment of the application analyzes the data fields in all bank systems, establishes the bank intelligent data dictionary, embodies the evolution information, namely the incidence relation, among different data fields and the system information, namely the source of the data fields, and can prevent wrong data from being used by querying through the data dictionary when a system developer needs to use the data fields.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a flowchart of a data dictionary construction method according to an embodiment of the present application;
fig. 2 is a structural diagram of a data dictionary construction device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The bank, especially a large bank, has a plurality of systems therein, each system has a plurality of data with similar data names, the data names of the data may be the same or only partially similar but different, for example, the data a and the data B have very similar names but come from different systems and have different contents after different processing, the data which is expected to be used is the data a, but the data B is used because of the similar names, but the content of the data B is different, which causes data use errors in the process of developing the system by workers, especially for the systems which are long in the past, and thus production problems easily occur, which bring unnecessary loss to the bank.
In order to solve the above technical problem, the present application provides a data dictionary construction method, including: acquiring data fields in all databases in a bank system; analyzing the data fields, and determining whether any two data fields have an association relation; if yes, inquiring sources of all data fields with incidence relation, and determining data flow among the data fields according to the sources; and constructing a data dictionary according to the data stream, wherein the data dictionary embodies system information and evolution information of the data field. The method provided by the embodiment of the application analyzes the data fields in all bank systems, establishes the bank intelligent data dictionary, embodies the evolution information, namely the incidence relation, among different data fields and the system information, namely the source of the data fields, and can prevent wrong data from being used by querying through the data dictionary when a system developer needs to use the data fields.
The method provided by the embodiment of the application can be applied to terminal equipment, and the terminal equipment can be equipment such as a computer, a Personal Digital Assistant (PDA for short), a tablet computer and the like.
The method provided by the embodiment of the application can also be applied to a server, and the server executes the method provided by the embodiment of the application.
Next, a data dictionary construction method provided by the present application will be described mainly with a terminal device as an execution subject, with reference to the accompanying drawings. Referring to fig. 1, the method comprises:
s101, acquiring data fields in all databases in the bank system.
The bank may include a plurality of systems, data names of data in the systems may be the same or partially similar but different, and when data needs to be used, in order to avoid that data obtained from the systems according to the data names are not data that actually needs to be used, so that data use errors are caused, and unnecessary loss is brought to the bank, in this embodiment, a data dictionary needs to be built, so that correct data is queried according to the data dictionary, and the data is processed.
At this time, a data dictionary is built for the original system, and all data fields need to be acquired from all databases of the banking system, for example, the data fields are sampled and taken out (according to time intervals).
S102, analyzing the data fields, and determining whether any two data fields have an association relationship.
The terminal device may analyze all data fields to determine whether any two data fields have an association relationship, where the association relationship may refer to performing service processing on one data field to obtain another data field, or modifying one data field to obtain another data field, and so on.
In a possible implementation manner, the terminal device analyzes the data fields, and the manner of determining whether any two data fields have an association relationship may be to analyze the data fields and determine a similarity between any two data fields; and determining whether the association relationship exists between any two data fields according to whether the similarity reaches a preset threshold value.
And if the similarity exceeds a preset threshold and does not reach one hundred percent, the two data fields are considered to have the association relationship (if the similarity reaches one hundred percent, the two data fields are considered to be the same), and the data fields with the association relationship are extracted to be used for determining the data flow. And if the similarity does not exceed the preset threshold, the two data fields are not considered to have the association relationship.
The similarity may be determined according to related information of the data field, for example, important information such as system information including the data field, for example, a system name of which system the data field is from, and the like. By analyzing the acquired data fields, the similarity of the data fields can be determined, so that whether the two data fields have an association relationship or not can be determined. The preset threshold may be set based on practical experience, and may be set to forty percent, for example.
S103, if yes, inquiring sources of all data fields with the association relationship, and determining data flow among the data fields according to the sources.
If the two data fields have the incidence relation, all the data fields with the incidence relation are extracted, the sources of the data fields are inquired by the corresponding system through an artificial intelligence model, and the data flow between the data fields is determined. The source of the data field may be which system the data field comes from, for example, for a branch line, # # # branch line included in the total line, etc., it may be determined from which system the data field comes from. Or, it may refer to which data fields the data fields are evolved from, etc.
S104, constructing a data dictionary according to the data stream, wherein the data dictionary represents system information and evolution information of the data field.
And the terminal equipment constructs a data dictionary by a machine self-analysis method according to the determined data stream.
In some possible embodiments, the terminal device may present the data stream, and construct the data dictionary according to the feedback information for the data stream. The feedback information may be acknowledgement information or data stream supplementary information.
For example, when the data stream is presented, the staff may determine whether the direction of the data stream is correct and whether the association relationship needs to be supplemented, and if the direction of the data stream is correct and the association relationship does not need to be supplemented, the staff triggers confirmation information on the terminal device, and the terminal device constructs a data dictionary according to the confirmation information (feedback information). If the direction of the data stream is incorrect or the incidence relation needs to be supplemented, the staff triggers supplementary information on the terminal equipment, the supplementary information is used for supplementing the incidence relation or correcting the direction, and the terminal equipment constructs a data dictionary according to the supplementary information (feedback information).
The supplementary information can be obtained by judging the data fields which cannot be analyzed in a manual experience mode and verifying the data.
And then the background system can inquire the related data according to the incidence relation in the data dictionary to judge whether the data dictionary is correct or not, and final verification is carried out to obtain the effective data dictionary of the bank system.
When a worker needs to develop a new system, only a field required by the worker, such as a field to be queried, needs to be queried, the terminal device determines a corresponding target data dictionary according to the field to be queried, so that which system calls the field to be queried is most convenient and the obtained field to be queried is processed according to system information and evolution information of each data field on the target data dictionary, and efficiency and accuracy are improved.
And if the field to be queried is not queried in the data dictionary, determining that the target data dictionary corresponding to the field to be queried is not determined, and supplementing the field to be queried in the data dictionary. That is, if the worker finds that the searched field is not in the data dictionary through the query result, the worker can supplement the information of the newly added field to realize the supplement of the field to be queried in the data dictionary, so that the worker can conveniently query and supplement the field to be queried later, and the bank system unified standard is formed.
According to the technical scheme, the data dictionary construction method comprises the following steps: acquiring data fields in all databases in a bank system; analyzing the data fields, and determining whether any two data fields have an association relation; if yes, inquiring sources of all data fields with incidence relation, and determining data flow among the data fields according to the sources; and constructing a data dictionary according to the data stream, wherein the data dictionary embodies system information and evolution information of the data field. The method provided by the embodiment of the application analyzes the data fields in all bank systems, establishes the bank intelligent data dictionary, embodies the evolution information, namely the incidence relation, among different data fields and the system information, namely the source of the data fields, and can prevent wrong data from being used by querying through the data dictionary when a system developer needs to use the data fields.
Based on the data dictionary construction method provided by the foregoing embodiment, an embodiment of the present application further provides a data dictionary construction device, with reference to fig. 2, where the device includes:
an obtaining unit 201, configured to obtain data fields in all databases in a bank system;
an analyzing unit 202, configured to analyze the data fields, and determine whether any two data fields have an association relationship;
a determining unit 203, configured to query sources of all data fields having an association relationship if yes, and determine data streams between the data fields according to the sources;
a constructing unit 204, configured to construct a data dictionary according to the data stream, where the data dictionary embodies system information and evolution information of the data field.
Optionally, the determining unit is further configured to:
and determining a corresponding target data dictionary according to the field to be queried.
Optionally, the apparatus further comprises:
and the supplementing unit is used for supplementing the field to be queried in the data dictionary if the field to be queried is not queried in the data dictionary.
Optionally, the analysis unit is configured to:
analyzing the data fields, and determining the similarity between any two data fields;
and determining whether the association relationship exists between any two data fields according to whether the similarity reaches a preset threshold value.
Optionally, the building unit is configured to:
presenting the data stream;
and constructing the data dictionary according to the feedback information aiming at the data stream.
Optionally, the feedback information is acknowledgement information or data stream supplementary information.
According to the technical scheme, the data dictionary construction method comprises the following steps: acquiring data fields in all databases in a bank system; analyzing the data fields, and determining whether any two data fields have an association relation; if yes, inquiring sources of all data fields with incidence relation, and determining data flow among the data fields according to the sources; and constructing a data dictionary according to the data stream, wherein the data dictionary embodies system information and evolution information of the data field. The method provided by the embodiment of the application analyzes the data fields in all bank systems, establishes the bank intelligent data dictionary, embodies the evolution information, namely the incidence relation, among different data fields and the system information, namely the source of the data fields, and can prevent wrong data from being used by querying through the data dictionary when a system developer needs to use the data fields.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium may be at least one of the following media: various media that can store program codes, such as read-only memory (ROM), RAM, magnetic disk, or optical disk.
It should be noted that, in the present specification, all the embodiments are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (12)

1. A method for constructing a data dictionary, the method comprising:
acquiring data fields in all databases in a bank system;
analyzing the data fields, and determining whether any two data fields have an association relation;
if yes, inquiring sources of all data fields with incidence relation, and determining data flow among the data fields according to the sources;
and constructing a data dictionary according to the data stream, wherein the data dictionary embodies system information and evolution information of the data field.
2. The method of claim 1, further comprising:
and determining a corresponding target data dictionary according to the field to be queried.
3. The method of claim 2, further comprising:
and if the field to be queried is not queried in the data dictionary, supplementing the field to be queried in the data dictionary.
4. The method of claim 1, wherein analyzing the data fields to determine whether any two data fields have an association relationship comprises:
analyzing the data fields, and determining the similarity between any two data fields;
and determining whether the association relationship exists between any two data fields according to whether the similarity reaches a preset threshold value.
5. The method of claim 1, wherein constructing a data dictionary from the data stream comprises:
presenting the data stream;
and constructing the data dictionary according to the feedback information aiming at the data stream.
6. The method of claim 5, wherein the feedback information is acknowledgement information or data stream side information.
7. An apparatus for constructing a data dictionary, the apparatus comprising:
the acquisition unit is used for acquiring data fields in all databases in the bank system;
the analysis unit is used for analyzing the data fields and determining whether any two data fields have an association relation;
the determining unit is used for inquiring the sources of all the data fields with the association relationship if the data fields are in the associated relationship, and determining the data flow among the data fields according to the sources;
and the construction unit is used for constructing a data dictionary according to the data stream, and the data dictionary embodies the system information and the evolution information of the data field.
8. The apparatus of claim 7, wherein the determining unit is further configured to:
and determining a corresponding target data dictionary according to the field to be queried.
9. The apparatus of claim 8, further comprising:
and the supplementing unit is used for supplementing the field to be queried in the data dictionary if the field to be queried is not queried in the data dictionary.
10. The apparatus of claim 7, wherein the analysis unit is configured to:
analyzing the data fields, and determining the similarity between any two data fields;
and determining whether the association relationship exists between any two data fields according to whether the similarity reaches a preset threshold value.
11. The apparatus of claim 7, wherein the building unit is configured to:
presenting the data stream;
and constructing the data dictionary according to the feedback information aiming at the data stream.
12. The apparatus of claim 11, wherein the feedback information is acknowledgement information or data stream side information.
CN202010621816.1A 2020-07-01 2020-07-01 Data dictionary construction method and device Pending CN111767298A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010621816.1A CN111767298A (en) 2020-07-01 2020-07-01 Data dictionary construction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010621816.1A CN111767298A (en) 2020-07-01 2020-07-01 Data dictionary construction method and device

Publications (1)

Publication Number Publication Date
CN111767298A true CN111767298A (en) 2020-10-13

Family

ID=72723250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010621816.1A Pending CN111767298A (en) 2020-07-01 2020-07-01 Data dictionary construction method and device

Country Status (1)

Country Link
CN (1) CN111767298A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170053002A1 (en) * 2015-08-18 2017-02-23 Fiserv, Inc. Generating integrated data records by correlating source data records from disparate data sources
CN110347695A (en) * 2019-07-18 2019-10-18 山东浪潮通软信息科技有限公司 A kind of processing of data dictionary dynamic and update method of self-defining data relationship
CN110990406A (en) * 2019-11-28 2020-04-10 中国建设银行股份有限公司 Fuzzy query method, device, equipment and medium
CN111046035A (en) * 2019-10-29 2020-04-21 三盟科技股份有限公司 Data automation processing method, system, computer equipment and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170053002A1 (en) * 2015-08-18 2017-02-23 Fiserv, Inc. Generating integrated data records by correlating source data records from disparate data sources
CN110347695A (en) * 2019-07-18 2019-10-18 山东浪潮通软信息科技有限公司 A kind of processing of data dictionary dynamic and update method of self-defining data relationship
CN111046035A (en) * 2019-10-29 2020-04-21 三盟科技股份有限公司 Data automation processing method, system, computer equipment and readable storage medium
CN110990406A (en) * 2019-11-28 2020-04-10 中国建设银行股份有限公司 Fuzzy query method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN113282461B (en) Alarm identification method and device for transmission network
US20190058609A1 (en) Method and apparatus for pushing information based on artificial intelligence
CN107016018B (en) Database index creation method and device
CN111857880B (en) Dialogue configuration item information management method, device, equipment and storage medium
CN112416778A (en) Test case recommendation method and device and electronic equipment
CN110046155B (en) Method, device and equipment for updating feature database and determining data features
CN113420537A (en) Method, device, equipment and storage medium for processing electronic form data
CN114416703A (en) Method, device, equipment and medium for automatically monitoring data integrity
CN111325031A (en) Resume parsing method and device
CN111190832A (en) Method, device and system for positioning and optimizing performance bottleneck
CN109189849B (en) Standardized and streamlined data entry method and system
JP2018092361A (en) Test script correction apparatus and test script correction program
CN110502538B (en) Method, system, equipment and storage medium for portrait tag generation logic mapping
CN109508204B (en) Front-end code quality detection method and device
CN109189809B (en) Shareholder name association matching method and device
CN111767298A (en) Data dictionary construction method and device
CN114141236B (en) Language model updating method and device, electronic equipment and storage medium
CN109324963A (en) The method and terminal device of automatic test profitable result
CN112152968B (en) Network threat detection method and device
CN111143643A (en) Element identification method and device, readable storage medium and electronic equipment
CN111221894B (en) Time sequence database storage method, device and server based on configuration
CN114116253A (en) Message processing method and system for message queue
CN114416895A (en) Map data processing method and device, electronic equipment and storage medium
JP2018092362A (en) Test script correction apparatus and test script correction program
CN112347056B (en) Automatic file generation method based on time axis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Huang Wenqiang

Inventor after: Xu Chenmin

Inventor after: Ji Yunqing

Inventor after: Hu Luping

Inventor after: Hu Wei

Inventor after: Huang Yanan

Inventor after: Hu Chuanjie

Inventor after: Fu Chenqi

Inventor after: Li Bangbang

Inventor after: Shen Yakun

Inventor before: Huang Wenqiang