CN110008193A - Data normalization method and device - Google Patents
Data normalization method and device Download PDFInfo
- Publication number
- CN110008193A CN110008193A CN201910304451.7A CN201910304451A CN110008193A CN 110008193 A CN110008193 A CN 110008193A CN 201910304451 A CN201910304451 A CN 201910304451A CN 110008193 A CN110008193 A CN 110008193A
- Authority
- CN
- China
- Prior art keywords
- metadata
- data
- professional standard
- standard library
- library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/176—Support for shared access to files; File sharing support
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/178—Techniques for file synchronisation in file systems
- G06F16/1794—Details of file format conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides a kind of data normalization method and device, and the metadata by the metadata of service database successively with multiple standard databases is compared, and finds out identical metadata, and be identified as similar metadata.For difference metadata different between the standard database in service database.Calculate the similarity between the sample data prestored in the corresponding data of difference metadata and the service database.The corresponding metadata of sample data that data similarity is greater than preset threshold is identified as similar metadata in industry java standard library.It is identified as the quantity of the metadata of the similar metadata in statistics the sector java standard library, the most professional standard library of the quantity is determined as the immediate professional standard library of service database.
Description
Technical field
This application involves data processing fields, in particular to a kind of data normalization method and device.
Background technique
With the universal and development of information technology, the level of informatization of government and enterprise is higher and higher, and then leads to business
Data volume also further increases.In face of a large amount of business datum, the data model accurately and standardized has efficiently and quickly been established
As trend.But a large amount of professional standard is faced, actual traffic data is established between existing standard by manual identified mode
Relationship can devote a tremendous amount of time and energy.
Summary of the invention
In order to overcome at least one deficiency in the prior art, the first purpose of the application is to provide a kind of data standard
Change method, is applied to data processing equipment, and the data processing equipment prestores multiple professional standard libraries, the professional standard library
Prestore sample data;The described method includes:
Obtain service database;
For each professional standard library, by first number of the metadata in the professional standard library and the service database
According to being compared;
Metadata identical with the service database in the professional standard library is identified as similar metadata;
For difference metadata different between the professional standard library in the service database, the difference is calculated
Data similarity is more than default by the similarity between sample data in the corresponding data of metadata and the professional standard library
Metadata corresponding to the sample data of threshold value is identified as similar metadata in the professional standard library;
The quantity that the metadata of the similar metadata is identified as in each professional standard library is counted, by the number
It measures most professional standard libraries and is determined as the immediate professional standard library of the service database.
Optionally, the sample data calculated in the corresponding data of the difference metadata and the professional standard library it
Between similarity the step of include:
Pass through the sample in the corresponding data of difference metadata described in artificial neural networks and the professional standard library
Similarity between data.
Optionally, the method also includes:
Standard information database is created according to the similar metadata in the immediate professional standard library;
Data corresponding with the similar metadata in the immediate professional standard library are obtained from the service database,
It is stored in the standard information database.
Optionally, the data processing equipment further includes industry shared information library, the method also includes:
The metadata of the metadata in industry shared information library and the standard information database is compared, is determined
Out in the standard information database with the identical shared metadata in industry shared information library;
According to the corresponding data creation shared data table of the shared metadata.
Optionally, the method also includes:
For each shared data table, corresponding interface is provided, so that other equipment pass through described in interface acquisition
Data in shared data table.
Optionally, the metadata includes field name, it is described by the professional standard library with the service database
The step of identical metadata is identified as similar metadata include:
Field name identical with the service database in the professional standard library is identified as similar metadata.
Optionally, the metadata further includes table name, field type and field length.
The another object of the embodiment of the present application is to provide a kind of data normalization device, is applied to data processing equipment,
The data processing equipment prestores multiple professional standard libraries, and the professional standard library prestores sample data, the data mark
Quasi- makeup is set including obtaining module, comparison module, mark module, similarity calculation module and statistical module;
The acquisition module is for obtaining service database;
The comparison module be used for be directed to each professional standard library, by the metadata in the professional standard library with it is described
The metadata of service database is compared;
The mark module is for metadata identical with the service database in the professional standard library to be identified as
Similar metadata;
The similarity calculation module is used for for different between the professional standard library in the service database
Difference metadata calculates similar between the corresponding data of the difference metadata and the sample data in the professional standard library
Data similarity is more than that metadata corresponding to the sample data of preset threshold is identified as phase in the professional standard library by degree
Like metadata;
The statistical module is for counting the first number for being identified as the similar metadata in each professional standard library
According to quantity, the most professional standard library of the quantity is determined as the immediate professional standard library of the service database.
Optionally, the comparison module is in the following manner by the member of the metadata of the sector java standard library and service database
Data are compared:
Pass through the sample in the corresponding data of difference metadata described in artificial neural networks and the professional standard library
Similarity between data.
Optionally, the data normalization device further includes creation module, writing module;
The creation module is used to create standard information according to the similar metadata in the immediate professional standard library
Database;
The write module is used to obtain from the service database similar in the immediate professional standard library
The corresponding data of metadata, are stored in the standard information database.
In terms of existing technologies, the application has the advantages that
The embodiment of the present application provides a kind of data normalization method and device, by the metadata of service database successively with more
The metadata of a standard database is compared, and finds out identical metadata, and be identified as similar metadata.For business datum
Difference metadata different between the standard database in library.Calculate the corresponding data of difference metadata and the business datum
The similarity between sample data prestored in library.Data similarity is greater than to the corresponding metadata of sample data of preset threshold
Similar metadata is identified as in industry java standard library.The metadata of the similar metadata is identified as in statistics the sector java standard library
Quantity, the most professional standard library of the quantity is determined as the immediate professional standard library of service database.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 is the block diagram of data processing equipment provided by the embodiments of the present application;
Fig. 2 is the step flow chart of data normalization method provided by the embodiments of the present application;
Fig. 3 is business datum table provided by the embodiments of the present application and industry standard data table contrast schematic diagram;
Fig. 4 is one of the structural schematic diagram of data normalization device provided by the embodiments of the present application;
Fig. 5 is the second structural representation of data normalization device provided by the embodiments of the present application.
Icon: 100- data processing equipment;130- processor;120- memory;110- data normalization device;500- industry
Business tables of data;600- industry standard data table;1101- obtains module;1102- comparison module;1103- mark module;1104- phase
Like degree computing module;1105- statistical module;1106- creation module;1107- writing module.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is implemented
The component of example can be arranged and be designed with a variety of different configurations.
Therefore, the detailed description of the embodiments herein provided in the accompanying drawings is not intended to limit below claimed
Scope of the present application, but be merely representative of the selected embodiment of the application.Based on the embodiment in the application, this field is common
Technical staff's every other embodiment obtained without creative efforts belongs to the model of the application protection
It encloses.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.
Please refer to Fig. 1, Fig. 1 is the block diagram of data processing equipment 100 provided by the embodiments of the present application, at the data
Managing equipment 100 includes data normalization device 110, memory 120 and processor 130.
The memory 120 and each element of processor 130 are directly or indirectly electrically connected between each other, to realize data
Transmission or interaction.Electrically connect for example, these elements can be realized between each other by one or more communication bus or signal wire
It connects.The data normalization device 110 includes described at least one can be stored in the form of software or firmware (firmware)
In memory 120 or the software function that is solidificated in the operating system (operating system, OS) of data processing equipment 100
Module.The processor 130 is for executing the executable module stored in the memory 120, such as the data normalization
Software function module included by device 110 and computer program etc..
The data processing equipment 100 may be, but not limited to, smart phone, PC (personal
Computer, PC), tablet computer, personal digital assistant (personal digital assistant, PDA), mobile Internet access set
Standby (mobile Internet device, MID) etc..
Wherein, the memory 120 may be, but not limited to, random access memory (Random Access
Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable
Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only
Memory, EPROM), electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only
Memory, EEPROM) etc..Wherein, memory 120 is for storing program, the processor 130 after receiving and executing instruction,
Execute described program.
The processor 130 may be a kind of IC chip, the processing capacity with signal.Above-mentioned processor can
To be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit
(Network Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), specific integrated circuit (ASIC),
Field programmable gate array (FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hard
Part component.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present application.General processor
It can be microprocessor or the processor be also possible to any conventional processor etc..
Referring to figure 2., Fig. 2 is the step process of the data normalization method applied to data processing equipment 100 shown in Fig. 1
Figure, the data processing equipment 100 prestore multiple professional standard libraries, and the sector java standard library prestores sample data;Below should
Each step of data normalization method is described in detail.
Step S100 obtains service database.
Optionally, the sector java standard library is the database for recording typical data in various industries.For example, in a kind of possibility
Example in, the professional standard library of education sector includes the data such as student name, student class, students' genders and student performance.
The professional standard library of financial industry includes the data such as capital, interest rate, depositor's title, gender and the time limit.The data processing equipment
100 link service databases, obtain the metadata of the service database, the metadata of the service database includes database name
Title, table name, field name and field type.
Step S200, for each professional standard library, by the metadata in the professional standard library and the business number
It is compared according to the metadata in library.
Metadata identical with the service database in the professional standard library is identified as similar finite element number by step S300
According to.
Optionally, for each professional standard library, the data processing equipment 100 as target industry java standard library,
Metadata in service database is compared with the metadata in the target industry java standard library, finds out identical metadata.It should
Identical metadata token is similar metadata by data processing equipment 100.For example, referring to figure 3., in a kind of possible example
In, which includes field name.Business datum table 500 include field name " age ", " fisrtname " and
"lastname".Industry standard data table 600 includes field name " age ", " number " and " name ".The data processing equipment
100 are compared the normal data table 600 of the same trade of business datum table 500, wherein " age " field name is identical, general
" age " field mark is similar metadata.
Optionally, in order to further ensure that the corresponding data of identical metadata in service database and professional standard library
It is similar.The data processing equipment 100 is respectively by service database number corresponding with metadata identical in professional standard library
According to doing similarity calculation.The metadata that similarity is greater than preset threshold is identified as similar metadata.Referring to figure 2., the data
Processing equipment 100 does the corresponding data of service database " age " field data corresponding with " age " field in professional standard library
Similarity calculation.
It is whether identical by comparing metadata, quickly filter out similar first number in service database and professional standard library
According to.Due to different developers, for identical data, naming Data field names, there may be discrepancy, for example, being directed to student
Total marks of the examination, field name may be named as " score " or " achievement " by different developers.Pass through simple first number
It is that can not judge whether the two is similar according to comparing.
Step S400, for difference metadata different between the professional standard library in the service database, meter
The similarity between the sample data in the corresponding data of the difference metadata and the professional standard library is calculated, data are similar
Degree is more than that metadata corresponding to the sample data of preset threshold is identified as similar metadata in the professional standard library.
Optionally, not identical since there may be field names in service database, but the similar repetition of real data
Field.The data processing equipment 100 is by the whole in the corresponding data same industry java standard library of difference metadata in service database
Sample data does similarity calculation, is more than metadata corresponding to the sample data of preset threshold in the row by data similarity
Similar metadata is identified as in industry java standard library.
In a kind of embodiment provided by the present application, the data processing equipment 100 is by the corresponding data of difference metadata
And all sample data inputs artificial neural network in professional standard library, calculate the corresponding data of each difference metadata with
Similarity in professional standard library between the corresponding sample data of each metadata.The data processing equipment 100 is by similarity
Metadata corresponding greater than the sample data of preset threshold is identified as similar metadata.
In another embodiment provided by the present application, which is successively selected from difference metadata
Target difference metadata is taken, by the corresponding sample of each metadata in the corresponding data same industry java standard library of target difference metadata
Notebook data carries out similarity calculation, and the corresponding metadata of sample data that similarity is greater than preset threshold is identified as similar finite element number
According to.Referring again to Fig. 3, the difference metadata in business datum table 500 is " lastname " and " firstname ".At data
Equipment 100 is managed by " age " field in the corresponding data same industry normal data table 600 of " lastname " field, " number "
Field and " name " field carry out similarity calculation respectively.Data processing equipment 100 is corresponding by " firstname " field again
" age " field, " number " field and " name " field in data same industry normal data table 600 carry out similarity meter respectively
It calculates.If the similarity of " lastname " field and " age " field, " number " field and " name " is respectively 0.2,0.1,0.7,
Wherein, the preset threshold of similarity is 0.6.Then data processing equipment 100 is by " name " field in industry standard data table 600
It is identified as similar field corresponding with " lastname " field.
Step S500 counts the number that the metadata of the similar metadata is identified as in each professional standard library
Amount, is determined as the immediate professional standard library of the service database for the most professional standard library of the quantity.
Optionally, since the data processing equipment 100 prestores multiple professional standard libraries, each professional standard library is counted
In be marked as similar field metadata quantity, the most professional standard library of similar metadata quantity is determined as and business
The immediate professional standard library of database.
Optionally, which creates standard according to the similar metadata in immediate professional standard library
Information database.The data processing equipment 100 obtains and the similar finite element number in immediate professional standard library from service database
According to corresponding data, it is stored in the standard information database.
Referring once again to Fig. 3, data processing equipment 100 mentions " name " field in professional standard library with " age " field
It takes out, and standard information database is created according to " name " field and " age " field.And it will be in business datum table 500
" age " field data corresponding with " lastname " field are stored in the standard information database.It is worth noting that at the data
It manages equipment 100 data in business datum table 500 are stored in standard information library, if data type or data length be not identical,
It will do it and do corresponding processing.
Optionally, data processing equipment 100 further includes industry shared information library, by the metadata in industry shared information library and
The metadata of standard information database is compared, and is determined in the standard information database and in trade information shared information library
Identical shared metadata.The data processing equipment 100 shares the corresponding data creation shared data table of first number according to this.
Optionally, for each shared data table, corresponding interface is provided, so that other equipment can be with by the interface
Data in accessing shared data table.
The embodiment of the present application also provides a kind of data normalization device 110, is applied to data processing equipment 100, at the tree
Reason equipment prestores multiple professional standard libraries, and the sector java standard library prestores sample data.Referring to figure 4., the data normalization
Device 110 includes obtaining module 1101, comparison module 1102, mark module 1103, similarity calculation module 1104 and statistics mould
Block 1105.
The acquisition module 1101 is for obtaining service database.
In the present embodiment, which is used to execute the step S100 in Fig. 2, about the acquisition module 1101
Detailed description can refer to step S100 detailed description.
The comparison module 1102 is used to be directed to each professional standard library, by the metadata in the professional standard library and institute
The metadata for stating service database is compared.
In the present embodiment, which is used to execute the step S200 in Fig. 2, about the comparison module 1102
Detailed description can refer to step S200 detailed description.
The mark module 1103 is used to identify metadata identical with the service database in the professional standard library
For similar metadata.
In the present embodiment, which is used to execute the step S300 in Fig. 2, about the mark module 1103
Detailed description can refer to step S300 detailed description.
The similarity calculation module 1104 is used for for different between the professional standard library in the service database
Difference metadata, calculate the phase between the corresponding data of the difference metadata and the sample data in the professional standard library
It is more than that metadata corresponding to the sample data of preset threshold is identified as in the professional standard library by data similarity like degree
Similar metadata.
In the present embodiment, which is used to execute the step S400 in Fig. 2, about the similarity
The detailed description of computing module 1104 can refer to the detailed description of step S400.
The statistical module 1105 is for counting the member for being identified as the similar metadata in each professional standard library
The most professional standard library of the quantity is determined as the immediate professional standard library of the service database by the quantity of data.
In the present embodiment, which is used to execute the step S500 in Fig. 2, about statistical module 1105
Detailed description can refer to the detailed description of step S500.
Optionally, the comparison module 1102 is in the following manner by the metadata and service database of the sector java standard library
Metadata be compared:
Pass through the sample in the corresponding data of difference metadata described in artificial neural networks and the professional standard library
Similarity between data.
Referring once again to Fig. 5, which further includes creation module 1106, writing module 1107.
The creation module 1106 is used for according to the similar metadata creation standard letter in the immediate professional standard library
Cease database.
The writing module 1107 is used to obtain and the phase in the immediate professional standard library from the service database
Like the corresponding data of metadata, it is stored in the standard information database.
In conclusion the embodiment of the present application provides a kind of data normalization method and device, by first number of service database
It is compared according to the metadata successively with multiple standard databases, finds out identical metadata, and be identified as similar metadata.Needle
To difference metadata different between the standard database in service database.Calculate the corresponding data of difference metadata with
The similarity between sample data prestored in the service database.Data similarity is greater than to the sample data pair of preset threshold
The metadata answered is identified as similar metadata in industry java standard library.The similar finite element number is identified as in statistics the sector java standard library
According to metadata quantity, the most professional standard library of the quantity is determined as the immediate professional standard of service database
Library.
In embodiment provided herein, it should be understood that disclosed device and method, it can also be by other
Mode realize.The apparatus embodiments described above are merely exemplary, for example, the flow chart and block diagram in attached drawing are shown
According to device, the architectural framework in the cards of method and computer program product, function of multiple embodiments of the application
And operation.In this regard, each box in flowchart or block diagram can represent one of a module, section or code
Point, a part of the module, section or code includes one or more for implementing the specified logical function executable
Instruction.It should also be noted that function marked in the box can also be attached to be different from some implementations as replacement
The sequence marked in figure occurs.For example, two continuous boxes can actually be basically executed in parallel, they sometimes may be used
To execute in the opposite order, this depends on the function involved.It is also noted that each of block diagram and or flow chart
The combination of box in box and block diagram and or flow chart can be based on the defined function of execution or the dedicated of movement
The system of hardware is realized, or can be realized using a combination of dedicated hardware and computer instructions.
In addition, each functional module in each embodiment of the application can integrate one independent portion of formation together
Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module
It is stored in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially in other words
The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a
People's computer, server or network equipment etc.) execute each embodiment the method for the application all or part of the steps.
And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
The above, the only various embodiments of the application, but the protection scope of the application is not limited thereto, it is any
Those familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all contain
Lid is within the scope of protection of this application.Therefore, the protection scope of the application shall be subject to the protection scope of the claim.
Claims (10)
1. a kind of data normalization method, which is characterized in that be applied to data processing equipment, the data processing equipment prestores
Multiple professional standard libraries, the professional standard library prestore sample data;The described method includes:
Obtain service database;
For each professional standard library, by the metadata in the professional standard library and the metadata of the service database into
Row compares;
Metadata identical with the service database in the professional standard library is identified as similar metadata;
For difference metadata different between the professional standard library in the service database, the difference member number is calculated
It is more than preset threshold by data similarity according to the similarity between the sample data in corresponding data and the professional standard library
Sample data corresponding to metadata be identified as similar metadata in the professional standard library;
The quantity that the metadata of the similar metadata is identified as in each professional standard library is counted, most by the quantity
More professional standard libraries is determined as the immediate professional standard library of the service database.
2. data normalization method according to claim 1, which is characterized in that the calculating difference metadata is corresponding
Data and the professional standard library in sample data between similarity the step of include:
Pass through the sample data in the corresponding data of difference metadata described in artificial neural networks and the professional standard library
Between similarity.
3. data normalization method according to claim 1, is characterized in that, the method also includes:
Standard information database is created according to the similar metadata in the immediate professional standard library;
Data corresponding with the similar metadata in the immediate professional standard library, deposit are obtained from the service database
The standard information database.
4. data normalization method according to claim 3, which is characterized in that the data processing equipment further includes that industry is total
Information bank is enjoyed, the method also includes:
The metadata of the metadata in industry shared information library and the standard information database is compared, determines institute
State in standard information database with the identical shared metadata in industry shared information library;
According to the corresponding data creation shared data table of the shared metadata.
5. data normalization method according to claim 4, which is characterized in that the method also includes:
For each shared data table, corresponding interface is provided, so that other equipment obtain described share by the interface
Data in tables of data.
6. data normalization method according to claim 1, which is characterized in that the metadata includes field name, institute
Stating the step of metadata identical with the service database in the professional standard library is identified as similar metadata includes:
Field name identical with the service database in the professional standard library is identified as similar metadata.
7. data normalization method according to claim 1, which is characterized in that the metadata further includes table name, word
Segment type and field length.
8. a kind of data normalization device, which is characterized in that be applied to data processing equipment, the data processing equipment prestores
Multiple professional standard libraries, the professional standard library prestore sample data, and the data normalization device includes obtaining module, ratio
Compared with module, mark module, similarity calculation module and statistical module;
The acquisition module is for obtaining service database;
The comparison module is used to be directed to each professional standard library, by the metadata in the professional standard library and the business
The metadata of database is compared;
The mark module is similar for metadata identical to the service database in the professional standard library to be identified as
Metadata;
The similarity calculation module is used for for difference different between the professional standard library in the service database
Metadata calculates the similarity between the sample data in the corresponding data of the difference metadata and the professional standard library,
By data similarity be more than preset threshold sample data corresponding to metadata be identified as in the professional standard library it is similar
Metadata;
The statistical module is for counting the metadata for being identified as the similar metadata in each professional standard library
The most professional standard library of the quantity is determined as the immediate professional standard library of the service database by quantity.
9. data normalization device according to claim 8, which is characterized in that the comparison module in the following manner will
The metadata of the sector java standard library is compared with the metadata of service database:
Pass through the sample data in the corresponding data of difference metadata described in artificial neural networks and the professional standard library
Between similarity.
10. data normalization device according to claim 8, which is characterized in that the data normalization device further includes
Creation module, writing module;
The creation module is used to create standard information data according to the similar metadata in the immediate professional standard library
Library;
The write module is used to obtain and the similar finite element number in the immediate professional standard library from the service database
According to corresponding data, it is stored in the standard information database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910304451.7A CN110008193B (en) | 2019-04-16 | 2019-04-16 | Data standardization method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910304451.7A CN110008193B (en) | 2019-04-16 | 2019-04-16 | Data standardization method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110008193A true CN110008193A (en) | 2019-07-12 |
CN110008193B CN110008193B (en) | 2021-06-18 |
Family
ID=67172159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910304451.7A Active CN110008193B (en) | 2019-04-16 | 2019-04-16 | Data standardization method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110008193B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765118A (en) * | 2019-10-21 | 2020-02-07 | 北京明略软件系统有限公司 | Data revision method, revision device and readable storage medium |
CN111078639A (en) * | 2019-12-03 | 2020-04-28 | 望海康信(北京)科技股份公司 | Data standardization method and device and electronic equipment |
CN112084245A (en) * | 2020-09-03 | 2020-12-15 | 深圳力维智联技术有限公司 | Data management method, device and equipment based on micro-service architecture and storage medium |
CN112699160A (en) * | 2021-03-23 | 2021-04-23 | 中国信息通信研究院 | Metadata template upgrading method and device and readable storage medium |
CN113111636A (en) * | 2021-05-17 | 2021-07-13 | 京东科技控股股份有限公司 | Data uniqueness standard identification method and device |
CN113282650A (en) * | 2020-11-24 | 2021-08-20 | 苏州律点信息科技有限公司 | Service data processing method and device based on big data |
WO2021184995A1 (en) * | 2020-03-19 | 2021-09-23 | 华为技术有限公司 | Data processing method and data standard management system |
CN115185923A (en) * | 2022-07-07 | 2022-10-14 | 中国气象局气象探测中心 | Method, system and intelligent terminal for managing meteorological observation metadata |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2793906A1 (en) * | 1999-05-19 | 2000-11-24 | Bull Sa | SYSTEM AND METHOD FOR MANAGING ATTRIBUTES IN AN OBJECT-ORIENTED ENVIRONMENT |
CN106845058A (en) * | 2015-12-04 | 2017-06-13 | 北大医疗信息技术有限公司 | The standardized method of disease data and modular station |
CN107844560A (en) * | 2017-10-30 | 2018-03-27 | 北京锐安科技有限公司 | A kind of method, apparatus of data access, computer equipment and readable storage medium storing program for executing |
CN109408561A (en) * | 2018-10-17 | 2019-03-01 | 杭州骑轻尘信息技术有限公司 | Business Name matching process and device |
-
2019
- 2019-04-16 CN CN201910304451.7A patent/CN110008193B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2793906A1 (en) * | 1999-05-19 | 2000-11-24 | Bull Sa | SYSTEM AND METHOD FOR MANAGING ATTRIBUTES IN AN OBJECT-ORIENTED ENVIRONMENT |
CN106845058A (en) * | 2015-12-04 | 2017-06-13 | 北大医疗信息技术有限公司 | The standardized method of disease data and modular station |
CN107844560A (en) * | 2017-10-30 | 2018-03-27 | 北京锐安科技有限公司 | A kind of method, apparatus of data access, computer equipment and readable storage medium storing program for executing |
CN109408561A (en) * | 2018-10-17 | 2019-03-01 | 杭州骑轻尘信息技术有限公司 | Business Name matching process and device |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765118A (en) * | 2019-10-21 | 2020-02-07 | 北京明略软件系统有限公司 | Data revision method, revision device and readable storage medium |
CN110765118B (en) * | 2019-10-21 | 2022-05-17 | 北京明略软件系统有限公司 | Data revision method, revision device and readable storage medium |
CN111078639A (en) * | 2019-12-03 | 2020-04-28 | 望海康信(北京)科技股份公司 | Data standardization method and device and electronic equipment |
WO2021184995A1 (en) * | 2020-03-19 | 2021-09-23 | 华为技术有限公司 | Data processing method and data standard management system |
CN112084245A (en) * | 2020-09-03 | 2020-12-15 | 深圳力维智联技术有限公司 | Data management method, device and equipment based on micro-service architecture and storage medium |
CN112084245B (en) * | 2020-09-03 | 2024-03-12 | 深圳力维智联技术有限公司 | Data management method, device, equipment and storage medium based on micro-service architecture |
CN113282650A (en) * | 2020-11-24 | 2021-08-20 | 苏州律点信息科技有限公司 | Service data processing method and device based on big data |
CN112699160A (en) * | 2021-03-23 | 2021-04-23 | 中国信息通信研究院 | Metadata template upgrading method and device and readable storage medium |
CN113111636A (en) * | 2021-05-17 | 2021-07-13 | 京东科技控股股份有限公司 | Data uniqueness standard identification method and device |
CN113111636B (en) * | 2021-05-17 | 2024-04-12 | 京东科技控股股份有限公司 | Data uniqueness standard identification method and device |
CN115185923A (en) * | 2022-07-07 | 2022-10-14 | 中国气象局气象探测中心 | Method, system and intelligent terminal for managing meteorological observation metadata |
CN115185923B (en) * | 2022-07-07 | 2023-03-07 | 中国气象局气象探测中心 | Method and system for managing meteorological observation metadata and intelligent terminal |
Also Published As
Publication number | Publication date |
---|---|
CN110008193B (en) | 2021-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110008193A (en) | Data normalization method and device | |
Aste | Cryptocurrency market structure: connecting emotions and economics | |
Hammond | From computer-assisted to data-driven: Journalism and Big Data | |
CN111125266B (en) | Data processing method, device, equipment and storage medium | |
CN104361119A (en) | Data cleaning method and system | |
US9524475B1 (en) | Presenting discriminant change history records on topology graphs | |
EP2528031A1 (en) | Methods and apparatus for on-line analysis of financial accounting data | |
Spanos et al. | Error statistical modeling and inference: Where methodology meets ontology | |
CN111427971A (en) | Business modeling method, device, system and medium for computer system | |
US20220343198A1 (en) | Systems and methods for determining data criticality based on causal evaluation | |
CN111444073A (en) | Method, device and system for testing performance of financial database | |
CN111858600B (en) | Data wide table construction method, device, equipment and storage medium | |
CN113538154A (en) | Risk object identification method and device, storage medium and electronic equipment | |
CN110750530A (en) | Service system and data checking method thereof | |
US9037607B2 (en) | Unsupervised analytical review | |
CN110675249A (en) | Matching method, device, server and storage medium for network lending | |
CN115907970A (en) | Credit risk identification method and device, electronic equipment and storage medium | |
CN114840531A (en) | Data model reconstruction method, device, equipment and medium based on blood relationship | |
CN112882956A (en) | Method and device for automatically generating full-scene automatic test case through data combination calculation, storage medium and electronic equipment | |
US20140279389A1 (en) | Automated detection of underwriting system manipulation | |
US9892411B2 (en) | Efficient tail calculation to exploit data correlation | |
CN107016028A (en) | Data processing method and its equipment | |
Gilens | Simulating representation: The devil’s in the detail | |
Hussain et al. | Financial inclusion and economic growth: Comparative panel evidence from developed and developing Asian countries | |
CN110020930B (en) | Financial data system construction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |