CN110362601A - Mapping method, device, equipment and the storage medium of metadata standard - Google Patents
Mapping method, device, equipment and the storage medium of metadata standard Download PDFInfo
- Publication number
- CN110362601A CN110362601A CN201910533687.8A CN201910533687A CN110362601A CN 110362601 A CN110362601 A CN 110362601A CN 201910533687 A CN201910533687 A CN 201910533687A CN 110362601 A CN110362601 A CN 110362601A
- Authority
- CN
- China
- Prior art keywords
- data
- target
- synonymous
- target data
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24573—Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides mapping method, device, equipment and the storage medium of a kind of metadata standard, that is, obtains the target data in the demapping instruction, obtains the corresponding synonymous normal data of the target data in preset standard library according to preset rules;The similarity of the synonymous normal data and the target data is calculated, and is judged in the synonymous normal data according to the similarity with the presence or absence of the corresponding target criteria data of the target data;The target data and the target criteria data are then established mapping relations, so that the target data is mapped as identifiable normal data by the target criteria data if it exists.The present invention can search corresponding synonymous normal data according to the corresponding synonym of target data in preset standard library, realize the incremental update of standard metadata, without manually carrying out the lookup of corresponding normal data, improve data search efficiency, the accuracy rate of data search result is improved, the user experience is improved.
Description
Technical field
The present invention relates to data processing field more particularly to a kind of mapping method of metadata standard, device, equipment and meters
Calculation machine readable storage medium storing program for executing.
Background technique
As Information System configuration develops to certain phase, data resource will become strategic asset, and effective data are controlled
Reason is only the necessary condition of data assets formation.Data improvement refer to from use sporadic data become using unified master data, from
It administers with little or no tissue and process to the integrated data in enterprise-wide and administers, handles master data confusion shape from trial
A condition process in perfect order to master data.And data administer successful key and are metadata management, i.e., in imparting data
Hereafter with the reference frame of meaning.At present in data governing system on the market, generally requires and manually searched in standards system
The corresponding standard of metadata out, and the metadata is established into mapping relations with corresponding standard.Therefore, existing metadata and standard
Mapping method not only inefficiency but also accuracy rate is low.
Therefore, the mapping method of existing metadata and standard not only inefficiency but also accuracy rate is low asks how is solved
The problem of topic is current urgent need to resolve.
Summary of the invention
The main purpose of the present invention is to provide a kind of mapping method of metadata standard, device, equipment and computers can
Read storage medium, it is intended to solve the mapping method of existing metadata and standard not only inefficiency but also the low technology of accuracy rate
Problem.
To achieve the above object, the present invention provides a kind of mapping method of metadata standard, which is characterized in that the member number
According to standard mapping method the following steps are included:
When detecting demapping instruction, the target data in the demapping instruction is obtained, is being marked in advance according to preset rules
The corresponding synonymous normal data of the target data is obtained in quasi- library;
It calculates the similarity of the synonymous normal data and the target data, and is judged according to the similarity described same
It whether there is the corresponding target criteria data of the target data in adopted normal data, wherein the target criteria data and institute
The similarity for stating target data is more than preset threshold;
The target data and the target criteria data are then established mapping and closed by the target criteria data if it exists
System, so that the target data is mapped as identifiable normal data.
Optionally, described when detecting demapping instruction, the target data in the demapping instruction is obtained, according to default rule
The step of corresponding synonymous normal data of the target data is then obtained in preset standard library include:
When detecting demapping instruction, the target data in the demapping instruction is obtained;
English dictionary WordNet based on cognitive linguistics, it is corresponding to obtain the target data in the WordNet
Synonym word set Syncet belongs to class word Class word and meaning interpretation Sense explanation, and in the synonym word
Collection belongs to class word word set and the progress data characteristics extraction of meaning interpretation word set, corresponding candidate same with the determination target data
Adopted word, wherein the extraction formula of the candidate synonym is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant categories of Sense W
Class;{ We } is notional word all in the explanation of Sense W;
The candidate synonym is matched with the standard metadata in the preset standard library, determines the number of targets
According to corresponding synonymous normal data.
Optionally, the similarity for calculating the synonymous normal data and the target data, and according to described similar
Degree judges in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein the target
The step of similarity of normal data and the target data is more than preset threshold include:
Based on vector space method, meaning similarity and word phase of the synonymous normal data with target data are calculated
Like degree;
Judged in the synonymous normal data according to the meaning similarity and Words similarity with the presence or absence of the target
The corresponding target criteria data of data, wherein the meaning similarity and word of the target criteria data and the target data
Similarity is more than preset threshold.
Optionally, described to be based on vector space method, calculate the meaning phase of the synonymous normal data with target data
It is specifically included like the step of degree and Words similarity:
Calculate the meaning similarity of the synonymous normal data and target data, wherein the calculating of the meaning similarity
According to following formula:
Wherein, No (SW) is the sequence of W meaning, IDF (Wi) it is when training obtained building WordNet from WordNet
There is some WiDocument inverse, Ks is the weight of synonym feature, and Kc is the weight of generic character, and Ke is meaning interpretation
Weight, QUFor WiThe index set of appearance, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the Words similarity
Calculation basis following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
Optionally, described when detecting demapping instruction, the target data in the demapping instruction is obtained, according to default rule
The step of corresponding synonymous normal data of the target data is then obtained in preset standard library specifically includes:
When detecting the metadata for not meeting preset standard, judge to whether there is in the java standard library according to preset rules
The corresponding synonymous normal data of the target data;
If there are the synonymous normal datas in the java standard library, the corresponding synonymous criterion numeral of the target data is obtained
According to.
Optionally, described when detecting the metadata for not meeting preset standard, the standard is judged according to preset rules
After the step of in library with the presence or absence of the target data corresponding synonymous normal data, further includes:
The target criteria data if it does not exist then obtain frequency of use of the target data in preset time period,
And when the frequency of use is more than preset threshold, then word segmentation processing is carried out to the metadata according to natural language processing NLP
And data analysis, and judge whether each unit data after segmenting meets the naming rule of natural language according to the NLP;
If the target data meets the naming rule of the natural language, generated according to the target data corresponding
Normal data updates recommendation information, and when receiving confirmation instruction of the user according to the update recommendation information feedback, by institute
It states target data and is added to the preset standard library.
Optionally, the similarity for calculating the synonymous normal data and the target data, and according to described similar
Degree judges in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein the target
After the step of similarity of normal data and the target data is more than preset threshold, further includes:
The target criteria data if it does not exist, then by the maximum synonymous normal data of similarity, and according to described similar
It spends maximum synonymous normal data and generates corresponding mapping recommendation information, to remind user whether that the similarity is maximum
Synonymous normal data and the target data establish mapping relations.
In addition, to achieve the above object, the present invention also provides a kind of mapping device of metadata standard, the metadata mark
Quasi- mapping device includes:
Data search module, for when detecting demapping instruction, obtaining the target data in the demapping instruction, according to
Preset rules obtain the corresponding synonymous normal data of the target data in preset standard library;
Data judgment module, for calculating the similarity of the synonymous normal data and the target data, and according to institute
It states similarity to judge in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein institute
The similarity for stating target criteria data and the target data is more than preset threshold;
Data mapping module, for the target criteria data if it exists, then by the target data and the target mark
Quasi- data establish mapping relations, so that the target data is mapped as identifiable normal data.
In addition, to achieve the above object, the present invention also provides a kind of mapped device of metadata standard, the metadata mark
Quasi- mapped device includes processor, memory and is stored in the member that can be executed on the memory and by the processor
The mapping program of data standard, wherein realizing when the mapping program of the metadata standard is executed by the processor as above-mentioned
Metadata standard mapping method the step of.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium
The mapping program of metadata standard is stored on storage medium, wherein the mapping program of the metadata standard is executed by processor
When, realize as above-mentioned metadata standard mapping method the step of.
The present invention provides a kind of mapping method of metadata standard, i.e., when detecting demapping instruction, obtains the mapping
Target data in instruction obtains the corresponding synonymous criterion numeral of the target data according to preset rules in preset standard library
According to;The similarity of the synonymous normal data and the target data is calculated, and the synonymous mark is judged according to the similarity
It whether there is the corresponding target criteria data of the target data in quasi- data, wherein the target criteria data and the mesh
The similarity for marking data is more than preset threshold;The target criteria data if it exists, then by the target data and the target
Normal data establishes mapping relations, so that the target data is mapped as identifiable normal data.By the above-mentioned means, this
Invention can search corresponding synonymous normal data according to the corresponding synonym of target data in preset standard library, without artificial
The lookup for carrying out corresponding normal data, improves data search efficiency, improves the accuracy rate of data search result, improve use
Family experience, solves the technical issues of existing standard metadata formulated in advance is unable to satisfy user demand.
Detailed description of the invention
Fig. 1 is the hardware structural diagram of the mapped device of metadata standard involved in the embodiment of the present invention;
Fig. 2 is the flow diagram of the mapping method first embodiment of metadata standard of the present invention;
Fig. 3 is the flow diagram of the mapping method second embodiment of metadata standard of the present invention;
Fig. 4 is the flow diagram of the mapping method 3rd embodiment of metadata standard of the present invention;
Fig. 5 is the functional block diagram of the mapping device first embodiment of metadata standard of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present embodiments relate to the mapping method of metadata standard be mainly used in the mapped device of metadata standard,
The mapped device of the metadata standard can be the equipment that PC, portable computer, mobile terminal etc. have display and processing function.
Referring to Fig.1, Fig. 1 is that the hardware configuration of the mapped device of metadata standard involved in the embodiment of the present invention shows
It is intended to.In the embodiment of the present invention, the mapped device of metadata standard may include processor 1001 (such as CPU), communication bus
1002, user interface 1003, network interface 1004, memory 1005.Wherein, communication bus 1002 for realizing these components it
Between connection communication;User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard);
Network interface 1004 optionally may include standard wireline interface and wireless interface (such as WI-FI interface);Memory 1005 can be with
It is high speed RAM memory, is also possible to stable memory (non-volatile memory), such as magnetic disk storage, stores
Device 1005 optionally can also be the storage device independently of aforementioned processor 1001.
It will be understood by those skilled in the art that hardware configuration shown in Fig. 1 does not constitute the mapping to metadata standard
The restriction of equipment may include perhaps combining certain components or different component cloth than illustrating more or fewer components
It sets.
With continued reference to Fig. 1, the memory 1005 in Fig. 1 as a kind of computer readable storage medium may include operation system
The mapping program of system, network communication module and metadata standard.
In Fig. 1, network communication module is mainly used for connecting server, carries out data communication with server;And processor
1001 can call the mapping program of the metadata standard stored in memory 1005, and execute member provided in an embodiment of the present invention
The mapping method of data standard.
The embodiment of the invention provides a kind of mapping methods of metadata standard.
It is the flow diagram of the mapping method first embodiment of metadata standard of the present invention referring to Fig. 2, Fig. 2.
In the present embodiment, the mapping method of the metadata standard the following steps are included:
Step S10 obtains the target data in the demapping instruction, according to preset rules when detecting demapping instruction
The corresponding synonymous normal data of the target data is obtained in preset standard library;
In the present embodiment, the existing system for production and application has been put into, some non-compliant metadata are not
It can be carried out change, therefore, it is necessary to which a mapping relations will be established between these non-compliant metadata and normal data,
To may recognize that above-mentioned metadata in next audit system data.The present invention is for existing needs manually in standards system
In carry out the technical issues of corresponding normal data is searched, provide and a kind of carry out the side that corresponding normal data is searched based on synonym
Method, by searching the corresponding synonymous data of target data to be mapped in java standard library, so that finding rapidly and efficiently is described
The corresponding synonymous normal data of target data.Wherein, preset rules can be when target data is English data, based on cognition
Philological English dictionary WordNet obtains the corresponding synonymous data acquisition system of the target data, by the synonymous data acquisition system
It is matched with java standard library, to obtain the synonymous normal data of the corresponding system of the target data.In specific embodiment, in target
When data are Chinese data, based on Chinese near synonym or synonymicon, the corresponding synonymous data set of the target data is obtained
It closes, and obtains the corresponding synonymous normal data of the target data.In specific embodiment, the obtaining step of synonymous normal data
Are as follows: when detecting demapping instruction, obtain the target data in the demapping instruction;English dictionary based on cognitive linguistics
WordNet obtains the corresponding synonym word set Syncet of the target data in the WordNet, belongs to class word Class
Word and meaning interpretation Sense explanation, and in the synonym word set, category class word word set and meaning interpretation word set
Data characteristics extraction is carried out, with the corresponding candidate synonym of the determination target data, wherein the extraction of the candidate synonym
Formula is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant categories of Sense W
Class;{ We } is notional word all in the explanation of Sense W;By the standard member in the candidate synonym and the preset standard library
Data are matched, and determine the corresponding synonymous normal data of the target data.Wherein, the present embodiment mainly utilizes WordNet
Then the interface function of offer is extracted out candidate from these three set of the synonym word set of WordNet, category class word and meaning interpretation
Then synonym carries out feature extraction to the candidate synonym, by the candidate synonym according to the mark in preset standard library
Quasi- metadata determines the corresponding synonymous normal data of the target data.
Step S20, calculates the similarity of the synonymous normal data and the target data, and is sentenced according to the similarity
Break in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein the target criteria
The similarity of data and the target data is more than preset threshold;
In the present embodiment, the similarity includes meaning similarity and Words similarity, between two meanings (Sense)
Similarity can be obtained by calculating its distance in three different significance characteristic spaces.Apart from smaller, similarity is got over
Greatly.The similarity in WordNet between two words can be calculated according to meaning similarity.Calculating the synonymous standard
When the similarity of data and the target data, judge in the synonymous normal data with the presence or absence of the phase with the target data
It is more than the target criteria data of preset threshold like degree.
Step S30, the target criteria data, then establish the target data and the target criteria data if it exists
Mapping relations, so that the target data is mapped as identifiable normal data.
In the present embodiment, there are when the target criteria data in determining the synonymous normal data, by the target
Normal data mapping relations corresponding with target data progress, consequently facilitating can recognize in subsequent audit system data
The target data is the normal data of correspondence mappings out.Such as: wordNet find trade synset:trade,
Transaction, business, deal, and sequencing of similarity is pressed, recommend out, wherein transaction, business are
Standard term in the system, is highlighted, and transaction can be chosen to the Mapping standard as trade by similarity.Exempt from
The process that synonym is manually found out from thousands of a standards is gone.
The present embodiment provides a kind of mapping methods of metadata standard, i.e., when detecting demapping instruction, reflect described in acquisition
The target data in instruction is penetrated, obtains the corresponding synonymous criterion numeral of the target data in preset standard library according to preset rules
According to;The similarity of the synonymous normal data and the target data is calculated, and the synonymous mark is judged according to the similarity
It whether there is the corresponding target criteria data of the target data in quasi- data, wherein the target criteria data and the mesh
The similarity for marking data is more than preset threshold;The target criteria data if it exists, then by the target data and the target
Normal data establishes mapping relations, so that the target data is mapped as identifiable normal data.By the above-mentioned means, this
Invention can search corresponding synonymous normal data according to the corresponding synonym of target data in preset standard library, without artificial
The lookup for carrying out corresponding normal data, improves data search efficiency, improves the accuracy rate of data search result, improve use
Family experience, solves the technical issues of existing standard metadata formulated in advance is unable to satisfy user demand.
It is the flow diagram of the mapping method second embodiment of metadata standard of the present invention referring to Fig. 3, Fig. 3.
Based on above-mentioned embodiment illustrated in fig. 2, in the present embodiment, the step S20 includes:
Step S21 is based on vector space method, calculates the meaning similarity of the synonymous normal data and target data
And Words similarity;
In the present embodiment, based on the classification of the lexical semantic of WordNet, it is synonymous then to extract corresponding candidate
Word, and the corresponding synonymous normal data of the target data is determined according to the java standard library in preset standard library.Then it uses and is based on
The method of vector space calculates the meaning similarity and Words similarity of the target data Yu each synonymous normal data.Tool
In body embodiment, the meaning similarity of the synonymous normal data and target data is calculated, wherein the meter of the meaning similarity
It calculates according to following formula:
Wherein, No (SW) is the sequence of W meaning, for example, the first sense=1, the second sense=
2……;IDF(Wi) it is to train when obtained building WordNet some W occur from WordNetiDocument inverse;Ks is same
The weight that the weight that the weight of adopted word feature, such as 1.5, Kc are generic character, such as 1, Ke are meaning interpretation, such as 0.5, QUFor WiOut
Existing index set, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the Words similarity
Calculation basis following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
Step S22 judges to whether there is in the synonymous normal data according to the meaning similarity and Words similarity
The corresponding target criteria data of the target data, wherein the target criteria data are similar to the meaning of the target data
Degree and Words similarity are more than preset threshold.
In the present embodiment, different similarity preset thresholds can be set according to the meaning similarity and Words similarity,
Also identical similarity preset threshold can be set.It will be more than the synonymous standard of preset threshold with the similarity of the target data
Data are as target criteria data, and according to the similarity and preset threshold, judge in the synonymous normal data whether
There are target criteria data.
It is the flow diagram of the mapping method 3rd embodiment of metadata standard of the present invention referring to Fig. 4, Fig. 4.
Based on above-mentioned embodiment illustrated in fig. 2, in the present embodiment, after the step S20, further includes:
Step S40, target criteria data if it does not exist, then obtain the target data makes in preset time period
With frequency, and when the frequency of use is more than preset threshold, then the metadata is divided according to natural language processing NLP
Word processing and data analysis, and judge whether each unit data after segmenting meets the life of natural language according to the NLP
Name rule;
In the present embodiment, target criteria data if it does not exist, then the target data is not the acceptance of the bid of preset standard library
The corresponding synonym of quasi- metadata.Frequency of use of the target data in preset time period is further obtained, i.e. statistics institute
Frequency of occurrence of the target data at the appointed time in section is stated, judges whether the frequency of use of the target data has been more than default threshold
Value, wherein the preset time period can be that one week that current time rises interior, in one month or three months etc..It is described default
Threshold value can be set according to the actual situation, and frequency of use is user's high frequency more than the target data of the preset threshold
The metadata used.It, can also be by counting frequency of occurrence of the target data in preset time period in specific embodiment.
When determining that the frequency of use is more than the preset threshold of setting, NLP (Natural is carried out to the target data
Language Processing, natural language processing) analysis processing.When the target data is phrase, by the number of targets
According to progress word segmentation processing, and each unit data after participle is judged respectively, that is, it is each after judging target data participle
Whether a unit data meets the naming rule of natural language.Wherein, the naming rule, which can be, judges each unit data
It whether is Chinese word, English word or other effective language terms etc..It, can be according to corresponding language in specific embodiment
Words allusion quotation judges whether each unit data is effective language term.
Step S50 is raw according to the target data if the target data meets the naming rule of the natural language
Recommendation information is updated at corresponding normal data, and is instructed receiving user according to the confirmation of the update recommendation information feedback
When, the target data is added to the preset standard library.
It, can be by the target data when determining that the target data meets corresponding naming rule in the present embodiment
Recommend to be stored as standard member number so that administrator judges whether the target data being added to presetting database to administrator
According to.Specific recommendation step are as follows: corresponding recommendation information is generated according to the target data, such as: whether by " IC (electric appliances service industry
In IC is referred to as to integrated circuit) be stored as standard metadata ".And generate confirmation simultaneously or cancel instruction, so as to administrator's root
Corresponding instruction is triggered according to auditing result.If receive confirmation instruction, i.e. administrator's audit passes through, and the target data is stored
To preset standard library, i.e., the target data is stored as standard metadata, so as to subsequent user use.
Further, after the step S20, further includes:
The target criteria data if it does not exist, then by the maximum synonymous normal data of similarity, and according to described similar
It spends maximum synonymous normal data and generates corresponding mapping recommendation information, to remind user whether that the similarity is maximum
Synonymous normal data and the target data establish mapping relations.
In the present embodiment, similarity is greater than the target criteria data of preset threshold if it does not exist, then obtains the synonymous mark
In quasi- data with the maximum synonymous normal data of the similarity of the target data, thus by the target data recommend as with
The target data has the corresponding mapping data of synonymous normal data of most relevance degree.I.e. according to the target data and
The maximum synonymous normal data of similarity generates mapping recommendation information, such as " normal data of the target data and so-and-so
Similarity it is larger, if the target data and so-and-so normal data are established into mapping relations " etc..To remind preset standard
The administrator of the standard metadata in library, if reflect the maximum synonymous normal data of the similarity and target data foundation
Relationship is penetrated, consequently facilitating identifying the target data.
In addition, the embodiment of the present invention also provides a kind of mapping device of metadata standard.
It is the functional block diagram of the mapping device first embodiment of metadata standard of the present invention referring to Fig. 5, Fig. 5.
In the present embodiment, the mapping device of the metadata standard includes:
Data search module 10, for when detecting demapping instruction, obtaining the target data in the demapping instruction, root
The corresponding synonymous normal data of the target data is obtained in preset standard library according to preset rules;
Data judgment module 20, for calculating the similarity of the synonymous normal data and the target data, and according to
The similarity judges in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein
The similarity of the target criteria data and the target data is more than preset threshold;
Data mapping module 30, for the target criteria data if it exists, then by the target data and the target
Normal data establishes mapping relations, so that the target data is mapped as identifiable normal data.
Further, the data search module 10 is also used to:
When detecting demapping instruction, the target data in the demapping instruction is obtained;
English dictionary WordNet based on cognitive linguistics, it is corresponding to obtain the target data in the WordNet
Synonym word set Syncet belongs to class word Class word and meaning interpretation Sense explanation, and in the synonym word
Collection belongs to class word word set and the progress data characteristics extraction of meaning interpretation word set, corresponding candidate same with the determination target data
Adopted word, wherein the extraction formula of the candidate synonym is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant categories of Sense W
Class;{ We } is notional word all in the explanation of Sense W;
The candidate synonym is matched with the standard metadata in the preset standard library, determines the number of targets
According to corresponding synonymous normal data.
Further, the name judgment module 20 specifically includes:
Similarity calculated calculates the synonymous normal data and target data for being based on vector space method
Meaning similarity and Words similarity;
Target data judging unit, for judging the synonymous criterion numeral according to the meaning similarity and Words similarity
It whether there is the corresponding target criteria data of the target data in, wherein the target criteria data and the number of targets
According to meaning similarity and Words similarity be more than preset threshold.
Further, the similarity calculated is also used to:
Calculate the meaning similarity of the synonymous normal data and target data, wherein the calculating of the meaning similarity
According to following formula:
Wherein, No (SW) is the sequence of W meaning, IDF (Wi) it is when training obtained building WordNet from WordNet
There is some WiDocument inverse, Ks is the weight of synonym feature, and Kc is the weight of generic character, and Ke is meaning interpretation
Weight, QUFor WiThe index set of appearance, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the Words similarity
Calculation basis following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
Further, the data search module 10 further include:
Numerical nomenclature judgment module then obtains the target data pre- for the target criteria data if it does not exist
If the frequency of use in the period, and when the frequency of use is more than preset threshold, then according to natural language processing NLP to institute
It states metadata and carries out word segmentation processing and data analysis, and judge whether each unit data after participle accords with according to the NLP
Close the naming rule of natural language;
Normal data adding module, if meeting the naming rule of the natural language, basis for the target data
The target data generates corresponding normal data and updates recommendation information, and is receiving user according to the update recommendation information
When the confirmation instruction of feedback, the target data is added to the preset standard library.
Further, the mapping device of the metadata standard further include:
Recommending module is mapped, for the target criteria data if it does not exist, then by the maximum synonymous criterion numeral of similarity
According to, and corresponding mapping recommendation information is generated according to the maximum synonymous normal data of the similarity, whether to remind user
The maximum synonymous normal data of the similarity and the target data are established into mapping relations.
Wherein, modules and the mapping method of above-mentioned metadata standard are implemented in the mapping device of above-mentioned metadata standard
Each step is corresponding in example, and function and realization process no longer repeat one by one here.
In addition, the embodiment of the present invention also provides a kind of computer readable storage medium.
The mapping program of metadata standard is stored on computer readable storage medium of the present invention, wherein the metadata mark
When quasi- mapping program is executed by processor, the step of realizing the mapping method such as above-mentioned metadata standard.
Wherein, the mapping program of metadata standard, which is performed realized method, can refer to metadata standard of the present invention
Each embodiment of mapping method, details are not described herein again.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and
And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do
There is also other identical elements in the process, method of element, article or system.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in one as described above
In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone,
Computer, server, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of mapping method of metadata standard, which is characterized in that the mapping method of the metadata standard includes following step
It is rapid:
When detecting demapping instruction, the target data in the demapping instruction is obtained, according to preset rules in preset standard library
It is middle to obtain the corresponding synonymous normal data of the target data;
The similarity of the synonymous normal data and the target data is calculated, and the synonymous mark is judged according to the similarity
It whether there is the corresponding target criteria data of the target data in quasi- data, wherein the target criteria data and the mesh
The similarity for marking data is more than preset threshold;
The target data and the target criteria data are then established mapping relations by the target criteria data if it exists, with
Just the target data is mapped as identifiable normal data.
2. the mapping method of metadata standard as described in claim 1, which is characterized in that described to detect demapping instruction
When, the target data in the demapping instruction is obtained, obtains the target data pair in preset standard library according to preset rules
The step of synonymous normal data answered includes:
When detecting demapping instruction, the target data in the demapping instruction is obtained;
English dictionary WordNet based on cognitive linguistics, it is corresponding synonymous to obtain the target data in the WordNet
Word word set Syncet, belong to class word Class word and meaning interpretation Sense explanation, and the synonym word set,
Belong to class word word set and meaning interpretation word set carries out data characteristics extraction, corresponding candidate is synonymous with the determination target data
Word, wherein the extraction formula of the candidate synonym is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant category classes of Sense W;
{ We } is notional word all in the explanation of Sense W;
The candidate synonym is matched with the standard metadata in the preset standard library, determines the target data pair
The synonymous normal data answered.
3. the mapping method of metadata standard as claimed in claim 2, which is characterized in that described to calculate the synonymous criterion numeral
Judge in the synonymous normal data according to the similarity with the target data, and according to the similarity with the presence or absence of the mesh
Mark the corresponding target criteria data of data, wherein the similarity of the target criteria data and the target data is more than default
The step of threshold value includes:
Based on vector space method, it is similar to the meaning similarity and word of target data to calculate the synonymous normal data
Degree;
Judged in the synonymous normal data according to the meaning similarity and Words similarity with the presence or absence of the target data
Corresponding target criteria data, wherein the target criteria data are similar to the meaning similarity and word of the target data
Degree is more than preset threshold.
4. the mapping method of metadata standard as claimed in claim 3, which is characterized in that it is described to be based on vector space method,
The step of calculating the meaning similarity and Words similarity of the synonymous normal data and target data specifically includes:
Calculate the meaning similarity of the synonymous normal data and target data, wherein the calculation basis of the meaning similarity
Following formula:
Wherein, No (SW) is the sequence of W meaning, IDF (Wi) it is to train when obtained building WordNet to occur certain from WordNet
A WiDocument inverse, Ks be synonym feature weight, Kc be generic character weight, Ke be meaning interpretation weight, QU
For WiThe index set of appearance, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the calculating of the Words similarity
According to following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
5. the mapping method of metadata standard as described in claim 1, which is characterized in that described to detect demapping instruction
When, the target data in the demapping instruction is obtained, obtains the target data pair in preset standard library according to preset rules
The step of synonymous normal data answered, specifically includes:
When detecting the metadata for not meeting preset standard, judged in the java standard library according to preset rules with the presence or absence of described
The corresponding synonymous normal data of target data;
If there are the synonymous normal datas in the java standard library, the corresponding synonymous normal data of the target data is obtained.
6. the mapping method of metadata standard as claimed in claim 5, which is characterized in that it is described detect do not meet it is default
When the metadata of standard, judged in the java standard library according to preset rules with the presence or absence of the corresponding synonymous standard of the target data
After the step of data, further includes:
The target criteria data if it does not exist, then obtain frequency of use of the target data in preset time period, and
The frequency of use be more than preset threshold when, then according to natural language processing NLP to the metadata carry out word segmentation processing and
Data analysis, and judge whether each unit data after segmenting meets the naming rule of natural language according to the NLP;
If the target data meets the naming rule of the natural language, corresponding standard is generated according to the target data
Data update recommendation information, and when receiving confirmation instruction of the user according to the update recommendation information feedback, by the mesh
Mark data are added to the preset standard library.
7. the mapping method of the metadata standard as described in claim 1 to 6 any one, which is characterized in that the calculating institute
State the similarity of synonymous normal data Yu the target data, and judged according to the similarity be in the synonymous normal data
No there are the corresponding target criteria data of the target data, wherein the phase of the target criteria data and the target data
After the step of like degree more than preset threshold, further includes:
The target criteria data if it does not exist, then by the maximum synonymous normal data of similarity, and most according to the similarity
Whether big synonymous normal data generates corresponding mapping recommendation information, to remind user the similarity is maximum synonymous
Normal data and the target data establish mapping relations.
8. a kind of mapping device of metadata standard, which is characterized in that the mapping device of the metadata standard includes:
Data search module, for when detecting demapping instruction, obtaining the target data in the demapping instruction, according to default
Rule obtains the corresponding synonymous normal data of the target data in preset standard library;
Data judgment module, for calculating the similarity of the synonymous normal data and the target data, and according to the phase
Judge in the synonymous normal data like degree with the presence or absence of the corresponding target criteria data of the target data, wherein the mesh
The similarity for marking normal data and the target data is more than preset threshold;
Data mapping module, for the target criteria data if it exists, then by the target data and the target criteria number
According to mapping relations are established, so that the target data is mapped as identifiable normal data.
9. a kind of mapped device of metadata standard, which is characterized in that the mapped device of the metadata standard include processor,
Memory and the mapping program for being stored in the metadata standard that can be executed on the memory and by the processor, wherein
When the mapping program of the metadata standard is executed by the processor, the member as described in any one of claims 1 to 7 is realized
The step of mapping method of data standard.
10. a kind of computer readable storage medium, which is characterized in that be stored with metadata on the computer readable storage medium
The mapping program of standard, wherein realizing such as claim 1 to 7 when the mapping program of the metadata standard is executed by processor
Any one of described in metadata standard mapping method the step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533687.8A CN110362601B (en) | 2019-06-19 | 2019-06-19 | Metadata standard mapping method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533687.8A CN110362601B (en) | 2019-06-19 | 2019-06-19 | Metadata standard mapping method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110362601A true CN110362601A (en) | 2019-10-22 |
CN110362601B CN110362601B (en) | 2020-12-18 |
Family
ID=68216679
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910533687.8A Active CN110362601B (en) | 2019-06-19 | 2019-06-19 | Metadata standard mapping method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110362601B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795524A (en) * | 2019-10-31 | 2020-02-14 | 北京东软望海科技有限公司 | Main data mapping processing method and device, computer equipment and storage medium |
CN112052645A (en) * | 2020-09-15 | 2020-12-08 | 平安医疗健康管理股份有限公司 | Data standardization method, device, medium and equipment |
CN112434200A (en) * | 2020-11-30 | 2021-03-02 | 北京思特奇信息技术股份有限公司 | Data display method and system and electronic equipment |
CN112668314A (en) * | 2020-12-30 | 2021-04-16 | 深圳市华傲数据技术有限公司 | Data standard conformance detection method, device, system and storage medium |
CN113642327A (en) * | 2021-10-14 | 2021-11-12 | 中国光大银行股份有限公司 | Method and device for constructing standard knowledge base |
CN117454892A (en) * | 2023-12-20 | 2024-01-26 | 深圳市智慧城市科技发展集团有限公司 | Metadata management method, device, terminal equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180210905A1 (en) * | 2017-01-25 | 2018-07-26 | International Business Machines Corporation | Data mapper |
CN109635098A (en) * | 2018-12-20 | 2019-04-16 | 东软集团股份有限公司 | A kind of intelligent answer method, apparatus, equipment and medium |
CN109740143A (en) * | 2018-11-28 | 2019-05-10 | 平安科技(深圳)有限公司 | Based on the sentence of machine learning apart from mapping method, device and computer equipment |
CN109815491A (en) * | 2019-01-08 | 2019-05-28 | 平安科技(深圳)有限公司 | Answer methods of marking, device, computer equipment and storage medium |
-
2019
- 2019-06-19 CN CN201910533687.8A patent/CN110362601B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180210905A1 (en) * | 2017-01-25 | 2018-07-26 | International Business Machines Corporation | Data mapper |
CN109740143A (en) * | 2018-11-28 | 2019-05-10 | 平安科技(深圳)有限公司 | Based on the sentence of machine learning apart from mapping method, device and computer equipment |
CN109635098A (en) * | 2018-12-20 | 2019-04-16 | 东软集团股份有限公司 | A kind of intelligent answer method, apparatus, equipment and medium |
CN109815491A (en) * | 2019-01-08 | 2019-05-28 | 平安科技(深圳)有限公司 | Answer methods of marking, device, computer equipment and storage medium |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795524A (en) * | 2019-10-31 | 2020-02-14 | 北京东软望海科技有限公司 | Main data mapping processing method and device, computer equipment and storage medium |
CN110795524B (en) * | 2019-10-31 | 2022-07-05 | 望海康信(北京)科技股份公司 | Main data mapping processing method and device, computer equipment and storage medium |
CN112052645A (en) * | 2020-09-15 | 2020-12-08 | 平安医疗健康管理股份有限公司 | Data standardization method, device, medium and equipment |
CN112434200A (en) * | 2020-11-30 | 2021-03-02 | 北京思特奇信息技术股份有限公司 | Data display method and system and electronic equipment |
CN112668314A (en) * | 2020-12-30 | 2021-04-16 | 深圳市华傲数据技术有限公司 | Data standard conformance detection method, device, system and storage medium |
CN113642327A (en) * | 2021-10-14 | 2021-11-12 | 中国光大银行股份有限公司 | Method and device for constructing standard knowledge base |
CN117454892A (en) * | 2023-12-20 | 2024-01-26 | 深圳市智慧城市科技发展集团有限公司 | Metadata management method, device, terminal equipment and storage medium |
CN117454892B (en) * | 2023-12-20 | 2024-04-02 | 深圳市智慧城市科技发展集团有限公司 | Metadata management method, device, terminal equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110362601B (en) | 2020-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110362601A (en) | Mapping method, device, equipment and the storage medium of metadata standard | |
US10977447B2 (en) | Method and device for identifying a user interest, and computer-readable storage medium | |
KR101708508B1 (en) | Method for calculating semantic similarities between messages and conversations based on enhanced entity extraction | |
CN109872162B (en) | Wind control classification and identification method and system for processing user complaint information | |
CN107463658B (en) | Text classification method and device | |
CN110377804A (en) | Method for pushing, device, system and the storage medium of training course data | |
CN107436875A (en) | File classification method and device | |
CN110134792B (en) | Text recognition method and device, electronic equipment and storage medium | |
CN106874253A (en) | Recognize the method and device of sensitive information | |
CN107102993B (en) | User appeal analysis method and device | |
CN110059924A (en) | Checking method, device, equipment and the computer readable storage medium of contract terms | |
CN113032584B (en) | Entity association method, entity association device, electronic equipment and storage medium | |
CN113076735A (en) | Target information acquisition method and device and server | |
CN112395391A (en) | Concept graph construction method and device, computer equipment and storage medium | |
CN110362662A (en) | Data processing method, device and computer readable storage medium | |
WO2020000752A1 (en) | Counterfeit mobile application program determination method and system | |
CN110263121A (en) | Table data processing method, device, electronic device and computer readable storage medium | |
CN107908649B (en) | Text classification control method | |
CN114116997A (en) | Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium | |
CN108388556A (en) | The method for digging and system of similar entity | |
CN109918420B (en) | Competitor recommendation method and server | |
CN112559711A (en) | Synonymous text prompting method and device and electronic equipment | |
CN113792230B (en) | Service linking method, device, electronic equipment and storage medium | |
CN108733702B (en) | Method, device, electronic equipment and medium for extracting upper and lower relation of user query | |
CN109033078A (en) | The recognition methods of sentence classification and device, storage medium, processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |