CN110362601A - Mapping method, device, equipment and the storage medium of metadata standard - Google Patents

Mapping method, device, equipment and the storage medium of metadata standard Download PDF

Info

Publication number
CN110362601A
CN110362601A CN201910533687.8A CN201910533687A CN110362601A CN 110362601 A CN110362601 A CN 110362601A CN 201910533687 A CN201910533687 A CN 201910533687A CN 110362601 A CN110362601 A CN 110362601A
Authority
CN
China
Prior art keywords
data
target
synonymous
target data
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910533687.8A
Other languages
Chinese (zh)
Other versions
CN110362601B (en
Inventor
李勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN201910533687.8A priority Critical patent/CN110362601B/en
Publication of CN110362601A publication Critical patent/CN110362601A/en
Application granted granted Critical
Publication of CN110362601B publication Critical patent/CN110362601B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides mapping method, device, equipment and the storage medium of a kind of metadata standard, that is, obtains the target data in the demapping instruction, obtains the corresponding synonymous normal data of the target data in preset standard library according to preset rules;The similarity of the synonymous normal data and the target data is calculated, and is judged in the synonymous normal data according to the similarity with the presence or absence of the corresponding target criteria data of the target data;The target data and the target criteria data are then established mapping relations, so that the target data is mapped as identifiable normal data by the target criteria data if it exists.The present invention can search corresponding synonymous normal data according to the corresponding synonym of target data in preset standard library, realize the incremental update of standard metadata, without manually carrying out the lookup of corresponding normal data, improve data search efficiency, the accuracy rate of data search result is improved, the user experience is improved.

Description

Mapping method, device, equipment and the storage medium of metadata standard
Technical field
The present invention relates to data processing field more particularly to a kind of mapping method of metadata standard, device, equipment and meters Calculation machine readable storage medium storing program for executing.
Background technique
As Information System configuration develops to certain phase, data resource will become strategic asset, and effective data are controlled Reason is only the necessary condition of data assets formation.Data improvement refer to from use sporadic data become using unified master data, from It administers with little or no tissue and process to the integrated data in enterprise-wide and administers, handles master data confusion shape from trial A condition process in perfect order to master data.And data administer successful key and are metadata management, i.e., in imparting data Hereafter with the reference frame of meaning.At present in data governing system on the market, generally requires and manually searched in standards system The corresponding standard of metadata out, and the metadata is established into mapping relations with corresponding standard.Therefore, existing metadata and standard Mapping method not only inefficiency but also accuracy rate is low.
Therefore, the mapping method of existing metadata and standard not only inefficiency but also accuracy rate is low asks how is solved The problem of topic is current urgent need to resolve.
Summary of the invention
The main purpose of the present invention is to provide a kind of mapping method of metadata standard, device, equipment and computers can Read storage medium, it is intended to solve the mapping method of existing metadata and standard not only inefficiency but also the low technology of accuracy rate Problem.
To achieve the above object, the present invention provides a kind of mapping method of metadata standard, which is characterized in that the member number According to standard mapping method the following steps are included:
When detecting demapping instruction, the target data in the demapping instruction is obtained, is being marked in advance according to preset rules The corresponding synonymous normal data of the target data is obtained in quasi- library;
It calculates the similarity of the synonymous normal data and the target data, and is judged according to the similarity described same It whether there is the corresponding target criteria data of the target data in adopted normal data, wherein the target criteria data and institute The similarity for stating target data is more than preset threshold;
The target data and the target criteria data are then established mapping and closed by the target criteria data if it exists System, so that the target data is mapped as identifiable normal data.
Optionally, described when detecting demapping instruction, the target data in the demapping instruction is obtained, according to default rule The step of corresponding synonymous normal data of the target data is then obtained in preset standard library include:
When detecting demapping instruction, the target data in the demapping instruction is obtained;
English dictionary WordNet based on cognitive linguistics, it is corresponding to obtain the target data in the WordNet Synonym word set Syncet belongs to class word Class word and meaning interpretation Sense explanation, and in the synonym word Collection belongs to class word word set and the progress data characteristics extraction of meaning interpretation word set, corresponding candidate same with the determination target data Adopted word, wherein the extraction formula of the candidate synonym is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant categories of Sense W Class;{ We } is notional word all in the explanation of Sense W;
The candidate synonym is matched with the standard metadata in the preset standard library, determines the number of targets According to corresponding synonymous normal data.
Optionally, the similarity for calculating the synonymous normal data and the target data, and according to described similar Degree judges in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein the target The step of similarity of normal data and the target data is more than preset threshold include:
Based on vector space method, meaning similarity and word phase of the synonymous normal data with target data are calculated Like degree;
Judged in the synonymous normal data according to the meaning similarity and Words similarity with the presence or absence of the target The corresponding target criteria data of data, wherein the meaning similarity and word of the target criteria data and the target data Similarity is more than preset threshold.
Optionally, described to be based on vector space method, calculate the meaning phase of the synonymous normal data with target data It is specifically included like the step of degree and Words similarity:
Calculate the meaning similarity of the synonymous normal data and target data, wherein the calculating of the meaning similarity According to following formula:
Wherein, No (SW) is the sequence of W meaning, IDF (Wi) it is when training obtained building WordNet from WordNet There is some WiDocument inverse, Ks is the weight of synonym feature, and Kc is the weight of generic character, and Ke is meaning interpretation Weight, QUFor WiThe index set of appearance, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the Words similarity Calculation basis following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
Optionally, described when detecting demapping instruction, the target data in the demapping instruction is obtained, according to default rule The step of corresponding synonymous normal data of the target data is then obtained in preset standard library specifically includes:
When detecting the metadata for not meeting preset standard, judge to whether there is in the java standard library according to preset rules The corresponding synonymous normal data of the target data;
If there are the synonymous normal datas in the java standard library, the corresponding synonymous criterion numeral of the target data is obtained According to.
Optionally, described when detecting the metadata for not meeting preset standard, the standard is judged according to preset rules After the step of in library with the presence or absence of the target data corresponding synonymous normal data, further includes:
The target criteria data if it does not exist then obtain frequency of use of the target data in preset time period, And when the frequency of use is more than preset threshold, then word segmentation processing is carried out to the metadata according to natural language processing NLP And data analysis, and judge whether each unit data after segmenting meets the naming rule of natural language according to the NLP;
If the target data meets the naming rule of the natural language, generated according to the target data corresponding Normal data updates recommendation information, and when receiving confirmation instruction of the user according to the update recommendation information feedback, by institute It states target data and is added to the preset standard library.
Optionally, the similarity for calculating the synonymous normal data and the target data, and according to described similar Degree judges in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein the target After the step of similarity of normal data and the target data is more than preset threshold, further includes:
The target criteria data if it does not exist, then by the maximum synonymous normal data of similarity, and according to described similar It spends maximum synonymous normal data and generates corresponding mapping recommendation information, to remind user whether that the similarity is maximum Synonymous normal data and the target data establish mapping relations.
In addition, to achieve the above object, the present invention also provides a kind of mapping device of metadata standard, the metadata mark Quasi- mapping device includes:
Data search module, for when detecting demapping instruction, obtaining the target data in the demapping instruction, according to Preset rules obtain the corresponding synonymous normal data of the target data in preset standard library;
Data judgment module, for calculating the similarity of the synonymous normal data and the target data, and according to institute It states similarity to judge in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein institute The similarity for stating target criteria data and the target data is more than preset threshold;
Data mapping module, for the target criteria data if it exists, then by the target data and the target mark Quasi- data establish mapping relations, so that the target data is mapped as identifiable normal data.
In addition, to achieve the above object, the present invention also provides a kind of mapped device of metadata standard, the metadata mark Quasi- mapped device includes processor, memory and is stored in the member that can be executed on the memory and by the processor The mapping program of data standard, wherein realizing when the mapping program of the metadata standard is executed by the processor as above-mentioned Metadata standard mapping method the step of.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium The mapping program of metadata standard is stored on storage medium, wherein the mapping program of the metadata standard is executed by processor When, realize as above-mentioned metadata standard mapping method the step of.
The present invention provides a kind of mapping method of metadata standard, i.e., when detecting demapping instruction, obtains the mapping Target data in instruction obtains the corresponding synonymous criterion numeral of the target data according to preset rules in preset standard library According to;The similarity of the synonymous normal data and the target data is calculated, and the synonymous mark is judged according to the similarity It whether there is the corresponding target criteria data of the target data in quasi- data, wherein the target criteria data and the mesh The similarity for marking data is more than preset threshold;The target criteria data if it exists, then by the target data and the target Normal data establishes mapping relations, so that the target data is mapped as identifiable normal data.By the above-mentioned means, this Invention can search corresponding synonymous normal data according to the corresponding synonym of target data in preset standard library, without artificial The lookup for carrying out corresponding normal data, improves data search efficiency, improves the accuracy rate of data search result, improve use Family experience, solves the technical issues of existing standard metadata formulated in advance is unable to satisfy user demand.
Detailed description of the invention
Fig. 1 is the hardware structural diagram of the mapped device of metadata standard involved in the embodiment of the present invention;
Fig. 2 is the flow diagram of the mapping method first embodiment of metadata standard of the present invention;
Fig. 3 is the flow diagram of the mapping method second embodiment of metadata standard of the present invention;
Fig. 4 is the flow diagram of the mapping method 3rd embodiment of metadata standard of the present invention;
Fig. 5 is the functional block diagram of the mapping device first embodiment of metadata standard of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present embodiments relate to the mapping method of metadata standard be mainly used in the mapped device of metadata standard, The mapped device of the metadata standard can be the equipment that PC, portable computer, mobile terminal etc. have display and processing function.
Referring to Fig.1, Fig. 1 is that the hardware configuration of the mapped device of metadata standard involved in the embodiment of the present invention shows It is intended to.In the embodiment of the present invention, the mapped device of metadata standard may include processor 1001 (such as CPU), communication bus 1002, user interface 1003, network interface 1004, memory 1005.Wherein, communication bus 1002 for realizing these components it Between connection communication;User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard); Network interface 1004 optionally may include standard wireline interface and wireless interface (such as WI-FI interface);Memory 1005 can be with It is high speed RAM memory, is also possible to stable memory (non-volatile memory), such as magnetic disk storage, stores Device 1005 optionally can also be the storage device independently of aforementioned processor 1001.
It will be understood by those skilled in the art that hardware configuration shown in Fig. 1 does not constitute the mapping to metadata standard The restriction of equipment may include perhaps combining certain components or different component cloth than illustrating more or fewer components It sets.
With continued reference to Fig. 1, the memory 1005 in Fig. 1 as a kind of computer readable storage medium may include operation system The mapping program of system, network communication module and metadata standard.
In Fig. 1, network communication module is mainly used for connecting server, carries out data communication with server;And processor 1001 can call the mapping program of the metadata standard stored in memory 1005, and execute member provided in an embodiment of the present invention The mapping method of data standard.
The embodiment of the invention provides a kind of mapping methods of metadata standard.
It is the flow diagram of the mapping method first embodiment of metadata standard of the present invention referring to Fig. 2, Fig. 2.
In the present embodiment, the mapping method of the metadata standard the following steps are included:
Step S10 obtains the target data in the demapping instruction, according to preset rules when detecting demapping instruction The corresponding synonymous normal data of the target data is obtained in preset standard library;
In the present embodiment, the existing system for production and application has been put into, some non-compliant metadata are not It can be carried out change, therefore, it is necessary to which a mapping relations will be established between these non-compliant metadata and normal data, To may recognize that above-mentioned metadata in next audit system data.The present invention is for existing needs manually in standards system In carry out the technical issues of corresponding normal data is searched, provide and a kind of carry out the side that corresponding normal data is searched based on synonym Method, by searching the corresponding synonymous data of target data to be mapped in java standard library, so that finding rapidly and efficiently is described The corresponding synonymous normal data of target data.Wherein, preset rules can be when target data is English data, based on cognition Philological English dictionary WordNet obtains the corresponding synonymous data acquisition system of the target data, by the synonymous data acquisition system It is matched with java standard library, to obtain the synonymous normal data of the corresponding system of the target data.In specific embodiment, in target When data are Chinese data, based on Chinese near synonym or synonymicon, the corresponding synonymous data set of the target data is obtained It closes, and obtains the corresponding synonymous normal data of the target data.In specific embodiment, the obtaining step of synonymous normal data Are as follows: when detecting demapping instruction, obtain the target data in the demapping instruction;English dictionary based on cognitive linguistics WordNet obtains the corresponding synonym word set Syncet of the target data in the WordNet, belongs to class word Class Word and meaning interpretation Sense explanation, and in the synonym word set, category class word word set and meaning interpretation word set Data characteristics extraction is carried out, with the corresponding candidate synonym of the determination target data, wherein the extraction of the candidate synonym Formula is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant categories of Sense W Class;{ We } is notional word all in the explanation of Sense W;By the standard member in the candidate synonym and the preset standard library Data are matched, and determine the corresponding synonymous normal data of the target data.Wherein, the present embodiment mainly utilizes WordNet Then the interface function of offer is extracted out candidate from these three set of the synonym word set of WordNet, category class word and meaning interpretation Then synonym carries out feature extraction to the candidate synonym, by the candidate synonym according to the mark in preset standard library Quasi- metadata determines the corresponding synonymous normal data of the target data.
Step S20, calculates the similarity of the synonymous normal data and the target data, and is sentenced according to the similarity Break in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein the target criteria The similarity of data and the target data is more than preset threshold;
In the present embodiment, the similarity includes meaning similarity and Words similarity, between two meanings (Sense) Similarity can be obtained by calculating its distance in three different significance characteristic spaces.Apart from smaller, similarity is got over Greatly.The similarity in WordNet between two words can be calculated according to meaning similarity.Calculating the synonymous standard When the similarity of data and the target data, judge in the synonymous normal data with the presence or absence of the phase with the target data It is more than the target criteria data of preset threshold like degree.
Step S30, the target criteria data, then establish the target data and the target criteria data if it exists Mapping relations, so that the target data is mapped as identifiable normal data.
In the present embodiment, there are when the target criteria data in determining the synonymous normal data, by the target Normal data mapping relations corresponding with target data progress, consequently facilitating can recognize in subsequent audit system data The target data is the normal data of correspondence mappings out.Such as: wordNet find trade synset:trade, Transaction, business, deal, and sequencing of similarity is pressed, recommend out, wherein transaction, business are Standard term in the system, is highlighted, and transaction can be chosen to the Mapping standard as trade by similarity.Exempt from The process that synonym is manually found out from thousands of a standards is gone.
The present embodiment provides a kind of mapping methods of metadata standard, i.e., when detecting demapping instruction, reflect described in acquisition The target data in instruction is penetrated, obtains the corresponding synonymous criterion numeral of the target data in preset standard library according to preset rules According to;The similarity of the synonymous normal data and the target data is calculated, and the synonymous mark is judged according to the similarity It whether there is the corresponding target criteria data of the target data in quasi- data, wherein the target criteria data and the mesh The similarity for marking data is more than preset threshold;The target criteria data if it exists, then by the target data and the target Normal data establishes mapping relations, so that the target data is mapped as identifiable normal data.By the above-mentioned means, this Invention can search corresponding synonymous normal data according to the corresponding synonym of target data in preset standard library, without artificial The lookup for carrying out corresponding normal data, improves data search efficiency, improves the accuracy rate of data search result, improve use Family experience, solves the technical issues of existing standard metadata formulated in advance is unable to satisfy user demand.
It is the flow diagram of the mapping method second embodiment of metadata standard of the present invention referring to Fig. 3, Fig. 3.
Based on above-mentioned embodiment illustrated in fig. 2, in the present embodiment, the step S20 includes:
Step S21 is based on vector space method, calculates the meaning similarity of the synonymous normal data and target data And Words similarity;
In the present embodiment, based on the classification of the lexical semantic of WordNet, it is synonymous then to extract corresponding candidate Word, and the corresponding synonymous normal data of the target data is determined according to the java standard library in preset standard library.Then it uses and is based on The method of vector space calculates the meaning similarity and Words similarity of the target data Yu each synonymous normal data.Tool In body embodiment, the meaning similarity of the synonymous normal data and target data is calculated, wherein the meter of the meaning similarity It calculates according to following formula:
Wherein, No (SW) is the sequence of W meaning, for example, the first sense=1, the second sense= 2……;IDF(Wi) it is to train when obtained building WordNet some W occur from WordNetiDocument inverse;Ks is same The weight that the weight that the weight of adopted word feature, such as 1.5, Kc are generic character, such as 1, Ke are meaning interpretation, such as 0.5, QUFor WiOut Existing index set, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the Words similarity Calculation basis following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
Step S22 judges to whether there is in the synonymous normal data according to the meaning similarity and Words similarity The corresponding target criteria data of the target data, wherein the target criteria data are similar to the meaning of the target data Degree and Words similarity are more than preset threshold.
In the present embodiment, different similarity preset thresholds can be set according to the meaning similarity and Words similarity, Also identical similarity preset threshold can be set.It will be more than the synonymous standard of preset threshold with the similarity of the target data Data are as target criteria data, and according to the similarity and preset threshold, judge in the synonymous normal data whether There are target criteria data.
It is the flow diagram of the mapping method 3rd embodiment of metadata standard of the present invention referring to Fig. 4, Fig. 4.
Based on above-mentioned embodiment illustrated in fig. 2, in the present embodiment, after the step S20, further includes:
Step S40, target criteria data if it does not exist, then obtain the target data makes in preset time period With frequency, and when the frequency of use is more than preset threshold, then the metadata is divided according to natural language processing NLP Word processing and data analysis, and judge whether each unit data after segmenting meets the life of natural language according to the NLP Name rule;
In the present embodiment, target criteria data if it does not exist, then the target data is not the acceptance of the bid of preset standard library The corresponding synonym of quasi- metadata.Frequency of use of the target data in preset time period is further obtained, i.e. statistics institute Frequency of occurrence of the target data at the appointed time in section is stated, judges whether the frequency of use of the target data has been more than default threshold Value, wherein the preset time period can be that one week that current time rises interior, in one month or three months etc..It is described default Threshold value can be set according to the actual situation, and frequency of use is user's high frequency more than the target data of the preset threshold The metadata used.It, can also be by counting frequency of occurrence of the target data in preset time period in specific embodiment.
When determining that the frequency of use is more than the preset threshold of setting, NLP (Natural is carried out to the target data Language Processing, natural language processing) analysis processing.When the target data is phrase, by the number of targets According to progress word segmentation processing, and each unit data after participle is judged respectively, that is, it is each after judging target data participle Whether a unit data meets the naming rule of natural language.Wherein, the naming rule, which can be, judges each unit data It whether is Chinese word, English word or other effective language terms etc..It, can be according to corresponding language in specific embodiment Words allusion quotation judges whether each unit data is effective language term.
Step S50 is raw according to the target data if the target data meets the naming rule of the natural language Recommendation information is updated at corresponding normal data, and is instructed receiving user according to the confirmation of the update recommendation information feedback When, the target data is added to the preset standard library.
It, can be by the target data when determining that the target data meets corresponding naming rule in the present embodiment Recommend to be stored as standard member number so that administrator judges whether the target data being added to presetting database to administrator According to.Specific recommendation step are as follows: corresponding recommendation information is generated according to the target data, such as: whether by " IC (electric appliances service industry In IC is referred to as to integrated circuit) be stored as standard metadata ".And generate confirmation simultaneously or cancel instruction, so as to administrator's root Corresponding instruction is triggered according to auditing result.If receive confirmation instruction, i.e. administrator's audit passes through, and the target data is stored To preset standard library, i.e., the target data is stored as standard metadata, so as to subsequent user use.
Further, after the step S20, further includes:
The target criteria data if it does not exist, then by the maximum synonymous normal data of similarity, and according to described similar It spends maximum synonymous normal data and generates corresponding mapping recommendation information, to remind user whether that the similarity is maximum Synonymous normal data and the target data establish mapping relations.
In the present embodiment, similarity is greater than the target criteria data of preset threshold if it does not exist, then obtains the synonymous mark In quasi- data with the maximum synonymous normal data of the similarity of the target data, thus by the target data recommend as with The target data has the corresponding mapping data of synonymous normal data of most relevance degree.I.e. according to the target data and The maximum synonymous normal data of similarity generates mapping recommendation information, such as " normal data of the target data and so-and-so Similarity it is larger, if the target data and so-and-so normal data are established into mapping relations " etc..To remind preset standard The administrator of the standard metadata in library, if reflect the maximum synonymous normal data of the similarity and target data foundation Relationship is penetrated, consequently facilitating identifying the target data.
In addition, the embodiment of the present invention also provides a kind of mapping device of metadata standard.
It is the functional block diagram of the mapping device first embodiment of metadata standard of the present invention referring to Fig. 5, Fig. 5.
In the present embodiment, the mapping device of the metadata standard includes:
Data search module 10, for when detecting demapping instruction, obtaining the target data in the demapping instruction, root The corresponding synonymous normal data of the target data is obtained in preset standard library according to preset rules;
Data judgment module 20, for calculating the similarity of the synonymous normal data and the target data, and according to The similarity judges in the synonymous normal data with the presence or absence of the corresponding target criteria data of the target data, wherein The similarity of the target criteria data and the target data is more than preset threshold;
Data mapping module 30, for the target criteria data if it exists, then by the target data and the target Normal data establishes mapping relations, so that the target data is mapped as identifiable normal data.
Further, the data search module 10 is also used to:
When detecting demapping instruction, the target data in the demapping instruction is obtained;
English dictionary WordNet based on cognitive linguistics, it is corresponding to obtain the target data in the WordNet Synonym word set Syncet belongs to class word Class word and meaning interpretation Sense explanation, and in the synonym word Collection belongs to class word word set and the progress data characteristics extraction of meaning interpretation word set, corresponding candidate same with the determination target data Adopted word, wherein the extraction formula of the candidate synonym is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant categories of Sense W Class;{ We } is notional word all in the explanation of Sense W;
The candidate synonym is matched with the standard metadata in the preset standard library, determines the number of targets According to corresponding synonymous normal data.
Further, the name judgment module 20 specifically includes:
Similarity calculated calculates the synonymous normal data and target data for being based on vector space method Meaning similarity and Words similarity;
Target data judging unit, for judging the synonymous criterion numeral according to the meaning similarity and Words similarity It whether there is the corresponding target criteria data of the target data in, wherein the target criteria data and the number of targets According to meaning similarity and Words similarity be more than preset threshold.
Further, the similarity calculated is also used to:
Calculate the meaning similarity of the synonymous normal data and target data, wherein the calculating of the meaning similarity According to following formula:
Wherein, No (SW) is the sequence of W meaning, IDF (Wi) it is when training obtained building WordNet from WordNet There is some WiDocument inverse, Ks is the weight of synonym feature, and Kc is the weight of generic character, and Ke is meaning interpretation Weight, QUFor WiThe index set of appearance, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the Words similarity Calculation basis following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
Further, the data search module 10 further include:
Numerical nomenclature judgment module then obtains the target data pre- for the target criteria data if it does not exist If the frequency of use in the period, and when the frequency of use is more than preset threshold, then according to natural language processing NLP to institute It states metadata and carries out word segmentation processing and data analysis, and judge whether each unit data after participle accords with according to the NLP Close the naming rule of natural language;
Normal data adding module, if meeting the naming rule of the natural language, basis for the target data The target data generates corresponding normal data and updates recommendation information, and is receiving user according to the update recommendation information When the confirmation instruction of feedback, the target data is added to the preset standard library.
Further, the mapping device of the metadata standard further include:
Recommending module is mapped, for the target criteria data if it does not exist, then by the maximum synonymous criterion numeral of similarity According to, and corresponding mapping recommendation information is generated according to the maximum synonymous normal data of the similarity, whether to remind user The maximum synonymous normal data of the similarity and the target data are established into mapping relations.
Wherein, modules and the mapping method of above-mentioned metadata standard are implemented in the mapping device of above-mentioned metadata standard Each step is corresponding in example, and function and realization process no longer repeat one by one here.
In addition, the embodiment of the present invention also provides a kind of computer readable storage medium.
The mapping program of metadata standard is stored on computer readable storage medium of the present invention, wherein the metadata mark When quasi- mapping program is executed by processor, the step of realizing the mapping method such as above-mentioned metadata standard.
Wherein, the mapping program of metadata standard, which is performed realized method, can refer to metadata standard of the present invention Each embodiment of mapping method, details are not described herein again.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or system.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone, Computer, server, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of mapping method of metadata standard, which is characterized in that the mapping method of the metadata standard includes following step It is rapid:
When detecting demapping instruction, the target data in the demapping instruction is obtained, according to preset rules in preset standard library It is middle to obtain the corresponding synonymous normal data of the target data;
The similarity of the synonymous normal data and the target data is calculated, and the synonymous mark is judged according to the similarity It whether there is the corresponding target criteria data of the target data in quasi- data, wherein the target criteria data and the mesh The similarity for marking data is more than preset threshold;
The target data and the target criteria data are then established mapping relations by the target criteria data if it exists, with Just the target data is mapped as identifiable normal data.
2. the mapping method of metadata standard as described in claim 1, which is characterized in that described to detect demapping instruction When, the target data in the demapping instruction is obtained, obtains the target data pair in preset standard library according to preset rules The step of synonymous normal data answered includes:
When detecting demapping instruction, the target data in the demapping instruction is obtained;
English dictionary WordNet based on cognitive linguistics, it is corresponding synonymous to obtain the target data in the WordNet Word word set Syncet, belong to class word Class word and meaning interpretation Sense explanation, and the synonym word set, Belong to class word word set and meaning interpretation word set carries out data characteristics extraction, corresponding candidate is synonymous with the determination target data Word, wherein the extraction formula of the candidate synonym is as follows:
Feature (SW)={ { Ws }, { Wc }, { We } }
Wherein, { Ws } is the synonym that Sense W is all in WordNet;{ Wc } is all relevant category classes of Sense W; { We } is notional word all in the explanation of Sense W;
The candidate synonym is matched with the standard metadata in the preset standard library, determines the target data pair The synonymous normal data answered.
3. the mapping method of metadata standard as claimed in claim 2, which is characterized in that described to calculate the synonymous criterion numeral Judge in the synonymous normal data according to the similarity with the target data, and according to the similarity with the presence or absence of the mesh Mark the corresponding target criteria data of data, wherein the similarity of the target criteria data and the target data is more than default The step of threshold value includes:
Based on vector space method, it is similar to the meaning similarity and word of target data to calculate the synonymous normal data Degree;
Judged in the synonymous normal data according to the meaning similarity and Words similarity with the presence or absence of the target data Corresponding target criteria data, wherein the target criteria data are similar to the meaning similarity and word of the target data Degree is more than preset threshold.
4. the mapping method of metadata standard as claimed in claim 3, which is characterized in that it is described to be based on vector space method, The step of calculating the meaning similarity and Words similarity of the synonymous normal data and target data specifically includes:
Calculate the meaning similarity of the synonymous normal data and target data, wherein the calculation basis of the meaning similarity Following formula:
Wherein, No (SW) is the sequence of W meaning, IDF (Wi) it is to train when obtained building WordNet to occur certain from WordNet A WiDocument inverse, Ks be synonym feature weight, Kc be generic character weight, Ke be meaning interpretation weight, QU For WiThe index set of appearance, QVFor WjThe index set of appearance;
Calculate the Words similarity of the synonymous normal data and the target data, wherein the calculating of the Words similarity According to following formula:
Wherein, | SW1 | it is the number of the meaning sense of W1, | SW1 | for the number of the meaning sense of W2.
5. the mapping method of metadata standard as described in claim 1, which is characterized in that described to detect demapping instruction When, the target data in the demapping instruction is obtained, obtains the target data pair in preset standard library according to preset rules The step of synonymous normal data answered, specifically includes:
When detecting the metadata for not meeting preset standard, judged in the java standard library according to preset rules with the presence or absence of described The corresponding synonymous normal data of target data;
If there are the synonymous normal datas in the java standard library, the corresponding synonymous normal data of the target data is obtained.
6. the mapping method of metadata standard as claimed in claim 5, which is characterized in that it is described detect do not meet it is default When the metadata of standard, judged in the java standard library according to preset rules with the presence or absence of the corresponding synonymous standard of the target data After the step of data, further includes:
The target criteria data if it does not exist, then obtain frequency of use of the target data in preset time period, and The frequency of use be more than preset threshold when, then according to natural language processing NLP to the metadata carry out word segmentation processing and Data analysis, and judge whether each unit data after segmenting meets the naming rule of natural language according to the NLP;
If the target data meets the naming rule of the natural language, corresponding standard is generated according to the target data Data update recommendation information, and when receiving confirmation instruction of the user according to the update recommendation information feedback, by the mesh Mark data are added to the preset standard library.
7. the mapping method of the metadata standard as described in claim 1 to 6 any one, which is characterized in that the calculating institute State the similarity of synonymous normal data Yu the target data, and judged according to the similarity be in the synonymous normal data No there are the corresponding target criteria data of the target data, wherein the phase of the target criteria data and the target data After the step of like degree more than preset threshold, further includes:
The target criteria data if it does not exist, then by the maximum synonymous normal data of similarity, and most according to the similarity Whether big synonymous normal data generates corresponding mapping recommendation information, to remind user the similarity is maximum synonymous Normal data and the target data establish mapping relations.
8. a kind of mapping device of metadata standard, which is characterized in that the mapping device of the metadata standard includes:
Data search module, for when detecting demapping instruction, obtaining the target data in the demapping instruction, according to default Rule obtains the corresponding synonymous normal data of the target data in preset standard library;
Data judgment module, for calculating the similarity of the synonymous normal data and the target data, and according to the phase Judge in the synonymous normal data like degree with the presence or absence of the corresponding target criteria data of the target data, wherein the mesh The similarity for marking normal data and the target data is more than preset threshold;
Data mapping module, for the target criteria data if it exists, then by the target data and the target criteria number According to mapping relations are established, so that the target data is mapped as identifiable normal data.
9. a kind of mapped device of metadata standard, which is characterized in that the mapped device of the metadata standard include processor, Memory and the mapping program for being stored in the metadata standard that can be executed on the memory and by the processor, wherein When the mapping program of the metadata standard is executed by the processor, the member as described in any one of claims 1 to 7 is realized The step of mapping method of data standard.
10. a kind of computer readable storage medium, which is characterized in that be stored with metadata on the computer readable storage medium The mapping program of standard, wherein realizing such as claim 1 to 7 when the mapping program of the metadata standard is executed by processor Any one of described in metadata standard mapping method the step of.
CN201910533687.8A 2019-06-19 2019-06-19 Metadata standard mapping method, device, equipment and storage medium Active CN110362601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910533687.8A CN110362601B (en) 2019-06-19 2019-06-19 Metadata standard mapping method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910533687.8A CN110362601B (en) 2019-06-19 2019-06-19 Metadata standard mapping method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110362601A true CN110362601A (en) 2019-10-22
CN110362601B CN110362601B (en) 2020-12-18

Family

ID=68216679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910533687.8A Active CN110362601B (en) 2019-06-19 2019-06-19 Metadata standard mapping method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110362601B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795524A (en) * 2019-10-31 2020-02-14 北京东软望海科技有限公司 Main data mapping processing method and device, computer equipment and storage medium
CN112052645A (en) * 2020-09-15 2020-12-08 平安医疗健康管理股份有限公司 Data standardization method, device, medium and equipment
CN112434200A (en) * 2020-11-30 2021-03-02 北京思特奇信息技术股份有限公司 Data display method and system and electronic equipment
CN112668314A (en) * 2020-12-30 2021-04-16 深圳市华傲数据技术有限公司 Data standard conformance detection method, device, system and storage medium
CN113642327A (en) * 2021-10-14 2021-11-12 中国光大银行股份有限公司 Method and device for constructing standard knowledge base
CN117454892A (en) * 2023-12-20 2024-01-26 深圳市智慧城市科技发展集团有限公司 Metadata management method, device, terminal equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180210905A1 (en) * 2017-01-25 2018-07-26 International Business Machines Corporation Data mapper
CN109635098A (en) * 2018-12-20 2019-04-16 东软集团股份有限公司 A kind of intelligent answer method, apparatus, equipment and medium
CN109740143A (en) * 2018-11-28 2019-05-10 平安科技(深圳)有限公司 Based on the sentence of machine learning apart from mapping method, device and computer equipment
CN109815491A (en) * 2019-01-08 2019-05-28 平安科技(深圳)有限公司 Answer methods of marking, device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180210905A1 (en) * 2017-01-25 2018-07-26 International Business Machines Corporation Data mapper
CN109740143A (en) * 2018-11-28 2019-05-10 平安科技(深圳)有限公司 Based on the sentence of machine learning apart from mapping method, device and computer equipment
CN109635098A (en) * 2018-12-20 2019-04-16 东软集团股份有限公司 A kind of intelligent answer method, apparatus, equipment and medium
CN109815491A (en) * 2019-01-08 2019-05-28 平安科技(深圳)有限公司 Answer methods of marking, device, computer equipment and storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795524A (en) * 2019-10-31 2020-02-14 北京东软望海科技有限公司 Main data mapping processing method and device, computer equipment and storage medium
CN110795524B (en) * 2019-10-31 2022-07-05 望海康信(北京)科技股份公司 Main data mapping processing method and device, computer equipment and storage medium
CN112052645A (en) * 2020-09-15 2020-12-08 平安医疗健康管理股份有限公司 Data standardization method, device, medium and equipment
CN112434200A (en) * 2020-11-30 2021-03-02 北京思特奇信息技术股份有限公司 Data display method and system and electronic equipment
CN112668314A (en) * 2020-12-30 2021-04-16 深圳市华傲数据技术有限公司 Data standard conformance detection method, device, system and storage medium
CN113642327A (en) * 2021-10-14 2021-11-12 中国光大银行股份有限公司 Method and device for constructing standard knowledge base
CN117454892A (en) * 2023-12-20 2024-01-26 深圳市智慧城市科技发展集团有限公司 Metadata management method, device, terminal equipment and storage medium
CN117454892B (en) * 2023-12-20 2024-04-02 深圳市智慧城市科技发展集团有限公司 Metadata management method, device, terminal equipment and storage medium

Also Published As

Publication number Publication date
CN110362601B (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN110362601A (en) Mapping method, device, equipment and the storage medium of metadata standard
US10977447B2 (en) Method and device for identifying a user interest, and computer-readable storage medium
KR101708508B1 (en) Method for calculating semantic similarities between messages and conversations based on enhanced entity extraction
CN109872162B (en) Wind control classification and identification method and system for processing user complaint information
CN107463658B (en) Text classification method and device
CN110377804A (en) Method for pushing, device, system and the storage medium of training course data
CN107436875A (en) File classification method and device
CN110134792B (en) Text recognition method and device, electronic equipment and storage medium
CN106874253A (en) Recognize the method and device of sensitive information
CN107102993B (en) User appeal analysis method and device
CN110059924A (en) Checking method, device, equipment and the computer readable storage medium of contract terms
CN113032584B (en) Entity association method, entity association device, electronic equipment and storage medium
CN113076735A (en) Target information acquisition method and device and server
CN112395391A (en) Concept graph construction method and device, computer equipment and storage medium
CN110362662A (en) Data processing method, device and computer readable storage medium
WO2020000752A1 (en) Counterfeit mobile application program determination method and system
CN110263121A (en) Table data processing method, device, electronic device and computer readable storage medium
CN107908649B (en) Text classification control method
CN114116997A (en) Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium
CN108388556A (en) The method for digging and system of similar entity
CN109918420B (en) Competitor recommendation method and server
CN112559711A (en) Synonymous text prompting method and device and electronic equipment
CN113792230B (en) Service linking method, device, electronic equipment and storage medium
CN108733702B (en) Method, device, electronic equipment and medium for extracting upper and lower relation of user query
CN109033078A (en) The recognition methods of sentence classification and device, storage medium, processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant