WO2022037103A1

WO2022037103A1 - Time-space boundary-oriented multi-party service value-quality-capability index alignment method

Info

Publication number: WO2022037103A1
Application number: PCT/CN2021/089373
Authority: WO
Inventors: 涂志莹; 李敏; 王忠杰; 徐晓飞; 徐汉川
Original assignee: 哈尔滨工业大学
Priority date: 2020-08-18
Filing date: 2021-04-23
Publication date: 2022-02-24
Also published as: CN111898928A; CN111898928B

Abstract

A time-space boundary-oriented multi-party service value-quality-capability index alignment method. The method is divided into two parts: domain feature-oriented multi-participant service value-quality-capability evaluation index semantic alignment and time space boundary feature-oriented multi-participant service value-quality-capability evaluation index quantification method alignment. The method does not rely on the construction of an ontology, but uses common means of natural language processing to extract key vocabulary contained in a sentence defined and explained by an indicator, and, by virtue of vocabulary information contained in public dictionaries and domain dictionaries and a morpheme relationship, mines the correlation between different indicators. In terms of quantification method alignment, the method summarizes the factors that lead to quantification methods being inconsistent in the collaborative process of multiple participants, and considers, from the perspective of time and space, the mapping between a specific value of an index in a multi-dimensional service implementation environment and the actual service level requiring expression, thus achieving index quantification method alignment.

Description

The Alignment Method of Multi-Party Service Value-Quality-Capability Index Oriented to the Space-Time Boundary

technical field

The invention belongs to the technical field of enterprise interoperability in software engineering, in particular to the field of multi-participant service non-functional attribute alignment, and relates to a time-space-oriented multi-party service value-quality-capability index alignment method.

Background technique

Enterprise interoperability is a prerequisite for the exchange of data and information among service participants, to reach consensus on service requirements and service goals, and to establish a stable cooperative relationship and a reliable cooperative model. The "European Interoperability Framework for Pan-European E-Government Services" (ElF) identifies three types of organizational interoperability, technical interoperability and semantic interoperability. Among them, organizational interoperability is related to enterprise organizational structure and business implementation process, which can be solved with the help of modeling specifications and model transformation methods; technical interoperability includes interactive interfaces, data integration, representation and exchange, usually with the help of standardized metadata formats and meanings As a reference to achieve data consistency; semantic interoperability is to eliminate inconsistencies in the exchange of information between different enterprises. Service evaluation index is a statistical index to measure and evaluate service value-quality-capability. It is an effective reference information for service decision-making and optimization, and it is also an important negotiation content for various service providers to establish cooperative relations. The evaluation indicators contain not only rich semantic information, but also detailed qualitative and quantitative description information. Different participants have their own norms and habits in the definition, interpretation, quantification, and empowerment of indicators. The premise of the cooperation between the two parties is to realize the alignment of the semantics and quantitative methods of the multi-party service evaluation indicators, so as to ensure that the content expressed by each other's indicators and the meaning of the values can be accurately understood in the process of multi-party cooperation and cooperation.

The traditional research on semantic interoperability of heterogeneous enterprise models mainly focuses on using ontology as the semantic model, establishing domain ontology through ontology construction or reconstruction techniques (ontology hybridization, synthesis, mutation, etc.), and providing semantic reference for model interoperability. Ontology-based model semantic mapping rules and strategies realize semantic alignment between heterogeneous enterprise models, including term alignment, conceptual granularity alignment, angle alignment, coverage alignment, etc., but these alignment schemes cannot solve the alignment of indicator measurement methods; Semantic conflicts between various heterogeneous models, including the same name but different names, different names for the same name, inconsistent scope of the concept, etc.; finally realize the information sharing and business cooperation between the alliances. There are three important deficiencies in this scheme: (1) The basis of model semantic interoperability is the construction of domain ontology. The hierarchy, relevance, authority, integrity and consistency of ontology will directly affect the effect of semantic alignment. Some ontology construction schemes and tools bring great challenges to ontology construction, especially the construction of vertical domain ontology, the accuracy and integrity of the ontology are difficult to guarantee; (2) the definition of concepts and instances in the existing open ontology resources Generally limited to nouns, but the evaluation of services is inseparable from business activities and evaluation aspects, which do not exist in the form of concepts in the ontology. Moreover, the existing concept attributes and concept relationship mining are not sufficient. Although the overall amount of information is large, it focuses on a small concept, and its related concepts and examples are lacking. (3) Besides, just realizing the alignment at the semantic level cannot ensure the consistency of shared information, and the existing work pays little attention to the alignment of the quantification method of indicators.

SUMMARY OF THE INVENTION

Aiming at the above-mentioned shortcomings of the prior art, the present invention provides a method for aligning multi-party service evaluation indicators oriented to the space-time boundary.

The purpose of this invention is to realize through the following technical solutions:

A multi-party service value-quality-capability index alignment method oriented to the space-time boundary, comprising the following steps:

Step 1: Extract keyword groups including service content, business activities, index evaluation aspects and index evaluation rules from the index definition, wherein:

The indicator definition includes indicator name, abbreviation/idiom, English abbreviation, indicator explanation, superior direction, dimension (unit + order of magnitude), value range, and calculation formula;

The four types of keyword groups specifically refer to: ① Service content, including service providers (personnel roles, system tools, software applications, etc.), service carriers (commodities, orders, knowledge, data, etc.) and service execution environment and context, generally Noun phrases; ②Business activities, including specific implementation behaviors of service providers and detailed disposal methods of service carriers, generally verb phrases; ③Evaluation aspects, including service content and business activities modifiers, generally represented by XX rate|proportion|account ratio, XX effect|degree, XX size|speed|load, etc.; ④Evaluation rules, including index evaluation criteria, weight, frequency and other statistical units, such as quantifiers such as daily average, monthly average, per capita, quarterly, and annual;

Step 2: According to the public dictionary, the domain dictionary and the self-built dictionary, calculate the morpheme relationship between the four key groups of the two indicators, and obtain the semantic similarity matrix between the indicators, where:

Described public dictionary includes synonym word forest (extended version), HowNet dictionary, Baidu Chinese dictionary;

The domain dictionary includes Sogou industry thesaurus and Baidu industry thesaurus, including six entries: concept identifier, concept name, synonym, English name, semantic description, and application field. It is established by field experts based on their understanding and experience of the field. A list of domain-specific concepts;

The definition content of the phrase in the self-built dictionary includes ID, phrase, part of speech, the category (one of service content, business activity, index evaluation side, index evaluation rule), synonyms, antonyms, similar words, hypernyms, Hyponymy, causal-related phrases, belonging/source-related phrases, usage/tool-related phrases, composition/total score-related phrases, and execution-dependent-related phrases;

The morpheme relationship includes four types: similar (highly similar), similar (weaker than similar), related, and similar;

The semantic similarity matrix is a two-dimensional matrix, which are four types of keyword group sets of two indicators;

Step 3: Determine the semantic relationship between the indicators with the help of the semantic similarity matrix, and calculate the relationship confidence, where:

The semantic relationship includes similarity relationship (①same index; ② conjugate index; ③ subordinate index; ), related relationship (④ service content related; ⑧ Similar business; ⑨ Similar service content);

Step 4: Determine the semantic relationship of all indicators according to Step 3 to obtain a semantic relationship network, delete redundant edges according to the direction and quantity of the semantic relationship between the indicators, and simplify the semantic network, wherein:

The semantic relationship network refers to a network with indicators as nodes and semantic relationships between indicators as edges. The edge attributes are the semantic relationship type and confidence, and the direction of the edge includes two kinds of directed and undirected. ⑤ Business-related is directional;

Step 5. Fit the distribution characteristics of the indicator in the single domain and the rich domain according to the sample data of the indicator in different space-time boundaries, where:

Time refers to different time domains, space refers to different geographic domains, and boundary refers to different service implementation environments (online or offline), different service implementation platforms or different service participants;

The single domain distribution feature refers to the probability distribution feature of the indicator in one service domain, and the rich domain distribution feature refers to the probability distribution feature of the indicator in two or more service domains;

Step 6: Establish an alignment relationship in the way of index quantification with the probability quantile as a reference, in which:

The alignment relationship in the index quantification method refers to finding the corresponding index value range of a certain type of service level under different space-time boundary characteristics, or determining the corresponding service level of the index value under a specific space-time boundary.

Compared with the prior art, the present invention has the following advantages:

Different from the traditional ontology-based enterprise model semantic interoperability method, the present invention does not depend on the construction of ontology, but uses common methods of natural language processing to extract key words contained in the sentences defined and explained by indicators, and uses public dictionaries and domain The lexical information and morpheme relationships contained in the dictionary are used to mine the correlation between different indicators. In terms of the alignment of quantification methods, the present invention summarizes the factors that lead to inconsistent quantification methods in the process of collaboration among multiple participants, and considers the relationship between the specific value of the indicator in the multi-dimensional service implementation environment and the actual service level to be expressed from the perspective of space and time. Mapping relationship, to achieve the alignment of index quantification.

Description of drawings

Fig. 1 is the multi-party service value-quality-capability index alignment method framework oriented to the space-time boundary of the present invention;

Fig. 2 is the method framework of the multi-participant service value-quality-capability index semantic alignment oriented to domain features of the present invention;

Fig. 3 is the method framework of the multi-participant service value-quality-capability index quantification method alignment oriented to spatiotemporal features of the present invention;

Fig. 4 is the principle of index relation judgment in the semantic alignment stage of the present invention;

FIG. 5 is an example diagram of the keyword analysis of the domain feature-oriented service evaluation index of the present invention;

6 is a schematic diagram of semantic alignment of domain feature-oriented multi-participant service evaluation indicators of the present invention;

FIG. 7 is an example diagram of a single-domain distribution feature of indicators oriented to spatiotemporal features of the present invention;

FIG. 8 is an exemplary diagram of a spatiotemporal feature-oriented index rich domain distribution feature of the present invention;

FIG. 9 is a theoretical diagram of alignment of the spatiotemporal feature-oriented multi-participant service evaluation index quantification method according to the present invention.

detailed description

The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings, but are not limited thereto. Any modification or equivalent replacement of the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention shall be included in the present invention. within the scope of protection.

The invention provides a multi-party service value-quality-capability index alignment method oriented to the space-time boundary. The method is divided into two parts: the semantic alignment of the multi-participant service evaluation index oriented to the domain characteristics and the multi-participant service oriented to the characteristics of the time-space boundary. The quantification methods of evaluation indicators are aligned, and the framework is shown in Figure 1-3.

The purpose of semantic alignment of the present invention is to extract key elements of indicators through natural language processing related technologies on the premise of knowing the multi-domain and multi-participant service value-quality-capability evaluation index system, and then calculate with the help of public dictionaries, domain dictionaries and self-built dictionaries. The semantic relationship between the four types of phrases is finally determined on the basis of the lexical relationship matrix and the relationship confidence is calculated. Finally, the multi-domain and multi-participant index semantic relationship network is obtained. Each participant can learn the relationship between its own service index and other party's index from the semantic relationship network. This relationship is not limited to the situation of the same name but different names or different names, and can also mine richer semantic relationships.

The original index definition includes index name, abbreviation/idiom, English abbreviation, index explanation, superior direction, dimension (unit + order of magnitude), value range, calculation formula, etc. The abbreviation/idiom and English abbreviation include: Strong domain expertise, it is necessary to use the relevant explanations contained in the domain dictionary to assist understanding; the index names and explanations lack normative, and the naming methods and explanation details of different participants are inconsistent; the calculation content also implies index related relation. In order to eliminate the irregularity of the index definition, the present invention completes the index preprocessing in the first step, extracts the key elements of the index through natural language processing technologies such as word segmentation, part-of-speech tagging, dependency syntax analysis, word frequency statistics, etc., and eliminates those that are difficult to understand or irrelevant to service evaluation. Words, get [service content, business activities, index evaluation side, index evaluation rules] four types of phrases.

Service content: It includes the roles of personnel involved in service implementation, the resources that service execution depends on, tangible products or valuable knowledge information accompanying the service delivery process, etc., generally represented by proper nouns.

Business activities: verbs related to business execution, referring to actions performed by human roles or automated mechanical systems, generally represented by verbs.

Indicator evaluation side: describe the nouns that modify service content or business activities, generally with specific suffixes, such as XX rate, XX degree, XX effect, XX nature.

Indicator evaluation rules: The evaluation indicators have specific evaluation frequency and objects, such as daily average, monthly average, annual average; or per person, per order, per case.

The main judging basis of the index relationship of the present invention is three types of dictionaries: public dictionaries, domain dictionaries and self-built dictionaries. The lexical richness, lexical relationship detail, lexical explanation detail, and lexical organization structure in the dictionaries will affect the calculation result. reliability. Therefore, the present invention selects the synonym Cilin (extended version), HowNet dictionary, and Baidu Chinese dictionary as public dictionaries that can be referred to; Sogou industry thesaurus and Baidu industry thesaurus are domain dictionaries that can be referred to; the self-built dictionary contains ID, phrase , part of speech, described category (one of the four of service content, business activity, evaluation side, evaluation rules), synonyms, antonyms, similar words, hypernyms, hyponyms, causally related phrases, belonging/source related phrases, usage/tools Related phrases, composition/total score related phrases, execution-dependent related phrases, etc. Then comprehensively use the above dictionary information to calculate the relationship between the four types of phrases.

The present invention defines three major categories and nine sub-categories for the correlation between indexes on the semantic level, wherein: the nine categories of relations are explained as follows:

1. Similar relationship

1. The same indicator: It means that the service content, business activities, indicator evaluation aspects and modifiers can all correspond, and all have highly similar semantics. eg. Food packaging rate, food packaging efficiency.

2. Conjugate index: It means that the service content and business activities are highly similar, but the evaluation aspects of the index are antonyms to each other. eg. The cleanliness of the restaurant and the degree of clutter in the dining environment.

3. Subordinate and subordinate indicators:

It means that business activities and index evaluation are highly similar, but there is a subordinate relationship between service contents (word A is a component of word B, or word A is a subcategory of word B). eg. Commodity defective rate, fresh defective rate.

2. Relevant relationship

4. Relevance of service content: refers to similar business activities (if both exist), similar aspects of index evaluation (weaker than similar approximation), and there is a certain correlation between service content, such as the health status of the chef and the hygiene of the dishes, and the dishes are made by the chef. , health and hygiene are similar.

5. Business-related: Refers to similar service content, similar indicators and evaluation aspects, and there is a certain correlation between business activities, such as the firmness of food packaging and the degree of non-destructiveness of food transportation, because packaging is a pre-order activity of transportation, and the degree of firmness and non-destructiveness are similar .

6. Indicator correlation: It means that there is no obvious correlation between service content and business activities, but when the indicator description contains accompanying words such as "with XXX" and "more XX, more XX", it indicates that there is a correlation between the two indicators. If the change trend is consistent, it is positive correlation; otherwise, it is negative correlation. For example, the delivery time of dishes is negatively correlated with the degree of quality assurance of the dishes. Obviously, the longer the delivery time, the worse the quality assurance of the dishes.

3. Similar indicators

7. Similar indicators/service evaluation aspects: Refers to the similar service evaluation aspects, but the service content and business activities are neither similar nor related, or the service content and business activities are not extracted. In this case, a similar relationship can be roughly defined. eg. Dishes packaging accuracy, order accounting accuracy.

8. Similar business: refers to similar business activities, but the service content and evaluation aspects are neither similar nor related. eg. The accuracy of food packaging and the firmness of food packaging.

9. Similar service content: Refers to the similar service content, but the business activities and evaluation aspects are neither similar nor related. eg. Commodity storage time, the proportion of finishing commodities.

The tightness of the above-mentioned nine types of relationships decreases sequentially. The reasons that may lead to misjudgment of the relationship are as follows: (1) the effective information contained in the index definition is missing; (2) the limited coverage of the training corpus is not high, which leads to the wrong understanding of the word meaning or the determination of the relationship between words and words. For the indicators that fail to automatically establish a correlation in the semantic-to-correlation stage or the indicators that have nothing to do with themselves are established, the last three types of indicator relationships are the focus of the investigation. On the one hand, the confidence level of the correlation determination can be lowered or the explanation of the indicators can be enriched. content, and improve the accuracy of index relationship determination; After optimization, a semantic relation network is obtained, in which node information represents an index, and edge information represents semantic relation and relation confidence.

The specific implementation steps of the method for semantic alignment of multi-participant service value-quality-capability evaluation indicators oriented to domain features of the present invention are as follows:

Step 1. Evaluation index preprocessing

Through the statistical analysis of the index content, it is found that rich information such as index evaluation objects, concerns and evaluation scope can be determined through service content, business activities, index evaluation aspects and index evaluation rules. Therefore, the main work of the preprocessing stage is to extract indicators The keyword groups of these four types of information contained in . The reason why it is not four words but phrases is that some indicators may contain words such as "such as XX", "including XX", "XX, etc." in the content of the indicator explanation.

The input in the preprocessing stage is a sentence S _i defined and explained by an indicator. The purpose of word segmentation is to extract all the words belonging to the above four categories of keywords from the sentence and remove unnecessary stop words to obtain WG (WG represents the number of key words). In the part-of-speech tagging stage, important words containing actual semantics, such as nouns, verbs, quantifiers, adverbs, adjectives, conjunctions, etc., can be identified from WG, and corresponding to the service content phrase WG _services , business activity phrase WG _business , and indicator evaluation side phrases WG _indicators , modifier phrase WG _adjunctword . The dependency/modification relationship between words of different parts of speech can be obtained at the stage of dependency syntax analysis. By synthesizing the analysis results of all evaluation indicators, the following four types of association relationships can be summarized: ①What are the related business actions of a certain service content; ② Who are the implementers of a business activity and who are the recipients; ③ What are the specific evaluation aspects of a service content or business activity; ④ Which evaluation aspects are public (most service content or business activities will be considered) . In addition, dependency syntactic analysis can also clarify the co-ordinated words related to conjunctions, and can further delete unimportant words.

The above preprocessing work can be completed by relying on natural language processing toolkits such as StanfordNLPCore and language models trained on public large corpora. Taking the table turnover rate as an example, the original definition of the indicator is as follows: [Table turnover rate; the average number of times each table is used in a hotel in a day, the table turnover rate is an important indicator to measure the profitability of a restaurant and is closely related to the average daily passenger flow of the restaurant; (table turnover rate) The number of times of use - the total number of units) ÷ the total number of units]. The four types of phrases obtained after preprocessing are as follows:

WG _services = {restaurant, table, dining room, table};

WG _business = {use|2, profit};

WG _indicators = {number of times, total number of stations, passenger flow};

WG _adjunctword = {one day, per sheet, daily average}.

Based on the above work alone, it is found in the experiment that the number of words parsed by some indicators is still very large, which will bring a lot of calculation to the subsequent indicator relationship determination. Therefore, some rules can be used to further simplify the phrases. The present invention uses the ID-IDF method to quantify The importance of each word is analyzed, and the unimportant words are deleted. At the same time, this importance will also be involved in the subsequent indicator relationship determination. The calculation formula is as follows:

tf-idf _i,j =tf _i,j ×idf _i ;

Among them, n _i,j is the total number of occurrences of a specific word i in an indicator j, n _k,j is the total number of occurrences of other words k in the indicator j, _| D _| represents the number of all indicators, |j:t _i ∈d _j | denotes the number of indices containing the word t _i , tf _i,j denotes the importance of the word in the explanation of this index, and idf _i denotes the degree of exclusiveness of the word in the explanation of the index.

Step 2. Customize other inputs

The determination of index correlation is directly affected by lexical semantic association. The existing open dictionaries partially meet the needs in this regard, but most of them only include hyponymous relations, synonymous relations, antonymous relations, homogeneous relations, etc. The related relationship has not been included. The present invention summarizes the common lexical semantic relationships of service evaluation indicators, but there is no excellent method to accurately extract these semantic relationships from the public domain, so it is temporarily replaced by a rough lexical semantic relationship dictionary and a user-built dictionary.

The semantic associations that exist between service contents are as follows:

① hyponym (a-kind-of): A is a kind of B, A is a hyponym of B, and B is a hypernym of A. Such as "ingredients" and "meat products".

②Inclusion relationship (a-part-of): A is a part of B, B contains A, A is a part and B is the whole. Such as "dishes" and "drinks".

③Similar relationship: A and B have a common abstract parent class in the tree-like upper-lower relationship. Such as "dishes" and "meat products".

④ Similar relationship (same different names): The meanings expressed by A and B are highly similar or equivalent. Such as "supermarket" and "mall".

⑤Relationship

Source related: A is the raw material of B, and B is processed by A. Such as "dishes" and "ingredients".

Use/tool related: A is a tool of B related business, such as "dish" and "refrigerator".

Composition/total score related: A is an accessory that B must include, such as "delivery cart" and "incubator".

The semantic associations that exist between business activities are as follows:

① Timing dependency: A activity is the pre-order activity of B activity, and B activity is the successor activity of A activity, such as "packaging" and "delivery";

②Synchronization dependency: Activity A and activity B must be synchronized at the same time or place to start subsequent activities, otherwise one party must wait, such as "dishes are packaged" and "rider arrives at the restaurant";

③ Compensation dependency: The error of A activity triggers the execution of B activity. If A activity is correct, B activity will not be executed, such as "confirmation of receipt" and "after-sales service".

The semantic associations between the evaluation aspects of the indicators are as follows:

①Synonymous relationship: A and B express the same or similar concepts, such as "correct rate" and "accuracy rate";

②Conjugate relationship: A and B express opposite concepts, such as "error rate" and "accuracy rate".

The semantic associations between the index evaluation rules are as follows:

①Conversion relationship: A and B belong to the same class of quantifiers, so they can be converted with the help of conversion formulas, such as "daily average" and "monthly average".

In addition, because different service participants have different criteria for defining the index system, the quality of the self-built dictionary is also different. Therefore, in order to ensure the confidence of automatic index alignment, a number of configurable parameters are allowed to be opened to ensure that the existing index relationship is not used. Missing, incorrect metric relationships are not mined. Two solutions are given here. On the one hand, the index system builder can configure the "similar judgment threshold TH _hs ", "similar judgment threshold TH _s ", "similar judgment threshold TH _ls ", and "related judgment threshold TH _r " (thresholds take The value range is between 0 and 1. There is no value limit for the relevant judgment threshold. The other three thresholds need to satisfy TH _hs > TH _s > TH _ls ). On the other hand, you can configure the "lower limit of relationship number" and "upper limit of relationship number", and automatically adjust the size of the above four thresholds on the premise of ensuring the number of relationships as much as possible.

Synthesizing the above-mentioned speech rate relationship, the present invention expresses it as the following six categories:

1. High similarity (HS): the calculated value of similarity between words is greater than the similarity judgment threshold TH _hs ;

2. Antonym of each other (AN): refers to the words of the adjective part of speech that are antonyms to each other in the dictionary, or the sum of the sentiment values expressed is approximately 1;

3. Mutual synonyms (SY): the calculated value of similarity between words is less than the similarity judgment threshold TH _hs , but greater than the similarity judgment threshold TH _s ;

4. Hyponymy relationship (LS): refers to the noun part-of-speech word that has a hyponymous relationship in the dictionary;

5. Relevance relationship (RE): refers to the relationship between words in the dictionary (there are semantic correlations between service contents and business activities);

6. NULL: means that there is neither a highly similar relationship nor a related relationship; or the category of words does not exist in the definition of one indicator.

The determination of the above semantic relationship can be obtained by calculating the position, number, identifier and dictionary structure of the word in the dictionary.

Step 3: Determining the relationship between indicators

First, the relationship between the four types of words is determined with the help of an open public dictionary. The synonym forest, HowNet and Baidu Chinese dictionary are adopted in the experiment of the present invention, which contains information such as word frequency, part of speech, synonyms, hypernyms, word codes, and related words. In addition, users can also build their own dictionaries to supplement. Assuming that all index sets are I, and one of the evaluation indexes is I _n , four phrases are obtained after preprocessing

To determine whether there is a certain _semantic relationship between the two indicators In, _Im , first calculate the same type of phrases

The semantic association that exists between k ∈ {services,bu sin ess,indicators,adjunctword}. As shown below, the relationship between homogeneous phrases can be calculated using a matrix

Express:

Among them, the _index In contains p words, the index _Im contains q words, each word has a corresponding IF-IDF value, and the matrix size is p×q. Each element a _i,j in the matrix is a two-tuple <RelarionType, Confidence> including the relation type and confidence between words, where RelationType∈{HS,AN,SY,LS,RE,NULL} and Confidence∈[0 ,1].

Next, it is necessary to calculate the support degree of each type of word semantic association R _r , as shown in the following formula, for all a _i,j .RelationType=R _r of a _i,j corresponding to the IF-IDF value product of w _i ,w _j and is the support of type R _r .

The r _Max corresponding to the maximum value of SD _r is the semantic association type of this type of phrase, and the confidence level of this semantic association

is the mean of the confidences of all elements of the same type in the matrix (other statistics can also be adopted).

Among them, n and m represent the index I _n and the index I _m respectively, k refers to the four types of keyword groups, and num refers to the

number of words.

After obtaining the relationship of the four types of phrases, it is necessary to determine the semantic relationship between the indicators on this basis, and the judgment basis is shown in Figure 4. In particular, if it is a correlation determination, it is necessary to compare the index semantic confidence calculation value and the similar determination threshold TH _ls . If it is greater than this threshold, it can be determined that there is a similar relationship, otherwise the two are irrelevant. The purpose of this is that only one type of phrase has a high confidence value in the calculation of the same type of relationship, and the confidence value of the other two types of phrases can be high or low. This comparison. As for the other six types of semantic relationships, the confidence of the three types of phrases will not be too low, and this problem will not exist.

Step 4: Optimize the relationship between indicators

In order to quantitatively analyze the effect of the semantic alignment results obtained by the technical framework, the present invention defines the following evaluation indicators:

1. Maximum node in-degree

The in-degree of a node indicates the degree of dependence of the node in the comprehensive index evaluation system, which means that many related variables or indicators will determine or affect the value of the index. If the maximum in-degree of the node is larger, it means that the index system The structure level is shallower, the fault tolerance rate is lower, and the error propagation probability is also lower.

2. Maximum node out degree

The out-degree of a node indicates the importance of the node in the comprehensive index evaluation index system, which means that the index can determine or affect the value of multiple indicators. If the maximum node out-degree is larger, it means that the index system structure The more complex and unstable it is, the more likely it is to cause problems that affect the whole body.

3. Coverage

Refers to the proportion of indicators associated with other indicators to the total number of indicators through semantic alignment. The higher the coverage, the more closely the indicators are related, and the richer the semantic relationship of indicators is; on the contrary, it means that the number of isolated indicators is large, and the model is more unknown, because a systematic service evaluation indicator does not exist in theory. Isolated metrics that are influenced by other metrics. v _i represents the _ith node in the index semantic relational network, O(vi ) represents the out-degree of the indicator, I(vi ) represents the in-degree of the indicator, and _{Λ k} ₍ "Condition") represents a certain The number of metrics for which an element meets a certain condition. The coverage calculation formula is as follows:

4. Hit rate

Because in the service value-quality-capability modeling stage, we also allow the user to define the index relationship and the relationship type manually, and use this as the deterministic set Set _certain , then the hit rate means that the index semantic relationship mined by the above method includes the deterministic centralized index Among them, e _j represents the jth edge in the indicator semantic relation network, and Λ _e ("Condition") represents the number of indicators that an element meets a certain condition.

5. Error rate

It refers to the proportion of indicators that have misjudged the types of the indicators of the semantic relationship of indicators mined by the above methods or established an alignment relationship with indicators that are completely irrelevant by human judgment.

6. Novelty

Refers to the proportion of the index semantic relationship index mined by the above methods that does not belong to the artificially defined index relationship in the modeling stage and the index judgment relationship is correct.

7. The number and average confidence of each semantic relation type found

This step is only to analyze the alignment effect of the above methods in detail. If the relationship between similar indicators is high, it means that the index evaluation system has high redundancy; The high proportion of index relationship means that the index system is more detailed.

This method is highly dependent on the lexicon and word semantic association judgment threshold, so the result of the index semantic alignment obtained by the artificial initial input may have insufficient relationship mining or relationship mining error. The hit rate, error rate, and innovation degree mentioned in the above alignment result evaluation are all proportional to coverage. The richer the index relationship mining, the higher the hit rate, the higher the innovation, and the higher the error rate. Therefore, controlling the number of index relationship mining is a starting point for optimization. Therefore, it can be optimized by resetting the confidence level of semantic relationship determination.

On the other hand, the richness of the index content will also affect the determination of the index relationship. If the index content is too concise (the description of service content, business activities, and evaluation aspects is incomplete), it is often easy to be classified into the same index relationship. Therefore, if the relationship between similar indicators is high and the error rate is high, the content optimization can be explained by supplementary indicators.

Finally, if there is always an irreducible error rate, we can only rely on human resources to optimize the alignment results by artificially adding and deleting index relationships.

Taking Hema Xiansheng service as an example, the results of index preprocessing and semantic alignment are shown in Figures 5 and 6.

The purpose of the quantitative alignment method of the present invention is to define the space-time boundary and divide the service domain based on the sample data of the known index under different space-time boundary conditions, and then use the kernel density estimation to fit the spatio-temporal boundary characteristic distribution of the index on the single domain and the rich domain. , solve the probability distribution function according to the fitted probability density function, and then use the quantile as the benchmark to solve the corresponding value of the index under different space-time boundary characteristics. The mapping relationship between the specific value of the index and the actual service level is not unique and constant. The same index value may also correspond to different service levels under different space-time boundary conditions, and different service levels are under different space-time boundary conditions. It is possible for the indicator to take the same value. For example, the price level and average price of commodities vary significantly in different regions. The same commodity average price is high in Harbin but low in Shanghai; or distribution efficiency and delivery time also exist in time, space and field. There are obvious differences. Taking the time domain as an example, the efficient delivery time during the off-peak dining period only takes 20 minutes, the high-efficiency delivery time during the dining peak period is generally about 30-40 minutes, and the efficient delivery time at midnight is 50-60 minutes. If the difference in characteristic distribution of indicators in different time and space boundaries is not considered, it will lead to the failure or imbalance of service decision-making and optimization. For example, if an enterprise formulates a unified commodity price adjustment strategy across the country, it will be obvious to low-income areas. Rising and high-income regions did not feel a significant difference. With the aid of the quantitative alignment method mentioned in the present invention, the decision maker can perceive the distribution difference of the index value in different time and space boundaries, and formulate a reasonable enterprise decision plan according to the alignment mapping function.

The specific implementation steps of the method for aligning the quantification method of the multi-participant service value-quality-capability evaluation index oriented to spatiotemporal characteristics of the present invention are as follows:

Step 1. Definition of space-time boundary and division of service domain

Step 1.1, time domain

The time domain has natural continuity and can be described by interval numbers. The specific definition is as follows:

1. Clock trigger

[T _start ,T _end ], take a certain moment in the past or the current moment as T _start , and define a specific deadline as T _end ;

[T _start ,T _end ] _period , define fixed T _start and T _end , define a clock period period;

[N _i , N _j ] _slice , defines a fixed time slice slice, starting with the N _i th slice and ending with the N _j th slice.

2. Event trigger

[T _E-start , T _E-end ] _Event , taking the event occurrence as T _E-start , taking the event’s influence end as T _E-end , and Event being the trigger event in the time domain.

[T _E-start , T _E-start +Δt] _Event , take the event occurrence as T _E-start , define the duration Δt of event influence, especially when Δt=0, it means that the influence of Event is abrupt.

3. Activity trigger

[∞, T _A-start ] _Activity , indicating the time period before the activity starts T _A-start .

[T _A-start , T _A-end ] _Activity , indicating the time period between the execution of the activity.

[T _A-start ,∞] _Activity , which represents the time period after the activity starts T _A-start .

[T _A-end , ∞] _Activity , indicating the time period after the activity ends T _A-end .

Step 1.2, Spatial Domain

To put it simply, the spatial domain is the geographic domain, which can be described in the form of set algebra. The specific definition is as follows:

1. Location: ① a geographic location with latitude and longitude attributes; ② streets, business districts, communities, etc. with proper names; ③ names of provinces and municipalities determined according to the division of national administrative regions.

2. Neighborhood: a certain geographic range determined by the location s ₀ and the neighborhood radius ρ.

3. Regional attributes can be ranked by regional advantages (such as regional economic development, population density, education level, consumption index, etc.), and each region will correspond to a Rank value, thereby determining the partial order relationship.

Step 1.3, Generalized Domain

The generalized domain is to divide the service domain into several sub-domains according to a certain boundary rule, highlighting the characteristics of different sub-domains and the fusion and transition between sub-domains with business optimization and service collaboration. Boundary rules can be formulated according to the industry field, service content and nature, and the technology platform on which service execution depends. The traditional definition of service boundaries is limited to the existence of management boundaries between autonomous organizations, and other boundaries are equivalent to the separation of technology platforms and service content caused by organizational boundaries. However, with the promotion and popularization of SaaS cloud platforms, organizational boundaries It is not enough to fully describe the existence of service boundaries. It is necessary to define richer service boundaries to provide a basis for judgment in service collaboration and integration.

Step 2. Fitting the single-domain/rich-domain distribution characteristics of the index

Generally, we cannot predict the distribution type of the sample data in advance, nor can we be sure that the distribution curve has several peaks, so the general parameter estimation scheme is not used. Combine, select "gau" as the kernel function, "scott" as the bandwidth calculation function, input the sample data DateSet _d' under a certain service domain, and use the KDEUnivariate function to fit the probability density function pdf _d of the indicator on the d' service domain _' and the probability distribution function cdf _d' . Taking the refund and change fee standards of the three major domestic airlines as an example, Figure 7 shows the single-domain distribution characteristics of the indicators in the three dimensions of cabin class, departure time and airline, and Figure 8 shows the indicators in the cabin class and airline. From the rich domain distribution characteristics of take-off time, it can be seen that there are obvious differences in the distribution of indicators in different domains.

Step 3. Calculate the alignment relationship of the indicators in terms of quantitative methods

On the basis of step 2, we obtained the characteristic distribution of the indicators in different time-space boundary service domains. Next, we need to use these distribution functions to establish the corresponding relationship between the index values in different time-space boundaries. In the present invention, the quantile α is used as the alignment reference, and it is assumed that the indicator I presents two distributions cdf(I _a ) and cdf(I _b ) on the two service domains a and b. ∈[0,1] is the function of the independent variable, each quantile α' corresponds to two index values i' _a , i' _b , so that the correspondence between the index values on the two service domains can be established relationship, as shown in Figure 9. In the same way, the alignment of multiple space-time boundary indicators is also established on the basis of quantiles. The service level can be converted into a number between [0, 1], and it can be known that a certain service level is under different space-time boundary conditions. The corresponding specific index value.

Matters not addressed in the present invention are known in the art.

Claims

A multi-party service value-quality-capability index alignment method oriented to the space-time boundary, characterized in that the method comprises the following steps:

Step 1: Extract keyword groups including service content, business activities, index evaluation aspects and index evaluation rules from the value-quality-capability evaluation index definition;

Step 2: According to the public dictionary, the domain dictionary and the self-built dictionary, calculate the morpheme relationship between the two pairs of indicators and the four types of keyword groups respectively, and obtain the semantic similarity matrix between the indicators;

Step 3: Determine the semantic relationship between the indicators with the help of the semantic similarity matrix, and calculate the relationship confidence;

Step 4: Determine the semantic relationship of all indicators according to step 3 to obtain a semantic relationship network, delete redundant edges according to the direction and number of semantic relationships between indicators, and simplify the semantic network;

Step 5: Fitting the distribution characteristics of the index on the single domain and the rich domain according to the sample data of the index in different space-time boundaries;

Step 6: Use the probability quantile as a reference to establish an alignment relationship in terms of index quantification.
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, wherein in the step 1, the index definition includes index name, abbreviation/idiom, English abbreviation, index explanation, superiority Direction, dimension, value range, calculation formula.
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, wherein in the step 1, the keyword group specifically refers to: ① service content; ② business activity; ③ evaluation side; ④ evaluation rule.
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, characterized in that in the second step, the public dictionary includes Baidu Chinese Dictionary, HowNet and Synonym Forest (extended version); the domain dictionary includes Sogou industry thesaurus, Baidu industry thesaurus, the definitions of phrases in self-built dictionaries include ID, phrase, part of speech, described category, synonyms, antonyms, similar words, hypernyms, hyponyms, causally related phrases, belonging/source related Several of phrases, usage/tool-related phrases, composition/total score-related phrases, and execution-dependent-related phrases.
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, characterized in that in the second step, the morpheme relationship includes four types: similarity, similarity, correlation, and same type.
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, wherein in the step 3, the semantic relationship includes:

Similar relationship: ①same index; ②conjugate index; ③superior index;

Relevant relationship: ④Service content related; ⑤Business related; ⑥Index related;

Similar indicators: ⑦Similar service evaluation side; ⑧Similar business; ⑨Similar service content.
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, characterized in that in the step 3, the relationship confidence
The calculation formula is as follows:

Among them, n and m represent the index I n and the index I m respectively, k refers to the four types of keyword groups, and num refers to the
Number of words with RelationType=r Max .
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, characterized in that in the step 4, the semantic relation network refers to a network with indexes as nodes and semantic relationships between indexes as edges, The edge attributes are the semantic relationship type and confidence, and the edge direction includes both directed and undirected.
The multi-party service value-quality-capability index alignment method oriented to the time-space boundary according to claim 1, characterized in that in the step 5, time refers to different time domains, space refers to different geographical domains, and boundary refers to different different service implementation environments, different service implementation platforms, or different service participants; single-domain distribution characteristics refer to the probability distribution characteristics of indicators in one service domain, and rich-domain distribution characteristics refer to indicators in two or more service domains. Probability distribution features.
The multi-party service value-quality-capability index alignment method oriented to the space-time boundary according to claim 1, characterized in that in the step 6, the alignment relationship in the index quantification method means that the solution index corresponds to a certain value under different space-time boundary characteristics. The value range of a class of service level, or the value of the judgment index under a specific space-time boundary is mapped to the corresponding service level.