CN113641877B - Intelligent comparison method for relay protection fixed values - Google Patents

Intelligent comparison method for relay protection fixed values Download PDF

Info

Publication number
CN113641877B
CN113641877B CN202110941813.0A CN202110941813A CN113641877B CN 113641877 B CN113641877 B CN 113641877B CN 202110941813 A CN202110941813 A CN 202110941813A CN 113641877 B CN113641877 B CN 113641877B
Authority
CN
China
Prior art keywords
constant value
word
dictionary
name
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110941813.0A
Other languages
Chinese (zh)
Other versions
CN113641877A (en
Inventor
戴志辉
方伟
李金铄
耿宏贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Electric Power University
Original Assignee
North China Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Electric Power University filed Critical North China Electric Power University
Priority to CN202110941813.0A priority Critical patent/CN113641877B/en
Publication of CN113641877A publication Critical patent/CN113641877A/en
Application granted granted Critical
Publication of CN113641877B publication Critical patent/CN113641877B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention belongs to the technical field of safe and stable operation of a power grid, and discloses a relay protection fixed value intelligent comparison method based on Chinese word segmentation, which comprises the steps of; firstly, text processing is carried out on the constant value names of the relay protection devices by using a Chinese word segmentation technology, naming rules of the relay protection devices of different manufacturers and different types are analyzed and combed, and a dictionary of the constant value names of the relay protection devices is built; then, on the basis of a dictionary, performing Chinese word segmentation on the constant value name on the constant value list and the constant value name on the operation equipment by adopting an improved maximum forward matching algorithm to obtain a word segmentation result and a word segmentation array; comparing the word arrays, screening out the constant value items of the running device matched with the constant value single constant value items, and comparing the constant values of the running device constant value items; finally, for the problem of few special synonyms, a sequence similarity calculation and error-proof comparison mechanism is introduced, so that the accuracy and the comprehensiveness are further improved, and the effectiveness of the method is verified through calculation example analysis.

Description

Intelligent comparison method for relay protection fixed values
Technical Field
The invention belongs to the technical field of safe and stable operation of power grids, and particularly relates to an intelligent comparison method for relay protection fixed values. In particular to a relay protection device fixed value intelligent comparison method based on Chinese word segmentation
Background
The rationality and the accuracy of the relay protection fixed value are important to ensuring the safe and stable operation of the power grid and fully playing the performance of the relay protection device. Along with the change of the operation mode of the power grid, the protection fixed value of the device can be changed after the processes of setting, checking, comparing, mounting and the like. At present, the technology is domestic and foreignMany units conduct targeted research, an online setting system and an online checking system are provided, and the operation efficiency of the relay protection system is greatly improved. Because the correctness of the operation fixed value in the relay protection device directly influences the operation mode of the power grid system, the method has important significance for safe and stable operation of the power grid, the substation with 220kV and above is required to perform once complete comparison between the actual operation fixed value of the total station equipment and the latest dispatching fixed value list every half year, and the substation with 110kV and below is required to perform once annual comparison. In addition, the relay protection device needs to be compared again before the equipment is operated and the new equipment is put into use. However, the research on relay protection fixed value comparison is less at present, and the fixed value comparison still stays at the manual completion of modes such as paper, telephone, fax, mail and the like [8] . This approach suffers from the following disadvantages:
(1) The operation, maintenance and repair personnel need to print the latest fixed value list of the protection device in the dispatching system, carry the latest fixed value list to a transformer substation site, and consult the operation fixed value through a human-computer interface of the protection device to carry out piece-by-piece comparison, so that the fixed value check workload of the existing operation equipment is large, and the existing personnel bearing capacity is limited;
(2) Relay protection devices of different manufacturers and different types have different display modes of digital items, control words and soft pressing plate function modules, and the problems of item shortage, item leakage and the like are easily caused;
(3) Meanwhile, the source of the fixed value original data is more, the window of the equipment human-computer interface and the operation flexibility are limited, the difficulty of on-site fixed value checking is increased, and the accuracy of the fixed value checking operation is low;
aiming at the situation, related units at home and abroad provide some solutions. If the relay protection fixed value on-line intelligent comparison method based on the mixed professional dictionary is adopted, the names of the setting fixed value and the operation fixed value are matched through the calculation of the Jaccard similarity of the maximum forward matching algorithm and the unknown word sequence, so that the intelligent on-line comparison is realized; but the alignment is too long and cannot be matched, due to synonymous profiles caused by the term of art. The constant value on-line comparison method based on the multi-source data and the constant value curing system based on the expert system are used for guaranteeing safe and stable operation of the intelligent power grid. Because the system adopts a fuzzy matching method, higher accuracy cannot be ensured, and the possibility of comparison errors still exists. The fixed value reading and comparing technology based on the AR text recognition technology utilizes the structural characteristics of the table, and does not split the fixed value name and data during information acquisition, but the technology needs to utilize a paper form fixed value list in the table format, so that the compatibility of fixed value lists in different formats is poor. And acquiring fixed value parameters through a printing port of the relay protection equipment to perform fixed value comparison. And downloading the running area fixed value, the soft pressing plate and relevant imported printing data from a printing interface of the relay protection device. And (3) carrying out piece-by-piece searching and fuzzy matching on the constant value file and the printing data in the system database to finish constant value comparison. The method avoids the interference to the protection equipment, has higher safety, but considers that the low-voltage class protection equipment does not have a printing interface, and has poorer applicability.
Disclosure of Invention
The invention aims to provide an intelligent comparison method of relay protection fixed values, which is characterized in that the intelligent comparison method of relay protection fixed values is based on Chinese word segmentation; firstly, text processing is carried out on the constant value names of the relay protection devices by using a Chinese word segmentation technology, naming rules of the relay protection devices of different manufacturers and different types are analyzed and combed, and a dictionary of the constant value names of the relay protection devices is built; then, on the basis of a dictionary, performing Chinese word segmentation on the constant value name on the constant value list and the constant value name on the operation equipment by adopting an improved maximum forward matching algorithm to obtain a word segmentation result and a word segmentation array; finally, screening out the constant value items of the running device matched with the constant value single constant value items through comparing the word arrays, and comparing the constant values of the constant value items, if the constant value items are different, sending out a warning and providing a constant value downloading application;
the method specifically comprises the following steps:
1. mechanical word segmentation algorithm based on character string matching
In combination with the actual situation of constant value name matching, as the naming rules of most protection constant value items are different from life expressions, word segmentation methods based on understanding and semantics cannot be applied to segmentation of constant value name strings. In addition, the word segmentation method based on statistics has great uncertainty, which can influence the accuracy of word segmentation, thereby causing name matching failure or false matching. Therefore, the constant value names are standardized and unified, and are segmented to facilitate subsequent name matching.
2. Relay protection fixed value name dictionary establishment
The relay protection constant value names consist of various relay protection terms, and a perfect constant value name dictionary is required to be established according to the terms before a mechanical algorithm based on character string matching is adopted; therefore, naming rules of relay protection devices of different manufacturers and different types are comprehensively combed and analyzed, a traditional mechanical word segmentation dictionary mechanism is improved, a relay protection constant value name dictionary suitable for constant value comparison is established, and the subsequent constant value name matching efficiency is effectively improved.
3. Improved dictionary lookup mechanism
Aiming at the traditional mechanical word segmentation, a forward maximum matching algorithm is adopted to segment the character string to be segmented according to the established dictionary mechanism; still can not meet the requirement of fixed value comparison work; the dictionary lookup mechanism improves as follows:
3.1 improved Forward maximum match principle
Combining the characteristics of a naming rule of relay protection fixed value items, an improved forward maximum matching algorithm firstly takes a first word A of the fixed value item name S to be segmented for the fixed value item name S of the word to be segmented 1 Calculate the hash value H of the word 1 . According to the hash value H 1 First word of inquiry A 1 The position in the first word Hash index table is read, and the maximum word length B in the first word Hash index table unit B is read 1 . In S, the forward interception length is B 1 Is a substring T of (1) 1 Then searching entries in a constant item name dictionary; if there are vocabulary entry and substring T in dictionary 1 If the two are identical, the matching is successful, T is 1 Cut out as independent substring and output T 1 Dictionary position number a 1 : output T if there is no synonym flag 1 The dictionary position number of the user; outputting dictionary position numbers of standard synonyms if synonym marks exist; if no entry and substring T exists 1 If the match is the same, the matching fails, and the substring T to be segmented is subtracted by adopting a word subtracting method 1 The last character is searched and matched again in the dictionary, and the steps are repeated circularly until the length of the molecular string to be cut is 1, namely A 1 As a single word, completing one round of segmentation; then the next word is subjected to segmentation matching according to the process until S segmentation is complete; finally, outputting a word sequence with segmentation symbols which is subjected to standard normalization and a corresponding numbering sequence;
3.2 improved forward maximum match principle
Based on the improved constant value term name dictionary, word segmentation is carried out on the character strings according to an improved forward maximum matching algorithm; in order to match a subsequent constant value name matching mechanism, so that the efficiency is higher, the output sequence is regulated according to the part of speech of the entry in a constant value term name dictionary, if the constant value name contains serial number class words, the word sequence is regulated, the serial number class words are placed at the beginning of the word sequence after word segmentation, and the serial number sequence is correspondingly regulated;
4. Constant value name matching based on word segmentation
Because the improved dictionary mechanism and the improved word segmentation algorithm solve most of the problems of constant value name synonyms, the similarity between the constant value item output sequences of a constant value list and an operation constant value is 1, and the constant value item output sequences can be directly compared and downloaded after being matched; the problem that the synonyms of very few constant names cannot be completely matched due to the defect of a dictionary mechanism, and a sequence similarity calculation method and an anti-misoperation comparison mechanism combined method are provided for solving the problem;
4.1 number sequence similarity calculation
The measurement indexes of the two digital sequence similarity can be mainly divided into two types of sequence position indexes and sequence numerical indexes, and the two digital sequence similarity is measured by comparing the sequence numerical indexes due to the language logic of the names of relay protection constant value items and the processing of the output language sequences in the step 3.2; sequentially calculating the similarity of the character string on the fixed value sheet and the target string on the operation equipment to obtain a fixed value item sequence meeting a given similarity threshold value, and arranging the sequence according to the principle that the similarity is from large to small; if the similarity of the given sequence does not reach 1, the constant value item matching fails, namely the similarity matching caused by the interference of the names of the constant value items with similar serial numbers exists; considering the fact that the probability that the numerical value of the constant value item is completely the same as the unit is extremely low, introducing an anti-misoperation comparison mechanism, and improving the accuracy of constant value comparison;
4.2 anti-mismatching mechanism
According to the similarity matching of the names of the constant value items with similar existing number sequences in the step 4.1, the constant value item matching fails; considering the fact that the probability that the numerical value of the constant value item is completely the same as the unit is extremely low, introducing an anti-misoperation comparison mechanism, and improving the accuracy of constant value comparison; the constant value item with the maximum similarity is taken for constant value comparison, so that whether the constant value name matching of the word segmentation is successful or not is judged;
4.3 Intelligent comparison flow of relay protection fixed values
The intelligent comparison flow of relay protection fixed values compares the actual running fixed values of the total station equipment with the latest dispatching fixed value list; firstly, translating, sequencing and word segmentation are carried out on a constant value single-value item and an operating device constant value item by adopting an improved maximum forward matching algorithm, and a corresponding word sequence group is output; then traversing the two word sequence groups to inquire whether the two word sequence groups are identical; if the values are the same, comparing the corresponding fixed values; if the two word sequences are different, similarity calculation is carried out on the two word sequences, whether the two word sequences meet the condition is judged by combining an error-proof comparison mechanism, and if the two word sequences do not meet the condition, the two word sequences are not matched; if so, comparing the corresponding constant values.
The key idea of the step 1 that the mechanical word segmentation algorithm based on character string matching is applied to constant value name matching is as follows: comprehensively combing and analyzing naming standards and habits of protecting constant-value item names at a shipping equipment manufacturer and a dispatching master station, and establishing a constant-value name dictionary according to the naming standards and habits; under the condition that the grammar knowledge and the statistical information are not used, the constant value name character strings to be segmented are matched with the entries stored in the constant value name dictionary one by one according to a certain strategy, if the entries which are the same as the character strings can be found, the matching is successful, otherwise, the constant value name character strings are intercepted again according to the increasing words or the decreasing words, and the dictionary is searched again. Therefore, the mechanical word segmentation algorithm based on character string matching can be used as the optimal word segmentation algorithm in constant term name matching.
The step 2 specifically includes:
2.1 Relay protection constant value name composition analysis
The relay protection constant value name can be composed of four parts, namely a factory station name, a primary equipment name, a protection device model and a constant value item name. When the constant value names are matched, the constant value names are required to be matched in sequence according to the names of the stations, the names of the primary equipment, the types of the protection devices and the names of the constant value items;
(1) Station name: typically in the form of "voltage class + place name + change/plant/power plant" or "place name + change/plant/power plant". When the dictionary is constructed, three words of a variable word, a factory word and a power plant word are stored as fixed words into a constant value name dictionary, and the voltage level and the place name are stored into the dictionary according to the actual application places.
(2) Primary device name: typically in the form of a "place + line" or "# + number + main transformer/transformer" (where the number is typically no greater than 3, and may be stored as a synonym with the number of the protector model below). When constructing the dictionary, the 'line', 'transformer', 'main transformer' and '#' are used as fixed words to be stored in the dictionary, and the 'place' is stored in the dictionary according to the actual application place.
(3) Protection device model: typically in the form of "english letters + numbers". When constructing the dictionary, 26 English letters are written in the case and the numbers 0-10 are used as fixed words to be stored in the dictionary.
(4) Constant item name: the method generally comprises the steps of dividing various relay protection terms into dictionary, and storing the divided relay protection terms into the dictionary when constructing the dictionary; when a dictionary is built, the fact that the naming standards and habits of protecting the names of the constant value items at a shipping equipment manufacturer and a dispatching master station are different causes the matching failure of three types of constant value items:
the class 3 problem is that matching failure caused by different word sequences is caused by the diversity of Chinese expression modes, when a plurality of modifier words exist, the sequence of the modifier words is often not fixed, "overcurrent protection II-section current constant value" and "II-section overcurrent protection current constant value";
2.2 constant value name dictionary construction
According to the characteristics of the constant value name items in the step 2.1, the original mechanical word segmentation dictionary mechanism is improved, and the improved constant value item dictionary has a three-layer structure (as shown in figure 1.)
(1) First word Hash index table: each unit in the first-word Hash index table mainly comprises the following contents, a: first word, by calculating the first word hash value of the constant value item name, and storing the first word in a mode that the value is a serial number; as shown in the formula (1),
offset=(c 1 -0xB0)*94+(c 2 -0xA1) (1)
in the formula (1), offset is the sequence number of the first word in the Hash table, c1 and c2 are the built-in codes of the first word, and 0xB0 and 0xA1 are the initial high byte and low byte of Chinese character coding; b: the maximum word length, i.e. the number of the longest word when the Chinese character in the dictionary is the first word; c: a first item pointer pointing to the position of the next-level word index table;
(2) Word index table: each unit in the word index table mainly comprises the following contents, a: all word lengths, the Chinese character in the dictionary text is all word lengths of the first word; b: the dictionary text pointer points to the position of the dictionary text meeting the first character and meeting the word length.
(3) Dictionary text: each unit in the dictionary text mainly comprises the following contents, a: the vocabulary entries are professional vocabulary related to constant value names, and comprise Chinese vocabulary, english vocabulary and serial number vocabulary; b: the dictionary position number of the entry, namely the sequence number of the dictionary of the entry; c: the synonym mark, namely, a '0' indicates that the term is standard expression of all synonyms or no synonym exists, a non-zero integer indicates that the term is not the standard expression, and the non-zero integer is the position number of the corresponding standard expression in a dictionary; d: part of speech flag, "1" means that the term is serial number class word or compound word, "I", "II", "III" and "segment" in constant value name; "2" means the term single structural noun, "current," "voltage," and "protection" in the definite value name; "3" means that the term is a special term, "instantaneous current quick break protection", "time-limited current quick break protection" and "time-limited overcurrent protection"; such term synonyms are subsequently processed during the dictionary lookup process.
The three constant value items in the step 2.1- (4) are matched:
the problem of class 1 is that the matching fails due to the existence of synonyms, which can be divided into:
1) Sequence number class synonym profile: the section I and the section 1 of the over-current protection section I and the over-current protection section 1 belong to serial number type synonym special words;
2) Chinese synonymous special-shaped words, "overcurrent" and "overcurrent", "start" and "start";
3) English synonymous special words, "TV" and "PT";
the class 2 problem is a match failure due to the term of art: the over-current protection I section and the over-current protection II section and the time-limited current quick-break protection, and the over-current protection II section and the time-limited over-current protection.
3.2, firstly traversing the whole output word sequence according to the word part of the entry in the constant value term name dictionary to find out whether serial number class words exist; if so, judging whether the part-of-speech flag of the dictionary text where the word following the word is positioned is 1; if yes, the two groups of words are combined into a compound word to be placed at the first position. After the processing, the single words (namely, the entry with the part of speech mark of '2') are ordered according to the original sequence.
For the special class term, namely the term with the part of speech mark of '3', such as 'instantaneous current quick break protection', 'time-limited current quick break protection', and 'time-limited overcurrent protection', the term is stored in a specific area in the mixed dictionary due to small proportion. If the character string has a special class entry, outputting a standard synonym sequence correspondingly according to the position number of the dictionary in which the special class entry is positioned. For example, "instantaneous current rapid break protection" translates the word segmentation output sequence "198,24,47,10" to "segment I overcurrent protection" according to the dictionary location number "215"; the 'time-limited current quick break protection' translates the dictionary position number '216' into a word segmentation output sequence '29,24,47,10' of 'II-section overcurrent protection'; the term "time-limited overcurrent protection" translates the word segmentation output sequence "90,24,47,10" of the "segment III overcurrent protection" according to the dictionary position number "217" where it is located.
The step 4.1 digital sequence similarity calculation method is divided into four types: (1) Comparing the sequence containing the sequence number word with the sequence without the sequence number word; (2) alignment of two sequences containing different sequence numbers; (3) alignment of two sequences each without sequence number words; (4) Comparing two sequences which contain sequence numbers and have the same sequence numbers; for the first two cases, the two sequences were similar to 0. For the latter two cases, the dot product ratio of the sequence A and the sequence B is used as a sequence numerical index, and specific formulas are shown as formula (2) and formula (3).
Figure BDA0003215176100000091
Figure BDA0003215176100000092
In the formula (2), DPR (A, B) represents a dot product ratio of the sequence A and the sequence B; n represents the length of the sequence;
Figure BDA0003215176100000093
representing the sum of the products of the corresponding positions of the two sequences; />
Figure BDA0003215176100000094
Representing the sum of squares of the sequences; i represents the serial number of the sequence element, and the values are 0,2, … and n-1; the larger the dot product ratio, the more similar the two sequences are; in the formula (3), NDPR (A, B) represents a normalized dot product ratio, and the value range is [0,1]For example, the sequences of "overcurrent section I protection" and "current section I protection", the output sequences after word segmentation are respectively "198,24,47,10" and "198,24,27,10", and the sequence similarity is calculated as 0.9932;
in the step 4.1, for a constant value sequence which meets a similarity threshold and is arranged according to the similarity from large to small, firstly, taking a constant value item corresponding to a constant value name with the maximum similarity value for corresponding constant value comparison; if the fixed value number is consistent with the unit, judging that the fixed value name is successfully matched, and the fixed value of the operation equipment is consistent with the setting value without changing, otherwise, carrying out value consistency comparison of the next fixed value item; if the comparison result of the next item is consistent, judging that the constant value name corresponding to the constant value item is the corresponding constant value name on the constant value list, and the constant value item is not required to be changed by mistake, otherwise, carrying out constant value item value consistency comparison of the next constant value; if the comparison result of the whole constant value sequence still has no condition of consistent value, judging that a constant value item with the maximum similarity and consistent constant value unit is a corresponding item on a constant value list, judging that the constant value item is wrong, sending out alarm information, carrying out constant value downloading after confirming the difference, and calling the operation constant value again for comparison after the completion of the constant value;
And 4.2, performing constant value comparison by taking a constant value item with the maximum similarity, thereby judging whether constant value name matching of the word segmentation is successful or not: if the numerical value and the unit of the fixed value item are consistent, judging that the name of the fixed value item is successfully matched and consistent with the setting value, and no subsequent operation is needed; if the units of the constant items are consistent and the values are not, there are two possibilities: (1) The name of the constant value item is successfully matched, but the operation constant value is inconsistent with the setting value, alarm information is required to be sent out, and the constant value downloading is carried out; (2) The name matching of the constant value item fails, and if the units of the constant value item are inconsistent, the name matching of the constant value item fails;
the method has the advantages of simplicity, practicality and easy realization. Because the parts of speech in the protected constant value name are less, the constructed dictionary has smaller capacity and is more perfect, and the maximum defect of a mechanical word segmentation algorithm is effectively overcome; in addition, the length of the longest word protecting the constant value name is shorter, so that the operation speed of a mechanical word segmentation algorithm is increased to a certain extent, and the complexity of matching operation is reduced. The invention has the following advantages:
1) On the basis of the traditional dictionary, the grammar composition of the constant value names and Chinese and English terms are considered to form a more comprehensive constant value term name dictionary; meanwhile, data items such as a synonym mark, a serial number compound word mark, a special noun mark, a dictionary position number and the like are added into the dictionary text, so that the problem of synonym expression is better solved.
2) Based on the traditional matching algorithm, processing methods such as word order adjustment, synonym substitution, special word translation, word segmentation sequence output and the like are provided. The vast majority of constant value item name matching problems caused by word sequences, technical terms, general false words and the like are better processed through an improved dictionary mechanism and a matching algorithm.
3) Aiming at the matching which can not be realized by the dictionary mechanism, an intelligent comparison method combining sequence similarity calculation and an anti-misoperation comparison mechanism is provided, so that the constant value comparison is more comprehensive and more accurate.
Drawings
FIG. 1 is a constant item name dictionary.
Fig. 2 is a flow chart of a forward maximum matching algorithm.
Fig. 3 is a flow chart of a forward maximum matching algorithm.
FIG. 4 is a flow chart of relay protection fixed value intelligent comparison.
Detailed Description
The invention provides an intelligent comparison method of relay protection fixed values, which is based on the intelligent comparison method of relay protection device fixed values of Chinese word segmentation; comprising the following steps: 1. a mechanical word segmentation algorithm based on character string matching; 2. establishing a relay protection fixed value name dictionary; 3 an improved dictionary lookup mechanism; 4. matching the constant value names based on the word segmentation; firstly, text processing is carried out on the constant value names of the relay protection devices by using a Chinese word segmentation technology, naming rules of the relay protection devices of different manufacturers and different types are analyzed and combed, and a dictionary of the constant value names of the relay protection devices is built; then, on the basis of a dictionary, performing Chinese word segmentation on the constant value name on the constant value list and the constant value name on the operation equipment by adopting an improved maximum forward matching algorithm to obtain a word segmentation result and a word segmentation array; finally, screening out the constant value items of the running device matched with the constant value single constant value items through comparing the word arrays, and comparing the constant values of the constant value items, if the constant value items are different, sending out a warning and providing a constant value downloading application; the invention will be further described with reference to the drawings and examples. The method specifically comprises the following steps:
1. Mechanical word segmentation algorithm based on character string matching
In view of the diversity of the components of the constant value names of the relay protection device and the fact that the existence of synonyms is a main reason for restricting the online intelligent comparison level of the constant value, the invention selects the Chinese word segmentation technology to carry out text processing on the constant value names of the relay protection device, standardizes and unifies the constant value names, and segments the constant value names so as to facilitate the subsequent name matching.
In combination with the actual situation of constant value name matching, as the naming rules of most protection constant value items are different from life expressions, word segmentation methods based on understanding and semantics cannot be applied to segmentation of constant value name strings. In addition, the word segmentation method based on statistics has great uncertainty, which can influence the accuracy of word segmentation, thereby causing name matching failure or false matching. Therefore, the latter three methods are not applicable to the segmentation of constant value name strings; in contrast, the core idea of the mechanical word segmentation algorithm based on character string matching applied to constant value name matching is that: comprehensively combing and analyzing naming standards and habits of protecting constant-value item names at a shipping equipment manufacturer and a dispatching master station, and establishing a constant-value name dictionary according to the naming standards and habits; under the condition that the grammar knowledge and the statistical information are not used, the constant value name character strings to be segmented are matched with the entries stored in the constant value name dictionary one by one according to a certain strategy, if the entries which are the same as the character strings can be found, the matching is successful, otherwise, the constant value name character strings are intercepted again according to the increasing words or the decreasing words, and the dictionary is searched again. The method has the advantages of simplicity, practicality and easy realization. Because the parts of speech in the protection fixed value name are less, the constructed dictionary has smaller capacity and is more perfect, and the maximum defect of a mechanical word segmentation algorithm is effectively overcome. In addition, the length of the longest word protecting the constant value name is shorter, so that the operation speed of a mechanical word segmentation algorithm is increased to a certain extent, and the complexity of matching operation is reduced. Therefore, the mechanical word segmentation algorithm based on character string matching can be used as the optimal word segmentation algorithm in constant term name matching.
2. Relay protection fixed value name dictionary establishment
The relay protection constant value name consists of various relay protection terms, and a perfect constant value name dictionary is established according to the terms before a mechanical algorithm based on character string matching is adopted. The section comprehensively combs and analyzes naming rules of relay protection devices of different manufacturers and different types, improves a traditional mechanical word segmentation dictionary mechanism, establishes a relay protection constant value name dictionary suitable for constant value comparison, and effectively improves the subsequent constant value name matching efficiency.
2.1 Relay protection constant value name composition analysis
The relay protection constant value name can be composed of four parts, namely a factory station name, a primary equipment name, a protection device model and a constant value item name. When the constant value names are matched, the matching is needed to be carried out according to the sequence of the station names, the primary equipment names, the protection device model numbers and the constant value item names.
(1) Station name: typically in the form of "voltage class + place name + change/plant/power plant" or "place name + change/plant/power plant". When the dictionary is constructed, three words of a variable word, a factory word and a power plant word are stored as fixed words into a constant value name dictionary, and the voltage level and the place name are stored into the dictionary according to the actual application places.
(2) Primary device name: typically in the form of a "place + line" or "# + number + main transformer/transformer" (where the number is typically no greater than 3, and may be stored as a synonym with the number of the protector model below). When constructing the dictionary, the 'line', 'transformer', 'main transformer' and '#' are used as fixed words to be stored in the dictionary, and the 'place' is stored in the dictionary according to the actual application place.
(3) Protection device model: typically in the form of "english letters + numbers". When constructing the dictionary, 26 English letters are written in the case and the numbers 0-10 are used as fixed words to be stored in the dictionary.
(4) Constant item name: the method generally comprises various relay protection terms, and when a dictionary is built, the various relay protection terms are required to be split and stored in the dictionary. When a dictionary is built, the problem of failure in matching three types of constant value items caused by different naming standards and habits of protecting the constant value item names between a shipping equipment manufacturer and a dispatching master station is to be noted: the problem of class 1 is that matching fails due to the existence of synonym heterograms, which can be classified into serial number synonym heterograms (such as 'overcurrent protection I section' and 'overcurrent protection 1 section'), chinese synonym heterograms (such as 'overcurrent' and 'overcurrent', 'start' and 'start'), and English synonym heterograms (such as 'TV' and 'PT'); the problem of class 2 is the matching failure caused by the technical terms, such as 'overcurrent protection I section' and 'current quick-break protection', 'overcurrent protection II section' and 'time-limited overcurrent protection', and the like; class 3 is the matching failure caused by the different word order. Because of the variety of Chinese expressions, when there are multiple modifiers, the sequence of the modifiers is often not fixed (e.g. "overcurrent protection II section current constant" and "II section overcurrent protection current constant").
2.2 constant value name dictionary construction
And improving the original mechanical word segmentation dictionary mechanism according to the characteristics of the fixed-value name items. The improved constant value term dictionary shown in fig. 1 has a three-layer structure.
(1) First word Hash index table: each unit in the first-word Hash index table mainly comprises the following contents, a: first word, the first word hash value (shown in 1) is calculated by the constant value item name, and the first word is stored in a mode that the value is a serial number.
offset=(c 1 -0xB0)*94+(c 2 -0xA1) (1)
In the formula (1), offset is the sequence number of the first word in the Hash table, c1 and c2 are the built-in codes of the first word, and 0xB0 and 0xA1 are the initial high byte and low byte of Chinese character coding; b: the maximum word length, i.e. the number of the longest word when the Chinese character in the dictionary is the first word; c: the first item pointer points to the position of the next-level word index table.
(2) Word index table: each unit in the word index table mainly comprises the following contents, a: all word lengths, the Chinese character in the dictionary text is all word lengths of the first word; b: the dictionary text pointer points to the position of the dictionary text meeting the first character and meeting the word length.
(3) Dictionary text: each unit in the dictionary text mainly comprises the following contents, a: the vocabulary entries are professional vocabulary related to constant value names, and comprise Chinese vocabulary (such as 'current', 'voltage', and the like), english vocabulary (such as 'PT', and the like), and serial number vocabulary (such as 'I', '1', and the like); b: the dictionary position number of the entry, namely the sequence number of the dictionary of the entry; c: the synonym mark, namely, a '0' indicates that the term is standard expression of all synonyms or no synonym exists, a non-zero integer indicates that the term is not the standard expression, and the non-zero integer is the position number of the corresponding standard expression in a dictionary; d: the part of speech flag, "1" indicates that the term is a serial number class word or a compound word, such as "I", "II", "III" and "segment" in a definite value name. "2" means the term of the term single structure, such as "current", "voltage" and "protection" in constant value names. "3" means that the term is a special term, such as "instantaneous current rapid break protection", "time-limited current rapid break protection", and "time-limited overcurrent protection", and the like, and the term is translated in the dictionary inquiry process.
3. Improved dictionary lookup mechanism
The traditional mechanical word segmentation adopts a forward maximum matching algorithm to segment the character string to be segmented according to the established dictionary mechanism; the forward maximum matching algorithm adopts forward scanning and a longest word subtracting method to match the characters to be segmented one by one, as shown in fig. 2. After the segmentation of the traditional matching algorithm, the original character string is changed into a string of word sequences with segmented symbols, and the requirement of constant value comparison work still cannot be met.
3.1 improved Forward maximum match principle
The invention combines the characteristics of the naming rule of relay protection constant value terms, as shown in figure 3, an improved forward maximum matching algorithm is adopted, and for the constant value term name S of the word to be segmented, a first character A is firstly taken 1 Calculate the hash value H of the word 1 . According to the hash value H 1 First word of inquiry A 1 The position in the first word Hash index table is read, and the maximum word length B in the first word Hash index table unit B is read 1 . In S, the forward interception length is B 1 Is a substring T of (1) 1 The entry is then looked up in the constant term name dictionary. If there are vocabulary entry and substring T in dictionary 1 If the two are identical, the matching is successful, T is 1 Cut out as independent substring and output T 1 Dictionary position number a 1 (output T if there is no synonym flag 1 The dictionary position number of the user; outputting dictionary location numbers of standard synonyms if synonym marks exist; if no entry and substring T exists 1 If the match is the same, the matching fails, and the substring T to be segmented is subtracted by adopting a word subtracting method 1 The last character is searched and matched again in the dictionary, and the steps are repeated circularly until the length of the molecular string to be cut is 1, namely A 1 And (5) cutting out the single word, and finishing one round of cutting. Then the next word is subjected to segmentation matching according to the process until S segmentation is complete. Finally, outputting the word sequence with the segmentation symbol and the corresponding numbering sequence which are subjected to standard normalization.
3.1 improved Forward maximum match principle
Based on the improved constant term name dictionary, word segmentation is carried out on the character strings according to an improved forward maximum matching algorithm. In order to match with the follow-up constant value name matching mechanism, the method has higher efficiency. As shown in the forward maximum matching algorithm flow chart of figure 2,
and (3) adjusting the output sequence according to the part of speech of the entry in the constant value term name dictionary, if the constant value term contains serial number class words, adjusting the word sequence, placing the serial number class words in the first part of the word sequence after word segmentation, and correspondingly adjusting the serial number sequence. Firstly traversing the whole output word sequence to find out whether serial number class words exist; if so, judging whether the part-of-speech flag of the dictionary text where the word following the word is positioned is 1; if yes, the two groups of words are combined into a compound word to be placed at the first position. After the processing, the single words (namely, the entry with the part of speech mark of '2') are ordered according to the original sequence.
For the special class term (namely the term with the part of speech mark of '3'), such as 'instantaneous current quick break protection', 'time limit current quick break protection', and 'time limit overcurrent protection', the term is stored in a specific area in the mixed dictionary due to small proportion. If the character string has a special class entry, outputting a standard synonym sequence correspondingly according to the position number of the dictionary in which the special class entry is positioned. For example, "instantaneous current rapid break protection" translates the word segmentation output sequence "198,24,47,10" to "segment I overcurrent protection" according to the dictionary location number "215"; the 'time-limited current quick break protection' translates the dictionary position number '216' into a word segmentation output sequence '29,24,47,10' of 'II-section overcurrent protection'; the term "time-limited overcurrent protection" translates the word segmentation output sequence "90,24,47,10" of the "segment III overcurrent protection" according to the dictionary position number "217" where it is located.
4. Constant value name matching based on word segmentation
Because the improved dictionary mechanism and the improved word segmentation algorithm solve most of the problems (section 2.1) of constant value name synonyms, the similarity between the output sequences of constant value items such as a constant value list and an operation constant value is 1, and the constant value items can be directly compared and downloaded after matching. The problem of few synonyms of constant value names cannot be completely matched due to the defect of a dictionary mechanism, for example, the overcurrent I-section protection and the current I-section protection are adopted, and output sequences after word segmentation are 198,24,47,10 and 198,24,27,10 respectively. The invention provides a sequence similarity calculation method and a method combining an anti-misoperation alignment mechanism to solve the problems.
4.1 number sequence similarity calculation
The measurement indexes of the similarity of two digital sequences can be mainly divided into two types of sequence position indexes and sequence numerical indexes. Because of the language logic of the relay protection constant item name and the processing of the output language sequence in the step 3.2, the invention measures the similarity of two digital sequences by comparing the numerical indexes of the sequences.
The number sequence similarity calculation methods can be divided into four categories: (1) Comparing the sequence containing the sequence number word with the sequence without the sequence number word; (2) Comparing two sequences which contain sequence numbers and different sequence numbers; (3) alignment of two sequences each without sequence number words; (4) And comparing two sequences which contain sequence numbers and have the same sequence numbers. For the first two cases, the two sequences were similar to 0. For the latter two cases, the dot product ratio of the sequences A and B is used as a sequence numerical index, and specific formulas are shown in the formulas (2) and (3).
Figure BDA0003215176100000171
Figure BDA0003215176100000172
In the formulas (2) and (3), DPR (A, B) represents the dot product ratio of the sequence A and the sequence B; n represents the length of the sequence;
Figure BDA0003215176100000173
representing the sum of the products of the corresponding positions of the two sequences; />
Figure BDA0003215176100000174
Representing the sum of squares of the sequences. The larger the dot product ratio, the more similar the two sequences. NDPR (A, B) represents a normalized dot product ratio ranging from 0,1 ]I represents the serial number of the sequence element, and the values are 0,2, … and n-1. For example, the sequences of overcurrent section I protection and current section I protection are respectively "198,24,47,10" and "198,24,27,10" after word segmentation, and the sequence similarity is calculated as 0.9932;
for a constant value sequence which meets a similarity threshold and is arranged according to the similarity from large to small, firstly, taking a constant value item corresponding to a constant value name with the largest similarity value for corresponding constant value comparison; if the fixed value number is consistent with the unit, judging that the fixed value name is successfully matched, and the fixed value of the operation equipment is consistent with the setting value without changing, otherwise, carrying out value consistency comparison of the next fixed value item; if the comparison result of the next item is consistent, judging that the constant value name corresponding to the constant value item is the corresponding constant value name on the constant value list, and the constant value item is not required to be changed by mistake, otherwise, carrying out constant value item value consistency comparison of the next constant value; if the comparison result of the whole constant value sequence still has no condition of consistent value, judging that a constant value item with the maximum similarity and consistent constant value unit is a corresponding item on a constant value list, judging that the constant value item is wrong, sending out alarm information, carrying out constant value downloading after confirming the difference, and calling the operation constant value again for comparison after the completion of the constant value;
4.2 anti-mismatching mechanism
And sequentially calculating the similarity of the character string on the fixed value sheet and the target string on the operation equipment to obtain a fixed value item sequence meeting a given similarity threshold value, and arranging the sequence according to the principle that the similarity is from large to small. And (3) taking a constant value item with the maximum similarity to perform constant value comparison, and if the numerical value and the unit of the constant value item are consistent, judging that the names of the constant value items are successfully matched and consistent with the setting value, so that subsequent operation is not needed. If the units of the constant items are consistent and the values are not, there are two possibilities: (1) The name of the constant value item is successfully matched, but the operation constant value is inconsistent with the setting value, alarm information is required to be sent out, and the constant value downloading is carried out; (2) constant item name matching fails. If the units of the constant value items are inconsistent, the name matching of the constant value items fails. The reasons for the failure of the constant value item matching are all caused by that the sequence similarity meets the threshold value but does not reach 1, namely, the similarity matching is interfered by the names of the constant value items with similar number sequences. In consideration of the fact that the probability that the numerical value of the constant value item is completely the same as that of the unit is extremely low, an anti-misoperation comparison mechanism is introduced, and accuracy of constant value comparison is improved.
For a constant value sequence which meets the similarity threshold and is arranged according to the similarity from large to small, firstly, a constant value item corresponding to a constant value name with the largest similarity value is taken for corresponding constant value comparison. If the fixed value number is consistent with the unit, judging that the fixed value name is successfully matched, and the fixed value of the operation equipment is consistent with the setting value without changing, otherwise, carrying out value consistency comparison of the next fixed value item; if the comparison result of the next item is consistent, judging that the constant value name corresponding to the constant value item is the corresponding constant value name on the constant value list, and the constant value item is not required to be changed by mistake, otherwise, carrying out constant value item value consistency comparison of the next constant value; if the comparison result of the whole constant value sequence still has no condition of consistent value, judging that the constant value item with the maximum similarity and consistent constant value unit is the corresponding item on the constant value list, judging that the constant value item is wrong, sending out alarm information, carrying out constant value downloading after confirming the difference, and calling the operation constant value again for comparison after the completion of the constant value.
And 4.2, performing constant value comparison by taking a constant value item with the maximum similarity, thereby judging whether constant value name matching of the word segmentation is successful or not: if the numerical value and the unit of the fixed value item are consistent, judging that the name of the fixed value item is successfully matched and consistent with the setting value, and no subsequent operation is needed; if the units of the constant items are consistent and the values are not, there are two possibilities: (1) The name of the constant value item is successfully matched, but the operation constant value is inconsistent with the setting value, alarm information is required to be sent out, and the constant value downloading is carried out; (2) The name matching of the constant value item fails, and if the units of the constant value item are inconsistent, the name matching of the constant value item fails;
4.3 Intelligent comparison flow of relay protection fixed values
The relay protection fixed value intelligent comparison flow Cheng Ru is shown in fig. 4, and the actual operation fixed value of the total station equipment is compared with the latest scheduling fixed value list. Firstly, an improved maximum forward matching algorithm is adopted to translate, sequence and word the constant value single-value item and the constant value item of the operating device, and a corresponding word sequence group is output. The two word sequence groups are then traversed to query whether they are identical. If the values are the same, comparing the corresponding fixed values; if the two word sequences are different, similarity calculation is carried out on the two word sequences, and whether the two word sequences meet the condition is judged by combining an anti-misoperation comparison mechanism. If not, the two are not matched; if so, comparing the corresponding constant values.
Examples
In order to verify the practicability of the method, the naming standards and habits of the names of the constant value items are comprehensively carded and analyzed at the equipment manufacturer and the dispatching master station, the equipment of a plurality of relay protection manufacturers is found, and the dispatching center has differences in the description of the names. Several special common differences were chosen as example validations.
5.1 improved Forward maximum matching Algorithm
Taking the fan starting overcurrent, the double-pressure blocking I section and the instantaneous current quick-break protection as examples for carrying out calculation and analysis. The conventional word segmentation process is shown in table 1, and the modified word segmentation process is shown in table 2.
Table 1 conventional forward maximum matching algorithm
Figure BDA0003215176100000201
Table 2 improved forward maximum matching algorithm
Figure BDA0003215176100000211
The traditional word segmentation method only outputs a string of word sequences with segmentation symbols, and does not carry out other processing on the vocabulary entry and the word sequences. And comparing the word segmentation process of the first group of character strings, adding a synonym mark into an improved word segmentation algorithm, and carrying out synonym inquiry and replacement on the characters matched with the sub-strings. For example, in the word segmentation process of the first group of "fan start over current", the synonym marks exist in the query character strings of "start" and "over current", the synonym marks are obtained by the synonym terms of the dictionary text of the fixed value name, and replacement is carried out, so that the word segmentation result is "fan/start/over current". The synonym replacement process solves the problem that the constant value name matching fails due to synonym special-shaped words (described in section 2.1), and facilitates subsequent matching work.
And comparing the word segmentation process of the second group of character strings, and adjusting the word sequence of the word segmentation result by an improved word segmentation algorithm. For example, in the word segmentation process of the second group of "double-pressure locking I section", after the serial number class entry "I" is replaced by the standard word "1", the serial number class entry "I" is combined with the compound entry "section" to form a compound word, and the compound word is placed in the first position and output. The adjustment of word segmentation order solves the problem of failure of constant value name matching (described in section 2.1) caused by identical semantics of different word sequences, and is convenient for subsequent similarity calculation of data sets.
And comparing the word segmentation process of the third group of character strings, and adding the query and conversion of the special word marks into the improved word segmentation algorithm. For example, the word segmentation process of "instantaneous current break protection" exists in the special word areas of the dictionary with the presence of special word tokens. According to the position of the dictionary, the output word segmentation result is 1 segment/overcurrent/protection. The introduction of the special word mark solves the problem of synonym special words (described in section 2.1) caused by the technical term, and is convenient for subsequent matching work.
5.2 Special string alignment examples
Fixed value single string s= "overcurrent I section protection" and running equipment fixed value item string space t= { T 1 ,t 2 ,t 3 ,t 4 }. Wherein t is taken 1 = "II segment overcurrent protection, t 2 = "current I segment protection, t 3 = "zero sequence I period time" and t 4 = "zero sequence I-segment current". The word segmentation results are shown in Table 3. The similarity between the character string s and the target space T is calculated, and the result is shown in table 4.
The threshold is defined by the similarity of the most widely separated synonym sequences in the dictionary as 0.95. As can be seen from table 4, the character strings satisfying the threshold are "current I segment protection" and "zero sequence I segment current", respectively. S and t are obtained from the formula (2) and the formula (3) 2 And t 4 The similarity of the two values meets the threshold value, and the value of the constant value item is compared at the moment. As is clear from Table 5, the setting value of s and t 2 Is consistent with t 4 Is different in setting value, and s and t 2 The similarity of (2) is highest, and the target string corresponding to s is determined to be t 2 And the setting value is qualified and the next operation is not needed. Therefore, the inverse correction (namely an anti-error comparison mechanism) of the value of the constant value item can improve the accuracy of the matching result to a certain extent.
TABLE 3 word segmentation results
Figure BDA0003215176100000231
TABLE 4 similarity calculation results
Figure BDA0003215176100000232
TABLE 5 Source string and string constant value name and setting value
Figure BDA0003215176100000233
/>

Claims (7)

1. The intelligent comparison method of the relay protection fixed values is characterized in that the intelligent comparison method of the relay protection fixed values is based on Chinese word segmentation; the diversity of the components of the fixed value names of the relay protection device and the existence of synonyms are main reasons for restricting the on-line intelligent comparison level of the fixed value; firstly, text processing is carried out on the constant value names of the relay protection devices by using a Chinese word segmentation technology, naming rules of the relay protection devices of different manufacturers and different types are analyzed and combed, and a dictionary of the constant value names of the relay protection devices is built; then, on the basis of a dictionary, performing Chinese word segmentation on the constant value name on the constant value list and the constant value name on the operation equipment by adopting an improved maximum forward matching algorithm to obtain a word segmentation result and a word segmentation array; finally, screening out the constant value items of the running device matched with the constant value single constant value items through comparing the word arrays, and comparing the constant values of the constant value items, if the constant value items are different, sending out a warning and providing a constant value downloading application; the method specifically comprises the following steps:
1) Mechanical word segmentation algorithm based on character string matching
Combining the actual situation of constant value name matching, comprehensively combing and analyzing naming standards and habits of protecting constant value item names at a shipping equipment manufacturer and a dispatching master station, and establishing a constant value name dictionary according to the naming standards and habits; under the condition of not using grammar knowledge and statistical information, matching the constant value name character strings waiting to be segmented with entries stored in a constant value name dictionary one by one; therefore, the constant value names are standardized and unified, and are segmented to facilitate the subsequent name matching;
2) Relay protection fixed value name dictionary establishment
The relay protection constant value names consist of various relay protection terms, and a perfect constant value name dictionary is required to be established according to the terms before a mechanical algorithm based on character string matching is adopted; therefore, naming rules of relay protection devices of different manufacturers and different types are comprehensively combed and analyzed, a traditional mechanical word segmentation dictionary mechanism is improved, a relay protection constant value name dictionary suitable for constant value comparison is established, and the subsequent constant value name matching efficiency is effectively improved;
3) Improved dictionary lookup mechanism
Aiming at the traditional mechanical word segmentation, a forward maximum matching algorithm is adopted to segment the character string to be segmented according to the established dictionary mechanism; still can not meet the requirement of fixed value comparison work; the dictionary lookup mechanism is improved as follows:
Step 3.1, improved forward maximum match principle
Combining the characteristics of a naming rule of relay protection fixed value items, an improved forward maximum matching algorithm firstly takes a first word A of the fixed value item name S to be segmented for the fixed value item name S of the word to be segmented 1 Calculate the hash value H of the word 1 The method comprises the steps of carrying out a first treatment on the surface of the According to the hash value H 1 First word of inquiry A 1 The position in the first word Hash index table is read, and the maximum word length B in the first word Hash index table unit B is read 1 The method comprises the steps of carrying out a first treatment on the surface of the In S, the forward interception length is B 1 Is a substring T of (1) 1 Then searching entries in a constant item name dictionary; if there are vocabulary entry and substring T in dictionary 1 If the two are identical, the matching is successful, T is 1 Cut out as independent substring and output T 1 Dictionary position number a 1 : output T if there is no synonym flag 1 The dictionary position number of the user; outputting dictionary position numbers of standard synonyms if synonym marks exist; if no entry and substring T exists 1 If the match is the same, the matching fails, and the substring T to be segmented is subtracted by adopting a word subtracting method 1 The last character is searched and matched again in the dictionary, and the steps are repeated circularly until the substring T to be segmented 1 Is 1, i.e. A 1 As a single word, completing one round of segmentation; then carrying out segmentation matching on the next word until S is completely segmented; finally, outputting a word sequence with segmentation symbols which is subjected to standard normalization and a corresponding numbering sequence;
Step 3.2, improved forward maximum match principle
Based on the improved constant value term name dictionary, word segmentation is carried out on the character strings according to an improved forward maximum matching algorithm; in order to match a subsequent constant value name matching mechanism, so that the efficiency is higher, the output sequence is regulated according to the part of speech of the entry in a constant value term name dictionary, if the constant value name contains serial number class words, the word sequence is regulated, the serial number class words are placed at the beginning of the word sequence after word segmentation, and the serial number sequence is correspondingly regulated;
4) Constant value name matching based on word segmentation
The method for solving the problem by combining a sequence similarity calculation method and an anti-misoperation comparison mechanism comprises the following steps:
step 4.1, numbered sequence similarity calculation
The measurement indexes of the two digital sequence similarity are divided into two types, namely a sequence position index and a sequence numerical index, and the two digital sequence similarity is measured by comparing the sequence numerical indexes due to the language logic of the names of the relay protection constant items and the processing of the output language sequences in the step 3.2; sequentially calculating the similarity of the character string on the fixed value sheet and the target string on the operation equipment to obtain a fixed value item sequence meeting a given similarity threshold value, and arranging the sequence according to the principle that the similarity is from large to small; if the similarity of the given sequence does not reach 1, the constant value item matching fails, namely the similarity matching caused by the interference of the names of the constant value items with similar serial numbers exists; considering the fact that the probability that the numerical value of the constant value item is completely the same as the unit is extremely low, introducing an anti-misoperation comparison mechanism, and improving the accuracy of constant value comparison;
Step 4.2 error proofing comparison mechanism
According to the similarity matching of the names of the constant value items with similar existing number sequences in the step 4.1, the constant value item matching fails; considering the fact that the probability that the numerical value of the constant value item is completely the same as the unit is extremely low, introducing an anti-misoperation comparison mechanism, and improving the accuracy of constant value comparison; the constant value item with the maximum similarity is taken for constant value comparison, so that whether the constant value name matching of the word segmentation is successful or not is judged;
step 4.3 Intelligent relay protection fixed value comparison flow
The intelligent comparison flow of relay protection fixed values compares the actual running fixed values of the total station equipment with the latest dispatching fixed value list; firstly, translating, sequencing and word segmentation are carried out on a constant value single-value item and an operating device constant value item by adopting an improved maximum forward matching algorithm, and a corresponding word sequence group is output; then traversing the two word sequence groups to inquire whether the two word sequence groups are identical; if the values are the same, comparing the corresponding fixed values; if the two word sequences are different, similarity calculation is carried out on the two word sequences, whether the two word sequences meet the condition is judged by combining an error-proof comparison mechanism, and if the two word sequences do not meet the condition, the two word sequences are not matched; if so, comparing the corresponding constant values.
2. The method for intelligently comparing relay protection fixed values according to claim 1, wherein the step 2) specifically comprises:
Step 2.1 Relay protection constant value name composition analysis
The relay protection constant value name consists of four parts, namely various relay protection terms are respectively a factory station name, a primary equipment name, a protection device model and a constant value term name; when the fixed value names are matched, the fixed value names of the station names, the primary equipment names and the fixed value item names of the protection device model are required to be matched in sequence, specifically:
(1) Station name: in the form of voltage class + place name + change/plant/power plant "or" place name + change/plant/power plant "; when constructing a dictionary, three words of a variable word, a factory word and a power plant word are required to be stored into a constant value name dictionary as fixed words, and voltage levels and place names are stored into the dictionary according to actual application places;
(2) Primary device name: the method comprises the steps of appearing in the form of 'place + line' or 'number + main transformer/transformer', wherein the number is not more than 3, and the number is stored in a dictionary as synonym with the number of the following protection device model; when constructing the dictionary, the 'line', 'transformer', 'main transformer' and '#' are used as fixed words to be stored in the dictionary, and the 'place' is stored in the dictionary according to the actual application place;
(3) Protection device model: appears in the form of "english letters + numbers"; when constructing a dictionary, 26 English letters with the upper and lower cases and numbers 0-10 are used as fixed words to be stored in the dictionary;
(4) Constant item name: the method comprises the steps that when a dictionary is built, various relay protection terms are required to be split and stored in the dictionary; when a dictionary is built, the fact that the naming standards and habits of protecting the names of the constant value items at a shipping equipment manufacturer and a dispatching master station are different causes the matching failure of three types of constant value items:
step 2.2, constant value name dictionary construction
According to the characteristics of the constant value name term in the step 2.1, an original mechanical word segmentation dictionary mechanism is improved, and the improved constant value term dictionary has a three-layer structure, and comprises the following components:
(1) First word Hash index table: each unit in the first-word Hash index table mainly comprises the following (a): first word, by calculating the first word hash value of the constant value item name, and storing the first word in a mode that the value is a serial number; as shown in the formula (1),
offset=(c 1 -0xB0)*94+(c 2 -0xA1) (1)
in the formula (1), offset is the sequence number of the first word in the Hash table, c1 and c2 are the built-in codes of the first word, and 0xB0 and 0xA1 are the initial high byte and low byte of Chinese character coding; (B): the maximum word length, i.e. the number of the longest word when the Chinese character in the dictionary is the first word; (C): a first item pointer pointing to the position of the next-level word index table;
(2) Word index table: each element in the word index table mainly includes the following (a): all word lengths, the Chinese character in the dictionary text is all word lengths of the first word; (B): a dictionary text pointer pointing to the position of the dictionary text meeting the first character and meeting the word length;
(3) Dictionary text: each unit in the dictionary text mainly includes the following (a): the vocabulary entries are professional vocabulary related to constant value names, and comprise Chinese vocabulary, english vocabulary and serial number vocabulary; (B): the dictionary position number of the entry, namely the sequence number of the dictionary of the entry; (C): the synonym mark, namely, a '0' indicates that the term is standard expression of all synonyms or no synonym exists, a non-zero integer indicates that the term is not the standard expression, and the non-zero integer is the position number of the corresponding standard expression in a dictionary; d: part of speech flag, "1" means that the term is serial number class word or compound word, "I", "II", "III" and "segment" in constant value name; "2" means the term single structural noun, "current," "voltage," and "protection" in the definite value name; "3" means that the term is a special term, "instantaneous current quick break protection", "time-limited current quick break protection" and "time-limited overcurrent protection"; such term synonyms are subsequently processed during the dictionary lookup process.
3. The intelligent comparison method of relay protection constant values according to claim 1, wherein three types of constant value items in the step 2.1 (4) are matched:
The problem of class 1 is that the matching failure due to the presence of synonyms is classified as:
1) Sequence number class synonym profile: the section I and the section 1 of the over-current protection section I and the over-current protection section 1 belong to serial number type synonym special words;
2) Chinese synonymous special-shaped words, "overcurrent" and "overcurrent", "start" and "start";
3) English synonymous special words, "TV" and "PT";
the class 2 problem is a match failure due to the term of art: the over-current protection I section and the over-current protection II section and the time-limited current quick-break protection, the over-current protection II section and the time-limited over-current protection;
the class 3 problem is that matching fails due to different word orders, and due to the diversity of Chinese expression modes, when a plurality of modifier words exist, the sequence of the modifier words is often not fixed, "overcurrent protection class II current constant value" and "class II overcurrent protection current constant value".
4. The intelligent comparison method of relay protection fixed values according to claim 1, wherein in the step 3.2, the output sequence is sequenced according to the parts of speech of the entry in the fixed value term name dictionary, and the whole output word sequence is traversed firstly to find whether serial number class words exist or not; if so, judging whether the part-of-speech flag of the dictionary text where the word following the word is positioned is 1; if yes, the two groups of words are combined into a compound word to be placed at the first position; after the processing is finished, sorting the single words, namely the vocabulary entries with the part of speech mark of 2, according to the original sequence;
Aiming at the special class entries, namely the entries with the part of speech mark of 3, the entries are stored in a specific area in the mixed dictionary because the proportion of the entries is small; if the character string has a special class entry, outputting a standard synonym sequence correspondingly according to the position number of the dictionary in which the special class entry is positioned.
5. The intelligent comparison method of relay protection fixed values according to claim 1, wherein the digital sequence similarity calculation method in step 4.1 is divided into four types: (1) Comparing the sequence containing the sequence number word with the sequence without the sequence number word; (2) alignment of two sequences containing different sequence numbers; (3) alignment of two sequences each without sequence number words; (4) Comparing two sequences which contain sequence numbers and have the same sequence numbers; for the first two cases, the similarity of the two sequences is 0; for the latter two cases, the dot product ratio of the sequence A and the sequence B is used as a sequence numerical index, the specific formulas are shown as the formula (2) and the formula (3),
Figure QLYQS_1
Figure QLYQS_2
in the formula (2), DPR (A, B) represents a dot product ratio of the sequence A and the sequence B; n represents the length of the sequence;
Figure QLYQS_3
representing the sum of the products of the corresponding positions of the two sequences; />
Figure QLYQS_4
Representing the sum of squares of the sequences; i represents the serial number of the sequence element, and the values are 0,2, … and n-1; the larger the dot product ratio is, the two are represented The more similar the sequences; in the formula (3), NDPR (A, B) represents a normalized dot product ratio, and the value range is [0,1]The method comprises the steps of carrying out a first treatment on the surface of the In the formula (3), NDPR (A, B) represents a normalized dot product ratio, and the value range is [0,1]。
6. The intelligent comparison method of relay protection fixed values according to claim 1, wherein in the step 4.1, for a fixed value sequence which satisfies a similarity threshold and is arranged according to the similarity from large to small, firstly, a fixed value item corresponding to a fixed value name with the largest similarity value is taken for corresponding fixed value comparison; if the fixed value number is consistent with the unit, judging that the fixed value name is successfully matched, and the fixed value of the operation equipment is consistent with the setting value without changing, otherwise, carrying out value consistency comparison of the next fixed value item; if the comparison result of the next item is consistent, judging that the constant value name corresponding to the constant value item is the corresponding constant value name on the constant value list, and the constant value item is not required to be changed by mistake, otherwise, carrying out constant value item value consistency comparison of the next constant value; if the comparison result of the whole constant value sequence still has no condition of consistent value, judging that the constant value item with the maximum similarity and consistent constant value unit is the corresponding item on the constant value list, judging that the constant value item is wrong, sending out alarm information, carrying out constant value downloading after confirming the difference, and calling the operation constant value again for comparison after the completion of the constant value.
7. The intelligent comparison method of relay protection constant values according to claim 1, wherein the step 4.2 is characterized in that a constant value item with the largest similarity is taken for constant value comparison, so as to judge whether the constant value name matching of the word segmentation is successful or not: if the numerical value and the unit of the fixed value item are consistent, judging that the name of the fixed value item is successfully matched and consistent with the setting value, and no subsequent operation is needed; if the units of the constant items are consistent and the values are not, there are two possibilities: (1) The name of the constant value item is successfully matched, but the operation constant value is inconsistent with the setting value, alarm information is required to be sent out, and the constant value downloading is carried out; (2) And if the units of the constant value items are inconsistent, the constant value item name matching fails.
CN202110941813.0A 2021-08-17 2021-08-17 Intelligent comparison method for relay protection fixed values Active CN113641877B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110941813.0A CN113641877B (en) 2021-08-17 2021-08-17 Intelligent comparison method for relay protection fixed values

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110941813.0A CN113641877B (en) 2021-08-17 2021-08-17 Intelligent comparison method for relay protection fixed values

Publications (2)

Publication Number Publication Date
CN113641877A CN113641877A (en) 2021-11-12
CN113641877B true CN113641877B (en) 2023-07-14

Family

ID=78422277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110941813.0A Active CN113641877B (en) 2021-08-17 2021-08-17 Intelligent comparison method for relay protection fixed values

Country Status (1)

Country Link
CN (1) CN113641877B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6031703A (en) * 1997-02-10 2000-02-29 Schneider Electric Sa Protection relay and process
CN104539047A (en) * 2014-12-23 2015-04-22 国家电网公司 Intelligent substation fault diagnosing and positioning method based on multi-factor comparison visualization
CN109753683A (en) * 2018-11-29 2019-05-14 国家电网有限公司 A kind of forming method of relay protection setting software protecting equipment model
CN110991184A (en) * 2019-12-10 2020-04-10 国网青海省电力公司 Relay protection fixed value self-adaptive checking method based on comprehensive dictionary characteristics
CN112003234A (en) * 2020-09-02 2020-11-27 广西电网有限责任公司河池供电局 Intelligent calibration system and method for relay protection equipment fixed value information
CN112180186A (en) * 2019-07-05 2021-01-05 国网新疆电力有限公司 Automatic checking method for fixed value of intelligent substation relay protection device
CN112182313A (en) * 2020-09-30 2021-01-05 国网青海省电力公司 Relay protection setting value name matching method and system
CN112415314A (en) * 2020-11-17 2021-02-26 华北电力大学(保定) Hidden fault identification method for relay protection system
CN112436477A (en) * 2020-11-10 2021-03-02 云南电网有限责任公司昆明供电局 Instrument is checked fast to relay protection device definite value of transformer substation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6026068B1 (en) * 2016-04-22 2016-11-16 三菱電機株式会社 Breaker non-operation protection relay and protection relay system
US20170173262A1 (en) * 2017-03-01 2017-06-22 François Paul VELTZ Medical systems, devices and methods

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6031703A (en) * 1997-02-10 2000-02-29 Schneider Electric Sa Protection relay and process
CN104539047A (en) * 2014-12-23 2015-04-22 国家电网公司 Intelligent substation fault diagnosing and positioning method based on multi-factor comparison visualization
CN109753683A (en) * 2018-11-29 2019-05-14 国家电网有限公司 A kind of forming method of relay protection setting software protecting equipment model
CN112180186A (en) * 2019-07-05 2021-01-05 国网新疆电力有限公司 Automatic checking method for fixed value of intelligent substation relay protection device
CN110991184A (en) * 2019-12-10 2020-04-10 国网青海省电力公司 Relay protection fixed value self-adaptive checking method based on comprehensive dictionary characteristics
CN112003234A (en) * 2020-09-02 2020-11-27 广西电网有限责任公司河池供电局 Intelligent calibration system and method for relay protection equipment fixed value information
CN112182313A (en) * 2020-09-30 2021-01-05 国网青海省电力公司 Relay protection setting value name matching method and system
CN112436477A (en) * 2020-11-10 2021-03-02 云南电网有限责任公司昆明供电局 Instrument is checked fast to relay protection device definite value of transformer substation
CN112415314A (en) * 2020-11-17 2021-02-26 华北电力大学(保定) Hidden fault identification method for relay protection system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Research and Application on the Comparison System for Relay Protection Settings Based on Document Comparison Tool";Chao Yang 等;《2020 IEEE IAS Industrial and Commercial Power System Asia Technical Conference》;第954-958页 *
"广西电网继电保护定值在线比对系统";蒙亮 等;《红水河》;第84-88页 *

Also Published As

Publication number Publication date
CN113641877A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN111860882B (en) Method and device for constructing power grid dispatching fault processing knowledge graph
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
CN101930435B (en) Method and system for retrieving organization names
CN110377901B (en) Text mining method for distribution line trip filling case
CN112527997B (en) Intelligent question-answering method and system based on power grid field scheduling scene knowledge graph
CN112101009B (en) Method for judging similarity of red-building dream character relationship frames based on knowledge graph
CN110991184B (en) Relay protection fixed value self-adaptive checking method based on comprehensive dictionary characteristics
CN106919663A (en) Character string matching method in the multi-source heterogeneous data fusion of power regulation system
CN112905804B (en) Dynamic updating method and device for power grid dispatching knowledge graph
CN110955806B (en) Character string matching method for Chinese text
CN109471929A (en) A method of it is matched based on map and carries out equipment maintenance record semantic search
CN113641877B (en) Intelligent comparison method for relay protection fixed values
CN114662279A (en) Relay protection information modeling method and system based on secondary equipment big data platform
CN115455315B (en) Address matching model training method based on comparison learning
CN112036179A (en) Electric power plan information extraction method based on text classification and semantic framework
CN115563968A (en) Water and electricity transportation and inspection knowledge natural language artificial intelligence system and method
CN114822592A (en) Substation signal acceptance method and system based on voice recognition
CN115455133A (en) Operation ticket checking method, system and equipment based on text mining
CN114077835A (en) Urban power grid dispatching operation ticket generation method
CN113515950B (en) Natural language processing semantic analysis method suitable for intelligent power dispatching
CN109145297A (en) Hash storage-based network vocabulary semantic analysis method and system
Liu et al. Historical Similar Ticket Matching and Extraction used for Power Grid Maintenance Work Ticket Decision Making
Han et al. Efficient regular expression matching on LZ77 compressed strings using negative factors
HongYing et al. Application of term library construction based on machine learning and statistical method in intelligent power grid custom service
CN116522966B (en) Text translation method and system based on multilingual vocabulary entry

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant