CN113268614A

CN113268614A - Label system updating method and device, electronic equipment and readable storage medium

Info

Publication number: CN113268614A
Application number: CN202110570241.XA
Authority: CN
Inventors: 李珊
Original assignee: Ping An Bank Co Ltd
Current assignee: Ping An Bank Co Ltd
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2021-08-17

Abstract

The invention relates to the field of data processing, and discloses a label system updating method, which comprises the following steps: extracting features of the example text of the new label to obtain a new feature word, and fusing the new feature word with the original feature word of the new label to obtain a label feature word; performing label text comparison and label semantic comparison on the new label and a label in a preset original label system according to the label feature words and the label grade of the new label to obtain a comparison result; and when the comparison result shows that the labels are not repeated, adding the new labels into the original label system according to the label level to obtain a new label system. The invention also relates to a blockchain technique, and the example text can be stored in a blockchain node. The invention also provides a label system updating device, electronic equipment and a storage medium. The invention can improve the updating efficiency of the label system.

Description

Label system updating method and device, electronic equipment and readable storage medium

Technical Field

The present invention relates to the field of data processing, and in particular, to a method and an apparatus for updating a tag system, an electronic device, and a readable storage medium.

Background

The label system is that short text fields (labels) which have a hierarchical relationship and can highly summarize the content of resources are stored in a tree structure, so that the effective management of the resources is realized. The label system is an efficient way to integrate resources.

However, the existing label system can only perform single-dimension repetitive comparison for adding new labels, and can not accurately screen repetitive labels, so that the label system is redundant to update, and the updating efficiency is low.

Disclosure of Invention

The invention provides a method and a device for updating a label system, electronic equipment and a computer readable storage medium, and mainly aims to improve the efficiency of updating the label system.

In order to achieve the above object, the present invention provides a method for updating a tag system, including:

obtaining a tag requirement, wherein the tag requirement comprises: the method comprises the following steps of (1) obtaining a new label, a label level corresponding to the new label, an original characteristic word and an example text;

extracting features of the example text to obtain a newly added feature word, and fusing the newly added feature word with the original feature word to obtain a label feature word;

performing label text comparison and label semantic comparison on the new label and a label in a preset original label system according to the label feature words and the label level to obtain a comparison result;

and when the comparison result is that the labels are not repeated, adding the new labels into the original label system according to the label level to obtain a new label system.

Optionally, the performing feature extraction on the example text to obtain a tag feature includes:

performing word segmentation on the example text by using different word segmentation granularities to obtain a text word segmentation result corresponding to each word segmentation granularity;

counting the occurrence frequency and the word position of each word in the text word segmentation result in the example text;

and extracting words in the text word segmentation result according to the occurrence frequency and the word positions to obtain the newly added feature words.

Optionally, the counting the occurrence frequency and word position of each word in the text segmentation result in the example text includes:

traversing the example text, and determining the frequency of occurrence of each word in the text word segmentation result in the example text as the occurrence frequency;

the example text is divided into sentences, and the text sentences are sequenced according to the sequence of the text sentences obtained by the division in the example text;

and determining the position of each word in the text word segmentation result according to the sequenced text sentences.

Optionally, the comparing the new tag with the tag in the preset original tag system according to the tag feature word and the tag level and the tag semantic meaning to obtain a comparison result, including:

extracting all tags which belong to different levels with the tag level in the original tag system to obtain a first comparison tag set;

performing label text repeatability comparison on the new label and each label in the first comparison label set to obtain a text repeatability result;

if the text repeatability result is label repetition, obtaining the comparison result according to the text repeatability result;

if the text repeatability result is that the labels are not repeated, extracting all labels in the label system which belong to the same level as the label level to obtain a second comparison label set;

and obtaining the characteristic words corresponding to each label in the second comparison label set to obtain label comparison characteristic words, performing label semantic repeated comparison on the label comparison characteristic words and the label characteristic words to obtain a semantic repeated result, and obtaining the comparison result according to the semantic repeated result.

Optionally, the repeatedly comparing the tag semantic meaning between the tag comparison feature words and the tag feature words to obtain the semantic repeatability result includes:

converting each word in the tag comparison characteristic words into a vector to obtain a comparison vector set;

performing vector compression on all vectors in the comparison vector set to obtain comparison vectors;

converting the labels corresponding to the label comparison characteristic words in the second comparison label set into vectors to obtain original label vectors;

converting each word in the label feature words into a vector to obtain a feature vector set;

performing vector compression on all vectors in the feature vector set to obtain feature vectors;

converting the new label into a vector to obtain a newly added label vector;

calculating the similarity between the feature vector and the comparison vector to obtain a first similarity;

calculating the similarity of the original label vector and the newly added label vector to obtain a second similarity;

and comparing and judging the similarity according to the first similarity and the second similarity to obtain the semantic repeatability result.

Optionally, the comparing and determining the similarity according to the first similarity and the second similarity to obtain the semantic repeating result includes:

calculating to obtain comparison similarity according to the first similarity and the second similarity;

summarizing all the comparison similarities to obtain a comparison similarity set;

and comparing each comparison similarity in the comparison similarity set with a preset similarity threshold to obtain the semantic repeatability result.

Optionally, the comparing each comparison similarity with a preset similarity threshold to obtain the semantic repeatability result includes:

if any one of the comparison similarity is greater than or equal to the similarity threshold, the semantic repeatability result is label repetition;

and if all the comparison similarities are smaller than the similarity threshold, the semantic repeatability result is that the labels are not repeated.

In order to solve the above problem, the present invention further provides a label system updating apparatus, including:

the feature extraction module is used for acquiring a label requirement, wherein the label requirement comprises: the method comprises the following steps of (1) obtaining a new label, a label level corresponding to the new label, an original characteristic word and an example text; extracting features of the example text to obtain a newly added feature word, and fusing the newly added feature word with the original feature word to obtain a label feature word;

the tag comparison module is used for comparing the new tag with a tag in a preset original tag system according to the tag feature words and the tag levels to obtain a comparison result;

and the tag system updating module is used for adding the new tag into the original tag system according to the tag level to obtain a new tag system when the comparison result shows that the tag is not repeated.

In order to solve the above problem, the present invention also provides an electronic device, including:

a memory storing at least one computer program; and

and the processor executes the computer program stored in the memory to realize the label system updating method.

In order to solve the above problem, the present invention further provides a computer-readable storage medium, in which at least one computer program is stored, and the at least one computer program is executed by a processor in an electronic device to implement the above-mentioned label system updating method.

The embodiment of the invention obtains the newly added feature words by extracting the features of the example text, fuses the newly added feature words and the original feature words to obtain the label feature words, enables the feature expression of the new label to be more comprehensive by extracting the features and fusing the feature words, and improves the accuracy of subsequent comparison; performing label text comparison and label semantic comparison on the new label and a label in a preset original label system according to the label feature words and the label level to obtain a comparison result; and when the comparison result is that the label is not repeated, adding the new label into the original label system according to the label level to obtain a new label system, and performing multi-dimensional comparison by using multiple comparison modes, wherein the comparison result of the repeated label is more accurate, the redundant update of the label system is reduced, and the update efficiency of the label system is improved. Therefore, the method, the device, the electronic equipment and the readable storage medium for updating the label system, provided by the embodiment of the invention, improve the efficiency of updating the label system.

Drawings

Fig. 1 is a schematic flow chart of a tag system updating method according to an embodiment of the present invention;

fig. 2 is a schematic block diagram of a tag hierarchy updating apparatus according to an embodiment of the present invention;

fig. 3 is a schematic internal structural diagram of an electronic device implementing a tag hierarchy updating method according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the invention provides a label system updating method. The execution subject of the tag hierarchy updating method includes, but is not limited to, at least one of electronic devices that can be configured to execute the method provided by the embodiment of the present application, such as a server, a terminal, and the like. In other words, the tag hierarchy updating method may be performed by software or hardware installed in the terminal device or the server device, and the software may be a block chain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

Referring to fig. 1, which is a schematic flow diagram of a label hierarchy updating method according to an embodiment of the present invention, in an embodiment of the present invention, the label hierarchy updating method includes:

s1, acquiring label requirements, wherein the label requirements comprise: the method comprises the following steps of (1) obtaining a new label, a label level corresponding to the new label, an original characteristic word and an example text;

in the embodiment of the present invention, the tag requirement is a request for adding a new tag to a preset tag system, where the tag requirement includes the new tag to be added and an original feature word included in the new tag, and the new tag is a short text field capable of highly summarizing text content, such as: if the new label is sports, the content corresponding to the new label is sports related content; the label level is the level corresponding to the new label as follows: the example text is example text which can highly summarize the text content by using a new label; the original characteristic words contained in the new label are preset words capable of carrying out characteristic description on the new label.

S2, extracting features of the example text to obtain a new feature word, and performing feature fusion on the new feature word and the original feature word to obtain a label feature word;

in detail, the example text in the embodiment of the present invention is an example of text content corresponding to a new tag that needs to be added; further, in the embodiment of the present invention, the example text is segmented by using different segmentation granularities to obtain a text segmentation result corresponding to each segmentation granularity, where the segmentation granularity is the number of characters included in a word of a segmentation, and may be a single granularity, or a single granularity, for example, a word of a segmentation is a character, or a mixed granularity, for example, a word of a segmentation is a character or two characters, for example: the example text is 'today's weather is really good ', the word segmentation granularity is two characters, and then the corresponding text word segmentation result is three words of' today's' and 'weather' and 'really good'; further, the embodiment of the invention performs feature extraction on the text word segmentation result to obtain the newly added feature words.

Further, in the embodiment of the present invention, performing feature extraction on the text word segmentation result to obtain the newly added feature word, includes: counting the occurrence frequency and the word position of each word in the text word segmentation result in the example text; and extracting words in the text word segmentation result according to the occurrence frequency and the word positions to obtain newly added feature words.

In detail, in the embodiment of the present invention, the counting the occurrence frequency and the word position of each word in the text segmentation result in the example text includes:

For example, there is an example text describing a gold market price in 2020, and traversing the example text results that if the number of times "gold" appears in the example text in the text segmentation result is 3, the frequency of appearance of the word "gold" in the text segmentation result is determined to be 3; the number of times that the word "gold price" in the text segmentation result appears in the example text is 2, and it is determined that the frequency of appearance of the word "gold price" in the text segmentation result is 2.

In detail, example texts can be divided according to preset sentence dividing symbols, and text sentences obtained by the division are sequenced, wherein the sentence dividing symbols include but are not limited to ","; ",". "a combination of one or more thereof.

For example, example text is: the highest organization of basketball, the international basketball association, was established in 1932, and headquarters were in geneva, switzerland. The example text may be claused according to the clause symbols, and the text clauses may be sorted according to the sequence of each text clause in the example text to obtain a first text clause: "the highest organizational structure of basketball, international association of basketball", second text clause: "hold in 1932" and the third textual clause: three text sentences of "headquarter is arranged in Japanese inner tile of Switzerland".

The word position of the text segmentation result can be determined according to the sequenced text sentences, for example, if the text segmentation result basketball appears in the first text segmentation, the second text segmentation and the third text segmentation, the word position of the word gold in the text segmentation result is determined to be 1; and if the text segmentation result 'headquarter' appears in the third text segmentation, determining that the word position of the word 'gold price' in the text segmentation result is 3.

In the embodiment of the present invention, when the frequency of occurrence of the text word segmentation result in the example text is higher, or when the position of a word of the text word segmentation result in the example text is higher, the text word segmentation result may have stronger representativeness to the text resource. Thus. The text word segmentation result can be screened by utilizing the occurrence frequency and the word position, so that words which are more representative of example texts are screened out and serve as new feature words.

In detail, in the embodiment of the present invention, the extracting words from the example text according to the occurrence frequency and the word positions to obtain new feature words includes:

calculating a first index of the text word segmentation result according to the occurrence frequency;

acquiring a weight parameter, and calculating a second index of the text word segmentation result according to the word position and the weight parameter;

calculating a characteristic value of the text word segmentation result according to the first index and the second index;

and selecting the text word segmentation result with the characteristic value larger than a preset characteristic threshold value as the newly added characteristic word.

In this embodiment, the first index, the second index and the feature value may all represent a representative size of the text word segmentation result to the example text. For example, the larger the numerical values of the first index, the second index, and the feature value, the greater the representativeness of the text segmentation result to the example text, and the smaller the numerical values of the first index, the second index, and the feature value, the lesser the representativeness of the text segmentation result to the example text.

In detail, the occurrence frequency can be substituted into preset key word algorithms such as TF-IDF, TextRank and the like to calculate a first index of each word in the text word segmentation result.

Specifically, the weight parameter is used for identifying importance weights of the text word segmentation results at different word positions in the example text, and the weight parameter can be preset by a user. In general, the more preceding the word position, the larger the weight parameter of the text word segmentation result; for example, the word positions of the word "gold" in the text segmentation result are 1, 2 and 3, the weight of the word "gold" in the text segmentation result at the word position 1 is 0.8, the weight of the word "gold" in the text segmentation result at the word position 2 is 0.7, and the weight of the word "gold" in the text segmentation result at the word position 3 is 0.6.

In this embodiment, the calculating a second indicator of each word of the text word segmentation result according to the word position and the weight parameter includes:

calculating a second index of each word in the text word segmentation result by using the following index algorithm:

F＝α*ρ

wherein, F is the second index, α is a weight parameter of each word in the text segmentation result, and ρ is a word position of each word in the text segmentation result.

Further, calculation of the feature value of each word in the text segmentation result can be achieved by performing operations such as addition and multiplication on the first index and the second index, and the word in the text segmentation result with the feature value larger than the feature threshold value is determined to be a newly added feature word.

According to the embodiment of the invention, the occurrence frequency and the word position of each word in the text word segmentation result are counted, and the newly added feature word of the example text is extracted according to the occurrence frequency and the word position, so that the words in the text word segmentation result are screened, and the words with higher representativeness to the example text are obtained as the newly added feature word, and the accuracy of the feature word description corresponding to the label is improved.

In detail, in the embodiment of the present invention, the original feature words are feature words corresponding to the new tag in the tag requirement, and if the new tag is "gold", the corresponding feature words are "gold price" and "precious metal".

Further, since the original feature words are artificially preset, there may be a small number of feature words and a corresponding label that cannot be comprehensively expressed, so that in the embodiment of the present invention, the newly added feature words and the original feature words are subjected to feature fusion to obtain the label feature words.

In detail, in the embodiment of the present invention, the performing feature fusion on the newly added feature words and the original feature words includes: summarizing all the newly added feature words and the original feature words; and carrying out repeated word deletion processing on the summarized words to obtain the label characteristic words. In the embodiment of the invention, because the word segmentation results corresponding to each word segmentation granularity are different, each word segmentation granularity corresponds to one newly added feature word, and a plurality of newly added feature words exist.

In another embodiment of the invention, the sample text is stored in the blockchain node by utilizing the high-throughput characteristic of the blockchain, so that the data access efficiency is improved.

S3, comparing the new label with a label in a preset original label system according to the label feature word and the label level to obtain a comparison result;

in detail, the original tag system in the embodiment of the present invention is a tag with a tree structure, and is divided into three levels, i.e. a first level tag, a second level tag, and a third level tag, for example: the first-level label is life, the second-level label is further detailed of the first-level label such as food, fashion, travel, health, entertainment, culture, education, workplace, sports, science and technology, automobile, society and region, and the third-level label is further detailed of the second-level label such as football, basketball and tennis corresponding to the 'sports' of the second-level label. Each label in the original label hierarchy has a rank.

Further, in the embodiment of the present invention, all tags belonging to different levels from the tag level in the original tag system are extracted, so as to obtain a first comparison tag set. For example: and if the label level corresponding to the new label is the second level, extracting the first-level label and the third-level label in the label system to obtain the first comparison label set.

In detail, in the embodiment of the present invention, a tag text repeatability comparison is performed between the new tag and each tag in the first comparison tag set, so as to obtain a text repeatability result; if the new label is 'gold', if the 'gold' label exists in the first comparison label set, the text repeatability result is label repetition; if the 'gold' tag does not exist in the first comparison tag set, the text repeatability result is that the tag is not repeated, and further, if the text repeatability result is that the tag is repeated, the comparison result is obtained according to the text repeatability result, and the text repeatability result is determined as the comparison result.

Further, in the embodiment of the present invention, if the text repetition result indicates that the tags are not repeated, all tags belonging to the same level as the tag level in the tag system are extracted to obtain a second comparison tag set; for example, if the level corresponding to the new tag is level 2, extracting all the tags of level 2 in the tag system to obtain a second comparison tag set.

In order to perform more multidimensional repetitive comparison, in the embodiment of the present invention, a feature word corresponding to each tag in the second comparison tag set is obtained, a tag comparison feature word is obtained, tag semantic repetitive comparison is performed between the tag comparison feature word and the tag feature word, a semantic repetitive result is obtained, the comparison result is obtained according to the semantic repetitive result, and the semantic repetitive result is determined as the comparison result.

In detail, in the embodiment of the present invention, a feature word corresponding to each tag of the second comparison tag set is obtained, so as to obtain a tag comparison feature word; converting each Word in the tag comparison characteristic words into a vector to obtain a comparison vector set, optionally, in the embodiment of the invention, vector conversion can be performed by using a Word2vec model formed by transfer learning training based on a preset professional field knowledge text (such as teaching materials and training materials); further, performing vector compression on all vectors in the comparison vector set to obtain comparison vectors; optionally, in the embodiment of the present invention, arithmetic mean calculation is performed on all vectors in the comparison vector set to obtain the comparison vector; further, converting the labels corresponding to the label comparison feature words in the second comparison label set into vectors to obtain original label vectors; converting each word in the label feature words into a vector to obtain a feature vector set; performing vector compression on all vectors in the feature vector set to obtain feature vectors; converting the new label into a vector to obtain a newly added label vector; and calculating the similarity between the feature vector and the comparison vector to obtain a first similarity.

The similarity calculation in the embodiment of the present invention may be performed by using the following formula:

wherein, X_iThe i-th element, Y, representing a feature vector X_iAnd as the ith element of the comparison vector Y, Sim represents the similarity between the feature vector X and the comparison vector Y.

Specifically, the embodiment of the present invention calculates the similarity between the original tag vector and the newly added tag vector to obtain a second similarity; further, according to the embodiment of the invention, similarity comparison and judgment are carried out according to the first similarity and the second similarity, so as to obtain a semantic repeatability result.

Further, the embodiment of the present invention performs similarity comparison and judgment according to the first similarity and the second similarity to obtain the semantic repeatability result, including: calculating to obtain comparison similarity according to the first similarity and the second similarity; summarizing all the comparison similarities to obtain a comparison similarity set; further, in the embodiment of the present invention, each comparison similarity in the comparison similarity set is compared with a preset similarity threshold, so as to obtain the semantic repeatability result.

Optionally, in the embodiment of the present invention, the comparison similarity is calculated by using the following formula:

S＝ma_j*nb_j

wherein S is comparison similarity; j is the label number in the second comparison label set; a is_jA first similarity corresponding to the label numbered j in the second comparison label set; b_jA second similarity corresponding to the tag numbered j in the second comparison tag set; m and n are preset weight parameters.

In detail, the embodiment of the present invention compares each comparison similarity with a preset similarity threshold to obtain a semantic repeatability result, including: if any one of the comparison similarity is greater than or equal to the similarity threshold, the semantic repeatability result is label repetition; and if all the comparison similarities are smaller than the similarity threshold, the semantic repeatability result is that the labels are not repeated.

S4, judging whether the comparison result is label duplication;

s5, when the comparison result is that the labels are not repeated, adding the new labels into the original label system according to the label level to obtain a new label system;

in the embodiment of the present invention, if the comparison result indicates that the tag is not repeated, the new tag is added to the original tag system according to the tag level to obtain the new tag system, for example: and if the label level is 2, taking the original label system of the new label as a 2-level label to obtain the new label system.

S6, when the comparison result is label repetition, stopping updating the label system

In the embodiment of the invention, if the comparison result is that the label is repeated, the response of the label requirement fails, and the updating of the label system is stopped.

Fig. 2 is a functional block diagram of the label system updating apparatus according to the present invention.

The label system updating apparatus 100 of the present invention can be installed in an electronic device. According to the implemented functions, the tag system updating apparatus may include a feature extraction module 101, a tag comparison module 102, and a tag system updating module 103, which may also be referred to as a unit, and refers to a series of computer program segments that can be executed by a processor of an electronic device and can perform fixed functions, and are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

the feature extraction module 101 is configured to obtain a tag requirement, where the tag requirement includes: the method comprises the following steps of (1) obtaining a new label, an original characteristic word corresponding to the new label, a label level and an example text; extracting features of the example text to obtain a newly added feature word, and fusing the newly added feature word with the original feature word to obtain a label feature word;

In detail, the example text in the embodiment of the present invention is an example of text content corresponding to a new tag that needs to be added; further, the feature extraction module 101 in the embodiment of the present invention performs word segmentation on the example text by using different word segmentation granularities to obtain a text word segmentation result corresponding to each word segmentation granularity, where the word segmentation granularity is a number of characters included in a word of a word segmentation, and may be a single granularity, such as that a word of a word segmentation is a character, or may be a mixed granularity, such as that a word of a word segmentation is a character or two characters, for example: the example text is 'today's weather is really good ', the word segmentation granularity is two characters, and then the corresponding text word segmentation result is three words of' today's' and 'weather' and 'really good'; further, the embodiment of the invention performs feature extraction on the text word segmentation result to obtain the newly added feature words.

Further, in this embodiment of the present invention, the feature extracting module 101 performs feature extraction on the text word segmentation result to obtain the newly added feature word, including: counting the occurrence frequency and the word position of each word in the text word segmentation result in the example text; and extracting words in the text word segmentation result according to the occurrence frequency and the word positions to obtain newly added feature words.

In detail, in the embodiment of the present invention, the counting, by the feature extraction module 101, the occurrence frequency and the word position of each word in the text segmentation result in the example text includes:

In detail, the feature extraction module 101 may perform clause splitting on an example text according to preset clause symbols, and sort the text sentences obtained by the clause splitting, where the clause symbols include, but are not limited to, "; ",". "a combination of one or more thereof.

The feature extraction module 101 of the embodiment of the present invention may determine, according to the ordered text sentences, word positions of the text segmentation results, for example, if a text segmentation result "basketball" appears in a first text segmentation, a second text segmentation, and a third text segmentation, the word position of a word "gold" in the text segmentation result is determined to be 1; and if the text segmentation result 'headquarter' appears in the third text segmentation, determining that the word position of the word 'gold price' in the text segmentation result is 3.

In detail, in the embodiment of the present invention, the extracting words from the example text by the feature extracting module 101 according to the occurrence frequency and the word position to obtain a new feature word includes:

In this embodiment, the calculating, by the feature extraction module 101, a second index of each word of the text word segmentation result according to the word position and the weight parameter includes:

F＝α*ρ

The feature extraction module 101 of the embodiment of the present invention counts the occurrence frequency and the word position of each word in the text segmentation result, and extracts the new feature words of the example text according to the occurrence frequency and the word position, so as to implement the screening of the words in the text segmentation result, obtain the words with higher representativeness of the example text as the new feature words, and improve the accuracy of the description of the feature words corresponding to the tags.

Further, because the original feature words are artificially preset, there may be a small number of feature words and a corresponding label that cannot be expressed comprehensively, and therefore, in the embodiment of the present invention, the feature extraction module 101 performs feature fusion on the newly added feature words and the original feature words to obtain the label feature words.

In detail, in the embodiment of the present invention, the performing, by the feature extraction module 101, feature fusion on the newly added feature words and the original feature words includes: summarizing all the newly added feature words and the original feature words; and carrying out repeated word deletion processing on the summarized words to obtain the label characteristic words. In the embodiment of the invention, because the word segmentation results corresponding to each word segmentation granularity are different, each word segmentation granularity corresponds to one newly added feature word, and a plurality of newly added feature words exist.

The tag comparison module 102 is configured to perform tag text comparison and tag semantic comparison on the new tag and a tag in a preset original tag system according to the tag feature words and the tag levels to obtain a comparison result;

Further, the tag comparison module 102 in the embodiment of the present invention extracts all tags in the original tag system that belong to different levels from the tag level to obtain a first comparison tag set. For example: and if the label level corresponding to the new label is the second level, extracting the first-level label and the third-level label in the label system to obtain the first comparison label set.

In detail, in the embodiment of the present invention, the tag comparison module 102 performs tag text repeatability comparison on the new tag and each tag in the first comparison tag set to obtain a text repeatability result; if the new label is 'gold', if the 'gold' label exists in the first comparison label set, the text repeatability result is label repetition; if the 'gold' tag does not exist in the first comparison tag set, the text repetition result is that the tag is not repeated, and further, if the text repetition result is that the tag is repeated, the tag comparison module 102 obtains the comparison result according to the text repetition result and determines the text repetition result as the comparison result in the embodiment of the present invention.

Further, in the embodiment of the present invention, if the text repetition result indicates that the tags are not repeated, the tag comparison module 102 extracts all tags in the tag system that belong to the same level as the tag level to obtain a second comparison tag set; for example, if the level corresponding to the new tag is level 2, extracting all the tags of level 2 in the tag system to obtain a second comparison tag set.

In order to perform more multidimensional repetitive comparison, in the embodiment of the present invention, the tag comparison module 102 obtains a feature word corresponding to each tag in the second comparison tag set, obtains a tag comparison feature word, performs tag semantic repetitive comparison between the tag comparison feature word and the tag feature word, obtains the semantic repetitive result, obtains the comparison result according to the semantic repetitive result, and determines the semantic repetitive result as the comparison result.

In detail, in the embodiment of the present invention, the tag comparison module 102 obtains a feature word corresponding to each tag of the second comparison tag set, so as to obtain a tag comparison feature word; converting each Word in the tag comparison characteristic words into a vector to obtain a comparison vector set, optionally, in the embodiment of the invention, vector conversion can be performed by using a Word2vec model formed by transfer learning training based on a preset professional field knowledge text (such as teaching materials and training materials); further, performing vector compression on all vectors in the comparison vector set to obtain comparison vectors; optionally, in the embodiment of the present invention, arithmetic mean calculation is performed on all vectors in the comparison vector set to obtain the comparison vector; further, converting the labels corresponding to the label comparison feature words in the second comparison label set into vectors to obtain original label vectors; converting each word in the label feature words into a vector to obtain a feature vector set; performing vector compression on all vectors in the feature vector set to obtain feature vectors; converting the new label into a vector to obtain a newly added label vector; and calculating the similarity between the feature vector and the comparison vector to obtain a first similarity.

Specifically, in the embodiment of the present invention, the tag comparison module 102 calculates a similarity between the original tag vector and the newly added tag vector to obtain a second similarity; further, according to the embodiment of the invention, similarity comparison and judgment are carried out according to the first similarity and the second similarity, so as to obtain a semantic repeatability result.

Further, in the embodiment of the present invention, the tag comparison module 102 performs similarity comparison and judgment according to the first similarity and the second similarity to obtain the semantic repeating result, including: calculating to obtain comparison similarity according to the first similarity and the second similarity; summarizing all the comparison similarities to obtain a comparison similarity set; further, in the embodiment of the present invention, each comparison similarity in the comparison similarity set is compared with a preset similarity threshold, so as to obtain the semantic repeatability result.

S＝ma_j*nb_j

In detail, the tag comparison module 102 of the embodiment of the present invention compares each comparison similarity with a preset similarity threshold to obtain a semantic repeatability result, including: if any one of the comparison similarity is greater than or equal to the similarity threshold, the semantic repeatability result is label repetition; and if all the comparison similarities are smaller than the similarity threshold, the semantic repeatability result is that the labels are not repeated.

The tag system updating module 103 is configured to, when the comparison result is that the tag is not repeated, add the new tag to the original tag system according to the tag level to obtain a new tag system.

The tag system updating module 103 determines whether the comparison result is a tag duplication; when the comparison result is that the labels are not repeated, adding the new labels into the original label system according to the label level to obtain a new label system;

in this embodiment of the present invention, if the comparison result is that the tag is not repeated, the tag system updating module 103 adds the new tag to the original tag system according to the tag level to obtain the new tag system, for example: and if the label level is 2, taking the original label system of the new label as a 2-level label to obtain the new label system. In the embodiment of the present invention, if the comparison result is that the tag is repeated, the response of the tag requirement fails, and the tag system updating module 103 stops updating the tag system.

Fig. 3 is a schematic structural diagram of an electronic device implementing the label hierarchy updating method according to the present invention.

The electronic device may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program, such as a tag hierarchy update program, stored in the memory 11 and executable on the processor 10.

The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, for example a removable hard disk of the electronic device. The memory 11 may also be an external storage device of the electronic device in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only to store application software installed in the electronic device and various types of data, such as a code of a tag hierarchy updating program, but also to temporarily store data that has been output or will be output.

The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device by running or executing programs or modules (e.g., tag architecture update programs, etc.) stored in the memory 11 and calling data stored in the memory 11.

The communication bus 12 may be a PerIPheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The bus may be divided into an address bus, a data bus, a control bus, etc. The communication bus 12 is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

Fig. 3 shows only an electronic device having components, and those skilled in the art will appreciate that the structure shown in fig. 3 does not constitute a limitation of the electronic device, and may include fewer or more components than those shown, or some components may be combined, or a different arrangement of components.

For example, although not shown, the electronic device may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Optionally, the communication interface 13 may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), which is generally used to establish a communication connection between the electronic device and other electronic devices.

Optionally, the communication interface 13 may further include a user interface, which may be a Display (Display), an input unit (such as a Keyboard (Keyboard)), and optionally, a standard wired interface, or a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable, among other things, for displaying information processed in the electronic device and for displaying a visualized user interface.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The tag hierarchy update program stored in the memory 11 of the electronic device is a combination of a plurality of computer programs, which when executed in the processor 10, can implement:

Specifically, the processor 10 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the computer program, which is not described herein again.

Further, the electronic device integrated module/unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer readable storage medium. The computer readable medium may be non-volatile or volatile. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

Embodiments of the present invention may also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, the computer program may implement:

Further, the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A label system updating method, comprising:

2. The label system updating method of claim 1, wherein the extracting the features of the example text to obtain the label features comprises:

3. The tag hierarchy updating method of claim 2, wherein the counting of the occurrence frequency and word position of each word in the text segmentation result in the example text comprises:

4. The method for updating a tag system according to claim 1, wherein the comparing the new tag with a tag in a preset original tag system according to the tag feature word and the tag level to obtain a comparison result comprises:

5. The tag system updating method according to claim 1, wherein the tag semantic repeating comparison of the tag comparison feature word with the tag feature word to obtain the semantic repeating result comprises:

converting the new label into a vector to obtain a newly added label vector;

6. The method for updating a tag system according to claim 5, wherein the comparing and determining the similarity according to the first similarity and the second similarity to obtain the semantic repeating result comprises:

7. The tag system updating method according to claim 6, wherein the comparing each of the comparison similarities with a preset similarity threshold to obtain the semantic repeating result comprises:

8. A label system updating method, comprising:

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the tag hierarchy updating method of any one of claims 1 to 7.

10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the label system updating method according to any one of claims 1 to 7.