CN116204594A - Data processing method, device and equipment based on block chain - Google Patents
Data processing method, device and equipment based on block chain Download PDFInfo
- Publication number
- CN116204594A CN116204594A CN202310493762.9A CN202310493762A CN116204594A CN 116204594 A CN116204594 A CN 116204594A CN 202310493762 A CN202310493762 A CN 202310493762A CN 116204594 A CN116204594 A CN 116204594A
- Authority
- CN
- China
- Prior art keywords
- input
- word
- input information
- information
- detected
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 15
- 238000001514 detection method Methods 0.000 claims abstract description 88
- 238000012545 processing Methods 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims description 39
- 230000008569 process Effects 0.000 claims description 12
- 230000006798 recombination Effects 0.000 claims description 6
- 238000005215 recombination Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000000875 corresponding effect Effects 0.000 description 91
- 230000000694 effects Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012015 optical character recognition Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012550 audit Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a data processing method, device and equipment based on a block chain. The data processing method based on the block chain comprises the following steps: receiving input information sent by a first node on a blockchain; performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node. The scheme of the invention can improve the accuracy of data exchange by carrying out compliance detection on the input information.
Description
Technical Field
The present invention relates to the field of computer information processing technologies, and in particular, to a data processing method, apparatus, and device based on a blockchain.
Background
The block chain, namely a chain composed of one block, each block stores certain information, each block is connected into a chain according to the time sequence generated by each block, the chain is stored in all servers, as long as one server in the whole system can work, the whole block chain is safe, the servers are called nodes in the block chain system, and each node provides storage space and calculation support for the whole block chain system. If the information in the blockchain is to be modified, the consent of a plurality of nodes must be characterized and the information in all the nodes must be modified, and the nodes usually master the information in the blockchain in different subject hands, so that the information recorded by the blockchain is more real and reliable. Meanwhile, the block chain also has the characteristics of information synchronization and transparent information disclosure.
However, because the word habit and the required content focus among the parties are different, the difficulty of providing the corresponding data to the accurate position is increased in the process of providing the data, and the accuracy in the information interaction process is further reduced. In the scenario where a passenger travels on an airplane and data is involved in the departure of the data, an airport or a flight crews are usually required by the flight crews, and relevant data are provided for explaining and applying the situation. However, due to the fact that the data required by the data provider and the data receiver are different, the data exchange accuracy is low, and therefore the problem of low data exchange efficiency is caused.
Disclosure of Invention
The invention aims to provide a data processing method, device and equipment based on a block chain, which can improve the accuracy of data exchange by detecting the compliance of input information.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a blockchain-based data processing method, the method comprising:
Receiving input information sent by a first node on a blockchain;
performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result;
and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
Optionally, receiving the input information sent by the first node on the blockchain includes:
acquiring an initial questionnaire template image;
performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input;
and carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
Optionally, performing text recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input, including:
performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
determining a plurality of first fields to be input and a plurality of second fields to be input according to the types of the objects to be input contained in the first scanning frame;
Acquiring a coding similarity set of each first field to be input and each second field to be input;
when the maximum coding similarity in the coding similarity set is larger than a coding threshold, replacing a second text mark of a second field to be input corresponding to the maximum coding similarity with the first text mark;
and adding the first field to be input and the second field to be input with the same text mark into the same initial questionnaire template image to obtain a plurality of texts to be input.
Optionally, performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result, where the detection result comprises:
acquiring a comparison set of at least one target comparison information corresponding to the input information;
word segmentation processing is carried out on the input information, so that a plurality of words to be detected corresponding to the input information are obtained;
performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
determining a part-of-speech sequence to be detected of the input information according to part-of-speech tags contained in the input information;
obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
And if the maximum value in the similarity value sequence is larger than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
Optionally, according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information, obtaining a similarity value sequence formed by the input information and a similarity value of each target comparison information includes:
according toObtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information;
wherein ,for the similarity value of the first input information and the ith target comparison information, +.>For the sub-similarity value of the b-th word to be detected and the i-th target comparison information in the input information, b=1, 2,3, …, y, and y are the total number of words to be detected in the first input information.
acquiring a target matching sequence corresponding to a b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
When the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,;
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as an initial comparison word;
when the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,;
when the b-th word to be detected in the input information belongs to the word list corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and;
Optionally, the similarity threshold satisfies the following condition:
wherein Y1 is a similarity threshold,as the threshold coefficient, AVG (a) is a number average of comparison words included in a plurality of target comparison information corresponding to the input information.
The invention also provides a data processing device based on the block chain, which comprises:
the receiving and transmitting module is used for receiving input information sent by a first node on the block chain;
the processing module is used for carrying out similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
The present invention also provides a computing device comprising: a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method as described above.
The present invention also provides a readable storage medium having stored thereon a program or instructions which when executed by a processor performs the steps of the method as described above.
The scheme of the invention at least comprises the following beneficial effects:
according to the scheme, the input information sent by the first node on the block chain is received; performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node. The data exchange accuracy can be improved by carrying out compliance detection on the input information.
Drawings
FIG. 1 is a flow chart of a blockchain-based data processing method provided by an embodiment of the present invention;
FIG. 2 is a block diagram of a block chain based data processing apparatus according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
As shown in fig. 1, an embodiment of the present invention provides a data processing method based on a blockchain, the method including:
and step 13, if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
In the embodiment of the invention, input information sent by a first node on a blockchain is received, after a first node in the blockchain clicks an information uploading instruction, a corresponding intelligent contract on the blockchain is triggered, compliance detection is carried out on the input information to obtain a detection result, if the detection result shows that the compliance detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node. Therefore, the data exchange accuracy can be improved by carrying out compliance detection on the input information.
It should be noted that the blockchain may include: a first node, a second node, and an intelligent contract; the first node is used for receiving input information; the intelligent contract is used for carrying out compliance detection on the input information uploaded by the first node;
the first node may be a data provider, for example: an airport or airline; the second node may be a data receiver or data compliance auditor, for example: the aeronautical credit department or the data compliance audit department;
the intelligent contract is encapsulated with the capability of carrying out compliance detection on input information provided by the first node; if the compliance detection of the intelligent contract is adopted, the consistency between the input information and the information required by the second node is higher, and the data exchange accuracy is further improved. In addition, after passing the compliance detection, the input information is sent to the second node, if the second node is a data compliance auditor, the input information can be audited again, and therefore, the accuracy of data exchange can be further improved through the two-time compliance detection of the intelligent contract and the second node.
In an alternative embodiment of the present invention, step 11 may include:
step 111, acquiring an initial questionnaire template image;
step 112, performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input; here, the field to be input is an item to be input in the text to be input;
and 113, carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
In this embodiment, after the initial questionnaire template image is split through the text recognition process, a plurality of sub questionnaires, that is, the text to be input, are recombined according to the relevance of the question content. Therefore, the concentration degree of the required content in each text to be input can be improved, so that the first node can conveniently distribute different texts to be input to operators at corresponding positions for filling.
It should be noted that, the initial questionnaire template image may be image information of a questionnaire made by the second node according to the required data; the initial questionnaire template image comprises a plurality of pixel sets of information to be input; the areas of the plurality of information to be input are different;
The pixel set of the information to be input can be an image set of each item to be filled, and the length of each item to be filled is different due to different collected contents; for example: the to-be-filled item may include: name: "," gender: "wait for simple entry, can also include: organizing or personal basic information, data safety management mechanism information and other complex items to be filled in; setting an answer area with a corresponding size according to the complexity of the item to be filled; for example: name: the corresponding answer area is smaller, so that an area with a few characters is reserved behind the name; the answer area corresponding to the information of the data safety management mechanism is larger, so that an area with a size capable of placing a plurality of lines of characters is reserved below the answer area.
Therefore, through the initial questionnaire template image, the input information can be ensured not to be changed at will, and further typesetting style information in the initial questionnaire template image is reserved.
Note that the initial questionnaire template image may be subjected to a text recognition process by OCR (Optical Character Recognition ). This allows questions with similar content to be placed in the same questionnaire to facilitate the first node filling out the relevant content.
In an alternative embodiment of the present invention, step 112 may include:
step 1121, performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
step 1122, determining a plurality of first fields to be input and a plurality of second fields to be input according to the type of the object to be input contained in the first scan frame; here, the object type to be input may include: a text object and an input identification object;
step 1123, obtaining a coding similarity set of each first field to be input and each second field to be input;
step 1124, when the maximum coding similarity in the coding similarity set is greater than the coding threshold, replacing the second text label of the second field to be input corresponding to the maximum coding similarity with the first text label;
in step 1125, the first field to be input and the second field to be input with the same text label are added to the same initial questionnaire template image, so as to obtain a plurality of texts to be input.
In this embodiment, the similarity between the first field to be input and each second field to be input may be determined, and then whether the maximum similarity corresponding to each first field to be input is greater than the coding threshold value is determined; if the text mark is larger than the first text mark, changing a first text mark corresponding to a first field to be input; thus, the same second text label can be determined for the first field to be input which is similar to the second field to be input; therefore, the first field to be input and the second field to be input with higher relativity can be added into the same initial questionnaire template image according to the same second text mark, so that a plurality of questions with higher relativity can be placed into the same text to be input.
It should be noted that, the size of the first scanning frame may be set according to actual needs, and in particular, but not limited to, the size of the first scanning frame may be set to cover a whole line of text, and the scanning mode of performing text recognition scanning on the initial questionnaire template image through the first scanning frame may be progressive scanning from top to bottom;
specifically, the set of coding similarities of the first field to be input and each second field to be input may include:,; wherein ,For the d-th first field to be input and the coding similarity set of each second field to be input, x is the total number of the first fields to be input, +.>For the coding similarity of the d first field to be input and the q second field to be input, w isThe total number of second fields to be entered;
specifically, an existing text coding mode may be used to code each first field to be input and each second field to be input, so as to obtain coding vectors corresponding to each first field to be input and each second field to be input, and respectively calculate vector similarity between the two coding vectors;
when (when)When it willThe second text label of the corresponding second field to be entered is replaced by +.>Is a first text mark of (a); wherein Y2 is a coding threshold; y2 may be set according to an actual use scenario.
In yet another alternative embodiment of the present invention, step 1122 may include:
step 11221, when at least one object to be input is included in the first scan frame, converting each object to be input into a first field to be input corresponding to the object to be input, and setting a first text mark; the first text labels corresponding to each first field to be input are the same;
step 11222, when the first scan frame includes only text objects, enlarging the scan area of the first scan frame;
step 11223, when the enlarged first scan frame includes an input identification object, converting the object to be input into a second field to be input and setting a second text label; wherein the second text labels corresponding to each second field to be input are different.
In this embodiment, the scanning area of the first scanning frame may be enlarged downward by increasing the length of the first scanning frame in the up-down direction of the initial questionnaire template image, and specifically, the size of the line space may be doubled each time;
because the complex item to be filled can be at least one line of text description, setting a corresponding blank area below the complex item to be filled; when encountering a complex item to be filled, the first scanning frame only has text information in the scanning frame, so that the first scanning frame can be vertically enlarged, and the first scanning frame can completely cover the complex item to be filled.
It should be noted that, the object to be input is the item to be filled, and the text object in the object to be input is the question field, for example, the input identification object corresponding to the gender is the input mark, for example ": "or an underline or a blank area of a corresponding size, the simple entry to be filled in can conform to the above characteristics, and therefore, the simple entry to be filled in can be converted into the first field to be input having the same first text label.
In yet another alternative embodiment of the present invention, step 12 may include:
step 121, obtaining a comparison set of at least one target comparison information corresponding to the input information;
step 122, word segmentation processing is performed on the input information, so as to obtain a plurality of words to be detected corresponding to the input information;
step 123, performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
step 124, determining a part-of-speech sequence to be detected of the input information according to the part-of-speech tag contained in the input information;
step 125, obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
And step 126, if the maximum value in the similarity value sequence is greater than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
Wherein the similarity threshold satisfies the following condition:
wherein Y1 is a similarity threshold,as the threshold coefficient, AVG (a) is a number average of comparison words included in a plurality of target comparison information corresponding to the input information.
In this embodiment, similarity compliance detection processing can be performed on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain, so as to obtain a detection result. Therefore, the difference between the data can be more accurately determined, and finally the data exchange accuracy is improved.
The comparison set of the at least one target comparison information corresponding to the input information may be:; wherein ,,For the comparison set of the ith target comparison information corresponding to the input information,/for the comparison set of the ith target comparison information>For the part-of-speech class sequence corresponding to the comparison word included in the ith target comparison information,/->For the vocabulary sequence corresponding to the comparison word included in the ith target comparison information, z is the total number of target comparison information corresponding to the input information, i=1, 2, … and z;
It should be noted that, when different people answer the same question, different answers may be input due to different semantic habits, for example: in answering questions of the self-evaluation work development situation, the detailed contents to be answered may specifically include: start-stop time, organization, implementation process, implementation mode and other contents; when different people reply, the sequence of the several items may be exchanged, different description modes such as flip-chip sentence may be used in the reply, and several items may be combined to reply, for example: the start-stop time, implementation process and implementation mode are answered in a reply; therefore, there may be a plurality of answer patterns meeting the requirements for the same question, so that the input information may correspond to at least one target comparison information;
since the contents included in all the answer patterns are substantially the same, the keywords corresponding to the answer contents are substantially the same, and thus, the part of speech class sequence of each target comparison information can be generated based on the part of speech of the keywords included in the answer and the arrangement order of the keywords.
In order to improve the matching precision, a vocabulary sequence of each target comparison information can be generated according to the vocabulary to which each keyword belongs; for example: one correct input is: the A aviation carries out data security self-checking activities in the A international airport in 2000, wherein the keywords are "A aviation", "2000", "A international airport" and "data security self-checking activities"; the part-of-speech class sequence of the target comparison information corresponding to the input information is noun, noun and noun; the corresponding vocabulary sequence is the name of the airline company, the time, the airport name and the activity name; if the answer input by the first node is that the international airport A and the aviation A jointly develop service promotion training in 2000. Although the part-of-speech class sequences of the keywords "International airport A", "aviation A", "year 2000" and "service improvement training" are the same as those described above, the vocabulary sequence corresponding to the input information is significantly different from the vocabulary sequence described above; therefore, by setting the part-of-speech class sequence and the vocabulary sequence at the same time, the reference dimension of similarity calculation is increased, and the accuracy of calculating the similarity between the input information and the standard answer can be improved;
Specifically, a plurality of word lists can be set, corresponding common words are configured in the word lists, and corresponding word list names are configured; configuring a corresponding vocabulary name for the comparison word according to the vocabulary name of the vocabulary to which the comparison word belongs in the target comparison information, and further generating a corresponding vocabulary sequence;
word segmentation processing is carried out on the input information to obtain a plurality of corresponding words to be detectedTo include:; wherein ,For the b-th word to be detected in the input information, y is the total number of words to be detected in the input information, b=1, 2, …, y;
performing part-of-speech detection on each word to be detected, and determining the part-of-speech tag corresponding to each word to be detected may include:; wherein ,The part-of-speech tag corresponding to the b-th word to be detected in the input information;
specifically, word segmentation processing and part-of-speech tagging processing can be performed through a CRF (conditional random field ) model;
the part-of-speech tags included in the input information may include:obtaining a part-of-speech sequence to be detected of input information; wherein y is the total number of words to be detected in the input information, and b=1, 2, …, y.
In yet another alternative embodiment of the present invention, step 125 may include:
step 1251, according to Obtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information;
wherein ,for the similarity value of the first input information and the ith target comparison information, +.>For the b-th of the input informationAnd b=1, 2,3, …, y and y are the total number of words to be detected in the first input information.
step 12511, obtaining a target matching sequence corresponding to the b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
step 12512, when the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,;
step 12513, when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as the initial comparison word;
Step 12514, when the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,;
step 12515, when the b-th word to be detected in the input information belongs to the vocabulary corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and;
In this embodiment, when determining the sub-similarity value of each word to be detected, determining a target matching sequence corresponding to the sub-similarity value, and searching for a target comparison word corresponding to the sub-similarity value in the target matching sequence; in determiningIn the process, the sequence of each word to be detected in the input information input by the first node is arranged, so that the calculation accuracy of the similarity between the input information and the target comparison information is further improved by adding a reference factor of one dimension; the sub-similarity value of the word to be detected and the target comparison information is related to the similarity of the word to be detected and the similarity of the content of the word to be detected; therefore, the sub-similarity value corresponding to the word to be detected has higher accuracy, so that the calculation accuracy of the similarity between the input information and the target comparison information is improved; therefore, the difference between the data is more accurately determined, and finally the data exchange accuracy is improved.
It should be noted that, the sub similarity value is positively correlated with the part-of-speech relevance of the word to be detected and the content relevance of the word to be detected; generating part-of-speech relevance according to the part-of-speech tag of the word to be detected and the corresponding part-of-speech class sequence; generating the relativity of the word list according to the word to be detected and the word list sequence.
Specifically, for example, if the b-th word to be detected is aviation a, the b-th comparison word corresponding to the target comparison information and the part of speech in the part of speech class sequence can be compared to generate a corresponding part of speech relativity; for example: if the parts of speech are the same, the part of speech relativity is determined to be 1, and if the parts of speech are different, the part of speech relativity is determined to be 0.
Meanwhile, the word to be detected can be matched with a b-th word list in the word list sequence corresponding to the target comparison information, and corresponding word list relativity is generated; for example: if the word to be detected belongs to the vocabulary, determining the relativity of the vocabulary as 1, and if the word to be detected does not belong to the vocabulary, determining the relativity of the vocabulary as 0;
and finally, taking the sum of the part-of-speech relativity of the word to be detected and the relativity of the word list as a sub-similarity value of the word to be detected and the target comparison information.
In the above embodiment of the present invention, by the blockchain-based data processing method, the accuracy of data exchange can be improved by performing compliance detection on the input information. After passing the compliance detection, the input information is sent to a second node, and if the second node is a data compliance auditor, the input information can be audited again, so that the accuracy rate of data exchange can be further improved through the two-time compliance detection of the intelligent contract and the second node;
Meanwhile, in the compliance detection of the present invention, the similarity value of the input information and each target comparison information is the sum of sub-similarity values of the words to be detected included in the input information. And, the sub-similarity value is positively correlated with the part-of-speech relevance of the word to be detected and the content relevance of the word to be detected. The sub-similarity value of the word to be detected and the target comparison information is related to the similarity of the word to be detected and the similarity of the content of the word to be detected. Therefore, the sub-similarity value corresponding to the word to be detected has higher accuracy, the similarity of the input information and the target comparison information can be improved, the data difference is further reduced, and finally the data exchange accuracy is further improved.
As shown in fig. 2, an embodiment of the present invention further provides a data processing apparatus 20 based on a blockchain, the apparatus 20 including:
a transceiver module 21, configured to receive input information sent by a first node on a blockchain;
the processing module 22 is configured to perform similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through an intelligent contract of a blockchain, so as to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
Optionally, receiving the input information sent by the first node on the blockchain includes:
acquiring an initial questionnaire template image;
performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input;
and carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
Optionally, performing text recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input, including:
performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
determining a plurality of first fields to be input and a plurality of second fields to be input according to the types of the objects to be input contained in the first scanning frame;
acquiring a coding similarity set of each first field to be input and each second field to be input;
when the maximum coding similarity in the coding similarity set is larger than a coding threshold, replacing a second text mark of a second field to be input corresponding to the maximum coding similarity with the first text mark;
and adding the first field to be input and the second field to be input with the same text mark into the same initial questionnaire template image to obtain a plurality of texts to be input.
Optionally, performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result, where the detection result comprises:
acquiring a comparison set of at least one target comparison information corresponding to the input information;
word segmentation processing is carried out on the input information, so that a plurality of words to be detected corresponding to the input information are obtained;
performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
determining a part-of-speech sequence to be detected of the input information according to part-of-speech tags contained in the input information;
obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
and if the maximum value in the similarity value sequence is larger than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
Optionally, according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information, obtaining a similarity value sequence formed by the input information and a similarity value of each target comparison information includes:
According toObtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information; />
wherein ,for the similarity value of the first input information and the ith target comparison information, +.>For the sub-similarity value of the b-th word to be detected and the i-th target comparison information in the input information, b=1, 2,3, …, y, and y are the total number of words to be detected in the first input information.
acquiring a target matching sequence corresponding to a b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,;
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as an initial comparison word;
When the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,;
when the b-th word to be detected in the input information belongs to the word list corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and;
Optionally, the similarity threshold satisfies the following condition:
wherein Y1 is a similarity threshold,as the threshold coefficient, AVG (a) is a number average of comparison words included in a plurality of target comparison information corresponding to the input information.
It should be noted that, the device is a device corresponding to the above method, and all implementation manners in the above method embodiments are applicable to the embodiment of the device, so that the same technical effects can be achieved.
Embodiments of the present invention also provide a computing device comprising: a processor, a memory storing a computer program which, when executed by the processor, performs the method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
Embodiments of the present invention also provide a computer-readable storage medium comprising instructions which, when run on a computer, cause the computer to perform a method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
Furthermore, it should be noted that in the apparatus and method of the present invention, it is apparent that the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention. Also, the steps of performing the series of processes described above may naturally be performed in chronological order in the order of description, but are not necessarily performed in chronological order, and some steps may be performed in parallel or independently of each other. It will be appreciated by those of ordinary skill in the art that all or any of the steps or components of the methods and apparatus of the present invention may be implemented in hardware, firmware, software, or a combination thereof in any computing device (including processors, storage media, etc.) or network of computing devices, as would be apparent to one of ordinary skill in the art after reading this description of the invention.
The object of the invention can thus also be achieved by running a program or a set of programs on any computing device. The computing device may be a well-known general purpose device. The object of the invention can thus also be achieved by merely providing a program product containing program code for implementing said method or apparatus. That is, such a program product also constitutes the present invention, and a storage medium storing such a program product also constitutes the present invention. It is apparent that the storage medium may be any known storage medium or any storage medium developed in the future. It should also be noted that in the apparatus and method of the present invention, it is apparent that the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention. The steps of executing the series of processes may naturally be executed in chronological order in the order described, but are not necessarily executed in chronological order. Some steps may be performed in parallel or independently of each other.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.
Claims (10)
1. A method of blockchain-based data processing, the method comprising:
receiving input information sent by a first node on a blockchain;
performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result;
and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
2. The blockchain-based data processing method of claim 1, wherein receiving the input information sent by the first node on the blockchain includes:
acquiring an initial questionnaire template image;
performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input;
And carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
3. The blockchain-based data processing method of claim 2, wherein performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input comprises:
performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
determining a plurality of first fields to be input and a plurality of second fields to be input according to the types of the objects to be input contained in the first scanning frame;
acquiring a coding similarity set of each first field to be input and each second field to be input;
when the maximum coding similarity in the coding similarity set is larger than a coding threshold, replacing a second text mark of a second field to be input corresponding to the maximum coding similarity with the first text mark;
and adding the first field to be input and the second field to be input with the same text mark into the same initial questionnaire template image to obtain a plurality of texts to be input.
4. The blockchain-based data processing method of claim 1, wherein the performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through the intelligent contract of the blockchain to obtain a detection result includes:
Acquiring a comparison set of at least one target comparison information corresponding to the input information;
word segmentation processing is carried out on the input information, so that a plurality of words to be detected corresponding to the input information are obtained;
performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
determining a part-of-speech sequence to be detected of the input information according to part-of-speech tags contained in the input information;
obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
and if the maximum value in the similarity value sequence is larger than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
5. The blockchain-based data processing method of claim 4, wherein obtaining a sequence of similarity values formed by the input information and the similarity value of each target comparison information according to the word to be tested and the target comparison information in the part-of-speech sequence to be tested, comprises:
according toObtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information; / >
wherein ,for the similarity value of the first input information and the ith target comparison information, +.>For the sub-similarity value of the b-th word to be detected and the i-th target comparison information in the input information, b=1, 2,3, …, y, and y are the total number of words to be detected in the first input information.
6. The method for blockchain-based data processing of claim 5, wherein,is determined by the following process:
acquiring a target matching sequence corresponding to a b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,;
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as an initial comparison word;
When the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,;
when the b-th word to be detected in the input information belongs to the word list corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and;
7. The blockchain-based data processing method of claim 5, wherein the similarity threshold satisfies the following condition:
8. A blockchain-based data processing device, the device comprising:
the receiving and transmitting module is used for receiving input information sent by a first node on the block chain;
the processing module is used for carrying out similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
9. A computing device, comprising: a processor, a memory and a program or instruction stored on the memory and executable on the processor, which program or instruction when executed by the processor implements the steps of the method of any of claims 1-7.
10. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310493762.9A CN116204594A (en) | 2023-05-05 | 2023-05-05 | Data processing method, device and equipment based on block chain |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310493762.9A CN116204594A (en) | 2023-05-05 | 2023-05-05 | Data processing method, device and equipment based on block chain |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116204594A true CN116204594A (en) | 2023-06-02 |
Family
ID=86509832
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310493762.9A Pending CN116204594A (en) | 2023-05-05 | 2023-05-05 | Data processing method, device and equipment based on block chain |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116204594A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190130114A1 (en) * | 2017-10-30 | 2019-05-02 | Pricewaterhousecoopers Llp | Implementation of continuous real-time validation of distributed data storage systems |
CN110765244A (en) * | 2019-09-18 | 2020-02-07 | 平安科技(深圳)有限公司 | Method and device for acquiring answering, computer equipment and storage medium |
CN111930809A (en) * | 2020-09-17 | 2020-11-13 | 支付宝(杭州)信息技术有限公司 | Data processing method, device and equipment |
CN112256271A (en) * | 2020-10-19 | 2021-01-22 | 中国科学院信息工程研究所 | Block chain intelligent contract security detection system based on static analysis |
CN112541194A (en) * | 2020-12-16 | 2021-03-23 | 国网河北省电力有限公司建设公司 | Actual measurement data chaining method for engineering construction and engineering detection management method thereof |
CN112883734A (en) * | 2021-01-15 | 2021-06-01 | 成都链安科技有限公司 | Block chain security event public opinion monitoring method and system |
CN114913534A (en) * | 2022-07-19 | 2022-08-16 | 北京嘉沐安科技有限公司 | Block chain-based network security abnormal image big data detection method and system |
US20230050782A1 (en) * | 2021-08-13 | 2023-02-16 | Usscyber Inc. | Server systems and methods for valuing blockchain tokens based on organizational performance |
-
2023
- 2023-05-05 CN CN202310493762.9A patent/CN116204594A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190130114A1 (en) * | 2017-10-30 | 2019-05-02 | Pricewaterhousecoopers Llp | Implementation of continuous real-time validation of distributed data storage systems |
CN110765244A (en) * | 2019-09-18 | 2020-02-07 | 平安科技(深圳)有限公司 | Method and device for acquiring answering, computer equipment and storage medium |
CN111930809A (en) * | 2020-09-17 | 2020-11-13 | 支付宝(杭州)信息技术有限公司 | Data processing method, device and equipment |
CN112256271A (en) * | 2020-10-19 | 2021-01-22 | 中国科学院信息工程研究所 | Block chain intelligent contract security detection system based on static analysis |
CN112541194A (en) * | 2020-12-16 | 2021-03-23 | 国网河北省电力有限公司建设公司 | Actual measurement data chaining method for engineering construction and engineering detection management method thereof |
CN112883734A (en) * | 2021-01-15 | 2021-06-01 | 成都链安科技有限公司 | Block chain security event public opinion monitoring method and system |
US20230050782A1 (en) * | 2021-08-13 | 2023-02-16 | Usscyber Inc. | Server systems and methods for valuing blockchain tokens based on organizational performance |
CN114913534A (en) * | 2022-07-19 | 2022-08-16 | 北京嘉沐安科技有限公司 | Block chain-based network security abnormal image big data detection method and system |
Non-Patent Citations (2)
Title |
---|
王爱英: "计算机组成与结构", 机械工业出版社, pages: 50 - 51 * |
王翔;: "健康大数据平台的"区块链治理"", 网络空间安全, no. 12 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110825882B (en) | Knowledge graph-based information system management method | |
Huang et al. | Automating intention mining | |
CA3052527C (en) | Target document template generation | |
Wu et al. | Question condensing networks for answer selection in community question answering | |
Poesio et al. | Anaphora resolution | |
Ameisen | Building Machine Learning Powered Applications: Going from Idea to Product | |
US20200004765A1 (en) | Unstructured data parsing for structured information | |
US11410130B2 (en) | Creating and using triplet representations to assess similarity between job description documents | |
Altintas et al. | Machine learning based ticket classification in issue tracking systems | |
CN110610003B (en) | Method and system for assisting text annotation | |
CN111782793A (en) | Intelligent customer service processing method, system and equipment | |
CN113157867A (en) | Question answering method and device, electronic equipment and storage medium | |
US11386263B2 (en) | Automatic generation of form application | |
Iqbal et al. | Multimedia based student-teacher smart interaction framework using multi-agents in eLearning | |
Bhagat et al. | Survey on text categorization using sentiment analysis | |
Dell | Deep learning for economists | |
Mgarbi et al. | Towards a new job offers recommendation system based on the candidate resume | |
Banu et al. | An intelligent web app chatbot | |
EP4300445A1 (en) | Generalizable key-value set extraction from documents using machine learning models | |
Vysotska et al. | Sentiment Analysis of Information Space as Feedback of Target Audience for Regional E-Business Support in Ukraine. | |
WO2024015740A1 (en) | Methods and apparatus for generating behaviorally anchored rating scales (bars) for evaluating job interview candidate | |
CN112732908B (en) | Test question novelty evaluation method and device, electronic equipment and storage medium | |
CN116204594A (en) | Data processing method, device and equipment based on block chain | |
Komamizu et al. | Exploring Identical Users on GitHub and Stack Overflow. | |
EP4411587A1 (en) | Detection of blanks in documents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |