CN110866172A - Data analysis method for block chain system - Google Patents

Data analysis method for block chain system Download PDF

Info

Publication number
CN110866172A
CN110866172A CN201911079968.7A CN201911079968A CN110866172A CN 110866172 A CN110866172 A CN 110866172A CN 201911079968 A CN201911079968 A CN 201911079968A CN 110866172 A CN110866172 A CN 110866172A
Authority
CN
China
Prior art keywords
data
block chain
block
nodes
transaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911079968.7A
Other languages
Chinese (zh)
Other versions
CN110866172B (en
Inventor
高健博
任立峰
李青山
吴振豪
刘世克
冯向军
吴奇泽
司华友
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guoxin Cloud Clothing Technology Co Ltd
Nanjing Boya Blockchain Research Institute Co Ltd
Boya Chain Beijing Technology Co Ltd
Original Assignee
Beijing Guoxin Cloud Clothing Technology Co Ltd
Nanjing Boya Blockchain Research Institute Co Ltd
Boya Chain Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guoxin Cloud Clothing Technology Co Ltd, Nanjing Boya Blockchain Research Institute Co Ltd, Boya Chain Beijing Technology Co Ltd filed Critical Beijing Guoxin Cloud Clothing Technology Co Ltd
Priority to CN201911079968.7A priority Critical patent/CN110866172B/en
Publication of CN110866172A publication Critical patent/CN110866172A/en
Application granted granted Critical
Publication of CN110866172B publication Critical patent/CN110866172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data analysis method for a block chain system, and relates to the technical field of block chains. The method comprises the steps of firstly deploying complete nodes of a block chain, connecting the nodes to a block chain network, and synchronizing the nodes with other nodes in the block chain network; communicating with the deployed block link points by RPC, reading data in each transaction in sequence starting from a block with a block height of 1; sequentially judging whether the acquired data in each transaction is invalid data related to the intelligent contract characteristics, coding the data and discarding the invalid coded data higher than a set threshold; and then determining the language used by the data, finally matching the sensitive keywords and classifying the emotion, and alarming the data matched with the sensitive keywords and the data judged to be negative. The method carries out targeted design aiming at the characteristics of the block chain public sentiment, and effectively improves the accuracy of the block chain public sentiment data analysis.

Description

Data analysis method for block chain system
Technical Field
The invention relates to the technical field of block chains, in particular to a data analysis method for a block chain system.
Background
Network public sentiment is always considered as an important expression mode of social sentiment and folk, and the analysis of the network public sentiment data is helpful for timely and accurately knowing the emotion, attitude, opinion and viewpoint of netizens. At present, each unit mainly faces to internet channels such as news, forums, blogs, microblogs and the like for collecting, analyzing and monitoring network public opinions.
With the rapid development of blockchain technology, some netizens choose to write public sentiment data into blockchains due to the property of being not able to be tampered, and once the public sentiment data is widely spread, the public sentiment data can cause significant social impact. Therefore, analyzing and monitoring public opinion data in a blockchain network is an important component in network public opinion work.
Compared with internet public sentiment, the block chain public sentiment mainly has the following characteristics:
1. the collection forms are different; block chain public sentiment needs to maintain block chain nodes, and data is synchronized in real time from a block chain network. Acquisition cannot be performed by the internet crawler system.
2. The coding forms are different; at present, the mainstream block chain system only receives binary data, so public sentiment data is often converted into binary data after being coded by UTF-8 and written into a block chain. The data seen directly from the blockchain is in binary form and can be read by a human being after being decoded.
3. The number of invalid data is large; since data such as blockchain data and smart contracts are written into the same field, a large amount of data related to the smart contracts is invalid for public opinion data analysis.
4. A multi-language environment; due to the country-crossing property of the block chain, the data written into the same block chain has a plurality of different languages, such as Chinese, English and the like, and also comprises various languages.
However, until now, there is no efficient and accurate public opinion data analysis technology for blockchains.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a data analysis method for a block chain system, which is used for analyzing and monitoring public opinion data in a block chain timely and accurately.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a data analysis method facing a block chain system comprises the following steps:
step 1, deploying complete nodes of a block chain, connecting the nodes to a block chain network to be subjected to data analysis, and synchronizing the nodes with other nodes in the block chain network;
step 2, communicating with the deployed block chain link points through RPC, reading data in each transaction in sequence from a block with the block height of 1, and storing the block, the transaction and data information in the transaction; sequentially executing steps 3-7 on the acquired data in each transaction;
step 3, deleting invalid data related to the intelligent contract characteristics running on the block chain network to be subjected to data analysis in the acquired data through characteristic matching;
the intelligent contract characteristics comprise the following: (1)0x6060 beginning; (2)0x6080 start; (3) not calculating 0x, wherein the length of the 16-system is 8+64 x n, n is more than or equal to 0, and the target address is a contract address, namely the code field of the address is not null;
step 4, decoding the acquired data in a UTF-8 coded format, and discarding data with invalid codes higher than a set threshold;
step 5, identifying the language of the acquired data through a multilingual dictionary, and determining the language used by the data;
step 6, matching the acquired data with sensitive keywords, and if the sensitive keywords are matched, giving an alarm to the data;
and 7, performing emotion classification on the acquired data in a cross-language emotion analysis mode, classifying the data into three categories of positive, neutral and negative, and alarming the data judged to be negative.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: according to the data analysis method for the block chain system, the characteristics of a block chain public opinion collection form, a coding form, a large amount of invalid data, a multi-language environment and the like are fully considered, targeted design is carried out, and the accuracy of block chain public opinion data analysis is effectively improved; and the block chain public sentiment can be analyzed and monitored timely and accurately, the monitoring range of the network public sentiment is effectively expanded, and the blank of the block chain public sentiment data analysis field is made up.
Drawings
Fig. 1 is a flowchart of a data analysis method for a blockchain system according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The embodiments of the present invention will be further described with reference to the accompanying drawings.
A data analysis method for a blockchain system, as shown in fig. 1, includes the following steps:
step 1, deploying a block chain complete node, and connecting the block chain complete node to a block chain network for synchronization;
in the embodiment of the present invention, the data on the block chain is set to 0x426c6f636b636861696ee88886e 68385;
step 2, communicating with the block chain link points through an RPC (Remote Procedure Call), reading data in each transaction in sequence from a block with the block height of 1, and storing block, transaction and data information;
in this embodiment, during the reading process, the data 0x426c6f636b636861696ee88886e68385 and the block and transaction information thereof are read.
Step 3, deleting invalid data related to the intelligent contract characteristics running on the block chain network to be subjected to data analysis in the acquired data through characteristic matching;
the intelligent contract characteristics include the following: (1)0x6060 beginning; (2)0x6080 start; (3) 0x is not calculated, the length, denoted 16, is 8+64 n (n ≧ 0), and the target address is the contract address (i.e., the code field for that address is not null). In this embodiment, the data on the blockchain does not meet the above condition, and is therefore not invalid data related to the smart contract.
And 4, decoding the acquired data in a UTF-8 coded format, and discarding the data with invalid codes higher than 10%. The data is decoded into 'Blockchain public opinion', and no invalid codes appear.
And 5, identifying the language of the acquired data through a multilingual dictionary, and determining the language used by the data.
In this embodiment, it is determined that the data is a mixture of english and chinese by dictionary matching, "Blockchain" is english, and "public opinion" is chinese.
And 6, matching sensitive keywords including important persons, places, events and the like on the acquired data, and alarming the data if the sensitive keywords are matched.
In this embodiment, no sensitive keyword is matched in "Blockchain public opinion".
And 7, carrying out emotion classification on the data in a cross-language emotion analysis mode, classifying the data into three categories of positive, neutral and negative, and alarming the data judged to be negative.
In this embodiment, for the chinese and english data, the existing trained emotion classification model is used to directly classify the data. And for data of other languages, translating the data into Chinese and English versions through a translator respectively, and classifying the data through a Chinese emotion classification model and an English emotion classification model respectively. If the two model results are consistent or close (e.g., one positive and one neutral), then the consistent result is taken as the final result (the final result of one positive and one neutral is taken as the positive); if the two models result in conflict (one positive and one negative), the data is marked and submitted for manual processing. In this embodiment, "Blockchain public opinion" is neutral, and therefore no alarm is given.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions and scope of the present invention as defined in the appended claims.

Claims (2)

1. A data analysis method for a block chain system is characterized in that: the method comprises the following steps:
step 1, deploying complete nodes of a block chain, connecting the nodes to a block chain network to be subjected to data analysis, and synchronizing the nodes with other nodes in the block chain network;
step 2, communicating with the deployed block chain link points through RPC, reading data in each transaction in sequence from a block with the block height of 1, and storing the block, the transaction and data information in the transaction; sequentially executing steps 3-7 on the acquired data in each transaction;
step 3, deleting invalid data related to the intelligent contract characteristics running on the block chain network to be subjected to data analysis in the acquired data through characteristic matching;
step 4, decoding the acquired data in a UTF-8 coded format, and discarding data with invalid codes higher than a set threshold;
step 5, identifying the language of the acquired data through a multilingual dictionary, and determining the language used by the data;
step 6, matching the acquired data with sensitive keywords, and if the sensitive keywords are matched, giving an alarm to the data;
and 7, performing emotion classification on the acquired data in a cross-language emotion analysis mode, classifying the data into three categories of positive, neutral and negative, and alarming the data judged to be negative.
2. The method of claim 1, wherein the method comprises: step 3, the intelligent contract characteristics comprise the following steps: (1)0x6060 beginning; (2)0x6080 start; (3) 0x is not calculated, the length is 8+64 x n expressed as 16, n ≧ 0, and the target address is the contract address, i.e., the code field of the address is not null.
CN201911079968.7A 2019-11-07 2019-11-07 Data analysis method for block chain system Active CN110866172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911079968.7A CN110866172B (en) 2019-11-07 2019-11-07 Data analysis method for block chain system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911079968.7A CN110866172B (en) 2019-11-07 2019-11-07 Data analysis method for block chain system

Publications (2)

Publication Number Publication Date
CN110866172A true CN110866172A (en) 2020-03-06
CN110866172B CN110866172B (en) 2023-01-03

Family

ID=69653515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911079968.7A Active CN110866172B (en) 2019-11-07 2019-11-07 Data analysis method for block chain system

Country Status (1)

Country Link
CN (1) CN110866172B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632346A (en) * 2021-01-11 2021-04-09 绵阳沸尔特科技有限公司 Data analysis method for block chain system
CN112925847A (en) * 2021-02-22 2021-06-08 同济大学 Data processing and network analysis tool for block chain
CN114037245A (en) * 2021-11-02 2022-02-11 南京鼎岩信息科技有限公司 System for multidimensional quantitative analysis of block chain common chain project maturity

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107018146A (en) * 2017-05-09 2017-08-04 暨南大学 A kind of public sentiment detection platform building method based on block chain technology
CN107103087A (en) * 2017-05-02 2017-08-29 成都中远信电子科技有限公司 Block chain big data analysis of market conditions system
CN108769751A (en) * 2018-05-02 2018-11-06 中广热点云科技有限公司 A kind of network video based on intelligent contract listens Management Support System
CN108776671A (en) * 2018-05-12 2018-11-09 苏州华必讯信息科技有限公司 A kind of network public sentiment monitoring system and method
US20190188787A1 (en) * 2017-12-20 2019-06-20 Accenture Global Solutions Limited Analytics engine for multiple blockchain nodes
CN109992735A (en) * 2019-03-19 2019-07-09 京东数字科技控股有限公司 The processing method of public sentiment data and publicly-owned catenary system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103087A (en) * 2017-05-02 2017-08-29 成都中远信电子科技有限公司 Block chain big data analysis of market conditions system
CN107018146A (en) * 2017-05-09 2017-08-04 暨南大学 A kind of public sentiment detection platform building method based on block chain technology
US20190188787A1 (en) * 2017-12-20 2019-06-20 Accenture Global Solutions Limited Analytics engine for multiple blockchain nodes
CN108769751A (en) * 2018-05-02 2018-11-06 中广热点云科技有限公司 A kind of network video based on intelligent contract listens Management Support System
CN108776671A (en) * 2018-05-12 2018-11-09 苏州华必讯信息科技有限公司 A kind of network public sentiment monitoring system and method
CN109992735A (en) * 2019-03-19 2019-07-09 京东数字科技控股有限公司 The processing method of public sentiment data and publicly-owned catenary system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632346A (en) * 2021-01-11 2021-04-09 绵阳沸尔特科技有限公司 Data analysis method for block chain system
CN112925847A (en) * 2021-02-22 2021-06-08 同济大学 Data processing and network analysis tool for block chain
CN114037245A (en) * 2021-11-02 2022-02-11 南京鼎岩信息科技有限公司 System for multidimensional quantitative analysis of block chain common chain project maturity

Also Published As

Publication number Publication date
CN110866172B (en) 2023-01-03

Similar Documents

Publication Publication Date Title
CN113420296B (en) C source code vulnerability detection method based on Bert model and BiLSTM
CN106557695B (en) A kind of malicious application detection method and system
CN110866172B (en) Data analysis method for block chain system
CN107423278B (en) Evaluation element identification method, device and system
CN111475649A (en) False news prediction method, system, device and medium based on deep learning
US11003705B2 (en) Natural language processing and classification
CN111177367A (en) Case classification method, classification model training method and related products
CN114757178A (en) Core product word extraction method, device, equipment and medium
CN116561748A (en) Log abnormality detection device for component subsequence correlation sensing
CN116611071A (en) Function-level vulnerability detection method based on multiple modes
CN115292568B (en) Civil news event extraction method based on joint model
CN114742016B (en) Chapter-level event extraction method and device based on multi-granularity entity different composition
CN115359799A (en) Speech recognition method, training method, device, electronic equipment and storage medium
CN115757695A (en) Log language model training method and system
CN110263345B (en) Keyword extraction method, keyword extraction device and storage medium
CN113568969B (en) Information extraction method, apparatus, device and computer readable storage medium
CN111859862B (en) Text data labeling method and device, storage medium and electronic device
CN113434631A (en) Emotion analysis method and device based on event, computer equipment and storage medium
CN113743118B (en) Entity relation extraction method in legal document based on fusion relation information coding
CN111562943B (en) Code clone detection method and device based on event embedded tree and GAT network
CN115115432A (en) Artificial intelligence based product information recommendation method and device
CN109145297B (en) Network vocabulary semantic analysis method and system based on hash storage
CN113901817A (en) Document classification method and device, computer equipment and storage medium
CN115587599B (en) Quality detection method and device for machine translation corpus
KR102575752B1 (en) Examination data classification device and classification method using ensemble classification model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant