CN110866172B - Data analysis method for block chain system - Google Patents

Data analysis method for block chain system Download PDF

Info

Publication number
CN110866172B
CN110866172B CN201911079968.7A CN201911079968A CN110866172B CN 110866172 B CN110866172 B CN 110866172B CN 201911079968 A CN201911079968 A CN 201911079968A CN 110866172 B CN110866172 B CN 110866172B
Authority
CN
China
Prior art keywords
data
block chain
block
nodes
transaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911079968.7A
Other languages
Chinese (zh)
Other versions
CN110866172A (en
Inventor
高健博
任立峰
李青山
吴振豪
刘世克
冯向军
吴奇泽
司华友
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guoxin Yunfu Technology Co ltd
Nanjing Boya Blockchain Research Institute Co ltd
Boya Chain Beijing Technology Co ltd
Original Assignee
Beijing Guoxin Yunfu Technology Co ltd
Nanjing Boya Blockchain Research Institute Co ltd
Boya Chain Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guoxin Yunfu Technology Co ltd, Nanjing Boya Blockchain Research Institute Co ltd, Boya Chain Beijing Technology Co ltd filed Critical Beijing Guoxin Yunfu Technology Co ltd
Priority to CN201911079968.7A priority Critical patent/CN110866172B/en
Publication of CN110866172A publication Critical patent/CN110866172A/en
Application granted granted Critical
Publication of CN110866172B publication Critical patent/CN110866172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data analysis method for a block chain system, and relates to the technical field of block chains. The method comprises the steps of firstly deploying complete nodes of a block chain, connecting the nodes to a block chain network, and synchronizing the nodes with other nodes in the block chain network; communicating with the deployed block link points by RPC, reading data in each transaction in sequence starting from a block with a block height of 1; sequentially judging whether the acquired data in each transaction is invalid data related to the intelligent contract characteristics, coding the data and discarding the invalid coded data higher than a set threshold; and then determining the language used by the data, finally matching the sensitive keywords and classifying the emotion, and alarming the data matched with the sensitive keywords and the data judged to be negative. The method carries out targeted design aiming at the characteristics of the block chain public sentiment, and effectively improves the accuracy of the block chain public sentiment data analysis.

Description

Data analysis method for block chain system
Technical Field
The invention relates to the technical field of block chains, in particular to a data analysis method for a block chain system.
Background
The network public sentiment is always considered as an important expression of social sentiment and is used for analyzing network public sentiment data, so that the emotion, attitude, opinion and viewpoint of netizens can be known timely and accurately. At present, each unit mainly faces to internet channels such as news, forums, blogs, microblogs and the like for collecting, analyzing and monitoring network public opinions.
With the rapid development of blockchain technology, some netizens choose to write public opinion data into blockchains due to their non-falsification property, and once the public opinion data is widely distributed, the public opinion data will cause significant social impact. Therefore, analyzing and monitoring public opinion data in a blockchain network is an important component in network public opinion work.
Compared with internet public sentiment, the block chain public sentiment mainly has the following characteristics:
1. the collection forms are different; and the blockchain public opinion needs to maintain blockchain nodes, and data is synchronized in real time from a blockchain network. Acquisition cannot be performed by the internet crawler system.
2. The coding forms are different; at present, the mainstream block chain system only accepts binary data, so public opinion data is often converted into binary data through UTF-8 coding and written into a block chain. The data seen directly from the blockchain is in binary form and can be read by a human being after being decoded.
3. The number of invalid data is large; since data such as blockchain data and smart contracts are written into the same field, a large amount of data related to the smart contracts is invalid for public opinion data analysis.
4. A multi-language environment; due to the country-crossing property of the block chain, the data written into the same block chain has a plurality of different languages, such as Chinese, english and the like, and also comprises various languages.
However, until now, there is no efficient and accurate public opinion data analysis technology for blockchains.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a data analysis method for a block chain system, which is used for analyzing and monitoring public opinion data in a block chain timely and accurately.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a data analysis method facing a block chain system comprises the following steps:
step 1, deploying complete nodes of a block chain, connecting the nodes to a block chain network to be subjected to data analysis, and synchronizing the nodes with other nodes in the block chain network;
step 2, communicating with the deployed block chain link points through RPC, reading data in each transaction in sequence from a block with the block height of 1, and storing the block, the transaction and data information in the transaction; sequentially executing steps 3-7 on the acquired data in each transaction;
step 3, deleting invalid data related to the intelligent contract characteristics running on the block chain network to be subjected to data analysis in the acquired data through characteristic matching;
the intelligent contract features include the following: (1) 0x6060 start; (2) 0x6080 start; (3) Not calculating 0x, representing the length of 16 system as 8+64 n, n ≧ 0, and the target address is a contract address, i.e., the code field of the address is not null;
step 4, decoding the acquired data in a UTF-8 coded format, and discarding data with invalid codes higher than a set threshold;
step 5, identifying the language of the acquired data through a multilingual dictionary, and determining the language used by the data;
step 6, matching the acquired data with sensitive keywords, and if the sensitive keywords are matched, giving an alarm to the data;
and 7, performing emotion classification on the acquired data in a cross-language emotion analysis mode, classifying the data into three categories of positive, neutral and negative, and alarming the data judged to be negative.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: according to the data analysis method for the block chain system, the characteristics of a block chain public opinion collection form, a coding form, a large amount of invalid data, a multi-language environment and the like are fully considered, targeted design is carried out, and the accuracy of block chain public opinion data analysis is effectively improved; and the block chain public sentiment can be analyzed and monitored timely and accurately, the monitoring range of the network public sentiment is effectively expanded, and the blank of the block chain public sentiment data analysis field is made up.
Drawings
Fig. 1 is a flowchart of a data analysis method for a blockchain system according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The embodiments of the present invention will be further described with reference to the accompanying drawings.
A data analysis method for a blockchain system, as shown in fig. 1, includes the following steps:
step 1, deploying a block chain complete node, and connecting the block chain complete node to a block chain network for synchronization;
in the embodiment of the invention, the data on the block chain is set to be 0x426c6f636b636861696ee88886e68385;
step 2, communicating with the block chain link points through an RPC (Remote Procedure Call), reading data in each transaction in sequence from a block with the block height of 1, and storing block, transaction and data information;
in this embodiment, in the reading process, data 0x426c6f636b636861696ee88886e68385 and the information of the block and the transaction where the data is located are read.
Step 3, deleting invalid data related to the intelligent contract characteristics running on the block chain network to be subjected to data analysis in the acquired data through characteristic matching;
the intelligent contract characteristics include the following: (1) 0x6060 start; (2) 0x6080 start; (3) 0x is not calculated, length denoted 16 is 8+64 n (n ≧ 0), and the target address is the contract address (i.e., the code field for that address is not null). In this embodiment, the data on the blockchain does not meet the above condition, and is therefore not invalid data related to the smart contract.
And 4, decoding the acquired data in a UTF-8 coded format, and discarding the data with invalid codes higher than 10%. The data is decoded into 'Blockchain public opinion', and no invalid codes appear.
And 5, identifying the language of the acquired data through a multilingual dictionary, and determining the language used by the data.
In this embodiment, it is determined that the data is a mixture of english and chinese by dictionary matching, "Blockchain" is english, and "public opinion" is chinese.
And 6, matching sensitive keywords including important persons, places, events and the like on the acquired data, and alarming the data if the sensitive keywords are matched.
In this embodiment, no sensitive keyword is matched in "Blockchain public opinion".
And 7, carrying out emotion classification on the data in a cross-language emotion analysis mode, classifying the data into three categories of positive, neutral and negative, and alarming the data judged to be negative.
In this embodiment, for the chinese and english data, the existing trained emotion classification model is used to directly classify the data. And for data of other languages, translating the data into Chinese and English versions through a translator respectively, and classifying the data through a Chinese emotion classification model and an English emotion classification model respectively. If the two model results are consistent or close (e.g., one positive and one neutral), then the consistent result is taken as the final result (the final result of one positive and one neutral is taken as the positive); if the two models result in conflict (one positive and one negative), the data is marked and submitted for manual processing. In this embodiment, "Blockchain public opinion" is neutral, and therefore no alarm is given.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions and scope of the present invention as defined in the appended claims.

Claims (1)

1. A data analysis method for a block chain system is characterized in that: the method comprises the following steps:
step 1, deploying complete nodes of a block chain, connecting the nodes to a block chain network to be subjected to data analysis, and synchronizing the nodes with other nodes in the block chain network;
step 2, communicating with the deployed block link points through RPC, sequentially reading data in each transaction from a block with the block height of 1, and storing the block, the transaction and data information in the transaction; sequentially executing steps 3-7 on the acquired data in each transaction;
step 3, deleting invalid data related to the intelligent contract characteristics running on the block chain network to be subjected to data analysis in the acquired data through characteristic matching;
step 4, decoding the acquired data in a UTF-8 coded format, and discarding data with invalid codes higher than a set threshold;
step 5, identifying the language of the acquired data through a multilingual dictionary, and determining the language used by the data;
step 6, matching the acquired data with sensitive keywords, and if the sensitive keywords are matched, giving an alarm to the data;
step 7, performing emotion classification on the acquired data in a cross-language emotion analysis mode, classifying the data into three categories of positive, neutral and negative, and alarming the data judged to be negative;
step 3, the intelligent contract characteristics comprise the following steps: (1) 0x6060 start; (2) 0x6080 start; (3) 0x is not calculated, length denoted 16 is 8+64 n, n ≧ 0, and the target address is the contract address, i.e., the code field for that address is not null.
CN201911079968.7A 2019-11-07 2019-11-07 Data analysis method for block chain system Active CN110866172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911079968.7A CN110866172B (en) 2019-11-07 2019-11-07 Data analysis method for block chain system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911079968.7A CN110866172B (en) 2019-11-07 2019-11-07 Data analysis method for block chain system

Publications (2)

Publication Number Publication Date
CN110866172A CN110866172A (en) 2020-03-06
CN110866172B true CN110866172B (en) 2023-01-03

Family

ID=69653515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911079968.7A Active CN110866172B (en) 2019-11-07 2019-11-07 Data analysis method for block chain system

Country Status (1)

Country Link
CN (1) CN110866172B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632346A (en) * 2021-01-11 2021-04-09 绵阳沸尔特科技有限公司 Data analysis method for block chain system
CN112925847B (en) * 2021-02-22 2022-07-05 同济大学 Data processing and network analysis tool for block chain

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107018146A (en) * 2017-05-09 2017-08-04 暨南大学 A kind of public sentiment detection platform building method based on block chain technology
CN107103087A (en) * 2017-05-02 2017-08-29 成都中远信电子科技有限公司 Block chain big data analysis of market conditions system
CN108769751A (en) * 2018-05-02 2018-11-06 中广热点云科技有限公司 A kind of network video based on intelligent contract listens Management Support System
CN108776671A (en) * 2018-05-12 2018-11-09 苏州华必讯信息科技有限公司 A kind of network public sentiment monitoring system and method
CN109992735A (en) * 2019-03-19 2019-07-09 京东数字科技控股有限公司 The processing method of public sentiment data and publicly-owned catenary system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3503012A1 (en) * 2017-12-20 2019-06-26 Accenture Global Solutions Limited Analytics engine for multiple blockchain nodes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103087A (en) * 2017-05-02 2017-08-29 成都中远信电子科技有限公司 Block chain big data analysis of market conditions system
CN107018146A (en) * 2017-05-09 2017-08-04 暨南大学 A kind of public sentiment detection platform building method based on block chain technology
CN108769751A (en) * 2018-05-02 2018-11-06 中广热点云科技有限公司 A kind of network video based on intelligent contract listens Management Support System
CN108776671A (en) * 2018-05-12 2018-11-09 苏州华必讯信息科技有限公司 A kind of network public sentiment monitoring system and method
CN109992735A (en) * 2019-03-19 2019-07-09 京东数字科技控股有限公司 The processing method of public sentiment data and publicly-owned catenary system

Also Published As

Publication number Publication date
CN110866172A (en) 2020-03-06

Similar Documents

Publication Publication Date Title
CN113420296B (en) C source code vulnerability detection method based on Bert model and BiLSTM
CN108885623A (en) The lexical analysis system and method for knowledge based map
CN111177367B (en) Case classification method, classification model training method and related products
CN111723569A (en) Event extraction method and device and computer readable storage medium
CN110866172B (en) Data analysis method for block chain system
US11003705B2 (en) Natural language processing and classification
CN111831902A (en) Recommendation reason screening method and device and electronic equipment
CN115359799A (en) Speech recognition method, training method, device, electronic equipment and storage medium
CN115757695A (en) Log language model training method and system
CN115292568B (en) Civil news event extraction method based on joint model
CN116561748A (en) Log abnormality detection device for component subsequence correlation sensing
CN114742016A (en) Chapter-level event extraction method and device based on multi-granularity entity differential composition
CN111736804A (en) Method and device for identifying App key function based on user comment
CN111562943B (en) Code clone detection method and device based on event embedded tree and GAT network
CN109145297B (en) Network vocabulary semantic analysis method and system based on hash storage
CN115587599B (en) Quality detection method and device for machine translation corpus
KR20120023387A (en) Apparatus and method for disambiguation of morphologically ambiguous korean verbs, and recording medium thereof
KR102575752B1 (en) Examination data classification device and classification method using ensemble classification model
CN113722496B (en) Triple extraction method and device, readable storage medium and electronic equipment
CN111582825B (en) Product information auditing method and system based on deep learning
CN110197192B (en) Natural language processing, query construction and classification
CN113505889A (en) Processing method and device of atlas knowledge base, computer equipment and storage medium
CN116861996A (en) Document-level event extraction method based on co-instruction disambiguation
CN114416174A (en) Model reconstruction method and device based on metadata, electronic equipment and storage medium
CN117193848A (en) Knowledge-enhanced pre-training model-based code abstract automatic generation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant