CN113065171B - Block chain-based big data processing system, method, medium and terminal - Google Patents

Block chain-based big data processing system, method, medium and terminal Download PDF

Info

Publication number
CN113065171B
CN113065171B CN202110617287.2A CN202110617287A CN113065171B CN 113065171 B CN113065171 B CN 113065171B CN 202110617287 A CN202110617287 A CN 202110617287A CN 113065171 B CN113065171 B CN 113065171B
Authority
CN
China
Prior art keywords
data
classification
classification model
tamper
block chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110617287.2A
Other languages
Chinese (zh)
Other versions
CN113065171A (en
Inventor
姚娟娟
钟南山
樊代明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mingping Medical Data Technology Co ltd
Original Assignee
Mingpinyun Beijing Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mingpinyun Beijing Data Technology Co Ltd filed Critical Mingpinyun Beijing Data Technology Co Ltd
Priority to CN202110617287.2A priority Critical patent/CN113065171B/en
Publication of CN113065171A publication Critical patent/CN113065171A/en
Application granted granted Critical
Publication of CN113065171B publication Critical patent/CN113065171B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services

Abstract

The invention provides a big data processing system, a method, a medium and a terminal based on a block chain, wherein the system comprises: the data acquisition module is used for acquiring data; the data classification module comprises a first classification model for classifying according to data types and a second classification model for classifying according to data contents; the information authentication module is used for authenticating the qualification information of the tamper-resistant data; the block chain module is used for carrying out data processing on the tamper-proof data which passes the authentication and storing the processed tamper-proof data in the nodes of the block chain; the method is based on the natural language processing technology and the block chain, can classify the composite data for the first time according to the data type, and then classify the composite data for the second time according to the data content to obtain the tamper-resistant data and the common data, further carry out corresponding data processing aiming at the tamper-resistant data, and store the data in the nodes of the block chain, thereby providing a foundation for solving the data security problem under the background of big data.

Description

Block chain-based big data processing system, method, medium and terminal
Technical Field
The present invention relates to the field of big data processing and computer application, and in particular, to a big data processing system, method, medium, and terminal based on a block chain.
Background
In recent years, with the continuous development of big data technology, great convenience is brought to enterprises and users in various industries, but with more centralized and more accessible data, the risk of data leakage is brought, and once the information is leaked, various problems such as ethics, laws, national security and the like can be brought.
At present, the conventional storage of big data generally adopts a centralized deployment mode, and exchanges and shares among different organizations through a data encryption technology, but with the increasing frequency of data interconnection and intercommunication, the existing mode cannot ensure that all organizations can solve the data security problem, so how to strengthen the data protection under the big data background becomes an urgent problem to be solved.
Disclosure of Invention
In view of the above-mentioned shortcomings in the prior art, the present invention provides a system, method, medium and terminal for processing big data based on block chain to solve the above-mentioned technical problems.
The invention provides a big data processing system based on a block chain, which comprises:
the data acquisition module is used for acquiring data;
the data classification module comprises a first classification model for classifying according to data types and a second classification model for classifying according to data contents, wherein the output end of the first classification model is connected with the input end of the second classification model, the second classification model performs secondary classification according to the data classification result of the first classification model to obtain a secondary classification result, and the secondary classification result comprises tamper-resistant data and common data;
the information authentication module is used for authenticating the qualification information of the tamper-resistant data;
and the block chain module is used for carrying out data processing on the authenticated anti-tampering data and storing the processed anti-tampering data in the nodes of the block chain.
In an embodiment of the present invention, the data type includes audio data and text data, and further includes a data processing module, where the data processing module includes a separation unit for performing data separation on the audio data and the text data, a conversion unit for converting the audio data into text data, and an extraction unit for performing keyword extraction on the text data.
In an embodiment of the present invention, the first classification model and the second classification model are trained respectively, where the training includes obtaining a target feature vector corresponding to a target content, inputting the target feature vector into the first classification model, obtaining a plurality of class vectors, obtaining a primary classification result and feature information of the primary classification result corresponding to the first classification model according to a weight of the class vectors, and inputting the feature information of the primary classification result and the target feature vector into the second classification model, so as to obtain a secondary classification result.
In an embodiment of the present invention, word segmentation processing is performed on the text data, and feature coding is performed on each word segmentation to obtain a word segmentation vector; acquiring a content vector, a position vector and a data type vector corresponding to each word segmentation, and acquiring a feature matrix through feature extraction; and performing vector splicing on the word segmentation vectors and the feature matrix to obtain a feature matrix corresponding to each word segmentation, and performing feature extraction on the feature matrix corresponding to each word segmentation to obtain the target feature vector.
In an embodiment of the present invention, the information authentication module includes a first authentication module for performing authentication by using a fixed digital certificate and a second authentication module for performing authentication on qualification information by using a dynamic digital certificate, and the information authentication module performs validity verification by using an intelligent contract, writes valid tamper-resistant data into the block chain, and synchronizes all nodes.
In an embodiment of the present invention, the apparatus further includes an encryption module, configured to encrypt the tamper-resistant data.
The invention also provides a big data processing method based on the block chain, which comprises the following steps:
carrying out data acquisition;
the data classification method comprises the steps that data classification is carried out on collected data through a pre-established data classification module, the data classification module comprises a first classification model used for classifying according to data types and a second classification model used for classifying according to data contents, the second classification model carries out secondary classification according to data classification results of the first classification model, secondary classification results are obtained, and the secondary classification results comprise anti-tampering data and common data;
authenticating qualification information of the tamper-resistant data;
and performing data processing on the authenticated anti-tampering data, and storing the processed anti-tampering data in the nodes of the blockchain.
In one embodiment of the invention, a first key for encryption is created, wherein the first key comprises a public key of the first key and a private key of the first key;
encrypting the tamper-resistant data through a public key of the first secret key to form first encrypted content;
acquiring a public key of an object needing to be authorized, and encrypting a private key of a first secret key through the public key of the authorized object to form second encrypted content;
merging the first encrypted content and the second encrypted content to form data for authorizing different authorized objects;
obtaining the data authorized for different authorized objects through an authorized object, and decrypting the data authorized for different authorized objects by using a private key of the authorized object to obtain a private key of the first key;
and acquiring the tamper-resistant data through a private key of the first secret key.
The invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any one of the above.
The present invention also provides an electronic terminal, comprising: a processor and a memory;
the memory is adapted to store a computer program and the processor is adapted to execute the computer program stored by the memory to cause the terminal to perform the method as defined in any one of the above.
The invention has the beneficial effects that: the big data processing system, the method, the medium and the terminal based on the block chain are based on a natural language processing technology and the block chain, can classify the composite data for the first time according to the data type, and then classify the composite data for the second time according to the data content to obtain the tamper-resistant data and the common data, further carry out corresponding data processing on the tamper-resistant data, and store the data in the nodes of the block chain, thereby providing a foundation for solving the data security problem under the background of the big data.
Drawings
FIG. 1 is a block chain-based big data processing system according to an embodiment of the present invention.
Fig. 2 is a schematic flow chart of a block chain-based big data processing method according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
In the following description, numerous details are set forth to provide a more thorough explanation of embodiments of the present invention, however, it will be apparent to one skilled in the art that embodiments of the present invention may be practiced without these specific details, and in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring embodiments of the present invention.
As shown in fig. 1, the big data processing system based on a block chain in this embodiment includes:
the data acquisition module is used for acquiring data;
the data classification module comprises a first classification model for classifying according to data types and a second classification model for classifying according to data contents, wherein the output end of the first classification model is connected with the input end of the second classification model, the second classification model performs secondary classification according to the data classification result of the first classification model to obtain a secondary classification result, and the secondary classification result comprises tamper-resistant data and common data;
the information authentication module is used for authenticating the qualification information of the tamper-resistant data;
and the block chain module is used for carrying out data processing on the authenticated anti-tampering data and storing the processed anti-tampering data in the nodes of the block chain.
In this embodiment, the big data processing system based on the blockchain may be built on a big data server, and may perform primary classification on the composite data according to the data type, perform secondary classification according to the data content, acquire tamper-resistant data and common data, perform corresponding data processing on the tamper-resistant data, and store the data in the nodes of the blockchain.
In this embodiment, the data classification module mainly includes a first classification model for classifying according to data types and a second classification model for classifying according to data contents, where the first classification model may be classified according to data types, the data types in this embodiment mainly include audio data and text data, and mainly aim at complex application scenarios where audio data and text data are tightly combined with each other, for example, scenarios where text records and audio records exist at the same time in hospitals, banks, airports, and the like. And after primary classification is carried out through the first classification model, the classification result is input into the second classification model, secondary classification is carried out according to the data content, and a secondary classification result is obtained and comprises tamper-resistant data and common data.
In this embodiment, the data processing module mainly includes a separation unit for performing data separation on the audio data and the text data, a conversion unit for converting the audio data into text data, and an extraction unit for performing keyword extraction on the text data. The method is limited by the limitations of time, labor and cost, and aims at the field of big data, the existing method cannot acquire diversified labeled training data, so that overfitting and poor generalization capability of a complex network with a large number of parameters can be caused. In this embodiment, the conversion unit may perform retrieval corresponding to the unified text information in the speech knowledge base data, and convert the audio data into text data by determining consistency between the text information included in the unified text information classification and the text information in the speech text retrieval analysis module.
In this embodiment, word segmentation processing is performed on text data, and feature coding is performed on each word segmentation to obtain a word segmentation vector; acquiring a content vector, a position vector and a data type vector corresponding to each word segmentation, and acquiring a feature matrix through feature extraction; and performing vector splicing on the word vectors and the feature matrix to obtain a feature matrix corresponding to each word, and performing feature extraction on the feature matrix corresponding to each word to obtain a target feature vector. Inputting the obtained target characteristic vector into a first classification model, obtaining a plurality of classification vectors, obtaining a primary classification result and characteristic information of the primary classification result corresponding to the first classification model according to the weight of the classification vectors, and inputting the characteristic information of the primary classification result and the target characteristic vector into a second classification model to obtain a secondary classification result.
In this embodiment, the system further includes an information authentication module, where the information authentication module in this embodiment is mainly used for a first authentication module that performs authentication through a fixed digital certificate and a second authentication module that performs authentication on qualification information through a dynamic digital certificate, and the information authentication module performs validity verification through an intelligent contract, writes valid verified tamper-resistant data into a block chain, and synchronizes all nodes. In this embodiment, the information authentication module may authenticate the organization information through a dynamic digital certificate according to different attributes of the tamper-resistant data, for example: the authority information is authenticated through regions and time, the authority information can be issued in a digital certificate mode, expired certificates with different attributes are automatically closed to download, and preferably, authorities with higher authority use dynamic certificates to issue the digital certificates. The dynamic certificate can be generated according to a certain strategy, the tamper-resistant data of different types, different regions and different times can be authorized by adopting different certificates, and the digital certificate can be stored in special hardware and can also be stored in a corresponding terminal and is kept and used by a role to be used.
In this embodiment, the apparatus further includes an encryption module, configured to encrypt data of the tamper-resistant data. Firstly creating a first secret key for encryption by an encryption module, wherein the first secret key comprises a public key of the first secret key and a private key of the first secret key;
encrypting the tamper-resistant data through a public key of a first secret key to form first encrypted content;
acquiring a public key of an object needing to be authorized, and encrypting a private key of a first secret key through the public key of the authorized object to form second encrypted content;
merging the first encrypted content and the second encrypted content to form data for authorizing different authorized objects;
obtaining data authorized for different authorized objects through the authorized object, and decrypting the data authorized for different authorized objects by using a private key of the authorized object to obtain a private key of a first secret key;
and the plaintext of the tamper-resistant data can be obtained through the private key of the first secret key.
Correspondingly, as shown in fig. 2, the present embodiment further provides a big data processing method based on a block chain, including:
s1, data acquisition is carried out;
s2, carrying out data classification on the acquired data through a pre-established data classification module, wherein the data classification module comprises a first classification model for classifying according to data types and a second classification model for classifying according to data contents, the second classification model carries out secondary classification according to the data classification result of the first classification model to obtain a secondary classification result, and the secondary classification result comprises tamper-resistant data and common data;
s3, authenticating qualification information of the tamper-resistant data;
and S4, performing data processing on the authenticated anti-tampering data, and storing the processed anti-tampering data in the nodes of the block chain.
In this embodiment, the composite data may be classified once according to the data type, and then classified twice according to the data content to obtain the tamper-resistant data and the common data, so as to perform corresponding data processing on the tamper-resistant data, and store the data in the node of the block chain.
In this embodiment, data classification may be performed by a data classification module, where the data classification module in this embodiment includes a first classification model for classifying according to a data type and a second classification model for classifying according to data content, and the first classification model may be classified according to the data type, and the data type in this embodiment mainly includes audio data and text data, and mainly aims at a complex application scenario in which the audio data and the text data are tightly combined with each other, for example, a scenario in which a text record and an audio record exist at the same time in a hospital, a bank, an airport, and the like. And after primary classification is carried out through the first classification model, the classification result is input into the second classification model, secondary classification is carried out according to the data content, and a secondary classification result is obtained and comprises tamper-resistant data and common data.
In this embodiment, it is necessary to perform data separation on the audio data and the text data, convert the audio data into text data for the separated audio data, and perform keyword extraction on the converted text data and the text data subjected to the primary classification, so as to provide a data basis for the subsequent secondary classification. The method is limited by the limitations of time, labor and cost, and aims at the field of big data, the existing method cannot acquire diversified labeled training data, so that overfitting and poor generalization capability of a complex network with a large number of parameters can be caused. In this embodiment, when data conversion is performed on the audio data and the text data, retrieval corresponding to the unified text information may be performed in the voice knowledge base data, and conversion of the audio data into the text data is completed by determining consistency between the text information included in the unified text information classification and the text information in the voice text retrieval analysis module.
In this embodiment, word segmentation processing is performed on text data, and feature coding is performed on each word segmentation to obtain a word segmentation vector; acquiring a content vector, a position vector and a data type vector corresponding to each word segmentation, and acquiring a feature matrix through feature extraction; and performing vector splicing on the word vectors and the feature matrix to obtain a feature matrix corresponding to each word, and performing feature extraction on the feature matrix corresponding to each word to obtain a target feature vector. Inputting the obtained target characteristic vector into a first classification model, obtaining a plurality of classification vectors, obtaining a primary classification result and characteristic information of the primary classification result corresponding to the first classification model according to the weight of the classification vectors, and inputting the characteristic information of the primary classification result and the target characteristic vector into a second classification model to obtain a secondary classification result. In this embodiment of the application, the first classification model and the second classification model may be implemented by any possible classification algorithm, for example, classification may be performed by a softmax algorithm, Logistic regression (Logistic) or full link layer, so as to obtain a classification result corresponding to each participle, where the classification result is a weight value of each participle.
In this embodiment, for the second classification model, the application scenario includes not only the scenario that can be used in hospitals, banks, airports, and the like, but also the scenario that can be used in data content auditing, registration information auditing, and the like, for the above information, on one hand, it is necessary to ensure the information to be real and effective, on the other hand, it is also necessary to perform data protection on the above information to prevent random tampering, and during the application process, it is also possible to perform content auditing on the above information. If the content is approved, information feedback is carried out so as to execute the next operation; and if the content is not approved, feeding back the reason of failure.
In this embodiment, the information authentication may be performed by a fixed digital certificate, or may be performed by a dynamic digital certificate to authenticate the qualification information, and the information authentication module in this embodiment performs validity verification by an intelligent contract, writes valid verification-verified tamper-resistant data into the block chain, and synchronizes all the nodes. In this embodiment, the tamper-resistant data needs to be verified when written into the blockchain, and the verified content is whether the data is true or not, for example, taking medical data as an example, personal information of hospitals, doctors, patients, and the like in the data content needs to be verified. When the contents are all truly valid, the data is valid, thereby allowing the write blockchain to be synchronized by all nodes, otherwise the write blockchain is discarded. Optionally, in this embodiment, the tamper-resistant data may be obtained through the client in a connection manner of P2P, and the data verification in this embodiment is automatically performed and verified by the smart contract issued on the blockchain.
In this embodiment, the data encryption of the tamper-resistant data is further included. First creating a first key for encryption by an encryption module, the first key including a public key of the first key and a private key of the first key;
encrypting the tamper-resistant data through a public key of a first secret key to form first encrypted content;
acquiring a public key of an object needing to be authorized, and encrypting a private key of a first secret key through the public key of the authorized object to form second encrypted content;
merging the first encrypted content and the second encrypted content to form data for authorizing different authorized objects;
obtaining data authorized for different authorized objects through the authorized object, and decrypting the data authorized for different authorized objects by using a private key of the authorized object to obtain a private key of a first secret key;
and the plaintext of the tamper-resistant data can be obtained through the private key of the first secret key.
Through the mode, on the premise of ensuring data interaction and data sharing, the data of the user can be effectively protected, on one hand, the privacy of the user can not be randomly revealed through encryption, on the other hand, the authenticity of the data is ensured through block chain storage, the data cannot be randomly tampered, the safety of the data is improved, and the accuracy and the reliability of the data are also ensured.
The present embodiment also provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements any of the methods in the present embodiments.
The present embodiment further provides an electronic terminal, including: a processor and a memory;
the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the method in the embodiment.
The computer-readable storage medium in the present embodiment can be understood by those skilled in the art as follows: all or part of the steps for implementing the above method embodiments may be performed by hardware associated with a computer program. The aforementioned computer program may be stored in a computer readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The electronic terminal provided by the embodiment comprises a processor, a memory, a transceiver and a communication interface, wherein the memory and the communication interface are connected with the processor and the transceiver and are used for completing mutual communication, the memory is used for storing a computer program, the communication interface is used for carrying out communication, and the processor and the transceiver are used for operating the computer program so that the electronic terminal can execute the steps of the method.
In this embodiment, the Memory may include a Random Access Memory (RAM), and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In the above embodiments, unless otherwise specified, the description of common objects by using "first", "second", etc. ordinal numbers only indicate that they refer to different instances of the same object, rather than indicating that the objects being described must be in a given sequence, whether temporally, spatially, in ranking, or in any other manner. In the above-described embodiments, reference in the specification to "the embodiment," "an embodiment," "another embodiment," or "other embodiments" means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments. The various appearances of the phrase "the present embodiment," "one embodiment," or "another embodiment" are not necessarily all referring to the same embodiment.
In the embodiments described above, although the present invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory structures (e.g., dynamic ram (dram)) may use the discussed embodiments. The embodiments of the invention are intended to embrace all such alternatives, modifications and variances that fall within the broad scope of the appended claims.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The invention is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The foregoing embodiments are merely illustrative of the principles of the present invention and its efficacy, and are not to be construed as limiting the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (8)

1. A blockchain-based big data processing system, comprising:
the data acquisition module is used for acquiring data;
the data classification module comprises a first classification model for classifying according to data types and a second classification model for classifying according to data contents, wherein the output end of the first classification model is connected with the input end of the second classification model, the second classification model performs secondary classification according to the data classification result of the first classification model to obtain a secondary classification result, and the secondary classification result comprises tamper-resistant data and common data;
the information authentication module is used for authenticating the qualification information of the tamper-resistant data;
the block chain module is used for carrying out data processing on the tamper-proof data which passes the authentication and storing the processed tamper-proof data in the nodes of the block chain;
the data processing module comprises a separation unit for separating the audio data from the text data, a conversion unit for converting the audio data into the text data, and an extraction unit for extracting keywords from the text data;
and respectively training the first classification model and the second classification model, wherein the training comprises the steps of obtaining a target characteristic vector corresponding to target content in text data, inputting the target characteristic vector into the first classification model, obtaining a plurality of class vectors, obtaining a primary classification result and characteristic information of the primary classification result corresponding to the first classification model according to the weight of the class vectors, and inputting the characteristic information of the primary classification result and the target characteristic vector into the second classification model to obtain a secondary classification result.
2. The big data processing system based on the block chain according to claim 1, wherein the text data is subjected to word segmentation processing, and each word segmentation is subjected to feature coding to obtain a word segmentation vector; acquiring a content vector, a position vector and a data type vector corresponding to each word segmentation, and acquiring a feature matrix through feature extraction; and performing vector splicing on the word segmentation vectors and the feature matrix to obtain a feature matrix corresponding to each word segmentation, and performing feature extraction on the feature matrix corresponding to each word segmentation to obtain the target feature vector.
3. The blockchain-based big data processing system according to claim 1, wherein the information authentication module includes a first authentication module for authenticating through a fixed digital certificate and a second authentication module for authenticating qualification information through a dynamic digital certificate, and the information authentication module performs validity verification through an intelligent contract, writes valid tamper-resistant data into the blockchain, and synchronizes all nodes.
4. The blockchain-based big data processing system according to claim 3, further comprising an encryption module for data encryption of the tamper-resistant data.
5. A big data processing method based on a block chain is characterized by comprising the following steps:
carrying out data acquisition;
the data classification method comprises the steps that data classification is carried out on collected data through a pre-established data classification module, the data classification module comprises a first classification model used for classifying according to data types and a second classification model used for classifying according to data contents, the second classification model carries out secondary classification according to data classification results of the first classification model, secondary classification results are obtained, and the secondary classification results comprise anti-tampering data and common data;
authenticating qualification information of the tamper-resistant data;
performing data processing on the authenticated anti-tampering data, and storing the processed anti-tampering data in a node of a block chain;
the data processing method comprises the steps that data processing is carried out through a data processing module, the data type comprises audio data and text data, the data processing module comprises a separation unit used for carrying out data separation on the audio data and the text data, a conversion unit used for converting the audio data into the text data, and an extraction unit used for carrying out keyword extraction on the text data;
and respectively training the first classification model and the second classification model, wherein the training comprises the steps of obtaining a target characteristic vector corresponding to target content in text data, inputting the target characteristic vector into the first classification model, obtaining a plurality of class vectors, obtaining a primary classification result and characteristic information of the primary classification result corresponding to the first classification model according to the weight of the class vectors, and inputting the characteristic information of the primary classification result and the target characteristic vector into the second classification model to obtain a secondary classification result.
6. The blockchain-based big data processing method according to claim 5,
creating a first key for encryption, the first key comprising a public key of the first key and a private key of the first key;
encrypting the tamper-resistant data through a public key of the first secret key to form first encrypted content;
acquiring a public key of an object needing to be authorized, and encrypting a private key of a first secret key through the public key of the authorized object to form second encrypted content;
merging the first encrypted content and the second encrypted content to form data for authorizing different authorized objects;
obtaining the data authorized for different authorized objects through an authorized object, and decrypting the data authorized for different authorized objects by using a private key of the authorized object to obtain a private key of the first key;
and acquiring the tamper-resistant data through a private key of the first secret key.
7. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program, when executed by a processor, implements the method of any one of claims 5 to 6.
8. An electronic terminal, comprising: a processor and a memory;
the memory is for storing a computer program and the processor is for executing the computer program stored by the memory to cause the terminal to perform the method of any of claims 5 to 6.
CN202110617287.2A 2021-06-03 2021-06-03 Block chain-based big data processing system, method, medium and terminal Active CN113065171B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110617287.2A CN113065171B (en) 2021-06-03 2021-06-03 Block chain-based big data processing system, method, medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110617287.2A CN113065171B (en) 2021-06-03 2021-06-03 Block chain-based big data processing system, method, medium and terminal

Publications (2)

Publication Number Publication Date
CN113065171A CN113065171A (en) 2021-07-02
CN113065171B true CN113065171B (en) 2021-10-08

Family

ID=76568564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110617287.2A Active CN113065171B (en) 2021-06-03 2021-06-03 Block chain-based big data processing system, method, medium and terminal

Country Status (1)

Country Link
CN (1) CN113065171B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010520B (en) * 2022-12-30 2023-06-30 航天广通科技(深圳)有限公司 Secret data storage method, device, equipment and storage medium based on block chain

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472118A (en) * 2018-11-23 2019-03-15 北京奇眸科技有限公司 A kind of copy-right protection method based on block chain
CN111339540A (en) * 2020-02-18 2020-06-26 山东劳动职业技术学院(山东劳动技师学院) Computer accounting data anti-theft device and control method thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101472451B1 (en) * 2010-11-04 2014-12-18 한국전자통신연구원 System and Method for Managing Digital Contents

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472118A (en) * 2018-11-23 2019-03-15 北京奇眸科技有限公司 A kind of copy-right protection method based on block chain
CN111339540A (en) * 2020-02-18 2020-06-26 山东劳动职业技术学院(山东劳动技师学院) Computer accounting data anti-theft device and control method thereof

Also Published As

Publication number Publication date
CN113065171A (en) 2021-07-02

Similar Documents

Publication Publication Date Title
TWI764037B (en) Interaction method and system across blockchain, computer equipment and storage medium
Sun et al. Data security and privacy in cloud computing
CN108809932B (en) Block chain-based deposit certificate system, method and readable medium
WO2018213519A1 (en) Secure electronic transaction authentication
CN112132198A (en) Data processing method, device and system and server
CN109815051A (en) The data processing method and system of block chain
Bergquist Blockchain technology and smart contracts: privacy-preserving tools
Omri et al. Cloud-ready biometric system for mobile security access
CN111680013A (en) Data sharing method based on block chain, electronic equipment and device
Bryson et al. Blockchain technology for government
CN116168820A (en) Medical data interoperation method based on virtual integration and blockchain fusion
Hossain et al. Improving cloud data security through hybrid verification technique based on biometrics and encryption system
CN114500093A (en) Safe interaction method and system for message information
CN113065171B (en) Block chain-based big data processing system, method, medium and terminal
CN113315745A (en) Data processing method, device, equipment and medium
Thuraisingham Building trustworthy semantic webs
Wang et al. A blockchain-based system for secure image protection using zero-watermark
CN113064731B (en) Cloud-edge-architecture-based big data processing terminal device, processing method and medium
Thuraisingham Developing and Securing the Cloud
Li et al. BEIR: A blockchain-based encrypted image retrieval scheme
CN113282959A (en) Service data processing method and device and electronic equipment
Shankar et al. Securing face recognition system using blockchain technology
Ramasamy et al. Cluster based multi layer user authentication data center storage architecture for big data security in cloud computing
Akter et al. Securing Smart Card Management Using Hyperledger Based Private Blockchain
CN110399706A (en) Authorization and authentication method, device and computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220707

Address after: 201615 room 1904, G60 Kechuang building, No. 650, Xinzhuan Road, Songjiang District, Shanghai

Patentee after: Shanghai Mingping Medical Data Technology Co.,Ltd.

Address before: 102400 no.86-n3557, Wanxing Road, Changyang, Fangshan District, Beijing

Patentee before: Mingpinyun (Beijing) data Technology Co.,Ltd.