CN115438015A - Computer file storage system and method based on block chain - Google Patents

Computer file storage system and method based on block chain Download PDF

Info

Publication number
CN115438015A
CN115438015A CN202211127852.8A CN202211127852A CN115438015A CN 115438015 A CN115438015 A CN 115438015A CN 202211127852 A CN202211127852 A CN 202211127852A CN 115438015 A CN115438015 A CN 115438015A
Authority
CN
China
Prior art keywords
file
data
user
files
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211127852.8A
Other languages
Chinese (zh)
Inventor
邓成正
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202211127852.8A priority Critical patent/CN115438015A/en
Publication of CN115438015A publication Critical patent/CN115438015A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/137Hash-based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • G06F16/152File search processing using file content signatures, e.g. hash values
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a computer file storage system and method based on a block chain, comprising the following steps: the system comprises a file uploading module, a data memory, a file processing module, a block chain module and a file calling module; after the identity authentication is carried out on the user through the file uploading module, files uploaded by the user are collected, and the collected files are sent to the data storage; storing all the collected user information and file data through the data storage; generating a file characteristic value and an access link through the file processing module, classifying and performing hash processing, and uploading the processed data to the block chain module; reading an address through the block chain module, encrypting and chaining the address, and sending the address to the file calling module; after the file calling module identifies the identity of the user, keywords are input, and corresponding file information is extracted, so that the problems of huge enterprise files, slow searching, easy loss and easy tampering are solved.

Description

Computer file storage system and method based on block chain
Technical Field
The invention relates to the technical field of computer file storage, in particular to a computer file storage system and method based on a block chain.
Background
The enterprise is established at the beginning, the information accumulation is started, the enterprise develops along with the time, the information quantity is more and more huge, the company file is used as the real record and knowledge accumulation of the activities such as enterprise research and development, production and operation, the office document, the file and the electronic file are used as the important knowledge assets of the enterprise, and the method has very important function in the enterprise operation management.
However, files are scattered in different computers, servers or systems, the number of the files is huge, unified backup is difficult, and the problems of slow search and low efficiency exist; moreover, security monitoring cannot be realized for relatively sensitive documents, and data may be lost or leaked, so that the security of the documents is threatened; and the traditional paper file management is not only easy to lose, but also the paper file is not easy to store, and the important file can not be repaired after being damaged.
Therefore, there is a need for a blockchain-based computer file storage system and method that can solve the above problems, monitor file storage and usage through identification, and prevent file tampering through blockchain technology.
Disclosure of Invention
The present invention is directed to a system and method for storing a computer file based on a block chain, so as to solve the problems mentioned in the background art.
In order to solve the technical problems, the invention provides the following technical scheme: a blockchain-based computer file storage system, the system comprising: the system comprises a file uploading module, a data memory, a file processing module, a block chain module and a file calling module;
after the identity authentication is carried out on the user through the file uploading module, files uploaded by the user are collected, and the collected files are sent to the data storage;
storing all the collected user information and file data through the data storage;
generating a file characteristic value and an access link through the file processing module, classifying and performing hash processing, and uploading the processed data to the block chain module;
reading an address through the block chain module, encrypting and chaining the address, and sending the address to the file calling module;
after the identity of the user is identified through the file calling module, keywords are input, and corresponding file information is extracted.
Further, the file uploading module comprises a user information collecting unit and a file content uploading unit; the user information collecting unit is used for collecting the identity information of a user who uploads a file; the file content uploading unit is used for collecting file information and uploading the file information to the data storage.
Further, the file processing module comprises a file processing unit, a file classification unit and a file hash unit; the file processing unit generates a file characteristic value and a file access link after processing a file; the file classification unit classifies and summarizes all files according to the file characteristic values; the file hash unit maps the file characteristic value and the file access link corresponding to the file characteristic value to the data memory address by using a separation link method;
the file processing unit comprises a characteristic value generating subunit and an access link generating subunit; the characteristic value generation subunit projects the data to a low-dimensional space by using an LDA model, and then extracts keywords, so as to obtain a file characteristic value, so that the same type of data is as compact as possible, and different types of data are dispersed as far as possible; the access link generation subunit generates a link for each file using a link generator.
Further, the block chain module comprises a reading unit, an encryption unit and an uplink unit; the reading unit is used for reading the address of the file hash unit; the encryption unit encrypts data by using an asymmetric encryption algorithm and sends the encrypted data to the uplink unit; and the uplink unit is used for acquiring data by using the block chain acquisition node so as to store the file characteristic value and the file access link.
Further, the file calling module comprises an identity authentication unit and a data extraction unit; the identity authentication unit is used for identifying identity information of company personnel and reducing the range of file extraction; and the data extraction unit is used for clicking the appeared link after the user inputs the keyword, thereby displaying the information content of the file.
A block chain based computer file storage method comprises the following steps:
s1: performing identity authentication on a user, and collecting a file uploaded by the user;
s2: storing all collected user information and file data;
s3: generating a file characteristic value and an access link, and carrying out classification and hash processing;
s4: reading the address and carrying out encryption and uplink transmission;
s5: and identifying the identity of the user, inputting keywords and extracting corresponding file information.
Further, in step S1: and verifying the identity information of the user through the work number, wherein the identity information comprises the name of an employee, the department of the employee and the like, and judging whether the user is a company employee or not, if so, the user can enter the system to upload the file, and otherwise, the user cannot enter the system.
Further, in step S2: and storing all the collected user identity information and file data through a data storage.
Further, in step S3: in order to obtain the file feature value, the data is projected to a low-dimensional space by using an LDA model, and then keyword extraction is performed:
(1) Firstly, calculating the mean value and covariance of each group of file data after projection: setting a data set formed by document data as D j =(x 1 ,y 1 ),(x 2 ,y 2 ),…,(x m ,y m ) Wherein x is i Is an arbitrary n-dimensional vector, y i E {0,1} is used for judging whether the keyword is contained, wherein D j A data set representing jth file data, m representing the number of jth file data, such that:
Figure BDA0003849675610000031
wherein, mu j Represents the mean value of m data in the jth file data,
Figure BDA0003849675610000032
representing a covariance matrix of m data in jth file data;
setting the projection straight line as vector omega, and for any file data x, its projection on the straight line is omega T x, then the mean vector and covariance matrix of the document data after projection are:
Figure BDA0003849675610000033
Figure BDA0003849675610000034
(2) Then, an optimization objective function J of the LDA model is calculated:
the uploaded files and the files with the same characteristics in the system are gathered together by utilizing two classifications, and a divergence matrix in the classifications is defined as S ω For representing the aggregation degree of data points in each file, and defining the inter-class divergence matrix as S b For representing the dispersion degree of files of different types:
Figure BDA0003849675610000041
S b =(μ 01 )(μ 01 ) T
in order to make the projection points of the same type of file data as close as possible, the covariance of the projection points of the same type of file data can be made as small as possible, the projection points of different types of file data can be kept as distant as possible, the distance between the class center points can be made as far as possible, and an optimization objective function of the LDA model is defined:
Figure BDA0003849675610000042
by making omega T S ω ω =1, the above equation can be maximized, thus making the data points of the same class more clustered and the data points of different classes more dispersed;
(3) Then, a projection straight line ω:
optimizing the objective function using a lagrange function: from L (omega) = omega T S b ω-λ(ω T S ω ω -1) and λ represents only one parameter, resulting in ω = S ω -101 );
(4) And finally, obtaining projected data points Y = ω X, wherein Y represents the set of characteristic values of each file.
Further, in step S3: classifying all files according to the file characteristic values, classifying the clustered files of the same class into one class, and further determining which type and which department the files of the same class belong to; and mapping the file characteristic value and the file access link corresponding to the file characteristic value to the data memory address by using a separation link method, wherein the method belongs to the conventional technical means of a person skilled in the art, and therefore, the method is not described in detail.
Further, in step S4: after the address of the file hash unit is read, data is encrypted by using an asymmetric encryption algorithm, wherein the method belongs to the conventional technical means of a person skilled in the art, and therefore, the method is not described in detail; and after encryption processing, acquiring data by using a block chain acquisition node, thereby storing the file characteristic value and the file access link.
Further, in step S5: identifying identity information of company personnel, and predicting files which are possibly extracted by a user according to the identity information of the company personnel, including names, affiliated departments and extraction records after the identity information passes verification, so that the range of the files is narrowed; the system identifies the key words input by the user, finds the file characteristic values of the key words, generates file access links, and clicks the links by the user, so that the information content of the files is displayed.
Compared with the prior art, the invention has the following beneficial effects:
the identity information of the user is verified through the work number, so that the security of the file is more favorably monitored; the data are projected to a low-dimensional space by using an LDA model, and then key words are extracted, so that the characteristic value of the file is obtained, the data points of the same class are more gathered, the data points of different classes are more dispersed, and the subsequent classification of the file is more facilitated; the file characteristic value and the file access link corresponding to the file characteristic value are mapped to the data storage address by utilizing a separation link method, so that the storage of files with overlarge memories is facilitated; the block chain technology is utilized to be more beneficial to preventing the user from tampering the file; the files which are possibly extracted by the user are predicted according to the identity information of the user, so that the range of the files is reduced, and the file extraction efficiency is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a block diagram of a blockchain based computer file storage system of the present invention;
fig. 2 is a flow chart of a method for storing a computer file based on a blockchain according to the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Referring to fig. 1-2, the present invention provides a technical solution: a blockchain-based computer file storage system, the system comprising: the system comprises a file uploading module, a data memory, a file processing module, a block chain module and a file calling module;
after the identity authentication is carried out on the user through the file uploading module, files uploaded by the user are collected, and the collected files are sent to the data storage;
storing all the collected user information and file data through the data storage;
generating a file characteristic value and an access link through the file processing module, classifying and performing hash processing, and uploading the processed data to the block chain module;
reading an address through the block chain module, encrypting and chaining the address, and sending the address to the file calling module;
after the identity of the user is identified through the file calling module, keywords are input, and corresponding file information is extracted.
Further, the file uploading module comprises a user information collecting unit and a file content uploading unit; the user information collecting unit is used for collecting the identity information of the user uploading the files, including the name of the user and the department to which the user belongs, so as to judge whether the user is a company employee or not; the file content uploading unit is used for collecting file information and uploading the file information to the data storage.
Furthermore, the data storage stores all the collected user identity information and file data, so that the user information can be conveniently recorded and the file information can be conveniently extracted by the user.
Further, the file processing module comprises a file processing unit, a file classification unit and a file hash unit; the file processing unit generates a file characteristic value and a file access link after processing the file, so that the file with a large memory can be conveniently stored in a block chain in the follow-up process; the file classification unit classifies and summarizes all files according to the file characteristic values, so that the files are tidier and more regular; the file hash unit maps the file characteristic value and the file access link corresponding to the file characteristic value to the data memory address by using a separation link method;
the file processing unit comprises a characteristic value generating subunit and an access link generating subunit; the characteristic value generation subunit projects the data to a low-dimensional space by using an LDA model, and then extracts keywords, so that a file characteristic value is obtained, the file characteristic value is more favorable for storing file data, the same type of data is as compact as possible, and the different types of data are as dispersed as possible; the access link generation subunit generates a link for each file using the link generator, thereby reducing the storage space thereof.
Further, the block chain module comprises a reading unit, an encryption unit and an uplink unit; the reading unit is used for reading the address of the file hash unit; the encryption unit encrypts data by using an asymmetric encryption algorithm and sends the encrypted data to the uplink unit; the uplink unit is used for collecting data by using the block chain collection node, so that the file characteristic value and the file access link are stored, and the problem of file tampering is effectively solved.
Further, the file calling module comprises an identity authentication unit and a data extraction unit; the identity authentication unit is used for identifying identity information of company personnel, and is favorable for predicting file information, so that the file extraction range is narrowed; and the data extraction unit is used for clicking the appeared link after the user inputs the keyword, thereby displaying the information content of the file.
A block chain based computer file storage method comprises the following steps:
s1: performing identity authentication on a user, and collecting a file uploaded by the user;
s2: storing all the collected user information and file data;
s3: generating a file characteristic value and an access link, and carrying out classification and hash processing;
s4: reading the address and carrying out encryption and uplink transmission;
s5: and identifying the identity of the user, inputting the key words and extracting corresponding file information.
Further, in step S1: the identity information of the user is verified through the work number, including the name of the staff, the department to which the user belongs, and the like, whether the user is a company staff or not is judged, if the user is a company staff, the user can enter the system,
otherwise, the system cannot be accessed, and the file is uploaded after the verification is passed.
Further, in step S2: all the collected user identity information and file data are stored through the data storage, so that the user information can be conveniently recorded and the file information can be conveniently extracted by the user.
Further, in step S3: in order to obtain the file feature value, the data is projected to a low-dimensional space by using an LDA model, and then keyword extraction is performed:
(1) Firstly, calculating the mean value and covariance of each group of file data after projection: setting a document dataformThe resultant data set is D j =(x 1 ,y 1 ),(x 2 ,y 2 ),…,(x m ,y m ) Wherein x is i Is an arbitrary n-dimensional vector, y i E {0,1} is used for judging whether the keyword is contained, wherein D j A data set representing jth file data, and m represents the number of jth file data, such that:
Figure BDA0003849675610000071
wherein, mu j Represents the mean value of m data in the jth file data,
Figure BDA0003849675610000072
representing a covariance matrix of m data in the jth document data;
setting the projection straight line as vector omega, and for any file data x, its projection on the straight line is omega T x, then the mean vector and covariance matrix of the document data after projection are:
Figure BDA0003849675610000073
Figure BDA0003849675610000074
(2) Then, an optimization objective function J of the LDA model is calculated:
the uploaded files and the files with the same characteristics in the system are gathered by utilizing two classifications, and a divergence matrix in the classifications is defined as S ω For representing the aggregation degree of data points in each file, and defining the inter-class divergence matrix as S b For indicating the degree of dispersion of files of different classes:
Figure BDA0003849675610000081
S b =(μ 01 )(μ 01 ) T
in order to make the projection points of the same type of file data as close as possible, the covariance of the projection points of the same type of file data can be made as small as possible, the projection points of different types of file data can be made as distant as possible, the distance between the class center points can be made as far as possible, the subsequent classification of files is facilitated, and an optimization objective function of the LDA model is defined:
Figure BDA0003849675610000082
by making omega T S ω ω =1, the above equation can be maximized, thus making the data points of the same class more clustered and the data points of different classes more dispersed;
(3) Then, a projection line ω:
optimizing the objective function using a lagrange function: by L (ω) = ω T S b ω-λ(ω T S ω ω -1), derived and let the result be 0 and λ represent only one parameter, resulting in ω = S ω -10 -v 1 );
(4) And finally, obtaining a projected data point Y = ω X, wherein Y represents a set of characteristic values of each file.
Further, in step S3: classifying all files according to the file characteristic values, classifying the files which are gathered together into one class, and further determining which type and department the files belong to; and mapping the file characteristic values and the corresponding file access links to the data memory addresses by using a separation linking method, wherein the method belongs to the conventional technical means of the technical personnel in the field, and therefore, the method is not described in detail.
Further, in step S4: after the address of the file hash unit is read, data is encrypted by using an asymmetric encryption algorithm, wherein the method belongs to the conventional technical means of a person skilled in the art, and therefore, the method is not described in detail; and after encryption processing, acquiring data by using a block chain acquisition node, thereby storing the file characteristic value and the file access link.
Further, in step S5: identifying the identity information of company personnel, and predicting files which are possibly extracted by a user according to the identity information of the company personnel, including names, affiliated departments and extraction records after the identity information passes verification, so that the range of extracting the files by the system is narrowed, and the operation efficiency of the system is improved; the system identifies the key words input by the user, finds the file characteristic values of the key words, generates file access links, and displays the information content of the files by clicking the links by the user.
The first embodiment is as follows:
in step S1: the identity information of the user is verified through the work number, the fact that the name of the user is Zhang III, belongs to the financial department and is a staff member of the company is recognized, and therefore the file is uploaded.
In step S2: and storing the identity information of the user and the uploaded file data through a data storage.
In step S3: in order to obtain the file characteristic value, the data is projected to a low-dimensional space by using an LDA model, and then keyword extraction is performed:
(1) Firstly, calculating the mean value and covariance of each group of file data after projection: setting a data set formed of document data to D j =(x 1 ,y 1 ),(x 2 ,y 2 ),…,(x m ,y m ) Wherein x is i Is an arbitrary n-dimensional vector, y i E {0,1} for judging whether the keyword is a keyword, wherein D j A data set representing jth file data, m representing the number of jth file data, such that:
Figure BDA0003849675610000091
wherein, mu j Represents the mean value of m data in the jth file data,
Figure BDA0003849675610000092
representing a covariance matrix of m data in the jth document data;
setting the projection straight line as vector omega, and for any file data x, its projection on the straight line is omega T x, then the post-projection document dataThe mean vector and covariance matrix of (a) are:
Figure BDA0003849675610000093
Figure BDA0003849675610000094
(2) Then, an optimization objective function J of the LDA model is calculated:
the uploaded files and the files with the same characteristics in the system are gathered together by utilizing two classifications, and a divergence matrix in the classifications is defined as S ω The method is used for representing the aggregation degree of data points in each file and defining an inter-class divergence matrix as S b For representing the dispersion degree of files of different types:
Figure BDA0003849675610000095
S b =(μ 01 )(μ 01 ) T
in order to make the projection points of the same type of file data as close as possible, the covariance of the projection points of the same type of file data can be made as small as possible, the projection points of different types of file data can be made as distant as possible, the distance between the class center points can be made as far as possible, the subsequent classification of files is facilitated, and an optimization objective function of the LDA model is defined:
Figure BDA0003849675610000101
by making omega T S ω ω =1, the above equation can be maximized, thus making the data points of the same class more clustered and the data points of different classes more dispersed;
(3) Then, a projection line ω:
optimizing the objective function using a lagrange function: from L (omega) = omega T S b ω-λ(ω T S ω ω -1) and λ represents only one parameter, resulting in ω = S ω -101 );
(4) Finally, obtaining a projected data point Y = ω X, wherein Y represents a set of characteristic values of each file;
at this time, the obtained file feature value is "a → B invoice".
Classifying the files according to the characteristic value 'A → B invoice', further confirming that the files belong to the 'invoice' class files and belong to the financial department; and mapping the file characteristic value and the file access link corresponding to the file characteristic value to the data memory address by using a separation linking method.
In step S4: after the address of the file hash unit is read, the data is encrypted by using an asymmetric encryption algorithm, wherein the method belongs to the conventional technical means of a person skilled in the art, and therefore, the method is not described in detail; and after encryption processing, acquiring data by using a block chain acquisition node, thereby storing the file characteristic value and the file access link.
In step S5: identifying the identity information of company personnel, identifying that the identity information is 'Liquan', belongs to the financial department, predicting that the files possibly extracted by the user are invoices, financial bills and the like according to the identity information including names, the department to which the company belongs and extraction records after the company personnel pass the verification, thereby reducing the range of the files; the system recognizes that the keyword input by the 'Liquad' is 'A → B invoice', automatically presents the file access link, and displays the specific information content of the invoice by clicking the link.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described above, or equivalents may be substituted for elements thereof. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A blockchain-based computer file storage system, comprising: the system comprises: the system comprises a file uploading module, a data memory, a file processing module, a block chain module and a file calling module;
after the identity authentication is carried out on the user through the file uploading module, files uploaded by the user are collected, and the collected files are sent to the data storage;
storing all the collected user information and file data through the data storage;
generating a file characteristic value and an access link through the file processing module, classifying and performing hash processing, and uploading the processed data to the block chain module;
reading an address through the block chain module, encrypting and chaining the address, and sending the address to the file calling module;
after the identity of the user is identified through the file calling module, keywords are input, and corresponding file information is extracted.
2. The blockchain-based computer file storage system of claim 1, wherein: the file uploading module comprises a user information collecting unit and a file content uploading unit; the user information collection unit is used for collecting the identity information of a user uploading files; the file content uploading unit is used for collecting file information and uploading the file information to the data storage.
3. The blockchain-based computer file storage system of claim 1, wherein: the file processing module comprises a file processing unit, a file classification unit and a file hash unit; the file processing unit generates a file characteristic value and a file access link after processing a file; the file classification unit classifies and summarizes all files according to the file characteristic values; the file hash unit maps the file characteristic value and the file access link corresponding to the file characteristic value to the data memory address by using a separation link method;
the file processing unit comprises a characteristic value generating subunit and an access link generating subunit; the characteristic value generation subunit projects data to a low-dimensional space by using an LDA model, and then extracts keywords, so as to obtain a file characteristic value; the access link generation subunit generates a link for each file using a link generator.
4. The blockchain-based computer file storage system of claim 1, wherein: the block chain module comprises a reading unit, an encryption unit and an uplink unit; the reading unit is used for reading the address of the file hash unit; the encryption unit encrypts data by using an asymmetric encryption algorithm and sends the encrypted data to the uplink unit; and the uplink unit is used for acquiring data by using the block chain acquisition node so as to store the file characteristic value and the file access link.
5. The blockchain-based computer file storage system of claim 1, wherein: the file calling module comprises an identity authentication unit and a data extraction unit; the identity authentication unit is used for identifying identity information of company personnel and reducing the range of file extraction; and the data extraction unit is used for clicking the appeared link after the user inputs the keyword, thereby displaying the information content of the file.
6. A block chain-based computer file storage method is characterized in that: the method comprises the following steps:
s1: performing identity authentication on a user, and collecting a file uploaded by the user;
s2: storing all the collected user information and file data;
s3: generating a file characteristic value and an access link, and carrying out classification and hash processing;
s4: reading the address and carrying out encryption and uplink transmission;
s5: and identifying the identity of the user, inputting keywords and extracting corresponding file information.
7. The method for storing the computer files based on the blockchain as claimed in claim 6, wherein the method comprises the following steps: in step S1: and verifying the identity information of the user through the work number, and judging whether the user is a company employee or not, wherein if the user is the company employee, the user can enter the system to upload the file, and otherwise, the user cannot enter the system.
8. The method for storing the computer files based on the blockchain as claimed in claim 6, wherein the method comprises the following steps: in step S3: in order to obtain the file feature value, the data is projected to a low-dimensional space by using an LDA model, and then keyword extraction is performed:
(1) Firstly, calculating the mean value and covariance of each group of file data after projection: setting a data set formed by document data as D j =(x 1 ,y 1 ),(x 2 ,y 2 ),…,(x m ,y m ) Wherein x is i Is an arbitrary n-dimensional vector, y i E {0,1} is used for judging whether the keyword is contained, wherein D j A data set representing jth file data, m representing the number of jth file data, such that:
Figure FDA0003849675600000021
wherein, mu j Represents the mean value of m data in the jth file data,
Figure FDA0003849675600000022
representing a covariance matrix of m data in the jth document data;
setting the projection straight line as vector omega, and for any file data x, its projection on the straight line is omega T x, then the mean vector and covariance matrix of the document data after projection are:
Figure FDA0003849675600000031
Figure FDA0003849675600000032
(2) Then, an optimization objective function J of the LDA model is calculated:
the uploaded files and the files with the same characteristics in the system are gathered by utilizing two classifications, and a divergence matrix in the classifications is defined as S ω For representing the aggregation degree of data points in each file, and defining the inter-class divergence matrix as S b For representing the dispersion degree of files of different types:
Figure FDA0003849675600000033
S b =(μ 01 )(μ 01 ) T
in order to aggregate files of the same class, files of different classes are scattered, and an optimization objective function of an LDA model is defined:
Figure FDA0003849675600000034
by making omega T S ω ω =1, the above formula can be maximized;
(3) Then, a projection line ω:
optimizing the objective function using a lagrange function: from L (omega) = omega T S b ω-λ(ω T S ω ω -1) and λ represents only one parameter, resulting in ω = S ω -101 );
(4) And finally, obtaining projected data points Y = ω X, wherein Y represents the set of characteristic values of each file.
9. The blockchain-based computer file storage method of claim 8, wherein: in step S3: classifying all files according to the file characteristic values, classifying the same class files gathered together into one class, and further confirming the types and the departments to which the class files belong; and mapping the file characteristic value and the file access link corresponding to the file characteristic value to the data memory address by using a separation linking method.
10. The method for storing the computer files based on the blockchain as claimed in claim 6, wherein the method comprises the following steps: in step S5: identifying identity information of company personnel, and predicting files which are possibly extracted by a user according to the identity information of the company personnel, including names, affiliated departments and extraction records after the identity information passes verification, so that the range of the files is narrowed; the system identifies the key words input by the user, finds the characteristic values of the files, generates file access links, and clicks the links by the user, so that the information content of the files is displayed.
CN202211127852.8A 2022-09-16 2022-09-16 Computer file storage system and method based on block chain Pending CN115438015A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211127852.8A CN115438015A (en) 2022-09-16 2022-09-16 Computer file storage system and method based on block chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211127852.8A CN115438015A (en) 2022-09-16 2022-09-16 Computer file storage system and method based on block chain

Publications (1)

Publication Number Publication Date
CN115438015A true CN115438015A (en) 2022-12-06

Family

ID=84249733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211127852.8A Pending CN115438015A (en) 2022-09-16 2022-09-16 Computer file storage system and method based on block chain

Country Status (1)

Country Link
CN (1) CN115438015A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115952140A (en) * 2023-01-09 2023-04-11 弘泰信息技术(天津)有限公司 Computer resource management system and method based on big data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115952140A (en) * 2023-01-09 2023-04-11 弘泰信息技术(天津)有限公司 Computer resource management system and method based on big data
CN115952140B (en) * 2023-01-09 2023-10-27 华苏数联科技有限公司 Big data-based computer resource management system and method

Similar Documents

Publication Publication Date Title
WO2018041066A1 (en) Data processing method, apparatus and system based on block chain technology
US20180300494A1 (en) Method of identifying and tracking sensitive data and system thereof
CN107491472B (en) Life cycle-based big data platform sensitive data secure sharing system and method
Poursafaei et al. Detecting malicious Ethereum entities via application of machine learning classification
CN111831636A (en) Data processing method, device, computer system and readable storage medium
CN112001586A (en) Enterprise networking big data audit risk control architecture based on block chain consensus mechanism
CN114140082B (en) Enterprise content management system
CN112100219A (en) Report generation method, device, equipment and medium based on database query processing
CN112861003A (en) User portrait construction method and system based on cloud edge collaboration
CN111767192B (en) Business data detection method, device, equipment and medium based on artificial intelligence
CN115438015A (en) Computer file storage system and method based on block chain
CN113139876A (en) Risk model training method and device, computer equipment and readable storage medium
CN111639355A (en) Data security management method and system
CN115827816A (en) BIM component data storage and authentication method and device based on block chain
CN116089620A (en) Electronic archive data management method and system
CN114968914A (en) Electronic archive management method and device, computer equipment and storage medium
CN111784360B (en) Anti-fraud prediction method and system based on network link backtracking
CN115758435A (en) External sharing security processing method for company marketing data and related equipment
JP6762057B1 (en) Video / image content posting management system in SNS
CN111461191B (en) Method and device for determining image sample set for model training and electronic equipment
CN117709901A (en) Whole-flow control method and system for technological achievements based on blockchain
CN115168848A (en) Interception feedback processing method based on big data analysis interception
CN114998001A (en) Service class identification method, device, equipment, storage medium and program product
CN114331728A (en) Security analysis management system
CN112035884B (en) Financial instrument management cloud platform based on block chain technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination