CN110109888B - File processing method and device - Google Patents

File processing method and device Download PDF

Info

Publication number
CN110109888B
CN110109888B CN201910270167.2A CN201910270167A CN110109888B CN 110109888 B CN110109888 B CN 110109888B CN 201910270167 A CN201910270167 A CN 201910270167A CN 110109888 B CN110109888 B CN 110109888B
Authority
CN
China
Prior art keywords
file
user
audit
comparing
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910270167.2A
Other languages
Chinese (zh)
Other versions
CN110109888A (en
Inventor
刘新
梁欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Launch Technology Co Ltd
Original Assignee
Shenzhen Launch Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Launch Technology Co Ltd filed Critical Shenzhen Launch Technology Co Ltd
Priority to CN201910270167.2A priority Critical patent/CN110109888B/en
Publication of CN110109888A publication Critical patent/CN110109888A/en
Application granted granted Critical
Publication of CN110109888B publication Critical patent/CN110109888B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a file processing method and device. The method is applied to block chain link point equipment and comprises the following steps: receiving an audit request of a user, wherein the audit request is used for requesting to audit a first file, and the audit request comprises a user identity; after the verification of the audit request is passed, judging whether the first file meets an uploading standard or not; if the uploading standard is met, registering the first file in a block chain for full network broadcasting; and if not, warning the user. In addition, a device corresponding to the method is also disclosed. By implementing the scheme, the manuscript washing behavior can be effectively restrained, and the order of the self-media platform is maintained.

Description

File processing method and device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a file processing method and a related device.
Background
With the rapid development from the media industry, the value of high-quality original content is increasingly improved. However, the problem of 'manuscript washing' has been plagued by originators, and a large number of unhealthy media persons tamper and delete original contents of others so that the original contents are completely different, but actually the most valuable parts are plagued from the original contents of others. How to suppress the wind of the 'washing manuscript' is a problem to be solved at present.
Currently, the current self-media platform is out of order, and there is no specific supervision and penalty mechanism for the "manuscript-washing" behavior. Even if enough evidence is provided, the "manuscript" article is not necessarily judged to be infringing. The infringed original author is hard to maintain.
Disclosure of Invention
The embodiment of the application provides a file processing method and device, which can effectively inhibit manuscript washing behaviors and maintain the order of a self-media platform.
In a first aspect, an embodiment of the present application provides a file processing method, which is applied to a block link point device, including: receiving an audit request of a user, wherein the audit request is used for requesting to audit a first file, and the audit request comprises a user identity; after the verification of the audit request is passed, judging whether the first file meets an uploading standard or not; if the uploading standard is met, registering the first file in a block chain for full network broadcasting; and if not, warning the user.
In another possible implementation manner, before the determining whether the first file meets the uploading standard, the method further includes: obtaining a file type corresponding to the first file, wherein the file type comprises one or more of the following: factual messages, newspapers, and news works; and acquiring the uploading standard corresponding to the first file according to the file type.
In yet another possible implementation manner, the determining whether the first file meets an upload criterion includes: word segmentation is carried out on the first file to obtain first information, wherein the first information comprises characteristic words of the first file; searching a second file with similarity exceeding a threshold value with the first file according to the first information; comparing the first file and the second file to generate an audit result, wherein the audit result comprises one or more of the following: paragraph similarity results, role similarity results and content similarity results; and judging whether the first file meets uploading standards according to the auditing result.
In yet another possible implementation, the comparing the first file and the second file generates an audit result including one or more of: comparing the paragraph structures of the first file and the second file to generate paragraph similar results, wherein the paragraph structures comprise parallel type, total division type, comparison type and progressive type; comparing the character relation of the first file and the second file to generate a character similarity result, wherein the character relation comprises the number of characters, the gender of the characters and the actions of the characters; and comparing the contents of the first file and the second file to generate a content similarity result, wherein the content comprises the number of the paraphrasing words and the number of the similar sentences.
In yet another possible implementation, the method further includes: and filtering the content belonging to the fact message in the first file to obtain a filtered first file.
In yet another possible implementation, the comparing the first file and the second file generates an audit result, including: and comparing the filtered first file with the second file to generate an auditing result.
In yet another possible implementation, after the alerting the user, the method further includes: and storing the violation records of the user according to the user identity.
In a second aspect, an embodiment of the present application provides a blockchain node device, including:
the receiving unit is used for receiving an auditing request of a user, wherein the auditing request is used for requesting to audit the first file, and the auditing request comprises a user identity; the judging unit is used for judging whether the first file meets an uploading standard after the auditing request is verified; the registration unit is used for registering the first file in a block chain for full network broadcasting if the uploading standard is met; and the warning unit is used for warning the user if the user is not in line with the user.
In one possible implementation, the apparatus further includes: the acquiring unit is configured to acquire a file type corresponding to the first file, where the file type includes one or more of the following: factual messages, newspapers, and news works; the obtaining unit is further configured to obtain, according to the file type, an upload standard corresponding to the first file.
In another possible implementation manner, the determining unit includes: the word segmentation subunit is used for segmenting the first file to obtain first information, wherein the first information comprises characteristic words of the first file; the searching subunit is used for searching a second file with similarity exceeding a threshold value with the first file according to the first information; the comparison subunit is used for comparing the first file and the second file to generate an audit result, and the audit result comprises one or more of the following: paragraph similarity results, role similarity results and content similarity results; and the judging subunit is used for judging whether the first file accords with uploading standards according to the auditing result.
In yet another possible implementation manner, the comparing subunit is specifically configured to compare paragraph structures of the first file and the second file to generate a paragraph similar result, where the paragraph structures include a parallel type, a total division type, a comparison type, and a progressive type; comparing the character relation of the first file and the second file to generate a character similarity result, wherein the character relation comprises the number of characters, the gender of the characters and the actions of the characters; and comparing the contents of the first file and the second file to generate a content similarity result, wherein the content comprises the number of the paraphrasing words and the number of the similar sentences.
In yet another possible implementation manner, the apparatus further includes: the filtering unit is used for filtering the content belonging to the fact message in the first file and obtaining the filtered first file.
In yet another possible implementation manner, the comparing subunit is further configured to compare the filtered first file and the second file to generate an audit result.
In yet another possible implementation manner, the apparatus further includes: and the storage unit is used for storing the violation records of the user according to the user identity.
In a third aspect, embodiments of the present application provide a block link point apparatus, including: the system comprises a processor, an input device, an output device and a memory, wherein the memory is used for storing a computer program for supporting a server to execute the method, the computer program comprises program instructions, and the processor is configured to call the program instructions to execute the method of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having instructions stored therein, which when run on a computer, cause the computer to perform the method of the above aspects.
In a fifth aspect, embodiments of the present application provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the above aspects.
The embodiment of the application has the following beneficial effects:
by comparing the uploaded file with the file on the blockchain network, the document washing behavior can be effectively restrained, and the order of the self-media platform is maintained.
Drawings
FIG. 1 is a schematic flow chart of a document processing method according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating another document processing method according to an embodiment of the present disclosure;
FIG. 3 is a schematic block chain node device according to an embodiment of the present disclosure;
fig. 4 is a schematic hardware structure of a blockchain node device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be understood that in the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system architectures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
At present, the problem of 'manuscript washing' is always plagued by originators, a large number of untoward media persons tamper and delete original contents of others to make the original contents of others completely different, but actually the most valuable part is also plagued from original contents of others, and a special supervision and punishment mechanism is lacked for 'manuscript washing' behaviors. Based on the above problems, embodiments of the present application provide a document processing method and apparatus, which can effectively suppress document washing behavior and maintain the order of a self-media platform.
Referring to fig. 1, fig. 1 is a flowchart of a file processing method according to an embodiment of the present application, which is applied to a block link point device. Wherein:
s101, receiving an audit request of a user, wherein the audit request is used for requesting to audit the first file, and the audit request comprises a user identity.
In this embodiment of the present application, the blockchain node device is an electronic device capable of connecting to the internet, including, but not limited to, a server, a portable tablet computer, a notebook computer, a desktop computer, a smart phone, a vehicle-mounted terminal, an OBD device, a wearable bracelet, a wearable watch, and an earphone. It can be appreciated that in the embodiments of the present application, the blockchain node device is not specifically limited.
In one possible implementation, the identity information of the user needs to be verified before the user uploads the first file. And searching whether the user is a registered user or not according to the user identity, and generating prompt information if the user is not registered, wherein the prompt information is used for prompting the user to register.
In another possible implementation manner, whether the user has a violation record is searched according to the user identity, and if the number of times of the violation record of the user exceeds a preset threshold, the user loses the qualification of uploading the file.
For example, a user whose violation record has not exceeded a preset threshold is a low-credit user who needs a certain number of high-credit users to provide signature vouches when they want to upload a file. Further, when the file uploaded by the low-credit user does not meet the uploading standard, the user credit of the high-credit user providing the signature assurance is correspondingly deducted.
For another example, the credit level of the user determines the quality of the uploaded file provided by the user, and for high-credit users, the auditing strength can be correspondingly reduced; for low-credit users with illegal records, the auditing strength can be correspondingly improved.
S102, after verification of the verification request is passed, whether the first file meets uploading standards is judged.
After determining that the user can upload the file, the specific content of the file uploaded by the user is required to be compared with news content in the blockchain network, and whether the file uploaded by the user can be stored in the blockchain is determined.
Specifically, the first file is segmented to obtain first information, wherein the first information comprises characteristic words of the first file; searching a second file with similarity exceeding a threshold value with the first file according to the first information; comparing the first file with the second file to generate an audit result, wherein the audit result comprises one or more of the following: paragraph similarity results, role similarity results and content similarity results; and judging whether the first file meets the uploading standard according to the auditing result.
In one possible implementation manner, the range of the first information is extended according to the credit level of the user.
For example, when the user is a high-credit user, the first information for searching similar files is a plurality of core feature words of the first file, the core feature words are words expressing the central thought of the first file, and the searching range of the second file is a file with high or identical topic similarity with the first file.
For another example, when the user is a low-credit user, the first information for searching for similar files is a plurality of feature words of the first file, the feature words are words with higher occurrence frequency in the first file, and the searching range of the second file is a file with similarity to the subject of the first file.
In another possible implementation manner, the blockchain node device proves that the judgment result obtained by the blockchain node device has credibility after passing the authentication of the authorization node according to the judgment result of the first file obtained by the auditing result. The authorization node may be a plurality of blockchain node devices representing different authorities or trusted authorities in the blockchain network, or may be trusted blockchain node devices trusted by the authorities.
And S103, if the uploading standard is met, registering the first file in the block chain for full network broadcasting.
Specifically, when the blockchain node device determines that the first file meets the uploading standard, it indicates that the first file is not a document washing file, the first file may be registered in a block and broadcasted in the whole network, and the block is added as a new block to the blockchain after passing through the consensus mechanism. It will be appreciated that the block may include at least one file information uploaded by the user.
In one possible implementation, when the blockchain node device registers the first file, the identifier of the blockchain node device is registered in a block together with the first file. Further, when the blockchain node device determines that the first file meets the uploading standard, the first file is packaged according to a preset rule to generate a verification request, and the verification request is used for further confirming originality of the first file to an uploading person of the second file. And after the uploading person of the second file confirms the first file, sending confirmation information to the block chain link point equipment. When the blockchain node device receives the confirmation information, the first file is determined to be registered in a block.
And S104, if the user is not in line with the user, warning the user.
The blockchain node device considers that the first file does not meet the uploading standard, and can consider that an uploading person of the first file washes a manuscript, and the blockchain node device sends an alarm to the user.
In one possible implementation, the number of alerts received by the user exceeds a preset threshold, and the copyright of other people is seriously infringed. The blockchain node device may send an alert record of the user to the infringed copyright, and the infringed copyright user may send an infringement reimbursement contract to the blockchain node device.
According to the file processing method provided by the embodiment of the application, the uploaded file is compared with the file in the blockchain network, and an alarm is sent to a user related to manuscript washing. By implementing the scheme, the manuscript washing behavior can be effectively restrained, and the order of the self-media platform is maintained.
Referring to fig. 2, fig. 2 is a flow chart of a file processing method according to an embodiment of the present application. Wherein:
s201, receiving an audit request of a user, wherein the audit request is used for requesting to audit the first file, and the audit request comprises a user identity.
The specific implementation of this step may refer to step S101 in the foregoing embodiment of fig. 1, and will not be described herein.
S202, according to the file type corresponding to the first file, obtaining an uploading standard corresponding to the first file.
Specifically, a file type corresponding to the first file is obtained, where the file type includes one or more of the following: factual messages, newspapers, and news works. According to different file types of the first file, corresponding uploading standards are also different.
In one possible implementation, the file type of the first file is a fact message, and since the content of the report of the fact message is the most concise implementation description of a certain news, the uploading standard corresponding to the fact message is that at least one report content (time, place, person, event) is different from the file existing in the blockchain network, or the file type is different from the file existing in the blockchain network.
In another possible implementation manner, the file type of the first file is a colleague article, and since the colleague article includes the thinking, comments and opinions of the author, the uploading standard of the colleague article needs to embody the selection and thinking of the user on the related facts and problems, and the structure and vocabulary of the selected article embody the conception and expression of the user. Files with high coincidence rate in terms of ideological expressions and vocabularies do not meet the uploading standard of the type.
In still another possible implementation manner, the file type of the first file is a news work, where the news work includes a plurality of types such as a news chart and a picture news, and the uploading standard is correspondingly changed according to different specific types of the news work.
For example, when the first file belongs to the news of the picture, the uploading standard is that the picture in the file cannot be the same as the picture existing in the blockchain network, or the picture in the file has acquired the transfer permission. It should be understood that the foregoing examples are merely illustrative, and the present embodiment does not specifically limit the uploading standard.
S203, word segmentation is carried out on the first file, and first information is obtained.
Specifically, the first information includes a feature word of the first file. And word segmentation is carried out on the first file through a word segmentation tool, so that a plurality of words are obtained. The word segmentation tool completes word segmentation based on an understanding word segmentation method, namely, a computer simulates the understanding of a person to sentences, so that the effect of word recognition is achieved. The basic idea is that the syntactic and semantic analysis is performed while the words are segmented, and the syntactic information and the semantic information are utilized to process the ambiguity. Comprehensively considering word frequency, collection frequency and word length information, carrying out weighted calculation on word segmentation results, and taking words with higher weighted values as first information representing a first file. It should be understood that the word segmentation tool may be any mature word segmentation product on the market, and is not limited herein.
Optionally, before the weighting calculation is performed on the segmentation result, stop words and low-frequency words are filtered, and words which are irrelevant to the text content, such as prepositions, conjunctions and other interference words, are deleted from the first file.
S204, searching a second file with similarity exceeding a threshold value with the first file according to the first information.
In one possible implementation, the same number of feature word sets as the first information is selected in the blockchain network file, and the degree of intersection with the first information is calculated using a vector space model. And taking one or more blockchain network files with higher intersection degree as the second file.
In another possible implementation, the files in the blockchain network may be categorized using a trained neural network model, with the first file mapped into a corresponding category. And selecting the files with the same category as the first file, and calculating the text similarity with the first file.
It is to be appreciated that the neural network may be an AlexNet neural network, a deep convolutional neural network (Visual Geometry Group, VGG), a residual network res net, a multi-object detection algorithm (The Single Shot Detector, SSD), an object detection neural network (you only look once, yolo), a cyclic neural network (Recurrent Neural Network, RNN), or the like. It will be appreciated that embodiments of the present application are not limited to a particular neural network.
S205, comparing the first file with the second file to generate an auditing result.
Specifically, the auditing result includes one or more of the following: paragraph similarity results, role similarity results, and content similarity results. Comparing the paragraph structures of the first file and the second file to generate paragraph similar results, wherein the paragraph structures comprise parallel type, total division type, comparison type and progressive type; comparing the character relation of the first file and the second file to generate a character similarity result, wherein the character relation comprises the number of characters, the gender of the characters and the actions of the characters; and comparing the contents of the first file and the second file to generate a content similarity result, wherein the content comprises the number of the paraphrasing words and the number of the similar sentences.
In one possible implementation manner, article summaries of the first file and the second file are obtained respectively, and according to the distribution condition of the content in the article summaries, the paragraph similarity results of the first file and the second file can be judged.
Specifically, an original word vector formed by words in the first information is used as input, and a neural network is adopted for processing, so that word vectors with context background information of corresponding words are obtained. Further, word vectors with context background information are converted into sentence vectors, the sentence vectors are used as input, and a neural network is adopted for processing, so that sentence vectors with context background information of corresponding sentences are obtained. And converting the sentence vector into a paragraph vector, and processing the input paragraph vector by adopting a neural network to obtain the paragraph vector with the context background information of the corresponding paragraph. And inputting the word vector, the sentence vector and the paragraph vector into a neural network model to obtain the article abstract of the first file. Similarly, an article abstract of the second file is obtained, and paragraph structures of the first file and the second file can be judged according to the distribution condition of the content of the article abstract.
For example, the first file and the second file are both of a general structure, but the first file is first described and then described, and the second file is first described and then described. Comparing the total narrative part of the first file with the total narrative part of the second file, and comparing the sub-portion of the first file with the sub-portion of the second file, and combining the comparison results of the two aspects to obtain the similarity of the first file and the second file in the paragraph structure aspect.
In another possible implementation manner, the role relationships of the first file and the second file are respectively acquired, and the role similarity results of the first file and the second file can be obtained through comparison.
Specifically, when two files select the same role event for example, the arguments led out from the examples indicate the idea and gist of the files, and the similarity of the two files can be further assisted to be judged according to the similarity of the roles selected by the files.
For example, the first file and the second file are both Kong Rong pear, and the similarity of the two files is high in terms of the number of characters, the gender of the characters, the actions of the characters, and the like, and the similarity of the two files can be further judged according to the arguments led out after the characters.
In another possible implementation manner, the number of the paraphrasing words and the number of the similar sentences in the first file and the second file are compared, and the comparison results form content similar results.
Specifically, the present paraphrasing words can be marked on the basis of the first file, a marked file is generated, and the similar results of the paraphrasing words are obtained by weighting the occurrence sequence of the paraphrasing words. Meanwhile, the number of the similar sentences is counted, and the proportion of the similar sentences in the first file content is calculated.
Further, if the order of a sentence in the first document is changed, a sentence in the second document is obtained, and the sentence is counted in the number of similar sentences.
In yet another possible implementation manner, the content of the fact message in the first file is filtered, and the filtered first file is obtained. And comparing the filtered first file with the second file to generate an auditing result.
Specifically, deleting the text part of the process, time, place and character of the first file about occurrence of a thing to obtain a filtered first file. And comparing the file with the second file from the aspects of paragraph structure, role relation and article content.
S206, judging whether the first file meets uploading standards according to the auditing result.
The specific implementation of this step may refer to step S102 in the above embodiment of fig. 1, which is not described herein.
And S207, if the uploading standard is met, registering the first file in a blockchain for full network broadcasting.
The specific implementation of this step may refer to step S103 in the above embodiment of fig. 1, which is not described herein.
And S208, if the user is not in conformity with the user information, warning the user.
The specific implementation of this step may refer to step S104 in the above embodiment of fig. 1, and will not be described herein.
S209, storing the violation records of the user according to the user identity.
Specifically, the violation record includes a plurality of second files for comparison, and the auditing result of the first files.
In one possible implementation, the user credit of the user is correspondingly deducted according to the number of times of the illegal recording of the user. Further, if the number of times of illegal recording of the user exceeds a preset threshold, the blockchain node device corresponding to the user is prohibited from uploading the file.
In another possible implementation, the above-mentioned violation records are classified into two types, executed and unexecuted, and unexecuted violation records are counted in the number of violation records. If the blockchain node device suspected of washing manuscript executes the infringement compensation contract, the corresponding violation record may not be counted in the number of violation records according to the confirmation message of the infringed copyright user.
According to the file processing method provided by the embodiment of the application, the uploaded file is compared with the file in the blockchain network, and warning and storing violation records are carried out on users corresponding to the file which does not meet the uploading standard. By implementing the scheme, the transparency of file processing can be realized, the manuscript washing behavior can be effectively restrained, and the order of the self-media platform is maintained.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a blockchain node device according to an embodiment of the present application. The blockchain node device includes: receiving unit 301, judging unit 302, registering unit 303, and warning unit 304; optionally, the blockchain node device further includes: an acquisition unit 305; optionally, the blockchain node device further includes: a filtering unit 306; optionally, the blockchain node device further includes: a storage unit 307. Wherein:
the receiving unit 301 is configured to receive an audit request of a user, where the audit request is used to request audit of the first file, and the audit request includes a user identity;
a judging unit 302, configured to judge whether the first file meets an upload criterion after the verification of the audit request is passed;
a registration unit 303, configured to register the first file in a blockchain for performing a whole network broadcast if the uploading standard is met;
and the warning unit 304 is configured to warn the user if the user is not in compliance with the user's warning.
In one possible implementation manner, the blockchain node device further includes: an obtaining unit 305, configured to obtain a file type corresponding to the first file, where the file type includes one or more of the following: factual messages, newspapers, and news works; the obtaining unit 305 is further configured to obtain, according to the file type, an upload standard corresponding to the first file.
In another possible implementation manner, the determining unit 302 includes:
a word segmentation subunit 3021, configured to segment the first file to obtain first information, where the first information includes feature words of the first file;
a searching subunit 3022, configured to search, according to the first information, a second file having a similarity with the first file that exceeds a threshold;
a comparing subunit 3023, configured to compare the first file and the second file, and generate an audit result, where the audit result includes one or more of the following: paragraph similarity results, role similarity results and content similarity results;
and the judging subunit 3024 is configured to judge whether the first file meets an upload standard according to the auditing result.
In yet another possible implementation manner, the comparing subunit 3023 is specifically configured to compare paragraph structures of the first file and the second file to generate a paragraph similar result, where the paragraph structures include a parallel type, a total division type, a comparison type, and a progressive type; comparing the character relation of the first file and the second file to generate a character similarity result, wherein the character relation comprises the number of characters, the gender of the characters and the actions of the characters; and comparing the contents of the first file and the second file to generate a content similarity result, wherein the content comprises the number of the paraphrasing words and the number of the similar sentences.
In yet another possible implementation manner, the blockchain node device further includes: and a filtering unit 306, configured to filter the content belonging to the fact message in the first file, and obtain a filtered first file.
In yet another possible implementation manner, the comparing subunit 3023 is further configured to compare the filtered first file with the second file to generate an audit result.
In yet another possible implementation manner, the blockchain node device further includes: and the storage unit 307 is configured to store the violation record of the user according to the user identity.
The more detailed descriptions of the receiving unit 301, the judging unit 302, the registering unit 303, the warning unit 304, the obtaining unit 305, the filtering unit 306 and the storing unit 307 may be directly obtained by referring to the related descriptions of the file processing method in the method embodiment shown in fig. 1 or fig. 2, which are not described herein.
According to the blockchain node device provided by the embodiment of the application, the uploaded file is compared with the file in the blockchain network, and warning and storing the violation record are carried out on the users corresponding to the file which does not accord with the uploading standard. By implementing the scheme, the transparency of file processing can be realized, the manuscript washing behavior can be effectively restrained, and the order of the self-media platform is maintained.
Referring to fig. 4, fig. 4 is a schematic hardware structure of a blockchain node device according to an embodiment of the present application, including a processor 401, and may further include an input device 402, an output device 403, and a memory 404. The input device 402, the output device 403, the memory 404 and the processor 401 are connected to each other via a bus.
The memory includes, but is not limited to, random access memory (random access memory, RAM), read-only memory (ROM), erasable programmable read-only memory (erasable programmable read only memory, EPROM), or portable read-only memory (compact disc read-only memory, CD-ROM) for the associated instructions and data.
The input means is for inputting data and/or signals and the output means is for outputting data and/or signals. The output device and the input device may be separate devices or may be a single device.
A processor may include one or more processors, including for example one or more central processing units (central processing unit, CPU), which in the case of a CPU may be a single core CPU or a multi-core CPU.
The memory is used to store program codes and data for the network device.
The processor is used for calling the program codes and data in the memory and executing the following steps: the method comprises the steps that a control input device receives an auditing request of a user, wherein the auditing request is used for requesting to audit a first file, and the auditing request comprises a user identity; after the verification of the audit request is passed, judging whether the first file meets an uploading standard or not; if the uploading standard is met, registering the first file in a block chain for full network broadcasting; if the user is not in conformity with the user information, the output device is controlled to warn the user.
In one possible implementation, before the step of determining whether the first file meets the upload criterion, the processor further performs the following steps: obtaining a file type corresponding to the first file, wherein the file type comprises one or more of the following: factual messages, newspapers, and news works; and acquiring the uploading standard corresponding to the first file according to the file type.
In another possible implementation manner, the step of determining whether the first file meets an upload criterion includes: word segmentation is carried out on the first file to obtain first information, wherein the first information comprises characteristic words of the first file; searching a second file with similarity exceeding a threshold value with the first file according to the first information; comparing the first file with the second file to generate an audit result, wherein the audit result comprises one or more of the following: paragraph similarity results, role similarity results and content similarity results; and judging whether the first file meets the uploading standard according to the auditing result.
In yet another possible implementation manner, the step of generating the audit result by the processor executing the step of comparing the first file with the second file includes one or more of the following operations: comparing the paragraph structures of the first file and the second file to generate paragraph similar results, wherein the paragraph structures comprise parallel, total division, comparison and progressive formulas; comparing the character relation of the first file and the second file to generate a character similar result, wherein the character relation comprises the number of characters, the gender of the characters and the actions of the characters; and comparing the contents of the first file and the second file to generate a content similar result, wherein the contents comprise the number of the paraphrasing words and the number of similar sentences.
In yet another possible implementation manner, the above processor is further configured to perform the following steps: and filtering the content belonging to the fact message in the first file to obtain a filtered first file.
In yet another possible implementation manner, the step of generating an audit result by the processor executing the comparing the first file and the second file includes: and comparing the filtered first file with the second file to generate an auditing result.
In yet another possible implementation, after the processor performs the step of alerting the user, the processor is further configured to perform the following steps: and storing the violation records of the user according to the user identity.
It will be appreciated that figure 4 shows only a simplified design of a block link point device. In practical applications, the blockchain node device may also include other necessary elements, including but not limited to any number of input/output devices, processors, controllers, memories, etc., and all blockchain node devices that may implement the embodiments of the present application are within the scope of protection of the present application.
The computer readable storage medium may be an internal storage module of the terminal device according to any one of the foregoing embodiments, for example, a hard disk or a memory of the terminal device. The computer readable storage medium may also be an external storage device of the terminal device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device. Further, the computer readable storage medium may further include both an internal storage module and an external storage device of the terminal device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the terminal device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the above-described module may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the several embodiments provided in this application, it should be understood that the disclosed methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or modules, or may be an electrical, mechanical, or other form of connection.
The modules described as separate components may or may not be physically separate, and components displayed as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over multiple network modules. Some or all modules in the system can be selected according to actual needs to achieve the purpose of the scheme of the embodiment of the application.
In addition, each functional module in each embodiment of the present application may be integrated in one processing module, or each module may exist alone physically, or two or more modules may be integrated in one module. The integrated module can be realized in a hardware mode or a software function module mode.
The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. A document processing method applied to a block link point device, comprising:
receiving an audit request of a user, wherein the audit request is used for requesting to audit a first file, and the audit request comprises a user identity;
after the verification of the audit request is passed, judging whether the first file meets an uploading standard, wherein the comparing the content of the first file with the content in the blockchain comprises the following steps: word segmentation is carried out on the first file to obtain first information, wherein the first information comprises characteristic words of the first file; searching a second file with similarity exceeding a threshold value with the first file according to the first information, wherein the range of the first information can be expanded according to the credit level of a user; comparing the first file and the second file to generate an audit result, wherein the audit result comprises one or more of the following: paragraph similarity results, role similarity results and content similarity results; judging whether the first file meets uploading standards according to the auditing result;
if the uploading standard is met, registering the first file in a block chain for full network broadcasting;
and if not, warning the user.
2. The method of claim 1, wherein prior to determining whether the first file meets an upload criterion, the method further comprises:
obtaining a file type corresponding to the first file, wherein the file type comprises one or more of the following: factual messages, newspapers, and news works;
and acquiring the uploading standard corresponding to the first file according to the file type.
3. The method of claim 1, wherein the comparing the first file and the second file generates an audit result comprising one or more of:
comparing the paragraph structures of the first file and the second file to generate paragraph similar results, wherein the paragraph structures comprise parallel type, total division type, comparison type and progressive type;
comparing the character relation of the first file and the second file to generate a character similarity result, wherein the character relation comprises the number of characters, the gender of the characters and the actions of the characters;
and comparing the contents of the first file and the second file to generate a content similarity result, wherein the content comprises the number of the paraphrasing words and the number of the similar sentences.
4. A method according to claim 3, characterized in that the method further comprises:
filtering the content belonging to the fact message in the first file to obtain a filtered first file;
the comparing the first file and the second file to generate an audit result comprises:
and comparing the filtered first file with the second file to generate an auditing result.
5. The method of claim 1, wherein after alerting the user, the method further comprises:
and storing the violation records of the user according to the user identity.
6. A block link point apparatus, comprising:
the receiving unit is used for receiving an auditing request of a user, wherein the auditing request is used for requesting to audit the first file, and the auditing request comprises a user identity;
the judging unit is configured to judge whether the first file meets an upload criterion after the verification of the audit request is passed, where comparing the content of the first file with the content in the blockchain includes: word segmentation is carried out on the first file to obtain first information, wherein the first information comprises characteristic words of the first file; searching a second file with similarity exceeding a threshold value with the first file according to the first information, wherein the range of the first information can be expanded according to the credit level of a user; comparing the first file and the second file to generate an audit result, wherein the audit result comprises one or more of the following: paragraph similarity results, role similarity results and content similarity results; judging whether the first file meets uploading standards according to the auditing result;
the registration unit is used for registering the first file in a block chain for full network broadcasting if the uploading standard is met;
and the warning unit is used for warning the user if the user is not in line with the user.
7. The apparatus of claim 6, wherein the apparatus further comprises:
the acquiring unit is configured to acquire a file type corresponding to the first file, where the file type includes one or more of the following: factual messages, newspapers, and news works;
the obtaining unit is further configured to obtain, according to the file type, an upload standard corresponding to the first file.
8. A block link point apparatus, comprising: a processor, an input device, an output device and a memory, wherein the memory is for storing a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1 to 5.
9. A computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the method according to any one of claims 1 to 5.
CN201910270167.2A 2019-04-04 2019-04-04 File processing method and device Active CN110109888B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910270167.2A CN110109888B (en) 2019-04-04 2019-04-04 File processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910270167.2A CN110109888B (en) 2019-04-04 2019-04-04 File processing method and device

Publications (2)

Publication Number Publication Date
CN110109888A CN110109888A (en) 2019-08-09
CN110109888B true CN110109888B (en) 2023-05-05

Family

ID=67485126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910270167.2A Active CN110109888B (en) 2019-04-04 2019-04-04 File processing method and device

Country Status (1)

Country Link
CN (1) CN110109888B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674140B (en) * 2019-09-29 2022-04-15 腾讯科技(深圳)有限公司 Block chain-based content processing method, device, equipment and storage medium
CN111126962B (en) * 2019-12-24 2024-03-12 南方电网科学研究院有限责任公司 New energy grid-connected standard declaration system and method based on blockchain
CN114598693B (en) * 2020-12-07 2023-11-21 国家广播电视总局广播电视科学研究院 File content auditing method and device and electronic equipment
CN114598699B (en) * 2020-12-07 2023-07-28 国家广播电视总局广播电视科学研究院 File content auditing method and device and electronic equipment
CN114221956A (en) * 2021-11-08 2022-03-22 北京中合谷投资有限公司 Content examination method of distributed network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109150607A (en) * 2018-08-22 2019-01-04 中链科技有限公司 Classification management-control method and device for block chain network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107086920A (en) * 2017-06-20 2017-08-22 无锡井通网络科技有限公司 Copyright based on block chain really weighs method
US10121025B1 (en) * 2018-02-22 2018-11-06 Capital One Services, Llc Content validation using blockchain
CN109002693B (en) * 2018-07-17 2021-03-26 大连理工大学 Manuscript protection method based on block chain
CN108876560B (en) * 2018-07-18 2020-10-02 阿里巴巴集团控股有限公司 Method and device for performing credit evaluation on work publisher based on block chain
CN109086459B (en) * 2018-09-17 2020-01-14 中国科学院重庆绿色智能技术研究院 News collecting, editing and releasing method based on block chain
CN109359948A (en) * 2018-10-26 2019-02-19 深圳市元征科技股份有限公司 A kind of measure of managing contract and relevant device based on block chain

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109150607A (en) * 2018-08-22 2019-01-04 中链科技有限公司 Classification management-control method and device for block chain network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋俊典 ; 戴炳荣 ; 蒋丽雯 ; 赵尧 ; 李超 ; 王晓强 ; .基于区块链的数据治理协同方法.计算机应用.2018,(09),全文. *
李超 ; 戴炳荣 ; 王泓机 ; 王晓强 ; .基于区块链的数字版权保护与交易系统.现代计算机(专业版).2018,(29),全文. *

Also Published As

Publication number Publication date
CN110109888A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
CN110109888B (en) File processing method and device
CN109905385B (en) Webshell detection method, device and system
CN106874253A (en) Recognize the method and device of sensitive information
CN111581355B (en) Threat information topic detection method, device and computer storage medium
CN110008428B (en) News data processing method and device, blockchain node equipment and storage medium
Javadi et al. Monitoring misuse for accountable'artificial intelligence as a service'
CN111553318A (en) Sensitive information extraction method, referee document processing method and device and electronic equipment
US8272051B1 (en) Method and apparatus of information leakage prevention for database tables
CN114883005A (en) Data classification and classification method and device, electronic equipment and storage medium
CN115827903A (en) Violation detection method and device for media information, electronic equipment and storage medium
CN116248412B (en) Shared data resource abnormality detection method, system, equipment, memory and product
CN113971283A (en) Malicious application program detection method and device based on features
CN113378118A (en) Method, apparatus, electronic device, and computer storage medium for processing image data
Chen et al. Fraud analysis and detection for real-time messaging communications on social networks
CN111813964B (en) Data processing method based on ecological environment and related equipment
CN113887191A (en) Method and device for detecting similarity of articles
CN113836297A (en) Training method and device for text emotion analysis model
CN111563276B (en) Webpage tampering detection method, detection system and related equipment
CN114153939A (en) Text recognition method and device
CN113515631B (en) Method, device, terminal equipment and storage medium for predicting crime name
CN112561456B (en) Approval assisting method, device, storage medium and equipment
CN115221857B (en) Data similarity detection method and device containing numerical value types
CN115964582B (en) Network security risk assessment method and system
CN115809466B (en) Security requirement generation method and device based on STRIDE model, electronic equipment and medium
CN112988324B (en) Android simulator identification method and system based on CPU information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant