CN111353031A - Thesis management method, server and system based on big data - Google Patents

Thesis management method, server and system based on big data Download PDF

Info

Publication number
CN111353031A
CN111353031A CN202010122369.5A CN202010122369A CN111353031A CN 111353031 A CN111353031 A CN 111353031A CN 202010122369 A CN202010122369 A CN 202010122369A CN 111353031 A CN111353031 A CN 111353031A
Authority
CN
China
Prior art keywords
client
existing documents
micro
information
related existing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010122369.5A
Other languages
Chinese (zh)
Other versions
CN111353031B (en
Inventor
林瀚
谷俊
薛忍霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan Yizhimai Technology Co ltd
Original Assignee
Hainan Yizhimai Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan Yizhimai Technology Co ltd filed Critical Hainan Yizhimai Technology Co ltd
Priority to CN202010122369.5A priority Critical patent/CN111353031B/en
Publication of CN111353031A publication Critical patent/CN111353031A/en
Application granted granted Critical
Publication of CN111353031B publication Critical patent/CN111353031B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a thesis management method, a server and a system based on big data, wherein the method comprises the steps of receiving authentication information sent by a client, wherein the authentication information comprises an authentication code and instruction information; sending the micro-service module to a client, and deploying after receiving the micro-service module by the client; receiving feature information uploaded by a micro-service module, wherein the feature information is obtained by the micro-service module through local preprocessing of thesis data on a client side, and the preprocessing of the thesis data is to extract feature words from a thesis text; retrieving related existing documents from a database according to the characteristic information, and sending the related existing documents to the micro-service module; the microservice module receives the related existing documents, calculates the similarity between the paper text and the related existing documents to obtain the duplicate checking result, and sends the duplicate checking result to the client.

Description

Thesis management method, server and system based on big data
Technical Field
The invention relates to the technical field of thesis management, in particular to a thesis management method, a server and a system based on big data.
Background
The thesis refers to an article for researching in each academic field and describing academic research results, which is a means for researching problems and performing academic research and a tool for describing academic research results and performing academic communication, the colleges and universities in China can examine the graduation papers of students when the students graduate at present, wherein, the repetition rate of paper duplicate checking is one of the important factors for checking the quality and creativity of a paper, and many journal magazines can check the duplicate of the paper before the paper of the student is published, the common practice of paper duplicate checking is that the user sends the own paper to websites of the known network and all parties for duplicate checking through the network, and the websites basically adopt the mode of plain text duplicate checking, and before duplicate checking, a user needs to upload own papers through an external network, so that the confidentiality is poor, and the unpublished papers of the user have higher leakage risk.
Disclosure of Invention
The invention aims to provide a thesis management method, a server and a system based on big data, and aims to solve the problems that in the prior art, the thesis duplicate checking confidentiality is poor, and unreported thesis of a user is easy to leak.
The invention provides a thesis management method based on big data in a first aspect, which comprises the following steps:
receiving authentication information sent by a client, wherein the authentication information comprises an authentication code and instruction information;
sending the micro-service module to a client, and deploying after receiving the micro-service module by the client;
receiving feature information uploaded by a micro-service module, wherein the feature information is obtained by the micro-service module through local preprocessing of thesis data on a client side, and the preprocessing of the thesis data is to extract feature words from a thesis text;
retrieving related existing documents from a database according to the characteristic information, and sending the related existing documents to the micro-service module;
and the micro-service module receives the related existing documents, calculates the similarity between the thesis text and the related existing documents to obtain a duplicate checking result, and sends the duplicate checking result to the client.
Further, the sending the micro service module to the client specifically includes:
acquiring client attribute information, and extracting a corresponding micro-service module from a micro-service warehouse according to the attribute information;
registering the micro service module according to the authentication information to generate a configuration file;
and sending the registered micro service module to a client for deployment.
Further, when the related existing documents are not retrieved from the database according to the feature information, retrieving the related existing documents from the third-party server specifically includes:
generating connection information, sending the connection information to the client, and sending the connection information and the characteristic information to a third-party server;
the third-party server retrieves the related existing documents from the third-party database according to the characteristic information, and if the related existing documents are retrieved, the third-party server sends connection information to the client;
and the client performs matching verification on the received two pieces of connection information, if the two pieces of connection information are matched, connection is established with a third-party server, and the third-party server sends related existing documents to the client.
Further, the micro-service module receives the related existing documents, calculates the similarity between the paper text and the related existing documents to obtain the duplication checking result, and specifically includes:
the method comprises the steps that related existing documents sent by a server are sequentially received, and the related existing documents are sequentially recorded into a transmission queue by the server to be sent;
creating a cache region, and storing the received related existing documents into the cache region;
and extracting related existing documents from the buffer area to perform similarity calculation with the paper text, receiving new related existing documents and storing the new related existing documents in the buffer area until the transmission queue is sent.
A second aspect of the present invention provides a server, comprising:
the first receiving module is used for receiving authentication information sent by a client, wherein the authentication information comprises an authentication code and instruction information;
the sending module is used for sending the micro-service module to a client, and the client deploys after receiving the micro-service module;
the second receiving module is used for receiving the feature information uploaded by the micro-service module, wherein the feature information is obtained by preprocessing the thesis data locally at the client by the micro-service module, and the preprocessing of the thesis data is to extract feature words from the thesis text;
the retrieval module is used for retrieving related existing documents from the database according to the characteristic information and sending the related existing documents to the microservice module;
the microservice module is used for receiving related existing documents, calculating the similarity between the thesis text and the related existing documents to obtain a duplicate checking result, and sending the duplicate checking result to the client.
Further, the sending module further includes:
the acquisition submodule is used for acquiring the attribute information of the client and extracting a corresponding micro-service module from the micro-service warehouse according to the attribute information;
the registration submodule is used for registering the micro-service module according to the authentication information to generate a configuration file;
and the sending submodule is used for sending the registered micro-service module to the client for deployment.
Further, the retrieval module is further configured to retrieve the relevant existing document from the third-party server when the relevant existing document is not retrieved from the database according to the feature information, and the retrieval module specifically includes a generation sub-module configured to generate connection information, send the connection information to the client, and send the connection information and the feature information to the third-party server,
the third-party server is used for retrieving related existing documents from a third-party database according to the characteristic information, and sending connection information to the client if the related existing documents are retrieved;
the client is also provided with a matching verification module which is used for matching and verifying the received two pieces of connection information, if the two pieces of connection information are matched, the connection is established with a third-party server, and the third-party server sends related existing documents to the client.
Further, the micro service module specifically further includes:
the receiving submodule is used for sequentially receiving related existing documents sent by the server, and the related existing documents are sequentially recorded into a transmission queue by the server for sending;
the creating submodule is used for creating a cache region and storing the received related existing documents into the cache region;
and the calculation sub-module is used for extracting the related existing documents from the buffer area to perform similarity calculation with the thesis text, receiving the new related existing documents and storing the new related existing documents into the buffer area until the transmission queue is sent.
The third aspect of the present invention provides a big data based thesis management system, which includes the server and the client described in the second aspect.
Compared with the prior art, the invention has the beneficial effects that:
the method comprises the steps of sending a micro-service module to a client for deployment, preprocessing a thesis text by the micro-service module locally at the client to obtain characteristic information and uploading the characteristic information, retrieving relevant existing documents from a database by the server according to the characteristic information and sending the relevant existing documents to the micro-service module, and obtaining a duplicate checking result by the micro-service module and calculating the similarity between the relevant existing documents and the thesis text, so that a user can check the duplicate without uploading the thesis text through a network, the safety of the data of the user without publication is further improved, and the load of the server can be effectively reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description are only preferred embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without inventive efforts.
Fig. 1 is a flowchart illustrating a thesis management method based on big data according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a big data based thesis management method according to another embodiment of the present invention.
FIG. 3 is a flowchart illustrating a big data based thesis management method according to another embodiment of the present invention.
FIG. 4 is a flowchart illustrating a big data based thesis management method according to another embodiment of the present invention.
Fig. 5 is a schematic diagram of an overall structure of a server according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, the illustrated embodiments are provided to illustrate the invention and not to limit the scope of the invention.
Fig. 1 is a schematic flow chart of a thesis management method based on big data according to an embodiment of the present invention.
In this embodiment, the server may be a computer, a server, or other devices; the client can be a computer, a tablet computer, an intelligent handheld terminal and other devices. And the server and the client carry out data interaction through a network.
As shown in fig. 1, the thesis management method based on big data, applied to a server, includes the following steps:
and S11, receiving authentication information sent by the client, wherein the authentication information comprises an authentication code and instruction information.
In the embodiment, the operation instruction is to check duplicate of a thesis text, and in other embodiments, the operation instruction may also be to search and download existing documents, check personal account information, and the like.
A database is pre-established in the server, on one hand, the database stores account information of all registered users, and the account information at least comprises unique identification information; on the other hand, existing literature data is stored, including but not limited to articles, books, periodicals, scientific reports, patents.
In addition, the client may process the authentication information before sending the authentication information to the server, where the processing may be to encrypt the authentication code and the instruction information by an encryption algorithm, to add characters to the authentication code and the instruction information, or to perform other processing manners.
And S12, sending the micro service module to the client, and deploying after receiving the micro service module by the client.
The server sends the encapsulation of the micro-service module to the client, and the deployment is that the client unpacks the encapsulation of the micro-service module after receiving the encapsulation and installs the encapsulation to the local part of the client so that the micro-service module can operate.
S13, receiving feature information uploaded by the micro service module, wherein the feature information is obtained by preprocessing the paper data locally by the micro service module at the client, and the preprocessing of the paper data is to extract feature words from the paper text.
When a user corresponding to a client performs duplicate checking on a thesis, the user needs to open a thesis text stored locally through the client, the micro-service module performs format recognition and word segmentation on the thesis text, and extracts feature words with high occurrence frequency from the thesis text, so that related existing documents can be searched in the subsequent steps according to the feature words extracted from the thesis text; the characteristic words can be multiple, so that the technical range related to the thesis is covered as much as possible, and the condition of missing check is avoided, and the duplicate checking result is not accurate enough.
In addition, before sending the feature information to the server, the client may process the feature information, where the processing may be to encrypt the feature information by an encryption algorithm, to add characters to the feature information, or to perform other processing manners.
And S14, retrieving the related existing documents from the database according to the characteristic information, and sending the related existing documents to the microservice module.
When the related existing documents are searched for the first time from the database according to the characteristic information, the server can perform secondary screening on the related existing documents which are searched for the first time, wherein the secondary screening comprises the following steps: the method comprises the steps of sorting according to the similarity between related existing documents and feature words retrieved for the first time, taking the existing documents with higher similarity as a paper text duplicate checking and comparing object according to a preset value, namely screening the existing documents with the similarity ranking outside the preset value so as to reduce duplicate checking and comparing on the existing documents with lower similarity and improve duplicate checking efficiency.
And S15, the micro-service module receives the related existing documents, calculates the similarity between the paper text and the related existing documents to obtain a duplication checking result, and sends the duplication checking result to the client.
By adopting the thesis management method based on big data provided by the embodiment, the server sends a micro service module to the client for deployment at the local part of the client, the micro service module preprocesses the thesis text at the body of the client, acquires the feature information of the thesis text and uploads the feature information to the server, the server retrieves the relevant existing documents according to the feature information and sends the relevant existing documents to the client, and the micro service module compares and calculates the similarity between the relevant existing documents and the thesis text at the local part to obtain the duplication checking result, so that the duplication checking comparison can be completed at the local part without uploading the thesis text, the content of the user unpublished thesis can be kept secret better because the duplication checking work is performed at the local part, the operation resource of the server is mainly used for searching the relevant existing documents according to the feature information, the load and the response speed of the server can be further reduced, the user can obtain the duplicate checking result more quickly, and the waiting time of the user is reduced.
Fig. 2 is a flowchart illustrating a thesis management method based on big data according to another embodiment of the present invention.
As shown in fig. 2, the sending the micro service module to the client specifically includes:
and S21, acquiring the client attribute information, and extracting the corresponding micro service module from the micro service warehouse according to the attribute information.
The micro-service warehouse is used for storing pre-created and packaged micro-service modules. The client attribute information includes, but is not limited to, device information, operating system information, and network information of the client, the device information may be a device model, and the operating system information may be an operating system version.
In this embodiment, the micro service warehouse stores a plurality of micro service module packages developed for different devices, operating systems, or networks in advance, and the server extracts corresponding micro service modules from the micro service warehouse according to attribute information after acquiring attribute information of the client, so that the micro service modules can be deployed according to characteristics of different types of clients, and device performance of the clients can be better utilized.
And S22, registering the micro service module according to the authentication information to generate a configuration file.
The server registers the micro service module according to authentication information sent by a user corresponding to the client, the unique identification information of the micro service module is generated, the association relation between the unique identification information of the micro service module and the authentication information is established, and then a configuration file is generated for the micro service module. The configuration file is used for recording the version, the running state and other information of the micro service module.
And S23, sending the registered micro service module to the client for deployment.
In this embodiment, only the registered micro service module is sent to the client by the server for deployment, so that the server can track the micro service module subsequently according to the registration information and the configuration file of the micro service module, and upgrade and maintenance of the micro service module are facilitated.
Fig. 3 is a flowchart illustrating a thesis management method based on big data according to another embodiment of the present invention.
As shown in fig. 3, when the server cannot retrieve the related existing document from the database according to the feature information uploaded by the micro service module, retrieving the related existing document from the third-party server specifically includes:
and S31, generating connection information, sending the connection information to the client, and sending the connection information and the characteristic information to the third-party server.
The connection information is a unique secret key generated when the server needs to retrieve related existing documents from the third-party server, and the server simultaneously sends the connection information to the client and the third-party server, so that the client and the third-party server can perform mutual authentication through the connection information in subsequent steps.
And S32, the third-party server searches the related existing documents from the third-party database according to the characteristic information, and if the related existing documents are searched, the third-party server sends connection information to the client.
In the step, when the third-party server retrieves the related existing documents, the received connection information is sent to the client; and if the relevant existing documents are not searched, feeding back a search result to the server, and after receiving the feedback result that the third-party server does not search the relevant existing documents, feeding back the search result to the client.
And S33, the client performs matching verification on the received two pieces of connection information, if the two pieces of connection information are matched, connection is established with a third-party server, and the third-party server sends related existing documents to the client.
In this embodiment, when the third-party server is required to retrieve the related existing documents, the client does not need to send any information to the third-party server, so that the local data security of the client is ensured, after the third-party server completes retrieval of the related existing documents, the connection information generated by the server is sent to the client, the client performs matching verification on the received connection information sent by the server and the connection information sent by the third-party server, if the two pieces of connection information are matched, the data sent by the third-party server is authentic, at this time, the client establishes connection with the third-party server and receives the related existing documents sent by the third-party server, so that the subsequent thesis text can be checked for duplication.
Fig. 4 is a flowchart illustrating a thesis management method based on big data according to another embodiment of the present invention.
As shown in fig. 4, the microservice module receives the related existing documents, calculates the similarity between the paper text and the related existing documents to obtain the duplication checking result, and specifically includes:
and S41, sequentially receiving the related existing documents sent by the server, wherein the related existing documents are sequentially recorded into a transmission queue by the server for sending.
In this embodiment, after the server or the third-party server retrieves the related existing documents according to the feature information uploaded by the micro service module, a transmission queue is created, the retrieved related existing documents are sequentially entered into the transmission queue, and the related existing documents are sent to the client through the transmission queue.
And S42, creating a buffer area, and storing the received related existing documents into the buffer area.
The cache region is created by the micro-service module according to the memory use condition of the client body, and the size of the cache region can be different for different memory use conditions of different clients.
In addition, when the size of the received related existing document reaches a preset threshold, the micro service module sends first feedback information to the server or the third-party server sending the related existing document, after the server or the third-party server receives the first feedback information, the server or the third-party server suspends sending the related existing document to the client, and the preset threshold may be slightly smaller than the maximum capacity of the cache area.
And S43, extracting the related existing documents from the buffer area to calculate the similarity between the related existing documents and the paper texts, receiving the new related existing documents and storing the new related existing documents in the buffer area until the transmission queue is sent.
The method comprises the steps that a micro-service module extracts related existing documents from a cache region in sequence to carry out similarity calculation with a thesis text, when one related existing document is extracted, the cache region is released, the micro-service module simultaneously sends second feedback information to a server or a third-party server, the server or the third-party server continues to send the remaining related existing documents in a transmission queue to a client, when the size of the received related existing document reaches a preset threshold value, the micro-service module sends first feedback information to the server or the third-party server again, and the process is repeatedly executed until the related existing documents in the transmission queue of the server or the third-party server are sent completely.
Based on the same inventive concept as the foregoing embodiment, fig. 5 is a schematic diagram of an overall structure of a server according to an embodiment of the present invention.
As shown in fig. 5, the server includes a first receiving module 1, a sending module 2, a second receiving module 3, and a retrieving module 4.
The first receiving module 1 is configured to receive authentication information sent by a client, where the authentication information includes an authentication code and instruction information.
The sending module 2 is used for sending the micro-service module to the client, and the client deploys after receiving the micro-service module.
The second receiving module 3 is configured to receive feature information uploaded by the micro service module, where the feature information is obtained by the micro service module by locally preprocessing the paper data at the client, and the preprocessing the paper data is to extract feature words from a paper text.
And the retrieval module 4 is used for retrieving the related existing documents from the database according to the characteristic information and sending the related existing documents to the microservice module.
The microservice module is used for receiving related existing documents, calculating the similarity between the thesis text and the related existing documents to obtain a duplicate checking result, and sending the duplicate checking result to the client.
Optionally, the sending module 2 further includes an obtaining sub-module, a registering sub-module, and a sending sub-module.
The obtaining submodule is used for obtaining client attribute information and extracting a corresponding micro-service module from a micro-service warehouse according to the attribute information.
And the registration submodule is used for registering the micro-service module according to the authentication information to generate a configuration file.
And the sending submodule is used for sending the registered micro-service module to the client for deployment.
Optionally, the retrieving module 4 is further configured to retrieve the relevant existing document from the third-party server when the relevant existing document is not retrieved from the database according to the feature information. The retrieval module 4 further includes a generation sub-module, configured to generate connection information, send the connection information to the client, and send the connection information and the feature information to the third-party server.
And the third-party server is used for retrieving the related existing documents from the third-party database according to the characteristic information, and sending the connection information to the client if the related existing documents are retrieved.
The client is further provided with a matching verification module 5, the matching verification module 5 is used for performing matching verification on the received two pieces of connection information, if the two pieces of connection information are matched, connection is established with a third-party server, and the third-party server sends related existing documents to the client.
Optionally, the microservice module further includes a receiving submodule, a creating submodule, and a calculating submodule.
The receiving submodule is used for sequentially receiving related existing documents sent by the server, and the related existing documents are sequentially recorded into a transmission queue by the server for sending.
The creating submodule is used for creating a cache region and storing the received related existing documents into the cache region.
And the calculation sub-module is used for extracting the related existing documents from the buffer area to perform similarity calculation with the thesis text, receiving the new related existing documents and storing the new related existing documents into the buffer area until the transmission queue is sent.
The server is configured to execute the foregoing embodiments, and reference may be made to the foregoing method embodiments for implementing the principles and technical effects, which are not described herein again.
An embodiment of the present invention further provides a thesis management system based on big data, where the system includes the server and the client described in any of the above embodiments.
These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more special integrated circuits, or one or more microprocessors, or one or more field programmable gate arrays, or the like. For another example, when some of the above modules are implemented in the form of processing element dispatcher code, the processing element may be a general purpose processor, such as a central processing unit or other processor that can invoke the program code. For another example, the modules may be integrated together and implemented in a system on a chip.
In the embodiments provided in the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (9)

1. A thesis management method based on big data is applied to a server and is characterized by comprising the following steps:
receiving authentication information sent by a client, wherein the authentication information comprises an authentication code and instruction information;
sending the micro-service module to a client, and deploying after receiving the micro-service module by the client;
receiving feature information uploaded by a micro-service module, wherein the feature information is obtained by the micro-service module through local preprocessing of thesis data on a client side, and the preprocessing of the thesis data is to extract feature words from a thesis text;
retrieving related existing documents from a database according to the characteristic information, and sending the related existing documents to the micro-service module;
and the micro-service module receives the related existing documents, calculates the similarity between the thesis text and the related existing documents to obtain a duplicate checking result, and sends the duplicate checking result to the client.
2. The big-data-based thesis management method according to claim 1, wherein said sending a micro-service module to a client specifically includes:
acquiring client attribute information, and extracting a corresponding micro-service module from a micro-service warehouse according to the attribute information;
registering the micro service module according to the authentication information to generate a configuration file;
and sending the registered micro service module to a client for deployment.
3. A thesis management method based on big data as claimed in claim 1, wherein when no relevant existing documents are retrieved from the database according to the feature information, retrieving relevant existing documents from a third party server specifically comprises:
generating connection information, sending the connection information to the client, and sending the connection information and the characteristic information to a third-party server;
the third-party server retrieves the related existing documents from the third-party database according to the characteristic information, and if the related existing documents are retrieved, the third-party server sends connection information to the client;
and the client performs matching verification on the received two pieces of connection information, if the two pieces of connection information are matched, connection is established with a third-party server, and the third-party server sends related existing documents to the client.
4. A thesis management method based on big data as claimed in claim 1, wherein said micro service module receives related existing documents, calculates similarity between the thesis text and the related existing documents to obtain duplicate checking result, specifically comprising:
the method comprises the steps that related existing documents sent by a server are sequentially received, and the related existing documents are sequentially recorded into a transmission queue by the server to be sent;
creating a cache region, and storing the received related existing documents into the cache region;
and extracting related existing documents from the buffer area to perform similarity calculation with the paper text, receiving new related existing documents and storing the new related existing documents in the buffer area until the transmission queue is sent.
5. A server, characterized in that the server comprises:
the first receiving module is used for receiving authentication information sent by a client, wherein the authentication information comprises an authentication code and instruction information;
the sending module is used for sending the micro-service module to a client, and the client deploys after receiving the micro-service module;
the second receiving module is used for receiving the feature information uploaded by the micro-service module, wherein the feature information is obtained by preprocessing the thesis data locally at the client by the micro-service module, and the preprocessing of the thesis data is to extract feature words from the thesis text;
the retrieval module is used for retrieving related existing documents from the database according to the characteristic information and sending the related existing documents to the microservice module;
the microservice module is used for receiving related existing documents, calculating the similarity between the thesis text and the related existing documents to obtain a duplicate checking result, and sending the duplicate checking result to the client.
6. The server according to claim 5, wherein the sending module further comprises:
the acquisition submodule is used for acquiring the attribute information of the client and extracting a corresponding micro-service module from the micro-service warehouse according to the attribute information;
the registration submodule is used for registering the micro-service module according to the authentication information to generate a configuration file;
and the sending submodule is used for sending the registered micro-service module to the client for deployment.
7. The server according to claim 5, wherein the retrieving module is further configured to retrieve the relevant existing document from the third-party server when the relevant existing document is not retrieved from the database according to the feature information, and the retrieving module further comprises a generating sub-module configured to generate a connection information, send the connection information to the client, and send the connection information and the feature information to the third-party server,
the third-party server is used for retrieving related existing documents from a third-party database according to the characteristic information, and sending connection information to the client if the related existing documents are retrieved;
the client is also provided with a matching verification module which is used for matching and verifying the received two pieces of connection information, if the two pieces of connection information are matched, the connection is established with a third-party server, and the third-party server sends related existing documents to the client.
8. The server according to claim 5, wherein the microservice module further comprises:
the receiving submodule is used for sequentially receiving related existing documents sent by the server, and the related existing documents are sequentially recorded into a transmission queue by the server for sending;
the creating submodule is used for creating a cache region and storing the received related existing documents into the cache region;
and the calculation sub-module is used for extracting the related existing documents from the buffer area to perform similarity calculation with the thesis text, receiving the new related existing documents and storing the new related existing documents into the buffer area until the transmission queue is sent.
9. A big data based thesis management system, characterized in that said system comprises a server and a client according to any one of claims 5-8.
CN202010122369.5A 2020-02-27 2020-02-27 Thesis management method, server and system based on big data Active CN111353031B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010122369.5A CN111353031B (en) 2020-02-27 2020-02-27 Thesis management method, server and system based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010122369.5A CN111353031B (en) 2020-02-27 2020-02-27 Thesis management method, server and system based on big data

Publications (2)

Publication Number Publication Date
CN111353031A true CN111353031A (en) 2020-06-30
CN111353031B CN111353031B (en) 2023-04-14

Family

ID=71195967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010122369.5A Active CN111353031B (en) 2020-02-27 2020-02-27 Thesis management method, server and system based on big data

Country Status (1)

Country Link
CN (1) CN111353031B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069496A (en) * 2020-09-10 2020-12-11 杭州锘崴信息科技有限公司 Work updating system, method, device and storage medium for protecting information

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5907836A (en) * 1995-07-31 1999-05-25 Kabushiki Kaisha Toshiba Information filtering apparatus for selecting predetermined article from plural articles to present selected article to user, and method therefore
US20060218137A1 (en) * 2000-02-03 2006-09-28 Yasuhiko Inaba Method of and an apparatus for retrieving and delivering documents and a recording media on which a program for retrieving and delivering documents are stored
CN107273432A (en) * 2017-05-23 2017-10-20 合肥智权信息科技有限公司 A kind of patent article integration system and method based on big data
CN107885706A (en) * 2017-11-06 2018-04-06 佛山市章扬科技有限公司 A kind of system of data similarity detection
WO2018096514A1 (en) * 2016-11-28 2018-05-31 Thomson Reuters Global Resources System and method for finding similar documents based on semantic factual similarity
CN108763486A (en) * 2018-05-30 2018-11-06 湖南写邦科技有限公司 Paper duplicate checking method, terminal and storage medium based on terminal
CN110334325A (en) * 2019-07-16 2019-10-15 同方知网数字出版技术股份有限公司 A kind of full text similarity analysis method compared towards publishing house's strange land resource joint
CN110727762A (en) * 2019-09-17 2020-01-24 东软集团股份有限公司 Method, device, storage medium and electronic equipment for determining similar texts
CN110737912A (en) * 2018-09-26 2020-01-31 杨思琦 thesis duplicate checking method based on homomorphic encryption

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5907836A (en) * 1995-07-31 1999-05-25 Kabushiki Kaisha Toshiba Information filtering apparatus for selecting predetermined article from plural articles to present selected article to user, and method therefore
US20060218137A1 (en) * 2000-02-03 2006-09-28 Yasuhiko Inaba Method of and an apparatus for retrieving and delivering documents and a recording media on which a program for retrieving and delivering documents are stored
WO2018096514A1 (en) * 2016-11-28 2018-05-31 Thomson Reuters Global Resources System and method for finding similar documents based on semantic factual similarity
CN107273432A (en) * 2017-05-23 2017-10-20 合肥智权信息科技有限公司 A kind of patent article integration system and method based on big data
CN107885706A (en) * 2017-11-06 2018-04-06 佛山市章扬科技有限公司 A kind of system of data similarity detection
CN108763486A (en) * 2018-05-30 2018-11-06 湖南写邦科技有限公司 Paper duplicate checking method, terminal and storage medium based on terminal
CN110737912A (en) * 2018-09-26 2020-01-31 杨思琦 thesis duplicate checking method based on homomorphic encryption
CN110334325A (en) * 2019-07-16 2019-10-15 同方知网数字出版技术股份有限公司 A kind of full text similarity analysis method compared towards publishing house's strange land resource joint
CN110727762A (en) * 2019-09-17 2020-01-24 东软集团股份有限公司 Method, device, storage medium and electronic equipment for determining similar texts

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
彭建高等: ""大学生学术论文查重系统的设计开发与应用实现"", 《信息技术与信息化》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069496A (en) * 2020-09-10 2020-12-11 杭州锘崴信息科技有限公司 Work updating system, method, device and storage medium for protecting information
CN112069496B (en) * 2020-09-10 2024-04-26 杭州锘崴信息科技有限公司 System, method, device and storage medium for checking new works of protection information

Also Published As

Publication number Publication date
CN111353031B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
US20220019678A1 (en) Method, apparatus, and computer-readable medium for automated construction of data masks
US11361110B2 (en) File verification method, file verification system and file verification server
CN103973692A (en) Automatic collecting system and method for electronic archives based on virtual printer
CN108764902B (en) Method, node and blockchain system for storing data
Rahmatulloh et al. Web services to overcome interoperability in fingerprint-based attendance system
CN110738323A (en) Method and device for establishing machine learning model based on data sharing
CN103985073A (en) Automatic electronic file collection system based on virtual printing and use method thereof
CN107888591A (en) The method and system that a kind of electronic data is saved from damage
CN107085584B (en) Cloud document management method and system based on content and server
CN111353031B (en) Thesis management method, server and system based on big data
CN112202919B (en) Picture ciphertext storage and retrieval method and system under cloud storage environment
CN106712958B (en) Information acquisition method and system, real-name system information acquisition method, system and application
US11822587B2 (en) Server and method for classifying entities of a query
CN109104449A (en) A kind of more Backup Data property held methods of proof under cloud storage environment
CN106598983A (en) Information display method and device
CN116866422A (en) Method, device, equipment and storage medium for pushing sensitive information and desensitizing information in real time
CN109492117A (en) Patent data analysis system
CN114048490A (en) Information processing method and device, equipment and storage medium thereof
US11989266B2 (en) Method for authenticating digital content items with blockchain and writing digital content items data to blockchain
CN113507499B (en) Smart campus dormitory-checking system based on big data
Yang et al. An Efficient and Privacy‐Preserving Biometric Identification Scheme Based on the FITing‐Tree
CN116719778B (en) Technology for generating virtual partition to complete four-way information theme by OFD file on OA system
US7827287B2 (en) Interim execution context identifier
CN116074041A (en) Cross-network-segment visiting processing information acquisition and feedback method
WO2021258189A1 (en) Method for authenticating digital content items with blockchain and writing digital content items data to blockchain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant