Big data-based manuscript propagation system and method
Technical Field
The invention relates to the field of big data, in particular to a manuscript transmission system and method based on big data.
Background
With the progress of paperless and digitization, more and more companies transmit documents to employees in the form of electronic documents. The mode of adopting the electronic file to spread the manuscript can improve the office efficiency of a company and reduce the use of paper, so that the office is more environment-friendly. However, in the process of document propagation, the documents are often sent to employees in a group sending manner, but in the group sending process, some employees who do not need the documents also receive the documents, so that the working storage space of the employees is occupied by junk documents.
Disclosure of Invention
The invention aims to provide a manuscript propagation system and method based on big data, and aims to solve the problems in the prior art.
In order to achieve the purpose, the invention provides the following technical scheme:
a big data-based manuscript transmission system comprises an original manuscript sending module to be sent, an original manuscript acquiring module to be sent, a manuscript generating module to be distributed and a manuscript monitoring module, the original document sending module to be sent is used for a sender to send an original document to be sent, the original document acquiring module to be sent is used for acquiring a document title, a document text, a document type and a preset identification of the original document to be sent, the to-be-distributed document generation module generates to-be-distributed documents according to the original documents to be transmitted and the preset identification and transmits the to-be-distributed documents in a group, wherein, the preset mark comprises a full-sending mark and a partial-sending mark, when the preset mark is the partial-sending mark, and counting the access condition of the original document to be sent, and judging whether the original document to be sent is privately sent or not according to the access condition.
Preferably, the original document to be sent sending module includes a category obtaining module, a category judging module, a received object setting module and a received object adjusting module, the category obtaining module is used for obtaining the category of the original document to be sent, the category judging module is used for judging whether the sender is the original document of the category sent for the first time, when the sender is the original document of the category sent for the first time, the sender sets the received objects of the original document to be sent in the received object setting module one by one, and when the sender is not the original document of the category sent for the first time, the received object adjusting module is operated, the received object adjusting module includes an abstract extracting module, a similarity comparing and sorting module and an object adding and deleting module, the abstract extracting module is used for extracting the abstract of the original document to be sent, the similarity comparing and sorting module compares the abstract similarity of the original document to be sent with the abstract of the historical original document of the same category, and the similarity is sorted according to the sequence from large to small, the object adding and deleting module selects the receiving object of the history original manuscript sorted firstly as a candidate sending object to be pushed to a sender, and the sender adds and deletes the receiving object in the candidate object to obtain a formal receiving object.
Preferably, the to-be-distributed document generation module includes a full-transmission identifier generation module and a partial-transmission identifier generation module, when the preset identifier is the full-transmission identifier, the to-be-distributed document is an original document to be sent, and the original document to be sent is group-transmitted, and when the preset identifier is the partial-transmission identifier, the to-be-distributed document includes a document title and abstract keywords, wherein the document title has a hyperlink pointing to the original document to be sent, and the abstract keywords are keywords extracted from an abstract of the original document to be sent after the abstract is extracted.
Preferably, the manuscript monitoring module comprises a receiving object address counting module, a temporary relation database establishing module, an address judging module, a temporary relation database managing module, a time interval judging module, a temporary relation database judging module, a manuscript private sending module and a informal receiving object accessing module, wherein the receiving object address counting module is used for counting the MAC address returned by each receiving object when receiving the manuscript to be distributed, the temporary relation database establishing module extracts the MAC address of the formal receiving object from the counting result of the receiving object address counting module, establishes a temporary relation database of the formal receiving object and the MAC address corresponding to the formal receiving object, and the address judging module acquires the MAC address of the receiving object and judges whether the address belongs to the MAC address in the temporary relation database when the receiving object clicks the hyperlink of the manuscript title, the temporary relation database management module allows a receiving object to access an original manuscript to be sent pointed by a hyperlink and deletes the MAC address and a corresponding formal receiving object from the temporary relation database when the address judgment module judges that the MAC address belongs to the MAC address in the temporary relation database, the time interval judgment module is used for obtaining the time interval between a time node of clicking the hyperlink to be accessed and the current time node in the formal receiving object and judging the size relation between the time interval and an interval threshold value, the temporary relation database judgment module judges whether the temporary relation database also contains the MAC address when the time interval is larger than or equal to the interval threshold value, the manuscript privately sending module judges that the temporary relation database judgment module also contains the MAC address, obtains the formal receiving object corresponding to the MAC address and privately sends the original manuscript to be sent to the formal receiving object, and the informal receiving object access module transmits the receiving object corresponding to the MAC address to a sender when the address judgment module judges that the receiving object does not belong to the MAC address in the temporary relation database, and the sender judges whether to grant the receiving object the authority of temporarily accessing the original manuscript to be sent pointed by the hyperlink.
A manuscript propagation method based on big data comprises the following steps:
a sender sends an original document to be sent, obtains a document title, a document text, a document category and a preset identification of the original document to be sent, generates a document to be distributed according to the original document to be sent and the preset identification and sends the document to be distributed in a group mode, wherein the preset identification comprises a full-sending identification and a partial-sending identification;
and when the preset identifier is a part of the sending identifier, counting the access condition of the original document to be sent, and judging whether the original document to be sent is privately sent or not according to the access condition.
Preferably, the sending of the original document to be sent by the sender includes:
acquiring the category of the original manuscript to be sent,
if the sender sends the original manuscript of the category for the first time, the sender sets receiving objects of the original manuscript to be sent one by one;
if the sender does not send the original documents of the category for the first time, the abstract of the original document to be sent is extracted, the similarity between the abstract of the original document to be sent and the abstract of the historical original documents of the same category is compared, the similarity is sorted according to the descending order, the receiving object of the historical original document sorted at the first time is selected as a candidate sending object to be pushed to the sender, and the sender adds and subtracts the receiving object in the candidate object to obtain a formal receiving object.
Preferably, the generating the to-be-distributed document according to the to-be-sent original document and the preset identifier and the mass-sending the to-be-distributed document include the following steps:
when the preset identification is a full-sending identification, the manuscript to be distributed is an original manuscript to be sent, and the original manuscript to be sent is sent in a group mode;
when the preset identification is part of the sending identification, the to-be-distributed document comprises a document title and abstract keywords, wherein the document title is provided with a hyperlink pointing to the to-be-sent original document, and the abstract keywords are keywords extracted from an abstract of the to-be-sent original document after the abstract is extracted.
Preferably, the counting the access condition of the original document to be sent, and determining whether to privately send the original document to be sent according to the access condition includes:
when receiving the manuscript to be distributed, each receiving object returns MAC address, and extracts MAC address of formal receiving object from the MAC address, and establishes temporary relation database between formal receiving object and corresponding MAC address,
when a receiving object clicks a hyperlink of a manuscript title, acquiring an MAC address of the receiving object, if the MAC address of the receiving object belongs to the MAC address in the temporary relational database, allowing the receiving object to access an original manuscript to be sent and pointed by the hyperlink, and deleting the MAC address and a corresponding formal receiving object from the temporary relational database;
and acquiring the time interval between the time node of the first click access hyperlink in the formal receiving object and the current time node, if the time interval is larger than or equal to an interval threshold, judging whether the temporary relational database also contains an MAC address, if so, acquiring the formal receiving object corresponding to the MAC address, and privately sending the original manuscript to be sent to the formal receiving object.
Preferably, when receiving the document to be distributed, each receiving object returns the MAC address, the method further includes:
if the MAC address of the receiving object does not belong to the MAC address in the temporary relation database, the receiving object corresponding to the MAC address is transmitted to the sender, and the sender judges whether to grant the receiving object the authority of temporarily accessing the original manuscript to be sent pointed by the hyperlink.
Preferably, before transmitting the receiving object corresponding to the MAC address to the sender, the method further includes:
and counting the frequency F of the receiving object granted with the original manuscript permission of the same category of temporary access, which is P/M, wherein P is the number of times that the sender grants the permission of the receiving object to temporarily access the original manuscript pointed by the hyperlink, M is the number of times that the receiving object clicks the hyperlink to be accessed when the receiving object is an informal receiving object, if the frequency F is greater than or equal to a frequency threshold value, the receiving object corresponding to the MAC address is transmitted to the sender, and if the frequency F is less than the frequency threshold value, the receiving object is rejected to access the hyperlink.
Compared with the prior art, the invention has the beneficial effects that: the invention generates different to-be-distributed manuscripts through the to-be-transmitted original manuscripts and the preset identifications, and takes the title and the abstract keywords of the manuscripts as the to-be-distributed manuscripts when the manuscripts are only required to be transmitted to one part of people, thereby reducing the occupation of the storage space of staff by junk manuscripts.
Drawings
Fig. 1 is a schematic block diagram of a document propagation system based on big data according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, in an embodiment of the present invention, a document propagation system based on big data includes an original document to be sent sending module, an original document to be sent acquiring module, a document to be distributed generating module, and a document monitoring module, the original document sending module to be sent is used for a sender to send an original document to be sent, the original document acquiring module to be sent is used for acquiring a document title, a document text, a document type and a preset identification of the original document to be sent, the to-be-distributed document generation module generates to-be-distributed documents according to the original documents to be transmitted and the preset identification and transmits the to-be-distributed documents in a group, wherein, the preset mark comprises a full-sending mark and a partial-sending mark, when the preset mark is the partial-sending mark, and counting the access condition of the original document to be sent, and judging whether the original document to be sent is privately sent or not according to the access condition.
The original document sending module to be sent comprises a category acquisition module, a category judgment module, a receiving object setting module and a receiving object adjusting module, wherein the category acquisition module is used for acquiring the category of the original document to be sent, the category judgment module is used for judging whether a sender sends the category of the original document for the first time, when the sender sends the category of the original document for the first time, the sender sets the receiving objects of the original document to be sent one by one in the receiving object setting module, and when the sender does not send the category of the original document for the first time, the receiving object adjusting module works, the receiving object adjusting module comprises an abstract extracting module, a similarity comparison sorting module and an object adding and deleting module, the abstract extracting module is used for extracting the abstract of the original document to be sent, the similarity comparison sorting module is used for comparing the abstract of the original document to be sent with the abstract of the historical original document of the same category, and the similarity is sorted according to the sequence from large to small, the object adding and deleting module selects the receiving object of the history original manuscript sorted firstly as a candidate sending object to be pushed to a sender, and the sender adds and deletes the receiving object in the candidate object to obtain a formal receiving object.
The to-be-distributed document generation module comprises a full-transmission identification generation module and a partial-transmission identification generation module, when the preset identification is the full-transmission identification, the to-be-distributed document is an original document to be transmitted, and the original document to be transmitted is subjected to group transmission, and when the preset identification is the partial-transmission identification, the partial-transmission identification generation module comprises a document title and abstract keywords, wherein the document title is provided with a hyperlink pointing to the original document to be transmitted, and the abstract keywords are keywords extracted from an abstract of the original document to be transmitted after the abstract is extracted.
The manuscript monitoring module comprises a receiving object address counting module, a temporary relation database establishing module, an address judging module, a temporary relation database management module, a time interval judging module, a temporary relation database judging module, a manuscript private sending module and an informal receiving object access module, wherein the receiving object address counting module is used for counting the MAC address returned by each receiving object when receiving a manuscript to be distributed, the temporary relation database establishing module extracts the MAC address of a formal receiving object from the counting result of the receiving object address counting module and establishes a temporary relation database of the formal receiving object and the MAC address corresponding to the formal receiving object, and the address judging module acquires the MAC address of the receiving object and judges whether the address belongs to the MAC address in the temporary relation database when the hyperlink of the manuscript title is clicked by the receiving object, the temporary relation database management module allows a receiving object to access an original manuscript to be sent pointed by a hyperlink and deletes the MAC address and a corresponding formal receiving object from the temporary relation database when the address judgment module judges that the MAC address belongs to the MAC address in the temporary relation database, the time interval judgment module is used for obtaining the time interval between a time node of clicking the hyperlink to be accessed and the current time node in the formal receiving object and judging the size relation between the time interval and an interval threshold value, the temporary relation database judgment module judges whether the temporary relation database also contains the MAC address when the time interval is larger than or equal to the interval threshold value, the manuscript privately sending module judges that the temporary relation database judgment module also contains the MAC address, obtains the formal receiving object corresponding to the MAC address and privately sends the original manuscript to be sent to the formal receiving object, and the informal receiving object access module transmits the receiving object corresponding to the MAC address to a sender when the address judgment module judges that the receiving object does not belong to the MAC address in the temporary relation database, and the sender judges whether to grant the receiving object the authority of temporarily accessing the original manuscript to be sent pointed by the hyperlink.
A manuscript propagation method based on big data comprises the following steps:
a sender sends an original document to be sent, obtains a document title, a document text, a document category and a preset identification of the original document to be sent, generates a document to be distributed according to the original document to be sent and the preset identification and sends the document to be distributed in a group mode, wherein the preset identification comprises a full-sending identification and a partial-sending identification; the full-sending identification represents that the manuscript needs to be sent to all staff members, and the partial-sending identification only needs to send the manuscript to partial staff members;
and when the preset identifier is a part of the sending identifier, counting the access condition of the original document to be sent, and judging whether the original document to be sent is privately sent or not according to the access condition.
The sending of the original manuscript to be sent by the sender comprises the following steps:
acquiring the category of the original manuscript to be sent,
if the sender sends the original manuscript of the category for the first time, the sender sets receiving objects of the original manuscript to be sent one by one;
if the sender does not send the original documents of the category for the first time, the abstract of the original document to be sent is extracted, the similarity between the abstract of the original document to be sent and the abstract of the historical original documents of the same category is compared, the similarity is sorted according to the descending order, the receiving object of the historical original document sorted at the first time is selected as a candidate sending object to be pushed to the sender, and the sender adds and subtracts the receiving object in the candidate object to obtain a formal receiving object.
The generating of the to-be-distributed document according to the to-be-sent original document and the preset identifier and the mass sending of the to-be-distributed document comprise the following steps:
when the preset identification is a full-sending identification, the manuscript to be distributed is an original manuscript to be sent, and the original manuscript to be sent is sent in a group mode;
when the preset identification is part of the sending identification, the to-be-distributed document comprises a document title and abstract keywords, wherein the document title is provided with a hyperlink pointing to the to-be-sent original document, and the abstract keywords are keywords extracted from an abstract of the to-be-sent original document after the abstract is extracted.
The counting of the access condition of the original document to be sent and the judging of whether to privately send the original document to be sent according to the access condition comprise:
when receiving the manuscript to be distributed, each receiving object returns MAC address, and extracts MAC address of formal receiving object from the MAC address, and establishes temporary relation database between formal receiving object and corresponding MAC address,
when a receiving object clicks a hyperlink of a manuscript title, acquiring an MAC address of the receiving object, if the MAC address of the receiving object belongs to the MAC address in the temporary relational database, allowing the receiving object to access an original manuscript to be sent and pointed by the hyperlink, and deleting the MAC address and a corresponding formal receiving object from the temporary relational database;
and acquiring the time interval between the time node of the first click access hyperlink in the formal receiving object and the current time node, if the time interval is larger than or equal to an interval threshold, judging whether the temporary relational database also contains an MAC address, if so, acquiring the formal receiving object corresponding to the MAC address, and privately sending the original manuscript to be sent to the formal receiving object.
When receiving the manuscript to be distributed, each receiving object further comprises the following steps when returning the MAC address:
if the MAC address of the receiving object does not belong to the MAC address in the temporary relation database, the receiving object corresponding to the MAC address is transmitted to the sender, and the sender judges whether to grant the receiving object the authority of temporarily accessing the original manuscript to be sent pointed by the hyperlink.
Before transmitting the receiving object corresponding to the MAC address to the sender, the method further includes:
and counting the frequency F of the receiving object granted with the original manuscript permission of the same category of temporary access, which is P/M, wherein P is the number of times that the sender grants the permission of the receiving object to temporarily access the original manuscript pointed by the hyperlink, M is the number of times that the receiving object clicks the hyperlink to be accessed when the receiving object is an informal receiving object, if the frequency F is greater than or equal to a frequency threshold value, the receiving object corresponding to the MAC address is transmitted to the sender, and if the frequency F is less than the frequency threshold value, the receiving object is rejected to access the hyperlink.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.