CN115082947B

CN115082947B - Paper letter quick collecting, sorting and reading system

Info

Publication number: CN115082947B
Application number: CN202210822765.8A
Authority: CN
Inventors: 李振国; 金雷; 刘坤; 王国清
Original assignee: Jiangsu Chuhuai Software Technology Development Co ltd
Current assignee: Jiangsu Chuhuai Software Technology Development Co ltd
Priority date: 2022-07-12
Filing date: 2022-07-12
Publication date: 2023-08-15
Anticipated expiration: 2042-07-12
Also published as: CN115082947A

Abstract

The invention discloses a paper letter rapid collection, sorting and reading system, which relates to the technical field of letter content processing, and comprises a paper letter scanning processing module, a system input interface and a letter processing module, wherein the paper letter scanning processing module is used for separately scanning and identifying each part of letter material by adopting a rapid scanner, automatically transferring the letter content obtained after scanning and identifying to the system input interface, and generating a letter list corresponding to each part of letter material; the content processing module is used for performing comparison and review and extracting elements; carrying out semantic decomposition to obtain a appeal part and a description fact part corresponding to each letter material; identifying the category of the affiliated public opinion hotspot; the attention index calculation module is used for calculating attention indexes for the letter materials; the important index calculation module is used for calculating important indexes for the letter materials; the pushing module is used for calculating the comprehensive pushing attention of each letter material; and obtaining a list of letters to be handled, which is pushed to the staff.

Description

Paper letter quick collecting, sorting and reading system

Technical Field

The invention relates to the technical field of letter content processing, in particular to a system for rapidly collecting, sorting and reading paper letters.

Background

The method for transmitting and expressing the comments in the form of letters is necessary, so that the importance of a delivery person can be intuitively reflected; however, for the manager handling letters, the number of letter materials delivered is often large and the category is cumbersome; because of the limited time and effort of the staff, the high cautious standard since the consistency of the letter handling business determines that the time to get the letter feedback may be longer when the delivery person delivers the letter material; the traditional letter material processing mode needs manual information acquisition and manual input, and is complex in process and easy to make mistakes; the time cost is high, related work needs to be completed manually by staff, and time and labor are wasted;

for letter processing business, when a large amount of letter materials are received in the same time period, the common appeal of wide delivery persons is quickly, efficiently and accurately identified, and the common appeal is submitted to related responsible personnel as much as possible, so that the key of whether the letter processing business is efficient in working quality is determined.

Disclosure of Invention

The invention aims to provide a system for rapidly collecting, sorting and reading paper letters so as to solve the problems in the background technology.

In order to solve the technical problems, the invention provides the following technical scheme: the system comprises a paper letter scanning processing module, a content processing module, a attention index calculating module, an important index calculating module and a pushing module;

the paper letter scanning processing module is used for separately scanning and identifying each letter material by adopting a rapid scanner, automatically transferring the letter content obtained after scanning and identifying to a system input interface, and generating a letter list corresponding to each letter material; the letter material comprises a plurality of paper letters; wherein, the categories of the paper letters comprise complaints, identity certificates and proxy agent certificates; the form of the paper letter comprises a handwriting body and a printing body; registering source information of each letter material; summarizing and displaying the source information and the material content obtained by scanning and identifying in a letter list; the letter list comprises a list two-dimensional code, and a list number is stored in the list two-dimensional code;

the content processing module is used for comparing and reviewing the letter content obtained after each scanning with the corresponding material original, automatically entering a system to-be-transacted interface for element extraction according to the letter content setting after the comparison and the review are correct; setting a handling period, and respectively carrying out semantic decomposition on each profile information part presented in the profile information element columns in the handling period to obtain a appeal part and a fact description part corresponding to each letter material; based on semantic character features of a appeal part and a description fact part corresponding to each letter material, identifying the affiliated civil public opinion hotspot category of each letter material;

the attention index calculation module is used for extracting the contents of each letter material presented in the to-be-handled interface in the handling period and calculating attention indexes for each letter material based on the characteristic word distribution situation in the appeal part in each letter material content;

the important index calculation module is used for extracting the contents of each letter material presented in the to-be-handled interface in the handling period and calculating an important index for each letter material based on the characteristic word distribution situation in the fact description part in each letter material content;

the pushing module is used for obtaining comprehensive pushing attention of each letter material according to the attention index and the importance index corresponding to each letter material and the generation time of the list two-dimensional code corresponding to each letter material; and based on the comprehensive pushing attention degree of each letter material, sorting all the letter materials in the to-be-handled interface to obtain a to-be-handled letter list pushed to the staff.

Further, the content processing module comprises an element extraction processing unit, a semantic decomposition unit and a hot spot identification unit;

the element extraction processing unit is used for extracting elements of the letter content automatically transferred into the system input interface and automatically filling the element content correspondingly extracted into the element column; the element columns corresponding to the element columns comprise letter writer information, profile information, problem areas and system departments to which the problems belong; respectively comparing and checking the content in each element column with the original one by one, and setting the letter content after the comparison and checking are error-free to automatically enter a system to-be-handled interface;

the semantic decomposition unit is used for real-time processing of civil news public opinion from Internet including mainstream news websites and new media websitesCapturing data, carrying out semantic decomposition on the folk news public opinion data, and respectively extracting keyword sets X corresponding to a plurality of folk news public opinion hotspots ₁ ,X ₂ ,…,X _n The method comprises the steps of carrying out a first treatment on the surface of the Wherein X is ₁ ,X ₂ ,…,X _n Respectively representing keyword sets corresponding to class 1, class 2, class … and class n folk public opinion hotspots; respectively carrying out semantic decomposition on each profile information part presented in the profile information element column in the handling period to obtain a appeal part and a description fact part corresponding to each letter material; extracting keywords from the solicited part and the declared part of each letter material to obtain a keyword set Y corresponding to each letter material ₁ ,Y ₂ ,…,Y _m The method comprises the steps of carrying out a first treatment on the surface of the Wherein Y is ₁ 、Y ₂ 、…、Y _m Respectively representing the extracted keyword sets corresponding to the 1 st, 2 nd, … th and m th letter materials;

the hot spot identification unit is used for respectively calculating the similarity between the keyword set of each letter material and the keyword sets of a plurality of types of civil public opinion hot spots successively, setting a similarity threshold, respectively selecting the types of the civil public opinion hot spots with the overlap ratio larger than the overlap ratio threshold for each material, carrying out category marking treatment to obtain the corresponding types of the civil public opinion hot spots of each letter material, and respectively accumulating category marking numbers for each letter material;

the recognition of the civil public opinion hotspot categories of each letter material is performed respectively, so that staff can quickly master which hotspots are mainly related to the appeal reacted by the letter submitters in a certain period, the staff can provide necessary technical mats on the follow-up master of the appeal shared by the wide submitters, and meanwhile, the staff can provide necessary technical mats for the follow-up calculation of the relevant attention degree of the materials provided by each submitter.

Further, the attention index calculation module comprises a labeling area identification processing unit, a first attention index calculation unit, a second attention index calculation unit and a third attention index calculation unit;

the marking area identification processing unit is used for marking areas of scanned letter contents corresponding to various letter materials and completing the integration processing of the marking areas based on the distribution characteristics of the marking areas;

the first attention index calculation unit is used for receiving the data in the labeling area identification processing unit and calculating a first attention index for each letter material;

a second attention index calculation unit for receiving the data in the labeling area identification processing unit and calculating a second attention index for each letter material

And the third attention index calculation unit is used for receiving the data in the labeling area identification processing unit and calculating a third attention index for each letter material.

Further, the labeling area identifying and processing unit includes:

capturing all the dullness characteristic words or phrases and sensibility characteristic words or phrases in the big data in advance, simultaneously, collecting all the words or phrases with the feature of the claim, the sensitive words or phrases into a feature word stock; setting the degree grade number of each feature word or phrase in the feature word library;

respectively acquiring the complaint part text content typesetting obtained after the letter materials are scanned, respectively carrying out content investigation on the materials classified into various folk public opinion hotspots, and displaying the dialect feature words or phrases and the sensitivity feature words or phrase labels appearing in the complaint parts of the materials on the complaint part text content typesetting based on the feature word stock; one labeling word or phrase corresponds to one first labeling area;

capturing the line interval word number C between each first labeling area, setting an interval word number threshold value, and labeling a non-labeling word part between two adjacent first labeling areas if the line interval word number C between the two adjacent first labeling areas is smaller than the interval word number threshold value, so as to generate a second labeling area formed by converging the two adjacent first labeling areas and the interval non-labeling area;

the process of carrying out region labeling processing on the appeal part obtained after each letter material is characterized in that psychological urgency degree of a submitter when related requests and complaints are stated is identified, namely, the emotion instability index of the letter submitter is defined by capturing the duty ratio of some sensitive words and dullness words in the whole space, so that staff can grasp the situation, and when staff is reminded of carrying out feedback processing on the materials, the staff can pay important attention to or pay priority to the processing under the condition that other letter material processing is not influenced.

Further, the first attention index calculation unit includes:

receiving first labeling area information and second labeling area information in a labeling area identification processing unit;

calculating a first Attention index for each letter material ¹ ：

Wherein Ya is _i A text character length representing an i-th first labeling area in each letter material appeal section; ya _j A text character length representing a j-th second labeling area in each letter material appeal section; a represents the total length of text in each letter material claim section.

Further, the second attention index calculation unit includes:

calculating a second Attention index for each letter material ² ：

Attention ² ＝∑Dgreeea _i +∑avDgreeea _j

Wherein Dgreeeea _i Representing the number of degree grades corresponding to the ith first labeling area in each letter material appeal part; avDgreeeea _j Representing the number of average degrees of ranking corresponding to the j-th second marked area in each letter material appeal section.

Further, the third attention index calculation unit includes:

acquiring text typesetting of each letter material before scanning and recognition, and capturing the front, inner and rear characteristic symbol formats of each first labeling area and each second labeling area corresponding to the appeal part in the text typesetting; marking and highlighting the parts with the characteristic symbol formats one by one in the content of the appeal part obtained after scanning and identifying; wherein the feature symbol format includes exclamation marks, question marks, fonts different from adjacent text words, font sizes different from adjacent text words, font colors different from adjacent text words, underlining, bold, highlighting;

calculating a third Attention index for each material ³ ：Attention ³ ＝∑(R ₁ a _i *R ₂ a _i )+∑(R ₁ a _j *R ₂ a _j ) Wherein R is ₁ a _i Representing the number of types of signature formats that appear before, within, and after the ith first labeling area in each letter material appeal section; r is R ₂ a _i Representing the total number of character symbol formats which appear in front of, in the interior of and behind the ith first labeling area in each letter material appeal part; r is R ₁ a _j Representing the number of types of signature formats that appear before, within, and after the jth second labeling area in each letter material appeal section; r is R ₂ a _j The total number of signature formats that appear before, within, and after the jth second label area in each material claim section is represented.

Further, the importance index calculation module includes:

extracting the fact content parts of the letter materials in the same class respectively; respectively identifying, disassembling and extracting semantic elements of each declared fact content part; the semantic elements comprise event occurrence time, event related characters, event occurrence places, event main contradictions, event backgrounds and event passes; obtaining a semantic element set corresponding to each letter material;

respectively acquiring the similarity between two semantic element sets in the same class of material, setting a similarity threshold, and collecting semantic element sets with the similarity threshold being larger than the similarity threshold to respectively acquire a plurality of semantic element set centers, wherein one semantic element set center comprises a plurality of semantic element sets with the similarity being larger than the similarity threshold;

classifying each letter material in the same class of materials based on the corresponding semantic element concentration centers; calculating an important index for each semantic element set center to which each material belongs:

wherein, import _e An important index representing the center of the e-th semantic element set; m is M _e Representing the average similarity value between the semantic element sets in the e-th semantic element set center; k (K) _e Representing the total number of the semantic element sets in the e-th semantic element set center;

attaching each semantic element set with an important index value import of a corresponding semantic element set center;

in the above process, through the process of classifying the semantic element set center, the number of complaints submitted by the letters submitted by persons based on the same complaints or the same complaint facts in the same class of materials is grasped, if the important index value import corresponding to one semantic element set is larger, the scope of the careable submitted person is larger, and on the other layer, the attention is more on the corresponding complaint content problem if the staff is preferentially examined for the class of materials in the subsequent processing process.

Further, the pushing module includes:

acquiring first Attention indexes Attention corresponding to various letters of different folk public opinion hotspot categories ¹ Second Attention index Attention ² Third Attention index Attention ³ Important index import; ordering the mail materials belonging to different folk public opinion hotspot categories according to the generation time of the corresponding list two-dimensional codes to obtain time ordering serial numbers corresponding to the materials;

calculating comprehensive push attention degree for each material:

F＝Attention ¹ +Attention ² +Attention ³ *import ^st

wherein F represents comprehensive push attention, st represents time sequence numbers corresponding to the materials;

sequencing all the letter materials in all the letter materials belonging to different folk public opinion hotspot categories according to the comprehensive push attention degree from large to small to obtain a list number sequence set belonging to different folk public opinion hotspot categories; and pushing the materials to be transacted to the staff according to the list number ordering in the list number sequence set.

Compared with the prior art, the invention has the following beneficial effects: the invention fully exerts the precision and predictability advantages of modern technology by means of artificial intelligence, and further improves the working quality; the modern technology is scientifically applied to realize manual replacement, so that the working efficiency is further improved; based on technologies such as image recognition, natural language processing and the like, the core capability and model algorithm of artificial intelligence such as voice recognition, OCR recognition, key element extraction, automatic generation of item profile and the like are developed in a customized mode, the application of the artificial intelligence on two layers of auxiliary item handling and letter submitting service is deepened, the handling period is further shortened, and the handling precision and standardization level are improved; in the process of processing the letter materials, the invention carries out the operation of the related indexes on the letter materials provided by each letter submitter, thereby providing the assistance of business processing for staff and comprehensively improving the letter processing working quality.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a schematic diagram of a system for rapid collection, sorting and reading of paper letters according to the present invention;

FIG. 2 is a schematic flow chart of a method in the system for rapid collection, sorting and reading of paper letters according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1-2, the present invention provides the following technical solutions: the system comprises a paper letter scanning processing module, a content processing module, a attention index calculating module, an important index calculating module and a pushing module;

the content processing module comprises an element extraction processing unit, a semantic decomposition unit and a hot spot identification unit;

the semantic decomposition unit is used for capturing the public opinion data of the civil news including the main stream news website and the new media website from the Internet in real time, carrying out semantic decomposition on the public opinion data of the civil news, and respectively extracting keyword sets X corresponding to a plurality of types of public opinion hotspots ₁ ,X ₂ ,…,X _n The method comprises the steps of carrying out a first treatment on the surface of the Wherein X is ₁ ,X ₂ ,…,X _n Respectively representing keyword sets corresponding to class 1, class 2, class … and class n folk public opinion hotspots; respectively carrying out semantic decomposition on each profile information part presented in the profile information element column in the handling period to obtain a appeal part and a description fact part corresponding to each letter material; extracting keywords from the solicited part and the declared part of each letter material to obtain a keyword set Y corresponding to each letter material ₁ ,Y ₂ ,…,Y _m The method comprises the steps of carrying out a first treatment on the surface of the Wherein Y is ₁ 、Y ₂ 、…、Y _m Respectively representing the extracted keyword sets corresponding to the 1 st, 2 nd, … th and m th letter materials;

the attention index calculating module comprises a labeling area identification processing unit, a first attention index calculating unit, a second attention index calculating unit and a third attention index calculating unit;

the labeling area identification processing unit comprises:

wherein the first attention index calculation unit includes:

calculating a first Attention index for each letter material ¹ ：

Wherein Ya is _i A text character length representing an i-th first labeling area in each letter material appeal section; ya _j A text character length representing a j-th second labeling area in each letter material appeal section; a represents the total length of text in each letter material claim section;

the second attention index calculation unit is used for receiving the data in the labeling area identification processing unit and calculating a second attention index for each letter material;

wherein the second attention index calculation unit includes:

calculating a second Attention index for each letter material ² ：

Attention ² ＝∑Dgreeea _i +∑avDgreeea _j

Wherein Dgreeeea _i Representing the number of degree grades corresponding to the ith first labeling area in each letter material appeal part; avDgreeeea _j Representing the average degree grade number corresponding to the j second labeling area in each letter material appeal part;

a third attention index calculation unit for receiving the data in the labeling area identification processing unit and calculating a third attention index for each letter material;

wherein the third attention index calculation unit includes:

calculating a third Attention index for each material ³ ：Attention ³ ＝∑(R ₁ a _i *R ₂ a _i )+∑(R ₁ a _j *R ₂ a _j ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is ₁ a _i Representing the number of types of signature formats that appear before, within, and after the ith first labeling area in each letter material appeal section; r is R ₂ a _i Representing the total number of character symbol formats which appear in front of, in the interior of and behind the ith first labeling area in each letter material appeal part; r is R ₁ a _j Representing the number of types of signature formats that appear before, within, and after the jth second labeling area in each letter material appeal section; r is R ₂ a _j Representing the total number of feature symbol formats appearing before, inside and behind the jth second labeling area in each material appeal part;

wherein, the importance index calculation module includes:

the pushing module is used for obtaining comprehensive pushing attention of each letter material according to the attention index and the importance index corresponding to each letter material and the generation time of the list two-dimensional code corresponding to each letter material; based on the comprehensive pushing attention degree of each letter material, all the letter materials in the to-be-handled interface are arranged to obtain a to-be-handled letter list pushed to a worker;

wherein, the push module includes:

calculating comprehensive push attention degree for each material:

F＝Attention ¹ +Attention ² +Attention ³ *import ^st

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The utility model provides a letter system is read in letter sorting of gathering fast to paper mail which characterized in that, the system includes: the system comprises a paper letter scanning processing module, a content processing module, a focus index calculating module, an important index calculating module and a pushing module;

the paper letter scanning processing module is used for separately scanning and identifying each letter material by adopting a rapid scanner, automatically transferring the letter content obtained after scanning and identifying to a system input interface, and generating a letter list corresponding to each letter material; the letter material comprises a plurality of paper letters; wherein, the categories of the paper letters comprise complaints, identity certificates and proxy agent certificates; the form of the paper letter comprises a handwriting body and a printing body; registering source information of each letter material; summarizing and displaying the source information and the scanned and identified material content in the letter list; the letter list comprises a list two-dimensional code, and a list number is stored in the list two-dimensional code;

the content processing module is used for comparing and reviewing the letter content obtained after each scanning with the corresponding material original, automatically entering a system to-be-processed interface for element extraction according to letter content setting after the comparison and the review are correct; setting a handling period, and respectively carrying out semantic decomposition on each profile information part presented in the profile information element columns in the handling period to obtain a appeal part and a fact description part corresponding to each letter material; based on semantic character features of a appeal part and a description fact part corresponding to each letter material, identifying the affiliated civil public opinion hotspot category of each letter material;

the attention index calculation module is used for extracting the content of each letter material presented in the to-be-handled interface in the handling period, and calculating the attention index for each letter material based on the characteristic word distribution condition in the appeal part in each letter material content;

the attention index calculation module comprises a labeling area identification processing unit, a first attention index calculation unit, a second attention index calculation unit and a third attention index calculation unit;

the labeling area identification processing unit comprises:

capturing all the dullness characteristic words or phrases and sensibility characteristic words or phrases in the big data in advance, simultaneously, collecting all the words or phrases with the feature of the claim, the sensitive words or phrases into a feature word stock; setting the degree grade number of each feature word or phrase in the feature word library respectively;

respectively obtaining the typesetting of the text content of the appeal part obtained after the scanning of each letter material, respectively carrying out content investigation on the materials classified into various folk public opinion hotspots, and displaying the dialect feature words or phrases, the sensitivity feature words or phrase labels appearing in each material appeal part on the typesetting of the text content of the appeal part based on the feature word stock; one labeling word or phrase corresponds to one first labeling area;

capturing the line interval word number C between each first labeling area, setting an interval word number threshold, labeling the non-labeling word part between two adjacent first labeling areas if the line interval word number C between the two adjacent first labeling areas is smaller than the interval word number threshold, and generating a second labeling area formed by converging the two adjacent first labeling areas and the interval non-labeling area

the first attention index calculation unit includes: receiving first labeling area information and second labeling area information in the labeling area identification processing unit; calculating a first Attention index for each letter material ¹ ：

Wherein Ya is _i A text character length representing an i-th first labeling area in each letter material appeal section; ya _j Representing the jth second labeling zone in each letter material appeal sectionThe text character length of the field; a represents the total length of text in each letter material claim section;

the second attention index calculation unit includes: receiving first labeling area information and second labeling area information in the labeling area identification processing unit; calculating a second Attention index for each letter material ² ：

Attention ² ＝ΣDgreee(a _i )+ΣavDgreee(a _j )

Wherein Dgreee (a) _i ) Representing the number of degree grades corresponding to the ith first labeling area in each letter material appeal part; avDgreee (a) _j ) Representing the average degree grade number corresponding to the j second labeling area in each letter material appeal part;

the third attention index calculation unit is used for receiving the data in the labeling area identification processing unit and calculating a third attention index for each letter material;

the third attention index calculation unit includes:

acquiring text typesetting of each letter material before scanning and recognition, and capturing the front, the inner and the rear characteristic symbol formats of each first marking area and each second marking area of the corresponding appeal part in the text typesetting; marking and highlighting the parts with the characteristic symbol formats one by one in the content of the appeal part obtained after scanning and identifying; wherein the characteristic symbol format comprises exclamation marks, question marks, fonts different from adjacent text words, font sizes different from adjacent text words, font colors different from adjacent text words, underlining, bold, highlighting;

calculating a third Attention index for each of the materials ³ ：Attention ³ ＝∑(R ₁ a _i *R ₂ a _i )+∑(R ₁ a _j *R ₂ a _j ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is ₁ a _i Indicated at each ofThe letter material appeal part is characterized by the number of types of character symbol formats appearing in front of, in the interior of and behind the ith first labeling area; r is R ₂ a _i Representing the total number of character symbol formats which appear in front of, in the interior of and behind the ith first labeling area in each letter material appeal part; r is R ₁ a _j Representing the number of types of signature formats that appear before, within, and after the jth second labeling area in each letter material appeal section; r is R ₂ a _j Representing the total number of signature formats appearing before, within and after the jth second labeling area in each of the material appeal parts;

the important index calculation module is used for extracting the contents of each letter material presented in the to-be-handled interface in the handling period, and calculating the important index for each letter material based on the characteristic character distribution condition in the fact description part in each letter material content;

the pushing module is used for obtaining comprehensive pushing attention of each letter material according to the attention index and the important index corresponding to each letter material and the generation time of the list two-dimensional code corresponding to each letter material; based on the comprehensive pushing attention degree of each letter material, all the letter materials in the to-be-handled interface are arranged to obtain a to-be-handled letter list pushed to a worker;

the pushing module comprises:

calculating comprehensive push attention degree for each material:

F＝[(Attention ¹ +Attention ² +Attention ³ )*import] ^st

wherein F represents comprehensive push attention, st represents time sequence numbers corresponding to the materials; sequencing all the letter materials in all the letter materials belonging to different folk public opinion hotspot categories according to the comprehensive push attention degree from large to small to obtain a list number sequence set belonging to different folk public opinion hotspot categories; and pushing the materials to be transacted to the staff according to the list number ordering in the list number sequence set.

2. The rapid paper letter collecting, sorting and reading system according to claim 1, wherein the content processing module comprises an element extraction processing unit, a semantic decomposition unit and a hot spot identification unit;

the element extraction processing unit is used for extracting elements of the letter content automatically transferred into the system input interface and automatically filling the corresponding extracted element content into the corresponding element column; the element columns corresponding to the element columns comprise letter writer information, profile information, problem areas and system departments to which the problems belong; respectively comparing and checking the content in each element column with the original one by one, and setting the letter content after the comparison and checking are error-free to automatically enter a system to-be-handled interface;

the semantic decomposition unit is used for capturing the civil news public opinion data from the internet end including the main stream news website and the new media website in real time, carrying out semantic decomposition on the civil news public opinion data, and respectively extracting keyword sets { (X) corresponding to a plurality of types of civil public opinion hotspots ₁ ),(X ₂ ),…,(X _n ) -a }; wherein, (X ₁ ),(X ₂ ),…,(X _n ) Respectively representing keyword sets corresponding to class 1, class 2, class … and class n folk public opinion hotspots; respectively carrying out semantic decomposition on each profile information part presented in the profile information element column in the handling period to obtain a appeal part and a description fact part corresponding to each letter material; extracting keywords from the appeal part and the description fact part of each letter material to obtain keyword set { (Y) corresponding to each letter material ₁ ),(Y ₂ ),…,(Y _m ) -a }; wherein, (Y) ₁ )、(Y ₂ )、…、(Y _m ) Respectively representing the extracted keyword sets corresponding to the 1 st, 2 nd, … th and m th letter materials;

the hot spot identification unit is used for respectively calculating the similarity between the keyword sets of the letter materials and the keyword sets of the plurality of types of public opinion hot spots successively, setting a similarity threshold, respectively selecting the public opinion hot spot categories with the overlap ratio larger than the overlap ratio threshold for each material, carrying out category marking processing on the public opinion hot spot categories with the overlap ratio larger than the overlap ratio threshold to obtain the public opinion hot spot categories corresponding to the respective categories of the letter materials, and respectively accumulating category marking numbers for each letter material.

3. The rapid paper letter sorting system according to claim 1, wherein the importance index calculation module includes:

respectively obtaining the similarity between two semantic element sets in the same class of material, setting a similarity threshold, and collecting semantic element sets larger than the similarity threshold to respectively obtain a plurality of semantic element set centers, wherein one semantic element set center contains a plurality of semantic element sets with similarity larger than the similarity threshold;

wherein, import _e Representing the e-th semantic meaningImportant index of prime center; m is M _e Representing the average similarity value between the semantic element sets in the e-th semantic element set center; k (K) _e Representing the total number of the semantic element sets in the e-th semantic element set center;

and respectively attaching each semantic element set with an important index value import of a corresponding semantic element set center.