TWI769531B

TWI769531B - Document confidentiality level management system and method

Info

Publication number: TWI769531B
Application number: TW109132910A
Authority: TW
Inventors: 許瑞愷; 羅文聰
Original assignee: 東海大學
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2022-07-01
Also published as: TW202213145A

Abstract

一種文件機密等級管理系統，包括有：複數個文件機密等級設定系統與一聯邦學習系統，各文件機密等級設定系統根據一分析模型對文件進行機密等級之分析，根據分析結果提供一第一機密等級標籤，並與使用者提供的一第二機密等級標籤比對，若相同，依第一機密等級標籤對文件進行加密或權限控管；若不同，依第二機密等級標籤，對文件進行加密或權限控管；各文件機密等級設定系統將各分析模型的模型參數加密傳送至聯邦學習系統；聯邦學習系統產生一更新參數並傳送至各文件機密等級設定系統，以利用更新參數更新分析模型。藉以提升準確性及機密性。A file confidentiality level management system, comprising: a plurality of file confidentiality level setting systems and a federated learning system, each file confidentiality level setting system analyzes the file confidentiality level according to an analysis model, and provides a first confidentiality level according to the analysis result The label is compared with a second confidentiality level label provided by the user. If they are the same, the document is encrypted or authorized according to the first confidentiality level label; if different, the document is encrypted or Authority control; each file confidentiality level setting system encrypts and transmits the model parameters of each analysis model to the federated learning system; the federated learning system generates an update parameter and transmits it to each file confidentiality level setting system, so as to update the analysis model with the updated parameters. This improves accuracy and confidentiality.

Description

Document confidentiality level management system and method

本發明係與資料文件之機密等級設定技術有關；特別是指一種利用人工智慧設定文件機密等級之系統與方法。The present invention is related to the technology for setting the confidentiality level of data files; in particular, it refers to a system and method for setting the confidentiality level of documents using artificial intelligence.

傳統上，文件之機密等級設定多半係以人工的方式，基於人工查閱文件內特定資訊進行文件機密等級的判定，以決定文件的機密等級。Traditionally, most of the document's confidentiality level is set manually, and the document's confidentiality level is determined based on manual review of specific information in the document, so as to determine the document's confidentiality level.

隨著科技發展，過去以紙本書寫的文件已逐漸改由電子文件的方式呈現。不同於紙本文件不易修改增減內容，電子文件可直接利用電子軟體進行文件內容的修改。其中，目前文件之機密等級設定技術僅在於透過文件附檔名或關鍵字等內容來判斷文件的機密等級，或者是單純透過人工閱讀文件來分類文件的機密等級。With the development of science and technology, documents written on paper in the past have been gradually presented in the form of electronic documents. Unlike paper documents, which are not easy to modify, increase or decrease content, electronic documents can directly use electronic software to modify the content of the document. Among them, the current technology for setting the confidentiality level of a document is only to judge the confidentiality level of a document through contents such as filenames or keywords, or to classify the confidentiality level of a document simply by manually reading the document.

另外，當文件的內容有增減或重新編修後，其機密等級將可能有所改變，尤其現在電子文件的內容編輯修改十分快速，需要即時對文件進行機密等級分析，甚至是即時的重新判斷文件的機密等級，對於分類人員來說，是一大負擔。In addition, when the content of a file is increased or decreased or re-edited, its confidentiality level may change. Especially now that the content of electronic files is edited and modified very quickly, it is necessary to conduct an immediate analysis of the confidentiality level of the file, or even re-judgment the file in real time. The confidentiality level of the classification personnel is a big burden.

此外，分類人員判斷文件的機密等級時，由於己經習慣在公司或單位中的經常性用詞，因此，當文件中出現涉及機密之罕見或新的用詞時，分類人員便不易查覺，而可能誤判了文件的機密等級。In addition, when classifying personnel to judge the confidentiality level of documents, because they are used to the frequent words used in companies or units, when rare or new words involving confidentiality appear in the documents, the classification personnel are not easy to detect. It may have misjudged the confidentiality level of the document.

有鑑於此，本發明之目的在於提供一種文件機密等級管理系統及方法，可於文件產生或編修後，根據人工智慧對文件進行機密等級分類，並且加上使用者對該文件的機密等級的判斷，來決定用何種機密等級對文件進行管控，同時又可優化人工智慧對文件之機密等級分類的準確性。In view of this, the purpose of the present invention is to provide a document confidentiality level management system and method, which can classify the document's confidentiality level according to artificial intelligence after the document is generated or edited, and add the user's judgment of the document's confidentiality level , to decide which confidentiality level to use to control the document, and at the same time, it can optimize the accuracy of the classification of the document's confidentiality level by artificial intelligence.

緣以達成上述目的，本發明提供一種文件機密等級管理系統，其包括有：複數個文件機密等級設定系統，各該文件機密等級設定系統包括一人工智慧分析模組、一資料外洩防護伺服器，其中：該人工智慧分析模組用以接收一文件，並根據一分析模型對該文件進行機密等級之分析，並根據分析結果提供一第一機密等級標籤；該資料外洩防護伺服器接收該第一機密等級標籤，並與一終端裝置連接，該終端裝置用以供使用者操作並提供一第二機密等級標籤，該資料外洩防護伺服器係比對該第一機密等級標籤與該第二機密等級標籤是否相同，若相同，該資料外洩防護伺服器係將該文件標記該第一機密等級標籤，並根據該第一機密等級標籤所定義之內容，對該文件進行加密或權限控管；若不同，該資料外洩防護伺服器係將該文件標記該第二機密等級標籤，並根據該第二機密等級標籤所定義之內容，對該文件進行加密或權限控管；一聯邦學習系統與各該文件機密等級設定系統連接，各該文件機密等級設定系統將各該分析模型之一模型參數加密形成一加密參數後傳送至該聯邦學習系統；該聯邦學習系統將該些加密參數聚合並產生一更新參數，及將該更新參數傳送至各該文件機密等級設定系統，各該人工智慧分析模組依據該更新參數更新該分析模型。In order to achieve the above object, the present invention provides a document confidentiality level management system, which includes: a plurality of document confidentiality level setting systems, each of which includes an artificial intelligence analysis module and a data leakage protection server. , wherein: the artificial intelligence analysis module is used for receiving a file, and analyzes the confidentiality level of the file according to an analysis model, and provides a first confidentiality level label according to the analysis result; the data leakage prevention server receives the The first security level label is connected with a terminal device, the terminal device is used for user operation and provides a second security level label, and the data leakage prevention server compares the first security level label with the first security level label. Whether the two confidentiality labels are the same, if they are the same, the data leakage prevention server marks the file with the first confidentiality label, and encrypts or controls the file according to the content defined by the first confidentiality label. management; if not, the data leakage prevention server marks the file with the second-class label, and encrypts or controls the file according to the content defined by the second-class label; a federated learning The system is connected with each of the document confidentiality level setting systems, and each of the document confidentiality level setting systems encrypts a model parameter of each of the analysis models to form an encrypted parameter and transmits it to the federated learning system; the federated learning system aggregates these encrypted parameters And generating an update parameter, and sending the update parameter to each of the document confidentiality level setting systems, and each of the artificial intelligence analysis modules to update the analysis model according to the update parameter.

緣以達成上述目的，本發明另提供一種文件機密等級管理方法，應用於該文件機密等級管理系統中，該方法包括有以下步驟：In order to achieve the above object, the present invention further provides a document confidentiality level management method, which is applied in the document confidentiality level management system, and the method includes the following steps:

各該文件機密等級設定系統執行下列步驟：Each of the document confidentiality level setting systems performs the following steps:

A1、一終端裝置提供一文件；A1. A terminal device provides a file;

A2、該文件經由該人工智慧分析模組根據一分析模型對於該文件進行機密等級之分析，並且該人工智慧分析模組根據分析結果提供一第一機密等級標籤予該資料外洩防護伺服器；A2. The document is analyzed by the artificial intelligence analysis module according to an analysis model for the confidentiality level of the document, and the artificial intelligence analysis module provides a first confidentiality level label to the data leakage prevention server according to the analysis result;

A3、該文件經由一使用者進行機密等級之分析，並提供一第二機密等級標籤予該資料外洩防護伺服器；A3. The document is analyzed by a user for the confidentiality level, and a second confidentiality level label is provided to the data leakage prevention server;

A4、該資料外洩防護伺服器比對該第一機密等級標籤及該第二機密等級標籤之異同；當該第一機密等級標籤與該第二機密等級標籤相同時，該資料外洩防護伺服器對該文件標記該第一機密等級標籤，並根據該第一機密等級標籤所定義之內容，對該文件進行加密或權限控管；當該第一機密等級標籤與該第二機密等級標籤不同時，該資料外洩防護伺服器對該文件標記該第二機密等級標籤，由該資料外洩防護伺服器根據該第二機密等級標籤所定義之內容，對該文件進行加密或權限控管；A4. The data leakage prevention server compares the similarities and differences between the first confidentiality level label and the second confidentiality level label; when the first confidentiality level label is the same as the second confidentiality level label, the data leakage prevention server The device marks the file with the first confidentiality level label, and encrypts or controls the file according to the content defined by the first confidentiality level label; when the first confidentiality level label is different from the second confidentiality level label When the data leakage prevention server marks the document with the second confidentiality level label, the data leakage prevention server encrypts or controls the authority of the document according to the content defined by the second confidentiality level label;

A5、將該分析模型之一模型參數加密形成一加密參數後傳送到該聯邦學習系統；A5. Encrypt a model parameter of the analysis model to form an encrypted parameter and transmit it to the federated learning system;

該聯邦學習系統執行下列步驟：The federated learning system performs the following steps:

B1、接收該些文件機密等級設定系統所傳來的加密參數；B1. Receive the encryption parameters from the document confidentiality level setting system;

B2、將該些加密參數聚合及產生一更新參數；B2. Aggregate these encryption parameters and generate an update parameter;

B3、將該更新參數傳送至各該文件機密等級設定系統；B3. Transmit the update parameter to each of the document confidentiality level setting systems;

各該文件機密等級設定系統更執行下列步驟：Each of the document confidentiality level setting system also performs the following steps:

A6、各該文件機密等級設定系統接收該更新參數；A6. Each of the document confidentiality level setting systems receives the update parameter;

A7、各該人工智慧分析模組依據該更新參數更新該分析模型。A7. Each of the artificial intelligence analysis modules updates the analysis model according to the update parameter.

本發明之效果在於，各該文件機密等級設定系統藉由人工智慧分析模組對文件進行機密等級分析，並藉由使用者對同一份文件進行機密等級分析，並且比對人工智慧分析模組與使用者所給出之機密等級是否有出入，而對文件進行權限控管。並且各該文件機密等級設定系統更將分析模型之模型參數加密後傳送到該聯邦學習系統以得到更新參數，各該文件機密等級設定系統依更新參數更新原有的分析模型，以提升人工智慧分析模組預測的準確性。由於各該文件機密等級設定系統之模型參數經過加密後才傳送出去，並非直接將文件或分析模型傳送出去，因此，可有效避免機密文件外洩的情形，提高各該文件機密等級設定系統中文件的資料安全性。The effect of the present invention is that each of the document confidentiality level setting systems uses the artificial intelligence analysis module to analyze the confidentiality level of the document, and uses the user to analyze the confidentiality level of the same document, and compares the artificial intelligence analysis module with the same document. Whether there is a discrepancy in the confidentiality level given by the user, and control the authority of the document. And each of the document confidentiality level setting systems further encrypts the model parameters of the analysis model and transmits them to the federated learning system to obtain updated parameters, and each of the document confidentiality level setting systems updates the original analysis model according to the updated parameters to improve artificial intelligence analysis. Accuracy of module predictions. Since the model parameters of each document confidentiality level setting system are encrypted and sent out, instead of directly transmitting the document or analysis model, it can effectively avoid the leakage of confidential documents and improve the document confidentiality level setting system for each document. data security.

此外，本發明可應用在各個文件機密等級設定系統訓練資料量不足的情況，藉由利用其它文件機密等級設定系統的參數，有效解決各個文件機密等級設定系統之訓練資料量不足的問題，提升準確性。In addition, the present invention can be applied to the situation where the training data volume of each document confidentiality level setting system is insufficient. By using the parameters of other document confidentiality level setting systems, the problem of insufficient training data volume of each document security level setting system is effectively solved, and the accuracy is improved. sex.

讓具有相近的機密等級認定標準的多個使用者使用同一個文件機密等級設定系統，避免不同使用者對機密等級認定差異太大又使用單一個文件機密等級設定系統，造成分析模型失準。Let multiple users with similar confidentiality level identification standards use the same document confidentiality level setting system, so as to avoid the use of a single document confidentiality level setting system for different users who have too large a difference in the confidentiality level identification, resulting in inaccurate analysis models.

為能更清楚地說明本發明，茲舉實施例並配合圖式詳細說明如後。請參圖1所示，圖1為本發明一實施例的文件機密等級設定系統的方塊圖。該文件機密等級設定系統包括有：一資料外洩防護伺服器10以及一人工智慧伺服器20，該人工智慧伺服器20包括有一人工智慧分析模組30以及一人工智慧訓練模組40。In order to illustrate the present invention more clearly, the following examples are given and detailed descriptions are given in conjunction with the drawings. Please refer to FIG. 1 . FIG. 1 is a block diagram of a system for setting a document confidentiality level according to an embodiment of the present invention. The document confidentiality level setting system includes: a data leakage prevention server 10 and an artificial intelligence server 20 . The artificial intelligence server 20 includes an artificial intelligence analysis module 30 and an artificial intelligence training module 40 .

該資料外洩防護伺服器10係與至少一終端裝置1連接，該資料外洩防護伺服器10係遵照一資料外洩防護規範管理該終端裝置1所處理之文件，例如，該資料外洩防護規範記載有當特定文件或文件格式被建立、開啟、共用、存取、編輯、刪除或是其他管理方式等文件處理動作時，該資料外洩防護伺服器10將依據該終端裝置1或是該終端裝置1之使用者所被授權的範圍，決定該文件處理動作是否可被執行。於本實施例中，所述資料外洩防護伺服器10還包括有一資料外洩防護單元12，該資料外洩防護單元12可以是但不限於為一應用程式，供安裝於終端裝置1上，並與該資料外洩防護伺服器10連線，並可將文件上傳至資料外洩防護伺服器10，再由該資料外洩防護伺服器10將該文件傳送至人工智慧伺服器20的人工智慧分析模組30，以對該文件進行機密等級之分析。該資料外洩防護單元12係按照資料外洩防護規範管理該終端裝置1所處理之文件，更進一步地說，該資料外洩防護單元12係可監控使用者透過該終端裝置1對文件處理之操作，例如資料外洩防護規範中可設定當該終端裝置1開啟的文件為word檔、PDF檔等檔案類型或是其他特定檔案類型之文件時，或是該終端裝置1所編輯之文件的內容中涉及有敏感性字詞或語句時，該資料外洩防護單元12將啟動並監控該終端裝置1對該文件的文件處理動作，並根據該文件所被標記之機密等級標籤所對應之內容，對該文件進行加密或權限控管，其中，所述的加密可以是但不限於隱藏或模糊化或其他加密方式使文件中特定或敏感性的字詞或語句，使其無法在終端裝置上正常顯示或無法被使用者所閱讀；所述的權限控管可以是但不限於限制使用者可在該終端裝置1上對該文件的操作行為或指令，例如包括有但不限於限制對該文件或該文件之內容執行開啟、共用、存取、編輯、刪除、複製、剪取等文件處理動作。於實際運作上，所述的資料外洩防護單元12可以有多個，分別安裝於不同的終端裝置1上，用以分別監控各終端裝置1，所述之終端裝置1可以是但不限於手機、平板電腦、筆記型電腦、桌上型電腦、工業電腦等裝置。該終端裝置1與文件庫連接，可自文件庫存取文件，所述之文件庫可以是終端裝置1上的儲存空間、伺服器或是雲端儲存空間，但不以此為限。於本實施例中，所述資料外洩防護伺服器10還包括有一應用程式介面14(Application Programming Interface)、一機密等級管理單元16以及一權限管理單元18。該應用程式介面14可供與該終端裝置1、人工智慧伺服器20進行通訊、資料傳輸，例如可對終端裝置1上的資料外洩防護單元12提供使用者權限的更新、文件機密等級的設定或變更資料等。該機密等級管理單元16用以管理對應各文件的機密等級標籤，例如機密等級管理單元16存有用以記錄各文件之機密等級標籤的目錄表或索引表等表單。該權限管理單元18用以管理各使用者所對於各機密等級文件之瀏覽、編修或其他文件處理動作的權限。據此，資料外洩防護單元12可與該應用程式介面14溝通，以自機密等級管理單元16下載或讀取各文件對應之機密等級標籤，或上傳機密等級標籤至機密等級管理單元16，以及自權限管理單元18更新對應該終端裝置1之使用者的權限。The data leakage protection server 10 is connected to at least one terminal device 1, and the data leakage protection server 10 manages the files processed by the terminal device 1 according to a data leakage protection specification, for example, the data leakage protection The specification records that when a specific file or file format is created, opened, shared, accessed, edited, deleted, or other file processing operations, the data leakage prevention server 10 will be based on the terminal device 1 or the The authorized scope of the user of the terminal device 1 determines whether the file processing action can be executed. In this embodiment, the data leakage protection server 10 further includes a data leakage protection unit 12, and the data leakage protection unit 12 may be, but is not limited to, an application for installation on the terminal device 1, It is connected to the data leakage prevention server 10, and can upload the file to the data leakage prevention server 10, and then the data leakage prevention server 10 transmits the file to the artificial intelligence of the artificial intelligence server 20. The analysis module 30 is used to analyze the confidentiality level of the document. The data leakage protection unit 12 manages the files processed by the terminal device 1 according to the data leakage protection specification. More specifically, the data leakage protection unit 12 can monitor the files processed by the user through the terminal device 1 . For example, in the data leakage protection specification, it can be set that when the file opened by the terminal device 1 is a file type such as a word file, a PDF file, or a file of other specific file types, or the content of the file edited by the terminal device 1 When sensitive words or sentences are involved in the file, the data leakage prevention unit 12 will activate and monitor the file processing action of the terminal device 1 for the file, and according to the content corresponding to the confidentiality level label marked on the file, Encrypt or control the authority of the file, wherein the encryption may be but not limited to hiding or obscuring or other encryption methods to make specific or sensitive words or sentences in the file, so that they cannot be used normally on the terminal device. Displayed or cannot be read by the user; the permission control may be, but not limited to, restricting the user's operation behavior or instructions on the file on the terminal device 1, such as including but not limited to restricting the file or The content of the file performs file processing operations such as opening, sharing, accessing, editing, deleting, copying, and clipping. In actual operation, there may be a plurality of the data leakage prevention units 12, which are respectively installed on different terminal devices 1 to monitor each terminal device 1 respectively. The terminal devices 1 may be, but are not limited to, mobile phones. , Tablet PC, Notebook PC, Desktop PC, Industrial PC and other devices. The terminal device 1 is connected to a file library, and can retrieve files from the file library. The file library can be a storage space on the terminal device 1, a server or a cloud storage space, but is not limited thereto. In this embodiment, the data leakage prevention server 10 further includes an application programming interface 14 (Application Programming Interface), a confidentiality level management unit 16 and an authority management unit 18 . The application program interface 14 can be used for communication and data transmission with the terminal device 1 and the AI server 20 , for example, the data leakage prevention unit 12 on the terminal device 1 can be used to update user rights and set document confidentiality levels. or change data, etc. The confidentiality level management unit 16 is used to manage the confidentiality level labels corresponding to each file. For example, the confidentiality level management unit 16 stores a table of contents such as a table of contents or an index table for recording the confidentiality level labels of each file. The authority management unit 18 is used to manage the authority of each user for browsing, editing or other document processing actions for each confidential level document. Accordingly, the data leakage prevention unit 12 can communicate with the application program interface 14 to download or read the confidentiality level labels corresponding to each file from the confidentiality level management unit 16 , or upload the confidentiality level labels to the confidentiality level management unit 16 , and The authority corresponding to the user of the terminal device 1 is updated from the authority management unit 18 .

另外，該終端裝置1亦可透過該資料外洩防護單元12與該資料外洩防護伺服器10進行通訊，並可將該文件由該終端裝置1傳送或上傳至該資料外洩防護伺服器10，並由該資料外洩防護伺服器10將該文件傳送至人工智慧分析模組30，以供該人工智慧分析模組30對該文件進行機密等級之分析，並且該人工智慧分析模組30之分析結果將回傳予該資料外洩防護伺服器10，或者回傳至該資料外洩防護伺服器10，再回傳至終端裝置1上的資料外洩防護單元12，並由該資料外洩防護伺服器10或該資料外洩防護單元12根據分析結果對該文件進行加密或權限控管。於本實施例中，該人工智慧分析模組30包括有一應用程式介面32，當人工智慧分析模組30完成對文件的機密等級分析後，可透過該應用程式介面32將人工智慧分析模組30所給出的機密等級標籤回傳給資料外洩防護伺服器10之機密等級管理單元16儲存。In addition, the terminal device 1 can also communicate with the data leakage prevention server 10 through the data leakage prevention unit 12 , and can transmit or upload the file from the terminal device 1 to the data leakage prevention server 10 , and the data leakage prevention server 10 transmits the file to the artificial intelligence analysis module 30 for the artificial intelligence analysis module 30 to analyze the confidentiality level of the document, and the artificial intelligence analysis module 30 The analysis result will be returned to the data leakage prevention server 10, or returned to the data leakage prevention server 10, and then sent back to the data leakage prevention unit 12 on the terminal device 1, and the data leakage The protection server 10 or the data leakage protection unit 12 encrypts or controls the authority of the file according to the analysis result. In this embodiment, the artificial intelligence analysis module 30 includes an application programming interface 32. After the artificial intelligence analysis module 30 completes the analysis of the confidentiality level of the document, the artificial intelligence analysis module 30 can be analyzed through the application programming interface 32. The given confidentiality level label is returned to the confidentiality level management unit 16 of the data leakage prevention server 10 for storage.

該人工智慧訓練模組40與該人工智慧分析模組30相連接，用以訓練該人工智慧分析模組30所使用的分析模型，以及用以接收經該人工智慧分析模組30對該文件進行機密等級分析後所產生之分析資料，並依據該分析資料對該分析模型進行重新訓練。The artificial intelligence training module 40 is connected to the artificial intelligence analysis module 30 for training the analysis model used by the artificial intelligence analysis module 30 and for receiving the analysis model of the file through the artificial intelligence analysis module 30 The analysis data generated after the analysis of the confidentiality level, and the analysis model is retrained according to the analysis data.

請一併配合圖1及圖2所示，本發明之文件機密等級設定方法包括有以下的步驟：Please cooperate with FIG. 1 and FIG. 2 together. The method for setting the document confidentiality level of the present invention includes the following steps:

終端裝置1提供一文件。所述終端裝置1提供文件的方式包括有但不限於：終端裝置1自文件庫下載一文件，或者使用者在終端裝置1上建立一文件，或使用者在終端裝置1上編輯並儲存一文件。資料外洩防護伺服器10或安裝於終端裝置1上的資料外洩防護單元12將根據資料外洩防護規範監控終端裝置1上所存放或執行之文件，舉例而言，當終端裝置1自文件庫存取受該資料外洩防護規範所列管之文件時，或者是使用者操作該終端裝置1建立了一文件，並欲將該文件儲存或移動至該文件庫時，若該文件屬於受資料外洩防護規範所監控之類型，該資料外洩防護伺服器10將提供該文件予該人工智慧分析模組30，由該人工智慧分析模組30依據分析模型對該文件進行機密等級之分析，並提供一第一機密等級標籤。The terminal device 1 provides a file. The manner in which the terminal device 1 provides a file includes but is not limited to: the terminal device 1 downloads a file from the file library, or the user creates a file on the terminal device 1, or the user edits and stores a file on the terminal device 1. . The data leakage protection server 10 or the data leakage protection unit 12 installed on the terminal device 1 will monitor the files stored or executed on the terminal device 1 according to the data leakage protection specification. When the database accesses a file that is subject to the data leakage protection specification, or when the user operates the terminal device 1 to create a file, and wants to store or move the file to the file library, if the file belongs to the data leakage protection For the type monitored by the leakage protection specification, the data leakage protection server 10 will provide the file to the artificial intelligence analysis module 30, and the artificial intelligence analysis module 30 will analyze the confidentiality level of the file according to the analysis model, And provide a first secret level label.

於一實施例中，可以是由該終端裝置1上的資料外洩防護單元12將該文件傳送至資料外洩防護伺服器10，再由該資料外洩防護伺服器10將該文件送給人工智慧分析模組30進行分析及預測對應該文件的機密等級。另外，於另一實施例中，該人工智慧分析模組30還包含有一模型代理程式34，該模型代理程式34可以是但不限於一應用程式，係供安裝於該終端裝置1上，用以對該文件進行機密等級之分析，如此一來，於終端裝置1或資料外洩防護單元12無法與外部的資料外洩防護伺服器10連線時，便可於終端裝置1本地的模型代理程式34針對該文件進行機密等級之分析，以提供相應之第一機密等級標籤。更進一步地說，該模型代理程式34係與該人工智慧伺服器20連線，並自該人工智慧伺服器20取得以及更新分析模型，並可利用該分析模型對文件進行機密等級分析，以提供第一機密等級標籤。In an embodiment, the data leakage protection unit 12 on the terminal device 1 may transmit the file to the data leakage protection server 10, and then the data leakage protection server 10 sends the file to a manual worker. The intelligent analysis module 30 analyzes and predicts the confidentiality level of the corresponding document. In addition, in another embodiment, the artificial intelligence analysis module 30 further includes a model agent 34. The model agent 34 may be, but is not limited to, an application that is installed on the terminal device 1 for use in Confidentiality level analysis is performed on the file, so that when the terminal device 1 or the data leakage prevention unit 12 cannot connect with the external data leakage protection server 10, the model agent program local to the terminal device 1 can be used. 34. Perform a classification level analysis on the document to provide a corresponding first classification level label. More specifically, the model agent 34 is connected to the artificial intelligence server 20, obtains and updates the analysis model from the artificial intelligence server 20, and can use the analysis model to perform confidentiality level analysis on documents to provide First Class Secret Label.

另外，使用者亦會對該文件進行機密等級分析，並提供一第二機密等級標籤，並將該第二機密等級標籤提供給該資料外洩防護伺服器10，由該資料外洩防護伺服器10比對該第一機密等級標籤與該第二機密等級標籤是否相同。其中，所述使用者提供第二機密等級標籤的時間點可以是在人工智慧分析模組30對該文件進行機密等級分析之後或之前，例如：於一情況下，當使用者對一文件進行編輯或者儲存一文件時，使用者可對該文件標記第二機密等級標籤，而後該文件將進一步提交予該人工智慧分析模組30進行機密等級分析，並產出對應之第一機密等級標籤；於另一情況下，該文件是先經由人工智慧分析模組30進行機密等級分析並產生第一機密等級標籤後，再提交至終端裝置1，以由使用者對該文件進行機密等級分析，以提供第二機密等級標籤。In addition, the user will also perform a confidentiality level analysis on the file, and provide a second confidentiality level label, and provide the second confidentiality level label to the data leakage prevention server 10, and the data leakage prevention server will 10 Compare whether the first confidentiality level label is the same as the second confidentiality level label. The time point when the user provides the second confidentiality level label may be after or before the artificial intelligence analysis module 30 performs the confidentiality level analysis on the document, for example, in one case, when the user edits a document Or when storing a file, the user can mark the file with a second confidentiality level label, and then the file will be further submitted to the artificial intelligence analysis module 30 for confidentiality level analysis, and a corresponding first confidentiality level label is generated; in In another case, the document is firstly analyzed by the artificial intelligence analysis module 30 for a confidentiality level and a first confidentiality level label is generated, and then submitted to the terminal device 1, so that the user can perform a confidentiality level analysis on the document to provide Second Class Secret Label.

於後，資料外洩防護伺服器10將比對該第一機密等級標籤與該第二機密等級標籤是否相同，若相同，則將該文件標記第一機密等級標籤，並由資料外洩防護伺服器10根據第一機密等級標籤所定義之內容，對該文件進行加密或權限控管；若不相同，則將該文件標記第二機密等級標籤，並由資料外洩防護伺服器根10據第二機密等級標籤所定義之內容，對該文件進行加密或權限控管。舉例而言，所述第一機密等級標籤、第二機密等級標籤可以是用來分類但不限於一般、密、機密、極機密、絕對機密等標籤，或者，所述第一、第二機密等級標籤係用以指向或代表使用者所設定之文件機密等級，以供資料外洩防護伺服器10讀取並根據第一、第二機密等級標籤之內涵對文件進行加密或使用者對於該文件可執行與不可執行動作之權限控管。After that, the data leakage prevention server 10 will compare whether the first confidentiality level label is the same as the second confidentiality level label, and if they are the same, mark the file with the first confidentiality level label, and the data leakage prevention server 10 will The server 10 encrypts or controls the file according to the content defined by the first confidentiality level label; 2. The content defined by the confidentiality level label shall be encrypted or subject to authority control. For example, the first confidentiality level label and the second confidentiality level label may be used to classify but not limited to general, secret, secret, extremely secret, absolutely secret and other labels, or the first and second secret level labels The label is used to point to or represent the document confidentiality level set by the user, so that the data leakage prevention server 10 can read and encrypt the document according to the content of the first and second confidentiality level labels or the user can access the document. Permission control of executable and non-executable actions.

另外，該人工智慧分析模組30對文件進行機密等級分析後所產生之對應的分析資料，將傳送至一訓練資料庫儲存，並由該人工智慧訓練模組40連結至該訓練資料庫接收該些分析資料，並根據該些分析資料對分析模型進行重新訓練。於一實施例中，請配合圖1所示，所述人工智慧訓練模組40包括有一資料收集與標記工具42、一模型訓練工具44、一模型重訓練工具46以及一錯誤驗證工具48。In addition, the corresponding analysis data generated after the artificial intelligence analysis module 30 analyzes the document with the confidentiality level will be sent to a training database for storage, and the artificial intelligence training module 40 will be connected to the training database to receive the analysis data. some analysis data, and retrain the analysis model according to the analysis data. In an embodiment, as shown in FIG. 1 , the artificial intelligence training module 40 includes a data collection and labeling tool 42 , a model training tool 44 , a model retraining tool 46 and an error verification tool 48 .

所述資料收集與標記工具42用以生成建立該分析模型所需的訓練資料，舉例而言，所述資料收集與標記工具42包含有一使用者介面，且該資料收集與標記工具42連接文件庫，於介面上可供使用者選擇所欲分析或訓練之文件，使用者可選擇其中一份或多份文件，並對文件標記第二機密等級標籤，另外，所述資料收集與標記工具42還包括有一建議標籤介面，該建議標籤介面與終端裝置相連接，並提供有多個基準模型(baseline model)，使用者可藉由該建議標籤介面選擇其中一該基準模型，並由人工智慧分析模組30根據所選擇之基準模型對該文件進行機密等級之分析，並產生至少一建議的機密等級標籤，再由使用者決定是否選用所建議之機密等級標籤做為該第二機密等級標籤，換言之，當使用者不知如何設定或決定其文件的第二機密等級標籤時，便可藉由建議標籤介面的輔助，提供出第二機密等級標籤。其中，所述基準模型可以是選自相同或相似產業之文件機密等級設定系統所訓練得出之模型，或者是基於類別相似或相近之文件進行訓練而得之模型，舉例而言，A業者為模具製造商，其設定機密等級之文件為模具設計圖相關文件；B業者亦為模具製造商或者其機密文件同樣與A業者之模具設計圖相關或相近，則基於A業者之文件所訓練而得之基準模型，便可應用於B業者，以供B業者採用該基準模型對其文件給出第二機密等級標籤之參考。The data collection and labeling tool 42 is used to generate the training data required to build the analytical model. For example, the data collection and labeling tool 42 includes a user interface, and the data collection and labeling tool 42 is connected to a file library , the user can select the files to be analyzed or trained on the interface. The user can select one or more of the files and mark the files with the second confidentiality level label. In addition, the data collection and marking tool 42 also It includes a suggested label interface, which is connected to the terminal device and provides a plurality of baseline models. The user can select one of the baseline models through the suggested label interface and analyze the model by artificial intelligence. The group 30 analyzes the confidentiality level of the document according to the selected benchmark model, and generates at least one suggested confidentiality level label, and then the user decides whether to select the suggested confidentiality level label as the second confidentiality level label, in other words , when the user does not know how to set or determine the second confidentiality level label of the document, the second security level label can be provided with the assistance of the suggested label interface. Wherein, the benchmark model may be a model selected from the document confidentiality level setting system of the same or similar industries, or a model obtained by training based on documents of similar or similar categories. For example, Company A is For the mold manufacturer, the documents that set the confidentiality level are the documents related to the mold design drawings; the company B is also a mold manufacturer or its confidential documents are also related to or similar to the mold design drawings of the company A, and are trained based on the documents of the company A. The benchmark model can be applied to business B, so that business B can use the benchmark model to give a reference to the second confidentiality level label for its documents.

該模型訓練工具44用以接收該些訓練資料以建立該分析模型。其中，該模型訓練工具44可使用的演算法包含有但不限於：支援向量機(SVM)、多層感知機（MLP）、梯度提升決策樹（GBDT）、BERT（Bidirectional Encoder Representations from Transformers）、LSTM（Long Short-Term Memory)等或其組合。請配合圖3及圖4所示於一實施例中，所述之模型訓練工具44可包含有但不限於：自動分詞(斷詞)模組(Word Segmentation Module)、詞向量模組(Word-to-Vector Module)、特徵選取模組(Feature Selection Module)、降維模組(Dimensionality Reduction Module)、加權模組(Weighted Module)以及分類模組(Classification Module)。所述自動分詞模組可根據不同語言(例如中文或英文等)對文件內之詞句基於辭典或詞庫匹配進行分詞、或基於詞頻度統計進行分詞但不以此為限；所述詞向量模組用以將分詞(斷詞)後之文件向量化為詞向量；所述特徵選取模組可以但不限於以單字母、雙字母或三字母作為一個特徵單位對詞向量進行特徵選取；所述降維模組可使用但不限於卡方檢定(chi-square)、單因子獨立變異數分析(ANOVA F-value)等降低特徵維度；所述加權模組可使用詞頻-逆文件頻率(TF-IDF)演算法，評估各字詞的重要程度，進行權重之計算；所述分類模組可使用但不限於支援向量機(SVM)、多層感知機（MLP）、梯度提升決策樹（GBDT），以建立該分析模型。於後，測試該分析模型對文件之密件等級分析並產出效果報告，根據效果報告之結果判斷分析模型的準確度或其他指標是否符合使用者的要求，若符合要求，則完成分析模型，若不符合要求，則持續調整模型建立之參數並增加訓練資料再進行模型之訓練。所述分析模型之訓練方式可使用但不限於按照80-20的拆分比例將訓練資料及/或分析資料，分成訓練集與測試集，再通過上述之模型訓練工具44之各模組幫助分析模型進行訓練，據以建立出該分析模型。又通常知識者可根據所進行分析之文件的類型不同，選擇不同的分析模型訓練方式，並不以上述說明為限制。該模型重訓練工具46可接收該訓練資料庫所儲存之分析資料對該分析模型進行重新訓練，該模型重訓練工具46的架構可採用與模型訓練工具44類似之架構，於此不再贅述。The model training tool 44 is used for receiving the training data to build the analysis model. The algorithms that can be used by the model training tool 44 include but are not limited to: Support Vector Machine (SVM), Multilayer Perceptron (MLP), Gradient Boosting Decision Tree (GBDT), BERT (Bidirectional Encoder Representations from Transformers), LSTM (Long Short-Term Memory) etc. or a combination thereof. Please cooperate with FIG. 3 and FIG. 4 in an embodiment, the model training tool 44 may include but not limited to: automatic word segmentation (word segmentation) module (Word Segmentation Module), word vector module (Word- to-Vector Module), Feature Selection Module, Dimensionality Reduction Module, Weighted Module, and Classification Module. The automatic word segmentation module can perform word segmentation on the words and sentences in the file based on dictionary or thesaurus matching according to different languages (such as Chinese or English, etc.), or perform word segmentation based on word frequency statistics, but not limited thereto; the word vector module The group is used to vectorize the document after word segmentation (word segmentation) into a word vector; the feature selection module may, but is not limited to, use a single letter, two letters or three letters as a feature unit to perform feature selection on the word vector; the described The dimension reduction module can use but not limited to chi-square test (chi-square), one-way independent variance analysis (ANOVA F-value), etc. to reduce the feature dimension; the weighting module can use word frequency-inverse document frequency (TF- IDF) algorithm to evaluate the importance of each word and calculate the weight; the classification module can use but not limited to Support Vector Machine (SVM), Multilayer Perceptron (MLP), Gradient Boosting Decision Tree (GBDT), to build the analytical model. After that, test the analysis model to analyze the security level of the document and generate an effect report. According to the result of the effect report, judge whether the accuracy of the analysis model or other indicators meet the requirements of the user. If the requirements are met, the analysis model is completed. If it does not meet the requirements, continue to adjust the parameters of the model establishment and add training data before training the model. The training method of the analysis model may use, but is not limited to, divide the training data and/or analysis data into a training set and a test set according to a split ratio of 80-20, and then use the modules of the above-mentioned model training tool 44 to assist in the analysis. The model is trained to build the analytical model. In addition, the knowledgeable person can usually choose different analysis model training methods according to the different types of files to be analyzed, and the above description is not limited. The model retraining tool 46 can receive the analysis data stored in the training database to retrain the analysis model. The model retraining tool 46 can adopt a structure similar to that of the model training tool 44, and details are not described herein again.

該錯誤驗證工具48用以儲存第一機密等級標籤與第二機密等級標籤不同的文件，換言之，係儲存人工智慧分析模組30之文件機密等級預測結果與使用者所給出之文件機密等級不一致之文件，並留待使用者做後續確認或改善分析模型之用。舉例而言，使用者可定期使用錯誤驗證工具48檢視一段時間內第一、第二機密等級標籤不一致之文件，並確認第一機密等級標籤或第二機密等級標籤是否正確或恰當，於第一情況下，若第一機密等級標籤為正確或較恰當，則修改第二機密等級標籤為第一機密等級標籤，並將此結果儲存至訓練資料庫，並可供訓練分析模型之用；於第二情況下，若第二機密等級標籤為正確或較恰當，則修改第一機密等級標籤為第二機密等級標籤，並將此結果以有別於第一情況的權重輸入模型重訓練工具46以微調分析模型或對分析模型進行再訓練。The error verification tool 48 is used for storing documents with different labels of the first security level and the second security level. In other words, the prediction result of the security level of the document stored by the artificial intelligence analysis module 30 is inconsistent with the security level of the document given by the user. and leave it to the user for subsequent confirmation or improvement of the analysis model. For example, the user can periodically use the error verification tool 48 to check the documents whose first and second confidentiality level labels are inconsistent within a period of time, and confirm whether the first or second confidentiality level label is correct or appropriate. In this case, if the first confidentiality level label is correct or more appropriate, modify the second confidentiality level label to the first confidentiality level label, and store the result in the training database, which can be used for training the analysis model; In the second case, if the second confidentiality level label is correct or more appropriate, the first confidentiality level label is modified to the second confidentiality level label, and the result is input into the model retraining tool 46 with a weight different from that in the first case to Fine-tune the analytical model or retrain the analytical model.

另外，於一實施例中，模型訓練工具44亦可利用深度學習技術來建立分析模型，舉例來說，可使用遷移學習（Transfer Learning）來訓練分析模型。請配合圖5所示，在建立分析模型的前置作業包括有：蒐集至少一產業的相關文件資料，例如：可以蒐集但不限於模具製造產業、機械產業、電子產業、醫療產業、銀行產業、保險產業、生技產業、醫藥產業等產業的相關文件資料，所述的相關文件資料可以是但不限於客戶資料、設計圖、製程文件、產品配方等。於後，使用通用語料庫所訓練出的語言模型（Language Model）與預先蒐集之至少一產業的文件資料，透過預訓練（Pre-Training）的方式，得到適合在該至少一產業的預訓練模型。之後，當欲導入業者之公司以對該公司之文件進行文件機密等級之設定與分類時，係根據該業者所屬之產業，或者根據該業者欲分類之文件類型，選擇適合之預訓練模型，並加入該業者之企業內部的文件以及機密等級分類等相關資料，所述機密等級分類可以是由該企業的機密文件管理辦法所得出，藉此，模型訓練工具44可學習到該企業之文件的機密等級特徵以及機密等級分類邏輯，並對預訓練模型進行模型微調（Fine-Tuning），據以產生適用於對該企業之文件進行文件機密等級設定的分析模型。並且，當文件機密等級設定系統於該企業運作一段期間後，所蒐集之第一機密等級標籤與第二機密等級標籤不一致的文件，可供模型重訓練工具46用以對分析模型進行重新訓練，例如可凍結分析模型之神經網路中一部分的階層（Layers），例如是數個低階層，並微調另一部分的階層，例如數個高階層，以這種鎖定特定階層權重的更新，或者是使用對神經網路各層設置不同的學習率等方式，對分析模型進行重訓練，藉以得到更加適合該企業的文件機密等級設定系統的分析模型。In addition, in one embodiment, the model training tool 44 may also use deep learning technology to create an analysis model, for example, transfer learning (Transfer Learning) may be used to train the analysis model. Please cooperate as shown in Figure 5. The pre-operations for establishing the analysis model include: collecting relevant documents of at least one industry, such as: but not limited to the mold manufacturing industry, machinery industry, electronics industry, medical industry, banking industry, Relevant documents and materials of insurance industry, biotechnology industry, pharmaceutical industry and other industries, the relevant documents and materials may be but not limited to customer information, design drawings, process documents, product formulas, etc. Afterwards, using the language model (Language Model) trained by the general corpus and pre-collected document data of at least one industry, through pre-training (Pre-Training), a pre-training model suitable for the at least one industry is obtained. Afterwards, when the company to be imported into the company is used to set and classify the documents of the company, a suitable pre-training model is selected according to the industry to which the company belongs, or according to the type of documents that the company wants to classify. The company's internal documents and confidential classification and other related information are added. The confidential classification can be derived from the company's confidential document management method, whereby the model training tool 44 can learn the confidentiality of the company's documents. Level features and classification logic of confidentiality level, and fine-tuning the pre-trained model to generate an analysis model suitable for setting the document confidentiality level of the enterprise's documents. In addition, when the document confidentiality level setting system operates for a period of time in the enterprise, the collected files whose first and second confidentiality level labels are inconsistent can be used by the model retraining tool 46 to retrain the analysis model. For example, you can freeze a part of the layers (Layers) in the neural network of the analysis model, such as several low layers, and fine-tune another part of the layers, such as several high layers, in this way to lock the update of specific layer weights, or use Different learning rates are set for each layer of the neural network, and the analysis model is retrained, so as to obtain an analysis model that is more suitable for the enterprise's document confidentiality level setting system.

值得一提的是，於一實施例中，當人工智慧分析模組30係再次對同一文件進行密件等級分析，且人工智慧分析模組30提供之第一機密等級標籤與使用者提供之第二機密等級標籤不同時，人工智慧訓練模組40將比對並提取該文件之前後版本的差異內容，並根據該差異內容對該分析模型進行重新訓練。舉例而言，當使用者對已經設定過文件機密等級的文件進行編輯或修改後，該文件的機密等級因此有所變動，使用者將會變更對該文件所給出的第二機密等級標籤，而若該文件再經過人工智慧分析模組30的判斷後，人工智慧分析模組30給出的第一機密等級標籤與第二機密等級標籤不同時，表示人工智慧分析模組30所使用的分析模型有更新的必要，因此，人工智慧訓練模組40將比對並提取該文件之前後版本的差異內容，也就是使用者對該文件二次編輯修改之內容，基於該差異內容對分析模型進行重新訓練，以提升往後使用該分析模型進行文件機密等級設定的準確率。It is worth mentioning that, in an embodiment, when the artificial intelligence analysis module 30 performs the confidentiality level analysis on the same file again, the first confidentiality level label provided by the artificial intelligence analysis module 30 and the second confidentiality level label provided by the user are provided. When the confidentiality level labels are different, the artificial intelligence training module 40 will compare and extract the difference content between the previous and later versions of the file, and retrain the analysis model according to the difference content. For example, when a user edits or modifies a document whose confidentiality level has been set, and the confidentiality level of the document is changed accordingly, the user will change the second confidentiality level label given to the document, And if the file is judged by the artificial intelligence analysis module 30, the first confidentiality level label given by the artificial intelligence analysis module 30 is different from the second confidentiality level label, indicating the analysis used by the artificial intelligence analysis module 30. The model needs to be updated. Therefore, the artificial intelligence training module 40 will compare and extract the difference content between the previous and later versions of the file, that is, the content of the user's secondary editing and modification of the file, and analyze the model based on the difference content. Retrain to improve the accuracy of using the analytical model to set file confidentiality levels in the future.

請配合圖6所示，於一實施例中，當資料外洩防護伺服器10在監控終端裝置1時，偵測到須設定機密等級之文件時，例如：偵測到使用者操作終端裝置1開啟、下載或儲存了一個疑似存有敏感性資料或資訊之文件時，或者偵測到資料外洩防護規範中所指定追蹤或是注意之檔案類型時，該資料外洩防護伺服器10將呼叫人工智慧分析模組30對該文件進行機密等級分析，並於人工智慧分析模組30提供了相應之第一機密等級標籤後，該資料外洩防護伺服器10將對該文件標記該第一機密等級標籤，或者將該文件與該第一機密等級標籤建立資料處理之連結關係，並先依據該第一機密等級標籤所定義之內容，對該文件進行加密或權限控管；另外，操作該文件的使用者或其他具有指揮監督權限之使用者也會對該文件進行機密等級分析，並提供一第二機密等級標籤，再由資料外洩防護伺服器10比對第一機密等級標籤與第二機密等級標籤的異同，若第一機密等級標籤與第二機密等級標籤為相同，則維持該文件所標記的第一機密等級標籤；若第一機密等級標籤與第二機密等級標籤不同，則將文件變更標記為第二機密等級標籤，再根據第二機密等級標籤所定義之內容對文件進行加密或權限控管。Please cooperate with FIG. 6 , in one embodiment, when the data leakage prevention server 10 is monitoring the terminal device 1 , it detects a file that needs to be set with a confidentiality level, for example, it detects that the user operates the terminal device 1 . When a file suspected of containing sensitive data or information is opened, downloaded or stored, or when a file type specified to be tracked or noted in the data leakage protection specification is detected, the data leakage protection server 10 will call The artificial intelligence analysis module 30 performs a confidentiality level analysis on the file, and after the artificial intelligence analysis module 30 provides a corresponding first confidentiality level label, the data leakage prevention server 10 will mark the document with the first confidentiality level label, or establish a data processing link between the file and the first confidentiality level label, and first encrypt or control the file according to the content defined by the first confidentiality level label; in addition, operate the file The user or other users with command and supervision authority will also perform a confidentiality level analysis on the file, and provide a second confidentiality level label, and then the data leakage prevention server 10 will compare the first confidentiality level label with the second confidentiality level label. Similarities and differences of confidentiality level labels, if the first confidentiality level label and the second confidentiality level label are the same, the first confidentiality level label marked in the file is maintained; if the first confidentiality level label and the second confidentiality level label are different, the The file changes are marked as the second confidentiality level label, and then the file is encrypted or authorized according to the content defined by the second confidentiality level label.

其中，文件的機密等級標籤（第一機密等級標籤與第二機密等級標籤）可以被儲存於文件屬性中，而跟隨著文件存放及移轉；另外，於一些實施例中，所述文件的機密等級標籤亦可存放在如資料外洩防護伺服器10、終端裝置1或資料外洩防護單元12的資料儲存空間中，或者是存放於一資料庫當中，並於對應文件的機密等級標籤更新時，可同步更新各資料儲存空間中對應該文件的機密等級標籤，而資料外洩防護單元12可前往上述資料儲存空間或資料庫取得對應文件之機密等級標籤，並根據機密等級標籤內容對文件進行加密或權限控管。Wherein, the confidentiality level tags of the file (the first confidentiality level tag and the second confidentiality level tag) can be stored in the file attributes, and follow the file storage and transfer; in addition, in some embodiments, the confidentiality of the file The level label can also be stored in the data storage space such as the data leakage protection server 10, the terminal device 1 or the data leakage protection unit 12, or stored in a database, and when the confidentiality level label of the corresponding file is updated , the confidentiality level label of the corresponding file in each data storage space can be updated synchronously, and the data leakage prevention unit 12 can go to the above-mentioned data storage space or database to obtain the confidentiality level label of the corresponding file, and according to the content of the confidentiality level label. Encryption or access control.

請配合圖7A至圖7C所示，於一實施例中，所述文件的機密等級標籤係可存放於文件之文件屬性內容中，例如存放於「文件機密等級屬性」（例示性名稱）的內容中，包括有人工智慧標籤的欄位F1以及使用者標籤的欄位F2，於欄位F1中用以供填入人工智慧分析模組30所給出的第一機密等級標籤，於欄位F2中用以供填入使用者所給出的第二機密等級標籤。圖7A所示為尚未進行文件機密等級分析之文件的文件機密等級屬性的示意圖，此時，該文件尚未進行文件機密等級分析，因此，其欄位F1、F2均為空白；如圖7B所示，當人工智慧分析模組30給出第一機密等級標籤後，例如給出的第一機密等級標籤為代碼「A」，此時，對應人工智慧標籤的欄位F1就會被填入相應代碼「A」，當使用者給出第二機密等級標籤後，例如給出的第二機密等級標籤為代碼「B」，此時，對應的使用者標籤的欄位F2就會被填入相應代碼「B」，於後，資料外洩防護伺服器10將讀取欄位F1、F2的內容，並依據欄位F1、F2內容進行判斷，於本例子中，由於第一機密等級標籤為代碼「A」與第二機密等級標籤的代碼「B」並不相同，因此，資料外洩防護伺服器10將修改人工智慧標籤的欄位F1內容為代碼「B」，並將使用者標籤的欄位F2清空，以及將本次。於後，該文件所被標記的機密等級標籤代碼「B」即為使用者所給出的第二機密等級標籤，因此，資料外洩防護伺服器10或終端裝置1上的資料外洩防護單元12便根據該人工智慧標籤上留存的代碼「B」（第二機密等級標籤）對對應文件進行加密或權限控管。Please cooperate with FIG. 7A to FIG. 7C. In one embodiment, the confidentiality level label of the file can be stored in the file attribute content of the file, such as the content stored in "document confidentiality level attribute" (an exemplary name). , including the field F1 of the artificial intelligence label and the field F2 of the user label, in the field F1 for filling in the first confidentiality level label given by the artificial intelligence analysis module 30, in the field F2 is used to fill in the label of the second confidentiality level given by the user. FIG. 7A is a schematic diagram of the file confidentiality level attribute of a file that has not been subjected to file confidentiality level analysis. At this time, the file has not been subjected to file confidentiality level analysis. Therefore, its fields F1 and F2 are blank; as shown in FIG. 7B , when the artificial intelligence analysis module 30 gives the first confidentiality level label, for example, the given first confidentiality level label is the code "A", at this time, the field F1 corresponding to the artificial intelligence label will be filled with the corresponding code "A", when the user gives the second confidentiality level label, for example, the given second confidentiality level label is the code "B", at this time, the field F2 of the corresponding user label will be filled with the corresponding code "B", after that, the data leakage prevention server 10 will read the contents of the fields F1 and F2, and make a judgment based on the contents of the fields F1 and F2. In this example, since the first confidentiality level label is the code " A" is not the same as the code "B" of the second confidentiality level label. Therefore, the data leakage prevention server 10 will modify the content of the field F1 of the artificial intelligence label to code "B", and change the field of the user label to the code "B". F2 empties, as well as this time. Afterwards, the security level label code "B" marked on the file is the second security level label given by the user. Therefore, the data leakage prevention server 10 or the data leakage prevention unit on the terminal device 1 12 Encrypts or controls the corresponding file according to the code "B" (second confidentiality level label) retained on the artificial intelligence label.

另外，請配合圖8A至圖8C所示，與圖7A至圖7C大致相同，不同的是，使用者所給出的第二機密等級標籤與人工智慧分析模組30給出的第一機密等級標籤相同，均為代碼「A」，因此，在圖8C中，人工智慧標籤欄位F1仍然是代碼「A」，也就是人工智慧分析模組30所給出的第一機密等級標籤，並且該使用者標籤的欄位F2在資料外洩防護伺服器10比對第一機密等級標籤與第二機密等級標籤後，同樣會被清空，於後，該文件所被標記的機密等級標籤代碼「A」即為人工智慧分析模組30所給出的第一機密等級標籤，因此，資料外洩防護伺服器10或終端裝置1上的資料外洩防護單元12便根據該人工智慧標籤上留存的代碼「A」（第一機密等級標籤）對對應文件進行加密或權限控管。另外補充一提，前述的代碼「A」、「B」僅作為例示性說明之用，並非實際機密等級標籤的內容。In addition, please cooperate with FIGS. 8A to 8C , which are basically the same as those shown in FIGS. 7A to 7C , the difference is that the second confidentiality level label given by the user is the same as the first confidentiality level given by the artificial intelligence analysis module 30 . The labels are the same, with the code “A”. Therefore, in FIG. 8C, the AI label field F1 is still the code “A”, which is the first confidentiality level label given by the AI analysis module 30, and the The field F2 of the user label is also cleared after the data leakage prevention server 10 compares the first confidentiality level label with the second confidentiality level label. After that, the document is marked with the confidentiality level label code "A". " is the first confidentiality level label given by the artificial intelligence analysis module 30. Therefore, the data leakage prevention server 10 or the data leakage prevention unit 12 on the terminal device 1 will use the code stored on the artificial intelligence label according to the code. "A" (first secret level label) encrypts or controls the corresponding file. In addition, it should be mentioned that the aforementioned codes "A" and "B" are only for illustrative purposes, and are not the contents of the actual confidentiality level label.

其中，前述將對應儲存第二機密等級標籤之使用者標籤的欄位F2清空的好處在於：當已標記有機密等級標籤的文件再次被使用者所編修而有內容的變更時，由於存放第二機密等級標籤之使用者標籤的欄位F2為空白，因此，資料外洩防護伺服器10將要求使用者再次對該文件的機密等級進行確認，以給出新的第二機密等級標籤，再與人工智慧分析模組30所給出的第一機密等級標籤進行比對，以避免當文件經編修而內容有所變動之後，又是以舊的第二機密等級標籤來評估該文件之機密等級的情況發生，而可實時地、即時地對有編修的文件重新評估文件機密等級。換言之，前述在比對完第一機密等級標籤與第二機密等級標籤後，將存放第二機密等級標籤之欄位清空的動作，將有助於對有重新編修後的文件進行文件機密等級的評估，資料外洩防護伺服器10可檢查使用者標籤的欄位狀態，據以決定是否提交文件給使用者做密件等級分析，以讓使用者提供即時的第二機密等級標籤。The advantage of clearing the field F2 corresponding to the user label that stores the second confidentiality level label is that when the document marked with the confidentiality level label is edited by the user again and the content is changed, the second confidentiality level label is stored. The field F2 of the user label of the confidentiality level label is blank, therefore, the data leakage prevention server 10 will require the user to confirm the confidentiality level of the document again, so as to give a new second confidentiality level label, and then communicate with The first confidentiality level label given by the artificial intelligence analysis module 30 is compared, so as to avoid evaluating the confidentiality level of the document with the old second confidentiality level label after the document has been edited and the content has changed. When this happens, the document confidentiality level can be reassessed in real-time and on-the-fly for redacted documents. In other words, the aforementioned action of clearing the field storing the second confidentiality level label after comparing the first confidentiality level label with the second confidentiality level label will help to perform the document security classification of the re-edited document. For evaluation, the data leakage prevention server 10 can check the status of the field of the user label to determine whether to submit a document to the user for a confidential level analysis, so as to allow the user to provide a real-time second confidential level label.

本發明所提供的文件機密等級設定系統及方法，可於文件產生或編修後，除了根據預先學習的人工智慧分析模組進行文件機密等級分類外，還可回饋使用者所設定的文件機密等級，並且當使用者所標記的文件機密等級與人工智慧分析模組所標記的文件機密等級標籤產生出入時，即將該份文件納入人工智慧分析模組再學習的資料當中，並可依據該份文件與該文件先前版本之間的差異內容重新訓練人工智慧分析模組，進而提升未來人工智慧分析模組自動分類的準確率。另外，當文件被修改或重新編修時，亦可對該文件重新檢視有無須變更文件機密等級的需要，而可即時且準確地對文件進行機密等級的設定。The system and method for setting the document confidentiality level provided by the present invention can not only classify the document confidentiality level according to the pre-learned artificial intelligence analysis module, but also return the document confidentiality level set by the user after the document is generated or edited. And when there is a discrepancy between the document confidentiality level marked by the user and the document confidentiality level label marked by the artificial intelligence analysis module, the document will be included in the data relearned by the artificial intelligence analysis module, and the document can be compared with the data based on the document. The differences between the previous versions of the document retrain the AI analysis module, thereby improving the accuracy of automatic classification of future AI analysis modules. In addition, when the document is modified or re-edited, the document can also be re-examined to see if there is a need to change the document's confidentiality level, and the document's confidentiality level can be set immediately and accurately.

為了增加文件機密等級設定系統及方法對文件機密設定的準確性，請配合圖9，提供本發明一實施例之文件機密等級管理系統100，包括有複數個上述的文件機密等級設定系統110與一聯邦學習系統120。In order to increase the accuracy of document confidentiality setting by the system and method for setting document confidentiality levels, please cooperate with FIG. 9 to provide a document confidentiality level management system 100 according to an embodiment of the present invention, which includes a plurality of the above-mentioned document confidentiality level setting systems 110 and a Federated Learning System 120.

該些文件機密等級設定系統110分別於相同產業類別之複數個不同單位中使用，例如是同一產業中的不同之企業。該聯邦學習系統120與各該文件機密等級設定系統110連接，本實施例中該聯邦學習系統120包括一聯邦學習伺服器122，該聯邦學習伺服器122與各該文件機密等級設定系統110的人工智慧伺服器20連接。人工智慧伺服器20更包括一模型參數加密模組50與一通訊模組52。該聯邦學習伺服器122包括一數據品質評估模組122a、一數據過濾模組122b、一特徵聚合模組122c、一模型參數更新儲存單元122d、一模型儲存單元122e與一模型派送模組122f。The document confidentiality level setting systems 110 are respectively used in a plurality of different units of the same industry category, such as different companies in the same industry. The federated learning system 120 is connected to each of the document confidentiality level setting systems 110 . In this embodiment, the federated learning system 120 includes a federated learning server 122 . Smart server 20 is connected. The AI server 20 further includes a model parameter encryption module 50 and a communication module 52 . The federated learning server 122 includes a data quality assessment module 122a, a data filtering module 122b, a feature aggregation module 122c, a model parameter update storage unit 122d, a model storage unit 122e and a model delivery module 122f.

請一併配合圖9及圖10所示，說明本發明之文件機密等級管理方法，其包括各該文件機密等級設定系統110所執行的步驟以及該聯邦學習系統120所執行的步驟，其中：Please cooperate with FIG. 9 and FIG. 10 to describe the document confidentiality level management method of the present invention, which includes the steps performed by each of the document confidentiality level setting systems 110 and the steps performed by the federated learning system 120, wherein:

各該文件機密等級設定系統110所執行的步驟係以圖2所示之文件機密等級設定方法為基礎，更包括將該分析模型之一模型參數加密成一加密參數後傳送到該聯邦學習系統120，例如以類神經網路演算法為例，該模型參數可包括多個參數，包括梯度、權重、偏差值、神經網路的層數、損失函數、卷積核的大小、學習率、動量、迭代次數等，實務上，該模型參數可包括至少一個參數。本實施例中，係由該模型參數加密模組50對分析模型之模型參數以加密演算法加密，例如可採用進階加密標準（Advanced Encryption Standard，AES）、RSA加密演算法等演算法，但不以此為限，人工智慧伺服器20的通訊模組52再將加密後的加密參數傳送給聯邦學習伺服器124。The steps performed by the document confidentiality level setting system 110 are based on the document confidentiality level setting method shown in FIG. 2 , and further include encrypting a model parameter of the analysis model into an encrypted parameter and sending it to the federated learning system 120 , For example, taking a neural network-like road algorithm as an example, the model parameters may include multiple parameters, including gradients, weights, deviations, the number of layers of the neural network, loss function, size of convolution kernel, learning rate, momentum, and number of iterations etc., in practice, the model parameters may include at least one parameter. In this embodiment, the model parameters of the analysis model are encrypted by the model parameter encryption module 50 using an encryption algorithm, for example, advanced encryption standard (Advanced Encryption Standard, AES), RSA encryption algorithm and other algorithms can be used, but Not limited to this, the communication module 52 of the AI server 20 then transmits the encrypted encryption parameters to the federated learning server 124 .

該聯邦學習系統120執行下列步驟：The federated learning system 120 performs the following steps:

接收該些文件機密等級設定系統110所傳來的加密參數。本實施例中係由聯邦學習伺服器124接收該些加密參數。Receive the encryption parameters transmitted from the document confidentiality level setting system 110 . In this embodiment, the federated learning server 124 receives the encryption parameters.

將該些加密參數聚合及產生一更新參數。The encryption parameters are aggregated and an update parameter is generated.

本實施例中，聯邦學習伺服器124係在同態加密之下對所接收的加密參數進行處理，包括進行篩選、聚合。該數據品質評估模組122a與該數據過濾模組122b用以對加密參數進行篩選，該數據品質評估模組122a對所接收的加密參數進行品質評估處理，以決定各加密參數是否可用於聚合，若各加密參數中的某一參數（例如梯度）偏離信賴區間則不用於聚合，以避免影響該更新參數。該數據過濾模組122b對所接收的加密參數進行過濾，以找出加密參數中可用於聚合的特徵。該特徵聚合模組122c對經前述處理後的加密參數聚合為該更新參數，儲存於模型儲存單元122e中。在一實施例中，更可對加密參數進行混淆處理。In this embodiment, the federated learning server 124 processes the received encryption parameters under homomorphic encryption, including screening and aggregation. The data quality evaluation module 122a and the data filtering module 122b are used for screening encryption parameters, and the data quality evaluation module 122a performs quality evaluation processing on the received encryption parameters to determine whether each encryption parameter can be used for aggregation, If one of the encryption parameters (eg gradient) deviates from the confidence interval, it is not used for aggregation to avoid affecting the update parameter. The data filtering module 122b filters the received encryption parameters to find features in the encryption parameters that can be used for aggregation. The feature aggregation module 122c aggregates the encrypted parameters after the aforementioned processing into the update parameters, which are stored in the model storage unit 122e. In one embodiment, the encryption parameters can be further obfuscated.

將該更新參數傳送至各該文件機密等級設定系統110。本實施例中，由該模型派送模組122f將該模型儲存單元122e中的更新參數傳送到各該文件機密等級設定系統110的人工智慧伺服器20的通訊模組52。The update parameters are transmitted to each of the document security level setting systems 110 . In this embodiment, the model dispatching module 122f transmits the updated parameters in the model storage unit 122e to the communication module 52 of the artificial intelligence server 20 of each document confidentiality level setting system 110 .

各該文件機密等級設定系統110更執行下列步驟：Each of the document security level setting systems 110 further performs the following steps:

各該文件機密等級設定系統110接收該更新參數。Each of the document security level setting systems 110 receives the update parameter.

各該人工智慧分析模組30將依據該更新參數更新該分析模型。本實施例中，各該人工智慧分析模組30將該更新參數進行解密後，依據解密後的該更新參數與該分析模型原來的模型參數進行運算，例如將相同的參數進行加總或平均，以更新該分析模型。本實施例中，是由該通訊模組52接收到該更新參數後，決定是否將更新參數傳予人工智慧分析模組30更新該分析模型。若是，則通訊模組52驅動人工智慧分析模組30更新該分析模型。若使用者認為原來的分析模型已夠準確，可以設定在通訊模組52中設定不將更新參數傳予人工智慧分析模組30。Each of the artificial intelligence analysis modules 30 will update the analysis model according to the update parameter. In this embodiment, each of the artificial intelligence analysis modules 30 decrypts the update parameters, and performs operations based on the decrypted update parameters and the original model parameters of the analysis model, such as summing or averaging the same parameters, to update the analytical model. In this embodiment, after receiving the update parameter, the communication module 52 determines whether to transmit the update parameter to the artificial intelligence analysis module 30 to update the analysis model. If so, the communication module 52 drives the artificial intelligence analysis module 30 to update the analysis model. If the user thinks that the original analysis model is accurate enough, it can be set in the communication module 52 not to transmit the updated parameters to the artificial intelligence analysis module 30 .

之後，各該文件機密等級設定系統110再次執行文件機密等級設定方法時，即可依據更新後的該分析模型對文件進行機密等級之分析。Afterwards, when each of the document confidentiality level setting systems 110 executes the document confidentiality level setting method again, it can analyze the document confidentiality level according to the updated analysis model.

藉由上述之文件機密等級管理系統100及方法，即可增加模型訓練的資料量，彌補各單位中各該文件機密等級設定系統之用於模型訓練的資料量不足的缺點，進而提升人工智慧分析模組30判斷的準確率。更值得一提的是，各該文件機密等級設定系統110是將模型參數加密後上傳至該聯邦學習系統120，而不是上傳分析模型，因此，即便得到加密參數後亦難以破解，縱然加密參數被破解，也得不到分析模型，如此一來，可有效避免各單位中的機密文件外洩的情形，提高各該文件機密等級設定系統110中文件的資料安全性。With the above-mentioned document confidentiality level management system 100 and method, the amount of data for model training can be increased, so as to make up for the shortcoming of insufficient amount of data for model training in each of the document confidentiality level setting systems in each unit, thereby improving artificial intelligence analysis The accuracy of the judgment of the module 30 . It is worth mentioning that each of the document confidentiality level setting systems 110 encrypts the model parameters and uploads them to the federated learning system 120 instead of uploading the analysis model. Therefore, even after obtaining the encrypted parameters, it is difficult to crack, even if the encrypted parameters are encrypted. Even if it is deciphered, the analysis model cannot be obtained. In this way, the leakage of confidential documents in each unit can be effectively avoided, and the data security of documents in the document confidentiality level setting system 110 can be improved.

此外，在以更新後的分析模型進行文件機密等級設定方法時，該人工智慧訓練模組40亦接收經該人工智慧分析模組30對文件進行機密等級分析後所產生之分析資料，並依據該分析資料對該分析模型進行重新訓練。換言之，在各單位的文件機密等級設定系統110將更新後的分析模型重新訓練之後，可產生更適合各單位的分析模型，更能提升人工智慧分析模組30判斷的準確率。之後，各該文件機密等級設定系統110亦可再將重新訓練後的分析模型的模型參數加密形成該加密參數後傳送至該聯邦學習系統，以再進行聚合產生更新模型。In addition, when using the updated analysis model to perform the document confidentiality level setting method, the artificial intelligence training module 40 also receives the analysis data generated by the artificial intelligence analysis module 30 after the document has been analyzed for the confidentiality level, and based on the The analysis data retrains the analysis model. In other words, after the updated analysis model is retrained by the document confidentiality level setting system 110 of each unit, an analysis model more suitable for each unit can be generated, which can further improve the accuracy of the judgment of the artificial intelligence analysis module 30 . Afterwards, each of the document confidentiality level setting systems 110 may further encrypt the model parameters of the retrained analysis model to form the encrypted parameters and then transmit them to the federated learning system for further aggregation to generate an updated model.

請配合圖11與圖12，提供本發明另一實施例之文件機密等級管理系統101，本實施例中，該些文件機密等級設定系統110區分為複數個類別C，該些類別C可例如是不同的產業類別，例如是製造業、醫療業、金融業。每一該類別包括複數個該文件機密等級設定系統110，至少為二個文件機密等級設定系統110。各該文件機密等級設定系統110係於每一個類別C之複數個不同單位中使用。Please cooperate with FIG. 11 and FIG. 12 to provide a document security level management system 101 according to another embodiment of the present invention. In this embodiment, the document security level setting systems 110 are divided into a plurality of categories C, and the categories C can be, for example, Different industry categories, such as manufacturing, medical, financial. Each of the categories includes a plurality of the document security level setting systems 110 , at least two document security level setting systems 110 . Each of the document security level setting systems 110 is used in a plurality of different units for each category C.

本實施例中，該聯邦學習系統130包含複數個聯邦平台132與一聯邦學習伺服器134，各該聯邦平台132與各該類別的文件機密等級設定系統110連接，該聯邦學習伺服器134與該些聯邦平台132連接。各該聯邦平台132包括該數據品質評估模組122a、該數據過濾模組122b、該特徵聚合模組122c、該模型參數更新儲存單元122d。該聯邦學習伺服器134包括一模型優化更新模組134a，以及該模型儲存單元122e、該模型派送模組122f。In this embodiment, the federated learning system 130 includes a plurality of federated platforms 132 and a federated learning server 134, each of the federated platforms 132 is connected to the file confidentiality level setting system 110 of each type, and the federated learning server 134 is connected to the federated learning server 134. These federated platforms 132 are connected. Each of the federation platforms 132 includes the data quality assessment module 122a, the data filtering module 122b, the feature aggregation module 122c, and the model parameter update storage unit 122d. The federated learning server 134 includes a model optimization update module 134a, the model storage unit 122e, and the model dispatch module 122f.

本實施例之文件機密等級管理方法如圖13所示，其具有大致相同於圖10之步驟，不同的是：The document confidentiality level management method of this embodiment is shown in FIG. 13 , which has roughly the same steps as those in FIG. 10 , with the following differences:

由各該聯邦平台132接收各該類別C的文件機密等級設定系統110之加密參數。The encryption parameters of the document confidentiality level setting system 110 for each of the Class Cs are received by each of the federated platforms 132 .

各該聯邦平台132將所接收各該類別的文件機密等級設定系統110之加密參數並聚合為一中繼參數後傳送至該聯邦學習伺服器134。本實施例中，各該聯邦平台132的各個模組所進行之處理同前一實施例，不同的是，該特徵聚合模組122c係對經處理後的加密參數聚合為該中繼參數，並儲存於模型儲存單元122e中。Each of the federated platforms 132 aggregates the received encryption parameters of the document confidentiality level setting system 110 of each category into a relay parameter and then transmits it to the federated learning server 134 . In this embodiment, the processing performed by each module of the federation platform 132 is the same as that of the previous embodiment, the difference is that the feature aggregation module 122c aggregates the processed encryption parameters into the relay parameters, and Stored in the model storage unit 122e.

該聯邦學習伺服器134在同態加密之下將該些中繼參數進行優化與更新，以產生該更新參數。由於各個類別C於文件中的機密之字詞或語句略有差異，因此，本實施例係由模型優化更新模組134a對各聯邦平台傳來的中繼參數進行優化與更新，優化與更新可例如為選擇相近類別C的各中繼參數中的部分參數再聚合，以產生適合該些類別的更新參數，並儲存於該模型儲存單元122e中，所產生的更新參數可以為一個，或者是針對不同類別C分別對應產生適合各類別C之不同的複數個更新參數。該模型派送模組122f將該模型儲存單元122e中的更新參數傳送到各該文件機密等級設定系統110的人工智慧伺服器20的通訊模組52。若為複數個更新參數，則該模型派送模組122f將各該更新參數分別傳送到對應之各該類別C的該文件機密等級設定系統110。The federated learning server 134 optimizes and updates the relay parameters under homomorphic encryption to generate the updated parameters. Since the confidential words or sentences in the documents of each category C are slightly different, in this embodiment, the model optimization and update module 134a optimizes and updates the relay parameters transmitted from the federated platforms. The optimization and update may For example, some parameters among the relay parameters of the similar category C are selected and re-aggregated to generate update parameters suitable for these categories and store them in the model storage unit 122e. Different categories C respectively generate a plurality of different update parameters suitable for each category C. The model dispatching module 122f transmits the updated parameters in the model storage unit 122e to the communication module 52 of the AI server 20 of each of the document confidentiality level setting systems 110 . If there are a plurality of update parameters, the model dispatching module 122f transmits the update parameters to the document security level setting system 110 of the corresponding category C, respectively.

各該聯邦平台132對各該類別C的文件機密等級設定系統110所傳來的該些加密參數可包括在同態加密之下進行品質評估處理、混淆處理、聚合為該中繼參數。The encryption parameters transmitted from the file confidentiality level setting system 110 of each category C by each of the federation platforms 132 may include performing quality assessment processing, obfuscation processing, and aggregation into the relay parameters under homomorphic encryption.

本實施例可應用於將不同產業類別的該文件機密等級設定系統110的加密參數聚合，以產生更新參數，不但可以增加各單位的訓練之資料量，且可以利用到相關產業類別的機密之字詞或語句作為判斷機密文件。同樣地，也可有效避免各單位中的機密文件外洩，以提高各該文件機密等級設定系統110中文件的資料安全性。This embodiment can be applied to aggregate the encryption parameters of the document confidentiality level setting system 110 of different industry categories to generate update parameters, which can not only increase the amount of training data of each unit, but also utilize the confidential word of the relevant industry category. words or phrases as judged confidential documents. Similarly, leakage of confidential documents in each unit can also be effectively avoided, so as to improve the data security of documents in each document confidentiality level setting system 110 .

請配合圖14，提供本發明另一實施例之文件機密等級管理系統102，本實施例中，其係以前述的文件機密等級管理系統100為基礎，不同的是，聯邦學習系統140包括複數個聯邦學習伺服器142，各該聯邦學習伺服器142係連接每一類別C的該文件機密等級設定系統110，且每一該類別C包括至少二個該文件機密等級設定系統110。本實施例之文件機密等級管理方法如圖15所示，其具有大致相同於圖10之步驟，不同的是：Please cooperate with FIG. 14 to provide a file confidentiality level management system 102 according to another embodiment of the present invention. In this embodiment, it is based on the aforementioned file confidentiality level management system 100 . The difference is that the federated learning system 140 includes a plurality of Federated learning servers 142, each of which is connected to the document confidentiality level setting system 110 of each category C, and each category C includes at least two document security level setting systems 110. The document confidentiality level management method of this embodiment is shown in FIG. 15 , which has roughly the same steps as those in FIG. 10 , with the following differences:

該聯邦學習系統140係由各該聯邦學習伺服器142接收各該類別C的文件機密等級設定系統110之加密參數；各該聯邦學習伺服器142接收各該類別C的文件機密等級設定系統110之加密參數並聚合為一中繼參數，該些聯邦學習伺服器142之中的二者更將各自的中繼參數相互傳到對方並將來自對方的中繼參數與來自各該類別的文件機密等級設定系統110之加密參數聚合為一更新參數；而後，由各該聯邦學習伺服器142將各該更新參數傳送至對應之類別C的各該文件機密等級設定系統110。In the federated learning system 140, the federated learning servers 142 receive the encryption parameters of the document confidentiality level setting system 110 of the category C; The encrypted parameters are aggregated into a relay parameter, and two of the federated learning servers 142 further transmit their relay parameters to each other and combine the relay parameters from the other party with the document confidentiality level from each category. The encryption parameters of the setting system 110 are aggregated into an update parameter; then, each of the federated learning servers 142 transmits each of the updated parameters to each of the document confidentiality level setting systems 110 of the corresponding category C.

本實施例的文件機密等級管理系統102中，可應用於將加密參數與相近類別的中繼參數聚合，以產生更新參數，不但可以增加各單位的訓練之資料量，且可以利用到相關產業類別的機密之字詞或語句作為判斷機密文件。同樣地，也可有效避免各單位中的機密文件外洩，以提高各該文件機密等級設定系統110中文件的資料安全性。In the document confidentiality level management system 102 of this embodiment, it can be applied to aggregate encryption parameters and relay parameters of similar categories to generate update parameters, which can not only increase the amount of training data of each unit, but also can utilize related industry categories Confidential words or phrases used as judged confidential documents. Similarly, leakage of confidential documents in each unit can also be effectively avoided, so as to improve the data security of documents in each document confidentiality level setting system 110 .

據上所述，本發明之文件機密等級管理系統及方法，可有效地將不同的文件機密等級設定系統110之分析模型再訓練為更準確的分析模型，克服了單一文件機密等級設定系統110中的訓練資料量不足的問題，藉以提升判斷的準確性。According to the above, the document confidentiality level management system and method of the present invention can effectively retrain the analysis models of different document confidentiality level setting systems 110 into a more accurate analysis model, which overcomes the problem in the single document confidentiality level setting system 110 . In order to improve the accuracy of judgment, the amount of training data is insufficient.

由於文件機密等級設定系統110傳送至聯邦學習系統120, 130, 140的資料為分析模型的模型參數且模型參數經過加密後才傳送出去，並非直接將文件或分析模型傳送出去，因此，可有效避免機密文件外洩的情形，提高各該文件機密等級設定系統110中文件的資料安全性。Since the data transmitted by the document confidentiality level setting system 110 to the federated learning systems 120, 130, and 140 are the model parameters of the analysis model and the model parameters are transmitted after being encrypted, instead of directly transmitting the file or the analysis model, it can be effectively avoided. In the case of leakage of confidential documents, the data security of documents in the document confidentiality level setting system 110 is improved.

另外，文件機密等級管理系統100, 101, 102亦可應用在同一個企業中，而該些文件機密等級設定系統110分別於同一企業的複數個不同部門，例如包括有研發部門、業務部門、會計部門、行銷部門、設計部門等製造部門等。同一部門的使用者具有相近的機密等級認定標準，不同部門之使用者對機密等級的認定標準不同，因此讓各個部門使用各自的文件機密等級設定系統110進行訓練，避免不同部門的使用者使用單一個文件機密等級設定系統110且對文件機密等級認定差異太大，而造成單一個文件機密等級設定系統110的分析模型不準確，例如參數的權重降低、特徵亂度升高。透過聯邦學習系統120可將各分析模型的參數聚合，或選擇部分參數聚合，再回傳各部門的文件機密等級設定系統110，藉以提升各分析模型的準確度。In addition, the document confidentiality level management systems 100, 101, 102 can also be applied in the same enterprise, and the document confidentiality level setting systems 110 are respectively in a plurality of different departments of the same enterprise, for example, including the research and development department, the business department, the accounting department Department, Marketing Department, Design Department and other manufacturing departments. Users in the same department have similar standards for identifying confidentiality levels, and users in different departments have different standards for identifying confidentiality levels. Therefore, each department uses its own document confidentiality level setting system 110 for training, so as to avoid users in different departments from using a single document. A document confidentiality level setting system 110 and the difference in the identification of document confidentiality levels are too large, resulting in an inaccurate analysis model of a single document confidentiality level setting system 110 , for example, the weight of parameters is reduced, and the feature disorder is increased. Through the federated learning system 120, the parameters of each analysis model can be aggregated, or selected part of the parameters can be aggregated, and then sent back to the document confidentiality level setting system 110 of each department, so as to improve the accuracy of each analysis model.

以上所述僅為本發明可行實施例而已，其中，文件機密等級設定系統中對文件標記第一機密等級標籤或第二機密等級標籤，可以是但不限於將機密等級標籤標記於對應該文件的文件屬性的欄位中，或者標記、記錄於一目錄表或索引表等表單當中，或是存放在如資料外洩防護伺服器、終端裝置或資料外洩防護單元的資料儲存空間中，或者是存放於一資料庫中，以供資料外洩防護伺服器前往存取，於資料外洩防護伺服器要對該文件進行加密或權限控管時，係自前述的文件屬性欄位、表單、資料儲存空間或資料庫中讀取對應該文件的機密等級標籤。舉凡應用本發明說明書及申請專利範圍所為之等效變化，理應包含在本發明之專利範圍內。The above is only a feasible embodiment of the present invention, wherein, in the document confidentiality level setting system, marking the file with the first confidentiality level label or the second confidentiality level label may be, but not limited to, marking the confidentiality level label with the corresponding document. In the field of file attributes, or marked or recorded in a table of contents or index table, or stored in the data storage space such as the data leakage protection server, terminal device or data leakage protection unit, or Stored in a database for access by the data leakage prevention server. When the data leakage prevention server needs to encrypt or control the document, it is derived from the aforementioned file attribute fields, forms, and data. Read the confidentiality level label corresponding to the file in the storage space or database. All equivalent changes made by applying the description of the present invention and the scope of the patent application should be included in the patent scope of the present invention.

1:終端裝置 10:資料外洩防護伺服器 12:資料外洩防護單元 14:應用程式介面 16:機密等級管理單元 18:權限管理單元 20:人工智慧伺服器 30:人工智慧分析模組 32:應用程式介面 34:模型代理程式 40:人工智慧訓練模組 42:資料收集與標記工具 44:模型訓練工具 46:模型重訓練工具 48:錯誤驗證工具 100:文件機密等級管理系統 110:文件機密等級設定系統 50:模型參數加密模組 52:通訊模組 120:聯邦學習系統 124:聯邦學習伺服器 122a:數據品質評估模組 122b:數據過濾模組 122c:特徵聚合模組 122d:模型參數更新儲存單元 122e:模型儲存單元 122f:模型派送模組 101:文件機密等級管理系統 130:聯邦學習系統 132:聯邦平台 134:聯邦學習伺服器 134a:模型優化更新模組 102:文件機密等級管理系統 140聯邦學習系統 142聯邦學習伺服器 C:類別 1: Terminal device 10: Data leakage protection server 12: Data leakage prevention unit 14: API 16: Confidentiality level management unit 18: Rights Management Unit 20: AI Server 30: Artificial Intelligence Analysis Module 32: API 34: Model Agent 40: Artificial Intelligence Training Module 42: Data collection and tagging tools 44: Model training tools 46: Model Retraining Tool 48: Error Validation Tool 100: Document Confidentiality Level Management System 110: Document Confidentiality Level Setting System 50: Model parameter encryption module 52: Communication module 120: Federated Learning Systems 124: Federated Learning Server 122a: Data Quality Assessment Module 122b: Data filtering module 122c: Feature Aggregation Module 122d: Model parameter update storage unit 122e: Model storage unit 122f: Model Delivery Module 101: Document Confidentiality Level Management System 130: Federated Learning Systems 132: Federation Platform 134: Federated Learning Server 134a: Model optimization update module 102: Document Confidentiality Level Management System 140 Federated Learning Systems 142 Federated Learning Server C: Category

圖1為本發明一實施例的文件機密等級設定系統的方塊圖；圖2為本發明一實施例的文件機密等級設定方法的流程圖；圖3為本發明一實施例的模型訓練工具的架構圖；圖4為本發明一實施例的分析模型的訓練流程圖；圖5為本發明另一實施例的分析模型的訓練流程圖；圖6為本發明另一實施例的文件機密等級設定方法的流程圖；圖7A至圖7C為本發明一實施例之文件機密等級標籤標記的示意圖；圖8A至圖8C為本發明一實施例之文件機密等級標籤標記的示意圖；圖9為本發明一實施例的文件機密等級管理系統的方塊圖；圖10為本發明一實施例的文件機密等級管理方法的流程圖；圖11為本發明另一實施例的文件機密等級管理系統的方塊圖；圖12為圖11中的聯邦學習系統的方塊圖；圖13為本發明上述實施例的文件機密等級設定方法的流程圖；圖14為本發明另一實施例的文件機密等級管理系統的方塊圖；圖15為本發明上述實施例的文件機密等級設定方法的流程圖。 1 is a block diagram of a system for setting document confidentiality levels according to an embodiment of the present invention; 2 is a flowchart of a method for setting a document confidentiality level according to an embodiment of the present invention; 3 is an architectural diagram of a model training tool according to an embodiment of the present invention; FIG. 4 is a training flow chart of an analysis model according to an embodiment of the present invention; Fig. 5 is the training flow chart of the analysis model of another embodiment of the present invention; 6 is a flowchart of a method for setting a document confidentiality level according to another embodiment of the present invention; 7A to 7C are schematic diagrams of document confidentiality level label marking according to an embodiment of the present invention; 8A to 8C are schematic diagrams of document confidentiality level label marking according to an embodiment of the present invention; 9 is a block diagram of a document confidentiality level management system according to an embodiment of the present invention; 10 is a flowchart of a method for managing a document confidentiality level according to an embodiment of the present invention; 11 is a block diagram of a document confidentiality level management system according to another embodiment of the present invention; Fig. 12 is a block diagram of the federated learning system in Fig. 11; 13 is a flowchart of the method for setting a document confidentiality level according to the above-mentioned embodiment of the present invention; 14 is a block diagram of a document confidentiality level management system according to another embodiment of the present invention; FIG. 15 is a flowchart of the method for setting a document confidentiality level according to the above-mentioned embodiment of the present invention.

100:文件機密等級管理系統 100: Document Confidentiality Level Management System

110:文件機密等級設定系統 110: Document Confidentiality Level Setting System

20:人工智慧伺服器 20: AI Server

30:人工智慧分析模組 30: Artificial Intelligence Analysis Module

40:人工智慧訓練模組" 40:Artificial intelligence training module"

50:模型參數加密模組 50: Model parameter encryption module

52:通訊模組 52: Communication module

120:聯邦學習系統 120: Federated Learning Systems

122:聯邦學習伺服器 122: Federated Learning Server

122a:數據品質評估模組 122a: Data Quality Assessment Module

122b:數據過濾模組 122b: Data filtering module

122c:特徵聚合模組 122c: Feature Aggregation Module

122d:模型參數更新儲存單元 122d: Model parameter update storage unit

122e:模型儲存單元 122e: Model storage unit

122f:模型派送模組 122f: Model Delivery Module

Claims

A document confidentiality level management system, comprising: a plurality of document confidentiality level setting systems, each of the document confidentiality level setting systems including an artificial intelligence analysis module and a data leakage protection server, wherein: the artificial intelligence analysis module It is used for receiving a file, analyzing the confidentiality level of the file according to an analysis model, and providing a first confidentiality level label according to the analysis result; the data leakage prevention server receives the first confidentiality level label and associates it with a The terminal device is connected, the terminal device is used for user operation and provides a second confidentiality level label, the data leakage prevention server compares whether the first confidentiality level label and the second confidentiality level label are the same, if they are the same , the data leakage prevention server marks the file with the first confidentiality level label, and encrypts or controls the file according to the content defined by the first confidentiality level label; if not, the data leakage The protection server marks the file with the second confidentiality level label, and encrypts the file or controls the authority according to the content defined by the second confidentiality level label; a federated learning system, and each file confidentiality level setting The system is connected, and each of the document confidentiality level setting systems encrypts a model parameter of each of the analysis models to form an encrypted parameter and transmits it to the federated learning system; the federated learning system aggregates these encrypted parameters and generates an update parameter, and the The update parameter is sent to each of the document confidentiality level setting systems, and each of the artificial intelligence analysis modules updates the analysis model according to the update parameter; wherein, each of the document confidentiality level setting systems includes an artificial intelligence training module, the artificial intelligence training module The module is connected with the artificial intelligence analysis module, and is used for receiving analysis data generated by the artificial intelligence analysis module after the confidential level analysis of the document, and retraining the analysis model according to the analysis data; Wherein, the artificial intelligence training module includes a data collection and labeling tool, an error verification tool, a model training tool and a heavy training tool, the data collection and labeling tool is used to generate training data required for establishing the analysis model; the The error verification tool is used for storing the file with the first confidentiality level label and the second confidentiality level label different; the model training tool is used for establishing the analysis model according to the training data; the retraining tool is used for the analysis according to the analysis data The model is retrained.

The document confidentiality level management system as claimed in claim 1, wherein the document confidentiality level setting systems are divided into a plurality of categories, each of which includes at least two document confidentiality level setting systems; the federated learning system includes a plurality of federated The platform is connected with a federated learning server, each of the federated platforms is connected with the document confidentiality level setting system of each type, the federated learning server is connected with the federated platforms; each of the federated platforms receives the document security level setting system of each type The encrypted parameters are aggregated into a relay parameter and sent to the federated learning server; the federated learning server optimizes and updates the relay parameters to generate at least one update parameter.

The document confidentiality level management system according to claim 2, wherein the federated learning server generates a plurality of the update parameters, each of the update parameters corresponds to each of the categories, and each of the update parameters is transmitted to the corresponding of the categories respectively The document confidentiality level setting system.

The document confidentiality level management system according to claim 1, wherein the document confidentiality level setting systems are divided into a plurality of categories, and each of the categories includes at least two of the document confidentiality level setting systems; wherein the federated learning system includes a plurality of A federated learning server, each federated learning server receives the encryption parameters of the file confidentiality level setting system of each category and aggregates them into a relay parameter, and two of the federated learning servers further convert their respective relay parameters to each other and relay parameters from each other with files from each category The encryption parameters of the security level setting system are aggregated into the updated parameters and then sent to the file security level setting systems of the corresponding category.

The document confidentiality level management system of claim 1, wherein each of the document confidentiality level setting systems comprises encrypting the retrained model parameters of the analysis model to form the encrypted parameters and then transmit them to the federated learning system.

The document confidentiality level management system of claim 1, wherein the data collection and labeling tool includes a suggested labeling interface, the suggested labeling interface is connected to the terminal device, and provides a plurality of reference models, through which the user can The terminal device selects one of the benchmark models, and the artificial intelligence analysis module performs a confidentiality level analysis on the document according to the benchmark model, so as to provide the second confidentiality level label.

A document confidentiality level management system, comprising: a plurality of document confidentiality level setting systems, each of the document confidentiality level setting systems including an artificial intelligence analysis module and a data leakage protection server, wherein: the artificial intelligence analysis module It is used for receiving a file, analyzing the confidentiality level of the file according to an analysis model, and providing a first confidentiality level label according to the analysis result; the data leakage prevention server receives the first confidentiality level label and associates it with a The terminal device is connected, the terminal device is used for user operation and provides a second confidentiality level label, the data leakage prevention server compares whether the first confidentiality level label and the second confidentiality level label are the same, if they are the same , the data leakage prevention server marks the file with the first confidentiality level label, and encrypts or controls the file according to the content defined by the first confidentiality level label; if not, the data leakage The protection server marks the document with the second confidentiality level label, and performs encryption or authority control on the document according to the content defined by the second confidentiality level label; a federated learning system, connected with each of the document confidentiality level setting systems, each of the document security level setting systems encrypts a model parameter of each of the analysis models to form an encrypted parameter and transmits it to the federated learning system; the federated learning system sets the These encryption parameters are aggregated to generate an update parameter, and the update parameter is sent to each of the document confidentiality level setting systems, and each of the artificial intelligence analysis modules updates the analysis model according to the update parameter; wherein, the data leakage prevention server Including a data leakage protection unit, the data leakage protection unit is installed on the terminal device, and communicates with the data leakage protection server, the data leakage protection unit is used for uploading the file on the terminal device to the the data leakage prevention server, and the data leakage prevention server transmits the document to the artificial intelligence analysis module to analyze the document with a confidentiality level, and the data leakage prevention unit is based on the document The content defined by the marked first security level label or the second security level label is encrypted or authorized to control the file.

A document confidentiality level management system, comprising: a plurality of document confidentiality level setting systems, each of the document confidentiality level setting systems including an artificial intelligence analysis module and a data leakage protection server, wherein: the artificial intelligence analysis module It is used for receiving a file, analyzing the confidentiality level of the file according to an analysis model, and providing a first confidentiality level label according to the analysis result; the data leakage prevention server receives the first confidentiality level label and associates it with a The terminal device is connected, the terminal device is used for user operation and provides a second confidentiality level label, the data leakage prevention server compares whether the first confidentiality level label and the second confidentiality level label are the same, if they are the same , the data leakage prevention server marks the file with the first confidentiality level label, and encrypts or controls the file according to the content defined by the first confidentiality level label; if not, the data leakage Guard Servo The device marks the file with the second confidentiality level label, and encrypts or controls the file according to the content defined by the second confidentiality level label; a federated learning system is connected to each of the document confidentiality level setting systems , each of the document confidentiality level setting systems encrypts a model parameter of each of the analysis models to form an encrypted parameter and transmits it to the federated learning system; the federated learning system aggregates these encrypted parameters and generates an update parameter, and the update The parameters are sent to each of the document confidentiality level setting systems, and each of the artificial intelligence analysis modules updates the analysis model according to the update parameters; wherein, the artificial intelligence analysis module includes a model agent program, and the model agent program is installed on the terminal On the device, it is used to analyze the confidentiality level of the files on the terminal device according to the analysis model.

A file confidentiality level management method is applied to a file confidentiality level management system, the file confidentiality level management system includes a plurality of file confidentiality level setting systems and a federated learning system, each of the file confidentiality level settings includes a data leakage protection server the device and an artificial intelligence analysis module, each of the document confidentiality level setting systems is connected with the federated learning system, and the method includes the following steps: each of the document confidentiality level setting systems performs the following steps: A1. A terminal device provides a document; A2. The document is analyzed by the artificial intelligence analysis module according to an analysis model for the confidentiality level of the document, and the artificial intelligence analysis module provides a first confidentiality level label to the data leakage prevention server according to the analysis result; A3. The document is analyzed by a user for the confidentiality level, and a second confidentiality level label is provided to the data leakage prevention server; A4. The data leakage prevention server compares the first confidentiality level label with the The similarities and differences of the second classification label; when the first classification label and the second classification label When signing the same, the data leakage prevention server marks the document with the first-class security label, and encrypts or controls the file according to the content defined by the first-class security label; when the first-class security label is When the class label is different from the second class label, the data leakage prevention server marks the document with the second class label, and the data leakage protection server defines the content according to the second class label, Encrypt or control the authority of the file; A5. Encrypt a model parameter of the analysis model to form an encrypted parameter and transmit it to the federated learning system; the federated learning system performs the following steps: B1. Receive the document confidentiality level settings Encryption parameters transmitted from the system; B2, aggregate these encryption parameters and generate an update parameter; B3, transmit the update parameter to each of the document security level setting systems; each of the document security level setting systems further performs the following steps: A6, each of the document confidentiality level setting systems receives the update parameter; A7, each of the artificial intelligence analysis modules updates the analysis model according to the update parameter; wherein, each of the document confidentiality level setting systems includes an artificial intelligence training module, the The method further includes: the artificial intelligence training module receives analysis data generated by the artificial intelligence analysis module performing a confidentiality level analysis on the document, and retrains the analysis model according to the analysis data; further comprising the following steps : When the artificial intelligence analysis module performs a confidential level analysis on the same file again, and the provided first confidentiality level label is different from the second confidential level label provided by the user, the artificial intelligence training module will compare The difference content between the previous and later versions of the file is extracted, and the analysis model is retrained according to the difference content.

The document confidentiality level management method as claimed in claim 9, wherein the document confidentiality level setting systems are divided into a plurality of categories, each of which includes at least two document confidentiality level setting systems; the federated learning system includes a plurality of federated The platform is connected to a federated learning server, each of the federated platforms is connected to the file confidentiality level setting system of each category, and the federated learning server is connected to the federated platforms; wherein: in step B1, each of the federated platforms receives the The encryption parameters of the file confidentiality level setting system of the category; in step B2, each federated platform aggregates the received encryption parameters of the file confidentiality level setting system for each category into a relay parameter and transmits it to the federated learning server ; The federated learning server optimizes and updates these relay parameters to generate at least one of the updated parameters.

The document confidentiality level management method as described in claim 10, wherein in step B2, a plurality of the update parameters are generated, and each of the update parameters corresponds to each of the categories; and in step B3, each of the update parameters is transmitted to the corresponding The document security level setting system for each of the categories.

The document confidentiality level management method according to claim 9, wherein the document confidentiality level setting systems are divided into a plurality of categories, and each of the categories includes at least two document confidentiality level setting systems; wherein the federated learning system includes a plurality of A federated learning server; wherein in step B1, each federated learning server receives the encryption parameters of the document confidentiality level setting system of each type; wherein in step B2, each federated learning server receives each type of document confidentiality level The encryption parameters of the system are set and aggregated into a relay parameter, and two of the federated learning servers further transmit their relay parameters to each other and combine the relay parameters from each other with files from each category Confidentiality level setting system encryption parameters Aggregate into the update parameter; wherein in step B3, each federated learning server transmits each update parameter to each of the document confidentiality level setting systems of the corresponding category.

The document confidentiality level management method as claimed in claim 9 includes that each document confidentiality level setting system encrypts the retrained model parameters of the analysis model to form the encrypted parameters and transmits them to the federated learning system.

A file confidentiality level management method is applied to a file confidentiality level management system, the file confidentiality level management system includes a plurality of file confidentiality level setting systems and a federated learning system, each of the file confidentiality level settings includes a data leakage protection server the device and an artificial intelligence analysis module, each of the document confidentiality level setting systems is connected with the federated learning system, and the method includes the following steps: each of the document confidentiality level setting systems performs the following steps: A1. A terminal device provides a document; A2. The document is analyzed by the artificial intelligence analysis module according to an analysis model for the confidentiality level of the document, and the artificial intelligence analysis module provides a first confidentiality level label to the data leakage prevention server according to the analysis result; A3. The document is analyzed by a user for the confidentiality level, and a second confidentiality level label is provided to the data leakage prevention server; A4. The data leakage prevention server compares the first confidentiality level label with the Similarities and differences of the second confidentiality level label; when the first confidentiality level label is the same as the second confidentiality level label, the data leakage prevention server marks the document with the first confidentiality level label, and according to the first confidentiality level The content defined by the tag is encrypted or authorized to control the file; when the first confidentiality level tag is different from the second confidentiality level tag, the data leakage prevention server marks the document with the second confidentiality level tag , by the The material leakage prevention server encrypts or controls the authority of the file according to the content defined by the second confidentiality level label; A5. Encrypt a model parameter of the analysis model to form an encrypted parameter and transmit it to the federated learning system ; The federated learning system executes the following steps: B1, receive the encryption parameters transmitted from the document confidentiality level setting system; B2, aggregate these encryption parameters and generate an update parameter; B3, transmit the update parameter to each of the A document security level setting system; each of the document security level setting systems further executes the following steps: A6, each of the document security level setting systems receives the update parameter; A7, each of the artificial intelligence analysis modules updates the analysis model according to the update parameter; Wherein, after step A2, the data leakage prevention server marks the file with the first confidentiality level label, and encrypts or controls the authority of the file according to the content defined by the first confidentiality level label; in In step A4, when the second confidentiality level label is different from the first confidentiality level label, the data leakage prevention server changes the marking of the file to the second confidentiality level label, and according to the second confidentiality level label The defined content is encrypted or permission control is performed on the file.

A file confidentiality level management method is applied to a file confidentiality level management system, the file confidentiality level management system includes a plurality of file confidentiality level setting systems and a federated learning system, each of the file confidentiality level settings includes a data leakage protection server the device and an artificial intelligence analysis module, each of the document confidentiality level setting systems is connected with the federated learning system, and the method includes the following steps: each of the document confidentiality level setting systems performs the following steps: A1. A terminal device provides a document; A2. The document is analyzed by the artificial intelligence analysis module according to an analysis model for the confidentiality level of the document, and the artificial intelligence analysis module provides a first confidentiality level label to the data leakage prevention server according to the analysis result; A3. The document is analyzed by a user for the confidentiality level, and a second confidentiality level label is provided to the data leakage prevention server; A4. The data leakage prevention server compares the first confidentiality level label with the Similarities and differences of the second confidentiality level label; when the first confidentiality level label is the same as the second confidentiality level label, the data leakage prevention server marks the document with the first confidentiality level label, and according to the first confidentiality level The content defined by the tag is encrypted or authorized to control the file; when the first confidentiality level tag is different from the second confidentiality level tag, the data leakage prevention server marks the document with the second confidentiality level tag , the data leakage prevention server encrypts or controls the file according to the content defined by the second confidentiality level label; A5, encrypts a model parameter of the analysis model to form an encryption parameter and transmits it to the A federated learning system; the federated learning system executes the following steps: B1, receiving encryption parameters from the document confidentiality level setting system; B2, aggregating these encryption parameters and generating an update parameter; B3, transmitting the update parameter to each of the document security level setting systems; each of the document security level setting systems further performs the following steps: A6. Each of the document security level setting systems receives the update parameter; A7, each of the artificial intelligence analysis modules updates the update parameter according to the update parameter. Analyzing model; wherein, in step A3, the user selects one of the benchmark models among several benchmark models, and the artificial intelligence analysis module selects the benchmark model according to the user's selection The quasi-model analyzes the confidentiality level of the document, and the user provides the second confidentiality level label according to the analysis result.