TWI737073B

TWI737073B - Timing analysis system and method for petition cases

Info

Publication number: TWI737073B
Application number: TW108145116A
Authority: TW
Inventors: 陳碧弘; 陳韋金; 余憲全; 羅瑞麟; 賴彥如
Original assignee: 中華電信股份有限公司
Priority date: 2019-12-10
Filing date: 2019-12-10
Publication date: 2021-08-21
Also published as: TW202123160A

Abstract

This invention provides a timing analysis system and a method for petition cases, which is used to summarize repeated and continuous cases, analyze whether there is regularity in the notification of the cases, and use the time series algorithm model for the regular notifications to predict the number of cases in the future time interval; as for irregular notifications, a neural network algorithm model is used to analyze the probability of case characteristics affecting notifications, the analysis results of regular cases and characteristic cases are presented respectively, and the date and time of the predicted receipt notification are sorted in order to simplify the time series analysis method of petition cases, improve the predictive ability of the number of reported cases, strengthen the summary of the characteristic attributes that affect the petitions, and improve the decision-making assistance benefit of the authority.

Description

Time sequence analysis system and method of petition cases

本發明係關於一種陳情案件之時序分析系統與方法，詳而言之，為一種匯集歷史陳情案件及開放數據資料，透過統計分析與機器學習，智慧化分析陳情案件之時序規則之系統與方法。 The present invention relates to a time series analysis system and method for petition cases. In detail, it is a system and method that collects historical petition cases and open data, and intelligently analyzes the time sequence rules of petition cases through statistical analysis and machine learning.

各縣市政府市民服務專線，主要目的是為了解決市民對於城市環境、設施上的困擾與不便，所推出的單一電話號碼、網站或APP等服務，提供市民解決各種非緊急事務(緊急事項為110、119服務範圍)，如噪音、垃圾、路燈、公園和路樹等公害或設施有危害之虞及道路維護、交通運輸之申訴、通報與諮詢等服務。立意在於即時受理民眾的舉報，以一通報事項，列為一案件，並以最快的速度分派給權責機關，不需經過繁瑣的市容查報及通報等內部簽呈作業，而權責機關接受舉報內容後，會儘速回覆辦理情形，建立便捷的市民服務。 The main purpose of the citizen service lines of the county and city governments is to solve the problems and inconveniences of the citizens to the urban environment and facilities. Services such as a single phone number, website or APP are launched to provide citizens with a variety of non-emergency issues (emergency is 110 , 119 service scope), such as noise, garbage, street lights, parks and road trees and other public hazards or facilities that are threatened, road maintenance, transportation complaints, notifications and consultation services. The intention is to immediately accept reports from the public, list a notification item as a case, and assign it to the competent authority as quickly as possible. There is no need to go through the cumbersome city appearance inspection and notification and other internal signing and submission operations, and the competent authority After accepting the content of the report, the situation will be responded to as soon as possible, and convenient citizen services will be established.

鑑於政府市民服務專線提供免費的便利諮詢、迅速派遣與陳情受理流程，常吸引部份市民以各種理由，重複通報、大量舉報，甚至惡意謊報，致使陳情人或被檢舉者及承辦機關，耗費人力、金錢等資源而面臨極大的困擾。然以某社區洗衣店為例，洗溶劑氣味惱人，清潔劑刺鼻氣味每日瀰漫在社區，住戶深感痛苦，受害社區鄰長3年內撥打客服專線投訴逾500次，權責單位到場200多次，包含部分時段稽查人員抵達的時間點，洗衣店尚未開始營業，或檢測設備的局限性，而查無汙染之實證，進而懷疑陳情人之動機，甚而將該陳情人列為拒絕往來戶，最後因權責單位更換異味污染物檢測工具，才得以驗出空氣污染超標，始知該陳情為真，與該市民服務專線目標有違。另外，諸如違建、違停、遊民、路霸、攤販等難以處理的陳情，違規者眾、再犯率高、處理曠日費時，且部分涉及陳情人之私利，若一再接到陳情人的通報，受理單位也一再重複訪視，恐淪為特定的擾人工具。 In view of the fact that the government's civic service hotline provides free and convenient consultation, rapid dispatch and petition acceptance process, it often attracts some citizens to repeatedly report, report a large number of reports, and even maliciously falsely report for various reasons, which causes the petitioner or the accused and the undertaking agency to consume manpower. , Money and other resources while Facing great distress. However, taking a laundry shop in a community as an example, the smell of detergent is annoying, and the pungent smell of detergent permeates the community every day. Residents are deeply distressed. Neighbors in the victim community have called the customer service hotline more than 500 complaints within 3 years, and the responsible unit is present. More than 200 times, including part of the time when the inspectors arrived, the laundry has not yet started business, or the limitations of the testing equipment, and the evidence of no pollution was checked, and then the motive of the petitioner was suspected, and the petitioner was even listed as a refusal to communicate In the end, because the authority and responsibility unit replaced the odor pollutant detection tool, it was able to detect that the air pollution exceeded the standard, only to know that the complaint was true and violated the target of the citizen service line. In addition, complaints that are difficult to deal with such as illegal construction, illegal parking, vagrant, road tyrants, vendors, etc., there are many offenders, high re-offending rates, time-consuming processing, and partly involve the personal interests of the petitioner. If you receive repeated notifications from the petitioner, The accepting unit also repeated the visits again and again, fearing it would become a specific harassing tool.

針對發生在鄰近地點、同一案類的重複陳情案件，若能在接收陳情案件的第一時間，預先自動判斷該通報鄰近地點，過去陳情的案類及歷史通報時間，預測陳情案件通報的時序規則、尋找有效的處理方式，選擇適當的時間區間及檢測技術，進行實地查核，精準發現問題；並對於沒有新事實或證據的重複通報，以簽結方式回覆，提昇服務效率、有效防制市民濫用，除可節省查核人力外，實可提昇市民滿意度。 For repeated petition cases that occur in neighboring locations and in the same case, if the notification case can be automatically judged in advance at the first time when the petition case is received, the neighboring location of the notification, the past petition cases and the historical notification time, and the timing rules for the notification of petition cases , Find effective processing methods, select appropriate time intervals and detection techniques, conduct on-site inspections, and accurately find problems; and reply to repeated notifications without new facts or evidence by signing to improve service efficiency and effectively prevent citizens from abuse , In addition to saving manpower for checking, it can actually improve citizen satisfaction.

由此可見，上述習用方式仍有系統可用性差，實非一便捷而容易廣泛應用之設計，亟待加以改良。 It can be seen that the above-mentioned conventional methods still have poor system usability, which is not a convenient and easy-to-use design, and needs to be improved urgently.

本案發明人鑑於上述習用方式所衍生的各項缺點，乃亟思加以改良創新，並經多年苦心孤詣潛心研究後，終於成功研發完成本件一種陳情案件之時序分析系統與方法。 In view of the various shortcomings derived from the above-mentioned conventional methods, the inventor of this case is eager to improve and innovate, and after years of painstaking research, finally successfully completed the development and completion of a time sequence analysis system and method for this petition case.

本發明之目的係在於提供一種陳情案件之時序分析系統與方法，運用統計分析與機器學習技術，增進智慧城市之陳情案件通報時間區間預測能力，協助權責單位即時剖析陳情案件規律，預先規畫派工時段，增進民情資訊萃取與服務的效率，提昇市民滿意度。 The purpose of the present invention is to provide a time series analysis system and method for petition cases, using statistical analysis and machine learning technology to improve the forecasting ability of the notification time interval of petition cases in smart cities, and assist the authority and responsibility unit to analyze the law of petition cases in advance and plan ahead. The dispatching period improves the efficiency of public information extraction and service, and enhances citizen satisfaction.

本發明之次一目的，係在於強化智慧城市之持續陳情案件應變對策，預先將案件分類為重複陳情案件或單一陳情案件，分析案件通報的趨勢，歸納持續陳情案件或偶發陳情案件，提供權責單位調整查核優先順序，並預先排除已使用且無效之檢測技術，使市民服務的運作更為順暢。 The second purpose of the present invention is to strengthen the smart city's response to continuous petition cases, classify cases into repeated petition cases or single petition cases in advance, analyze the trend of case notification, summarize continuous petition cases or occasional petition cases, and provide rights and responsibilities The unit adjusts the priority of inspections, and excludes used and ineffective detection technologies in advance to make the operation of citizen services smoother.

本發明之另一目的，係在於預先分析鄰近地點、相同案類的通報數量及處理結果，藉由陳情案件之資料探勘，簽結已完成處理或已查明確認的通報案件，精簡規律性或顯有耗費機關資源之通報案件的處理程序，有效防制資源濫用。 Another purpose of the present invention is to analyze in advance the number of notifications and processing results of neighboring locations, the same type of case, and use the data exploration of petition cases to sign the reported cases that have been processed or confirmed, simplifying the regularity or The procedures for handling reported cases that obviously consume agency resources can effectively prevent the abuse of resources.

本發明之再一目的，係在於提昇智慧城市之陳情案件預測能力，藉由陳情案件之通報時間序列預測，協助政府權責機關及第一線查核人員，即時掌握演算法所預測的案件通報時間區間，預先擬定解決對策，從預防的角度降低民眾對派工服務的需求，進而提昇人工智慧預測技術對於市府、市民及整個社會的影響力。 Another purpose of the present invention is to improve the prediction ability of petition cases in smart cities. By predicting the notification time series of petition cases, it assists government authorities and first-line inspectors to grasp the time of case notification predicted by the algorithm in real time. In order to reduce the public’s demand for dispatching services from a preventive perspective, it will increase the influence of artificial intelligence prediction technology on municipalities, citizens, and society as a whole.

為達成上述發明目的，本發明提供一種陳情案件之時序分析系統，包括：一外部資料蒐集模組，係將介接一陳情整合系統及一開放資料系統所取得之資料經預處理程序後儲存於一資料儲存庫中，再依一陳情案件的通報時間和通報地點，搜尋該資料儲存庫中預設時間範圍及地點範圍內的歷史陳情案件，判斷該陳情案件是否為鄰近地點的重複且持續之陳情案件；一陳情時序探勘模組，係接收該重複且持續之陳情案件，分析不同時間單位之案件數量變動有無規律性；一規律案件預測模組，若有規律性，則由該規律案件預測模組使用時間序列演算法進行建模預測，藉此提供規律案件之時序分析結果；及一特徵案件預測模組，若無規律性，則由該特徵案件預測模組使用類神經網路演算法進行建模預測，藉此提供特徵案件之時序分析結果。 In order to achieve the above-mentioned object of the invention, the present invention provides a time series analysis system for petition cases, including: an external data collection module, which stores the data obtained by interfacing a petition integration system and an open data system in a preprocessing process One data storehouse, and then one report The notification time and location of the sentiment case, search for historical petition cases within the preset time range and location in the data repository, and determine whether the petition case is a repeated and continuous petition case in a nearby location; a petition time series exploration module , Is to receive the repeated and continuous petition case, analyze whether the number of cases in different time units has regularity; if there is regularity, the regular case prediction module will use the time series algorithm to build Model prediction to provide time series analysis results of regular cases; and a feature case prediction module. If there is no regularity, the feature case prediction module uses neural network-like algorithms to model and predict, thereby providing feature cases The timing analysis results.

為達成上述發明目的，本發明另提供一種陳情案件之時序分析方法，包括外部資料之蒐集、陳情時序之探勘、規律案件之預測及特徵案件之預測。 In order to achieve the above-mentioned purpose of the invention, the present invention also provides a time series analysis method of petition cases, including the collection of external data, the exploration of petition time series, the prediction of regular cases, and the prediction of characteristic cases.

外部資料之蒐集，係即時接收最新陳情與定時取得外部開放資料，經資料萃取、轉換、清理及去識別化等預處理，依陳情通報時間、地點，搜尋預設時間範圍及地點範圍的歷史陳情案件，判斷該陳情是否為鄰近地點的單一陳情案件或重複陳情案件。若為重複陳情案件則計算該陳情案件的通報頻率是否大於預設值，若通報頻率大於預設值，則歸類為持續陳情案件；否則，歸類為偶發陳情案件。此方法透過了解過去陳情的重複性與持續性，進而對未來的陳情趨勢做預測。 The collection of external data is to receive the latest submissions in real time and obtain external open data at regular intervals. After data extraction, conversion, cleaning and de-identification and other pre-processing, the time and location are reported according to the submissions, and the historical submissions in the preset time range and location range are searched. Case, to determine whether the petition is a single petition case or a repeated petition case in a nearby location. If it is a repeated petition case, calculate whether the notification frequency of the petition case is greater than the preset value. If the notification frequency is greater than the preset value, it will be classified as a continuous petition case; otherwise, it will be classified as an occasional petition case. This method understands the repetitiveness and continuity of past petitions, and then makes predictions about future petition trends.

陳情時序之探勘，係針對重複且持續之陳情案件，使用敘述統計描述其案件數輪廓，了解案件數的變動輪廓及其集中和分散的程度。繪製小時、日、週、月、季或年等不同基本時間單位之案件數量時間序列圖，觀察各時間單位之案件數量變動有無規律性，並分析其通報的時間規則。若案件通報時間具有規律性，則使用時間序列演算法建模預測；否則，使用類神經網路演算法建模預測。此方法將案件分類為是否具有規律性時間序列特色，藉以對後續解決問題的方法選擇，提供重要的訊息。 The exploration of the sequence of reports is to use narrative statistics to describe the outline of the number of cases for repeated and continuous petition cases, and to understand the outline of the change of the number of cases and the degree of concentration and dispersion. Draw a time series diagram of the number of cases in different basic time units such as hour, day, week, month, quarter or year, observe whether the number of cases in each time unit changes regularly, and analyze the time of notification Between rules. If the notification time of the case is regular, the time series algorithm is used to model and forecast; otherwise, the neural network-like algorithm is used to model and forecast. This method classifies cases as whether they have regular time series characteristics, and provides important information for the selection of subsequent problem-solving methods.

規律案件之預測，係使用統計分析方法，將規律性案件依據歷史陳情時間點及案件數，使用時間序列演算法建模，以日期為主鍵，歷史案件數為輸入及預測變數，使用時間序列分析，預測未來數個日期的案件數，藉由挖掘案件數隨時間推移所發生的變化，進而預測未來時間區間的通報案件數及案件數的變動趨勢。 The prediction of regular cases is based on the statistical analysis method. The regular cases are modeled based on the time point of historical submissions and the number of cases, using a time series algorithm to model, with date as the main key, and the number of historical cases as input and forecast variables, using time series Analyze and predict the number of cases on several dates in the future. By digging out the changes in the number of cases over time, we can predict the number of reported cases and the trend of changes in the number of cases in the future time interval.

特徵案件之預測，係使用機器學習方法，將非規律性案件依據案件本身的特徵及所收集的鄰近區域之災情、地理、氣象、環境品質、歷史陳情資料的特徵屬性，使用類神經網路演算法建模，評估各特徵對發生通報案件的影響機率。特徵為類神經網路分析之選項，重要的特徵有助於預測該特徵對於案件通報的影響程度，可提升模型的準確度。對於影響成效愈大的特徵優先考慮，使影響案件通報的特徵能夠掌握的愈來愈準確，將特徵出現的時間區間依序條列，並可進一步使用時序預測結果來輔助決策及預先安排查核派工時間。 The prediction of characteristic cases is based on the machine learning method. The non-regular cases are based on the characteristics of the case itself and the characteristic attributes of the collected neighboring area disaster, geography, weather, environmental quality, and historical complaints, using a neural network-like algorithm Modeling and assessing the probability of each feature's influence on the occurrence of notified cases. Feature is an option of similar neural network analysis. Important features help predict the impact of the feature on case notification and improve the accuracy of the model. Priority is given to the features with greater impact, so that the features of the notification of impact cases can be grasped more and more accurately. The time intervals in which the features appear are listed in order, and the results of time series forecasts can be further used to assist decision-making and pre-arrange inspections. Working time.

智慧城市之陳情案件時序分析方法將政府陳情平台所收集的市民陳情意見彙集，在各種瑣碎、局部及不相關的訊息中，有效地進行未來通報時間區間的分析預測，從解決市民問題的角度，提供決策支援中心具有參考價值的決策資訊，以回應市民的服務需求，此為廣求民瘼、探尋民意及市民關係管理的關鍵任務。 The chronological analysis method of petition cases in the smart city gathers the petitions of citizens collected by the government petition platform, and effectively analyzes and predicts the time interval of future notifications in various trivial, partial and irrelevant messages. From the perspective of solving citizen problems, Provide decision-making information with reference value for the decision-making support center to respond to the service needs of the public. This is the key task of seeking public despair, exploring public opinion and managing citizen relations.

應用陳情案件時序分析方法，預先預測案件通報的時間區間，協助政府權責機關調整派工策略，有效提升人員調度及派遣的時效性及準確度，並能從預防的角度降低市民的派工服務需求。 Apply the time sequence analysis method of petition cases to predict the time interval of case notification in advance, assist the government authority to adjust the dispatch strategy, effectively improve the timeliness and accuracy of personnel dispatch and dispatch, and reduce the citizen's dispatch service from the perspective of prevention need.

1‧‧‧陳情案件時序分析系統 1‧‧‧Time sequence analysis system for petition cases

11‧‧‧外部資料蒐集模組 11‧‧‧External data collection module

12‧‧‧陳情時序探勘模組 12‧‧‧Chen Qing Time Series Exploration Module

13‧‧‧規律案件預測模組 13‧‧‧Regular case prediction module

14‧‧‧特徵案件預測模組 14‧‧‧Characteristic case prediction module

2‧‧‧介接平台 2‧‧‧Interface platform

21‧‧‧陳情整合系統 21‧‧‧Composition System

22‧‧‧開放資料系統 22‧‧‧Open Data System

23‧‧‧大數據分析系統 23‧‧‧Big Data Analysis System

3‧‧‧資料儲存庫 3‧‧‧Data Repository

31‧‧‧案件資料庫 31‧‧‧Case Database

32‧‧‧災情資訊資料庫 32‧‧‧Disaster Information Database

33‧‧‧地理資訊資料庫 33‧‧‧Geographic Information Database

34‧‧‧氣象資訊資料庫 34‧‧‧Meteorological Information Database

35‧‧‧環境品質資料庫 35‧‧‧Environmental Quality Database

S101~S104、S201~S208、S301~S306、S401~S408、S501~S508‧‧‧步驟 S101~S104, S201~S208, S301~S306, S401~S408, S501~S508‧‧‧Step

第1A圖為本發明之陳情案件之時序分析系統之架構示意圖； Figure 1A is a schematic diagram of the structure of the time sequence analysis system for petition cases of the present invention;

第1B圖為本發明之陳情案件之時序分析方法之流程示意圖； Figure 1B is a schematic flow diagram of the time sequence analysis method of petition cases of the present invention;

第2圖為本發明之陳情案件之時序分析方法之外部資料之蒐集流程圖； Figure 2 is a flow chart of collecting external data of the time series analysis method of petition cases of the present invention;

第3圖為本發明之陳情案件之時序分析方法之陳情時序之探勘流程圖； Figure 3 is the exploratory flow chart of the petition sequence of the petition case sequence analysis method of the present invention;

第4圖為本發明之陳情案件之時序分析方法之規律案件之預測流程圖； Figure 4 is the prediction flow chart of regular cases in the time series analysis method of petition cases of the present invention;

第5圖為本發明之陳情案件之時序分析方法之特徵案件之預測流程圖； Figure 5 is the prediction flow chart of the characteristic case of the time series analysis method of the petition case of the present invention;

第6A至6D圖為本發明之陳情案件之時序分析系統與方法之實施例一之依小時、日、週及月累計之案件數量時間序列圖； Figures 6A to 6D are time series diagrams of the number of cases accumulated by hour, day, week, and month in the first embodiment of the time series analysis system and method of petition cases of the present invention;

第7圖為本發明之陳情案件之時序分析系統與方法之實施例一之未來30日案件數量預測時間序列圖； Figure 7 is a time series diagram of the number of cases forecast in the next 30 days in the first embodiment of the time series analysis system and method for petition cases of the present invention;

第8A至8D圖為本發明之陳情案件之時序分析系統與方法之實施例二之依小時、日、週及月累計之案件數量時間序列圖；及 Figures 8A to 8D are time series diagrams of the number of cases accumulated by hour, day, week, and month in the second embodiment of the time series analysis system and method for petition cases of the present invention; and

第9圖為本發明之陳情案件之時序分析系統與方法之實施例二之特徵影響有無通報案件增益值長條圖。 Figure 9 is a bar graph showing the influence of the characteristics of the second embodiment of the time sequence analysis system and method of the petition case of the present invention on the gain value of the reported case.

請參閱第1A圖，為本發明一種陳情案件之時序分析系統之系統架構示意圖，一陳情案件時序分析系統1具備外部資料蒐集、陳情時序探勘、規律案件預測及特徵案件預測之能力。陳情案件時序分析系統1包括外部資料蒐集模組11、陳情時序探勘模組12、規律案件預測模組13及特徵案件預測模組14。 Please refer to Figure 1A, which is a schematic diagram of the system architecture of a time series analysis system for petition cases of the present invention. A petition case time series analysis system 1 has the capabilities of collecting external data, petting time series exploration, predicting regular cases, and predicting characteristic cases. The case sequence analysis system 1 includes an external data collection module 11, a case sequence exploration module 12, a regular case prediction module 13, and a characteristic case prediction module 14.

外部資料蒐集模組11外部介接平台2，即時接收陳情整合系統21及開放資料系統中的最新資料，經萃取、轉換、清理及去識別化等預處理，依陳情案件的通報時間、地點，搜尋資料儲存庫3中預設時間範圍(例如：一年內)及地點範圍(例如：100公尺內)的歷史陳情案件，藉此判斷該陳情案件是否為鄰近地點的單一陳情案件或重複陳情案件。若為重複陳情案件則計算該陳情案件的通報頻率是否大於預設值(例如：1次/月)，若通報頻率大於預設值，則歸類為持續陳情案件；否則，歸類為偶發陳情案件。 The external data collection module 11, the external interface platform 2, receives the latest data in the submission integration system 21 and the open data system in real time, and undergoes preprocessing such as extraction, conversion, cleaning and de-identification, according to the reporting time and location of the submission case. Search for historical petition cases in the preset time range (e.g. within one year) and location (e.g. within 100 meters) in the data repository 3 to determine whether the petition case is a single petition case or a repeated petition in a nearby location case. If it is a repeated petition case, calculate whether the notification frequency of the petition case is greater than the preset value (for example: 1 time/month), if the notification frequency is greater than the preset value, it will be classified as a continuous petition case; otherwise, it will be classified as an occasional petition case.

陳情時序探勘模組12將重複且持續之陳情案件，使用敘述統計描述其案件數輪廓，包含計算案件總數及每日案件之平均數、眾數、中位數、最大值、最小值、四分位差、變異數與標準差等值，了解每日案件數的集中與分散趨勢，繪製歷史與即時通報案件之小時、日、週、月、季或年等不同基本時間單位之案件數量時間序列圖，觀察各時間單位的時序圖，計算一高峰到次一高峰或一谷底到次一谷底的時間長度。使用小時、日、週、月、季或年等不同基本時間單位或週末/週間、上午/下午等延伸時間單位，計算該時間單位之案件數量變動有無規律性，案件規律性是指在固定時間單位或相隔一段期間會接收到通報案件的規則或循環，若計算案件在該基本或延伸的時間單位所發生之案件數超過所有案件數的五成，則未來案件可能發生在該時間區間的可能性明顯高於其它區間，該時間區間具有重複出現的規則。若案件通報時間具有規律性，則使用時間序列演算法建模預測；否則，使用類神經網路演算法建模預測。 The report time series exploration module 12 will use narrative statistics to describe the number of repeated and continuous reports, including calculating the total number of cases and the average, mode, median, maximum, minimum, and quarter points of the daily cases. Position difference, variance and standard deviation, etc., understand the concentration and dispersion trend of the number of daily cases, and plot the time series of the number of cases in different basic time units such as hour, day, week, month, quarter, or year for historical and instant notification cases Figure, observe the time sequence diagram of each time unit, calculate the length of time from one peak to the next peak or from the bottom to the bottom of the valley. Use different basic time units such as hour, day, week, month, quarter or year, or extended time units such as weekends/weeks, morning/afternoon, etc. to calculate whether the number of cases in this time unit changes regularly. Case regularity refers to the fixed time The unit or within a period of time will receive the rule or cycle of notification of cases, if If the number of cases that occurred in the basic or extended time unit exceeds 50% of all cases, the probability that future cases may occur in this time interval is significantly higher than that in other intervals, and this time interval has a recurring rule. If the notification time of the case is regular, the time series algorithm is used to model and forecast; otherwise, the neural network-like algorithm is used to model and forecast.

規律案件預測模組13使用統計分析方法，針對具規律性案件，依據歷史陳情時間點及案件數，使用時間序列演算法建模，挖掘案件數隨時間推移所發生的變化，進而預測未來的案件數量，配合該陳情通報時段排序，進而預測未來可能接收通報的日期時間。 The regular case prediction module 13 uses statistical analysis methods to target regular cases, based on historical submission time points and the number of cases, using time series algorithm modeling, mining the changes in the number of cases over time, and predicting future cases The number, in accordance with the sorting of the time period of the notification, and then predict the date and time that the notification may be received in the future.

特徵案件預測模組14使用機器學習方法，針對不具規律性案件，依據即時案件與歷史案件的案類、地點、時間、描述、等級、狀態等特徵及所收集鄰近區域之災情、地理及相近時間之氣象、環境品質等外部開放資料的特徵，使用類神經網路演算法建模，評估前述各特徵對發生通報案件的影響機率，經由重複建模與評估，使影響案件通報的特徵能夠掌握的愈來愈準確，並將影響成效愈大的特徵優先條列，進而預測未來可能接收通報的日期時間，為智慧城市權責單位提供基於統計分析與機器學習的數據依據，並可進一步透過預測來輔助決策及預先安排查核派工時間。 The characteristic case prediction module 14 uses machine learning methods to deal with irregular cases, based on the case type, location, time, description, level, status and other characteristics of real-time cases and historical cases, as well as the disaster situation, geography and similar time of the collected neighboring areas The characteristics of externally open data such as weather, environmental quality, etc., are modeled using a neural network algorithm to evaluate the probability of the impact of each of the aforementioned features on the occurrence of notified cases. Through repeated modeling and evaluation, the characteristics of affecting case notifications can be grasped more and more. To be more accurate, and to prioritize the features that have greater impact on the effectiveness, and then predict the date and time that the notification may be received in the future, and provide data basis based on statistical analysis and machine learning for the smart city authority and responsibility unit, and can be further assisted by prediction Decision-making and pre-arrangement to check the dispatch time.

介接平台2包含陳情整合系統21、開放資料系統22及大數據分析系統23。陳情整合系統21提供市民可直接撥打市民服務專線，由話務人員直接受理市民電話陳述或透過市府陳情系統、市民陳情APP、市政信箱、市政論壇等方式，輸入陳情之人、事、時、地、物等資訊、輔以上載照片、錄影等佐證，其範圍涵蓋城市治理之建議、違規缺失之舉發或大眾權益之維護等，經市民通報後，由陳情系統進行列管及後續派工等服務。開放資料系統22可有複數個以提供外部系統介接，取得城市相關之災情、地理、氣象與環境品質等開放資料之能力。大數據分析系統23提供外部系統介接，由介接系統選擇欲建模的演算法、訓練集與測試集的比例及匯入擬進行預測之檔案資料，經重複建模與評估，輸出最佳的預測結果。 The interface platform 2 includes a sentiment integration system 21, an open data system 22 and a big data analysis system 23. The sentiment integration system 21 provides citizens who can directly dial the civic service line, and the operator can directly accept the citizen’s telephone statement or enter the sentencing person, event, time, and Information on places, objects, and supporting evidence such as uploaded photos, videos, etc. The scope covers suggestions for urban governance, reports of violations or lack of regulations or The maintenance of public rights, etc., after being notified by the citizens, the reporting system will provide services such as train management and follow-up dispatch of workers. There may be a plurality of open data systems 22 to provide an interface with external systems, and the ability to obtain open data such as city-related disasters, geography, weather, and environmental quality. The big data analysis system 23 provides an interface to an external system. The interface system selects the algorithm to be modeled, the ratio of the training set to the test set, and imports the file data to be predicted. After repeated modeling and evaluation, the best output forecast result.

資料儲存庫3將外部資料蒐集模組11介接陳情整合系統21所取得之陳情資料，儲存於案件資料庫31，將介接開放資料系統22所取得之資料，例如介接災情資訊開放系統所取得之路樹、招牌、道路、隧道、橋梁、積淹水、水利設施災情、民生基礎設施等災情資訊，載入災情資訊資料庫32；例如將介接地理資訊開放系統所取得之圖層、網格、商家、防救災等地理資訊，載入地理資訊資料庫33；將介接氣象資料開放系統所取得之氣象觀測站之溫度、雨量、風向、風速、高度、相對溼度等氣象觀測資料，載入氣象資訊資料庫34；將介接環境品質開放系統所取得之空氣品質指標值、指標狀態、指標顯示顏色及臭氧、一氧化碳、二氧化硫、一氧化氮、二氧化氮、氮氧化物等環境品質監測資料，載入環境品質資料庫35，作為時序分析相關的資料來源。 The data repository 3 connects the external data collection module 11 to the report data obtained by the report integration system 21, stores it in the case database 31, and connects to the data obtained by the open data system 22, for example, it interfaces with the data obtained by the open data system 22. Obtain disaster information such as road trees, signboards, roads, tunnels, bridges, flooding, water conservancy facility disasters, and people’s livelihood infrastructure, and load them into the disaster information database 32; for example, connect the layers and networks obtained by the open geographic information system Geographic information such as grids, businesses, disaster prevention and relief, etc., are loaded into the geographic information database 33; the meteorological observation data such as temperature, rainfall, wind direction, wind speed, altitude, relative humidity and other meteorological observation stations obtained by the meteorological data open system will be loaded. Enter the weather information database 34; will interface with the air quality index value, index status, index display color and environmental quality monitoring of ozone, carbon monoxide, sulfur dioxide, nitrogen monoxide, nitrogen dioxide, nitrogen oxides, etc. obtained by the environmental quality open system The data is loaded into the environmental quality database 35 as a data source related to time series analysis.

請參閱第1B圖，為本發明一種陳情案件之時序分析方法之流程示意圖。於步驟S101中，執行外部資料之蒐集，接著進至步驟S102。於步驟S102中，針對重複且持續之陳情案件，執行陳情時序之探勘，藉此區分為有規律性和無規律性之陳情案件。於步驟S103中，針對有規律性之陳情案件，執行規律案件之預測。於步驟S104中，針對無規律性之陳情案件，執行特徵案件之預測。 Please refer to Figure 1B, which is a schematic flow diagram of the time sequence analysis method of a petition case of the present invention. In step S101, the collection of external data is performed, and then the process proceeds to step S102. In step S102, for repeated and continuous petition cases, an exploration of petition sequence is performed to distinguish regular and irregular petition cases. In step S103, the prediction of regular cases is executed for regular complaints. In step S104, the prediction of characteristic cases is executed for irregular petition cases.

請參閱第2圖，為本發明一種陳情案件之時序分析方法之外部資料之蒐集的流程示意圖。 Please refer to Figure 2, which is a schematic diagram of the collection of external data in a time series analysis method of petition cases of the present invention.

於步驟S201中，即時接收最新陳情與定時取得外部開放資料，以寫入資料儲存庫。詳言之，外部資料蒐集模組介接陳情整合系統，即時接收最新陳情資料，寫入案件資料庫；定時介接開放資料系統，取得災情、地理、氣象、環境品質等資訊，分別寫入災情資訊資料庫、地理資訊資料庫、環境品質資訊或環境品質資料庫。 In step S201, the latest submission is received in real time and the external open data is obtained regularly to be written into the data storage database. In detail, the external data collection module is connected to the report integration system to receive the latest report data in real time and write it into the case database; it is connected to the open data system regularly to obtain disaster, geography, meteorology, environmental quality and other information, and write them into the disaster situation separately Information database, geographic information database, environmental quality information or environmental quality database.

於步驟S202中，依陳情通報時間和地點，查詢預設時間範圍及預設地點範圍內的歷史陳情資料。詳言之，外部資料蒐集模組依陳情通報時間、地點，查詢案件資料庫中預設時間範圍及預設地點範圍內的歷史陳情資料。 In step S202, according to the time and location of the submission, the historical submission data within the preset time range and the preset location are queried. In detail, the external data collection module queries the historical submission data in the case database within the preset time range and the preset location based on the time and location of the submission.

於步驟S203中，檢查是否存在同一案類的歷史陳情資料。若是，進至步驟S204，將該陳情案件歸類為重複陳情案件；若否，進至步驟S205，將該陳情案件歸類為單一陳情案件。 In step S203, it is checked whether there are historical petitions of the same case type. If yes, proceed to step S204 to classify the petition case as a repeated petition case; if not, proceed to step S205 to classify the petition case as a single petition case.

於步驟S206中，計算該陳情案件的通報頻率是否大於預設值，通報頻率的計算公式為

。若是，進至步驟 S207，將該重複陳情案件歸類為持續陳情案件；若否，進至步驟S208，將該重複陳情案件歸類為偶發陳情案件。 In step S206, it is calculated whether the notification frequency of the petition case is greater than a preset value, and the calculation formula of the notification frequency is

. If yes, go to step S207 to classify the repeated petition case as a continuous petition case; if not, go to step S208 to classify the repeated petition case as an occasional petition case.

請參閱第3圖，為本發明一種陳情案件之時序分析方法之陳情時序之探勘流程圖。 Please refer to Fig. 3, which is a flow chart of the exploratory flow chart of the time sequence of the petition in the time-series analysis method of the petition case of the present invention.

於步驟S301中，針對重複且持續陳情案件，使用敘述統計方法計算其統計值。詳言之，陳情時序探勘模組取得外部資料蒐集模組中，針對已歸類為重複且持續陳情案件，使用敘述統計方法計算其統計值，包含不同時間單位(小時/日/週/月/季/年)之案件數平均、眾數、中位數、最大值、最小值、四分位差、變異數與標準差等值，掌握案件數量的變動和集中與分散趨勢。 In step S301, for repeated and continuous petition cases, the statistical value of the narrative statistical method is calculated. In detail, the sentiment timing exploration module obtains the external data collection module In, for cases that have been classified as repeated and continuous submissions, use narrative statistical methods to calculate their statistical values, including the average, mode, and median number of cases in different time units (hours/days/weeks/months/quarters/years) , Maximum value, minimum value, interquartile range, variance and standard deviation, etc., to grasp the changes in the number of cases and the trend of concentration and dispersion.

於步驟S302中，依據小時、日、週、月、季或年等不同基本時間單位，繪製案件數量時間序列圖，亦即依時間單位的順序排列案件數量，藉此了解陳情案件的數量變動情形。 In step S302, according to different basic time units such as hour, day, week, month, quarter or year, draw a time series diagram of the number of cases, that is, arrange the number of cases in the order of the time unit, so as to understand the change in the number of petition cases .

於步驟S303中，使用基本時間單位或延伸時間單位，計算各時間單位之案件數占所有案件數的比例。 In step S303, the basic time unit or the extended time unit is used to calculate the proportion of the number of cases in each time unit to the total number of cases.

於步驟S304中尋找是否存在案件數比例大於五成的時間單位，據以分析案件數量之變動有無規律性。若是即有規律性，進至步驟S305，則使用時間序列演算法建模；若否即無規律，進至步驟S306，使用類神經網路演算法建模。 In step S304, it is searched whether there is a time unit in which the proportion of the number of cases is greater than 50%, and the regularity of the change in the number of cases is analyzed accordingly. If there is regularity, proceed to step S305 to use time series algorithm for modeling; if not, that is irregular, proceed to step S306 to use neural network-like algorithm for modeling.

請參閱第4圖，為本發明一種陳情案件之時序分析方法之規律案件之預測流程圖。 Please refer to Figure 4, which is a flow chart of predicting regular cases in a time series analysis method for petition cases of the present invention.

於步驟S401中，將歷史案件依相同時間間隔分為N個等分。詳言之，規律案件預測模組依據陳情時序探勘模組分析之規律性案件，將該即時及其歷史案件，依相同時間間隔，劃分為N個等分。 In step S401, the historical case is divided into N equal parts according to the same time interval. In detail, the regular case prediction module divides the real-time and historical cases into N equal parts at the same time interval based on the regular cases analyzed by the petition time sequence exploration module.

於步驟S402中，取最後第N個等分為測試集，前面N-1個等分為訓練集。 In step S402, the last Nth is divided into the test set, and the first N-1 are divided into the training set.

於步驟S403中，使用ARIMA(Autoregressive Integrated Moving Average Model；整合移動平均自迴歸模型)演算法，建立時間序列模型，其中ARIMA(p，d，q)又稱為差分整合移動平均自迴歸模型，其中，AR為自迴歸，p為自迴歸項；MA為移動平均，q為移動平均項數；d為時間序列成為平穩時所做的差分次數。 In step S403, the ARIMA (Autoregressive Integrated Moving Average Model; integrated moving average autoregressive model) algorithm is used to establish a time sequence Column model, where ARIMA (p, d, q) is also called differential integrated moving average autoregressive model, where AR is autoregressive, p is autoregressive term; MA is moving average, q is the number of moving average terms; d is The number of differentiations made when the time series becomes stationary.

於步驟S404中，使用RMSE(Root Mean Square Error)、MAE(Mean Absolute Error)或MAPE(Mean Absolute Percentage Error)等預測模型表現評估方法，計算預測值與實際值的差距，評估預測結果與實際結果之誤差值是否在容忍區間內。若否即不在容忍區間內，進至步驟S405，調整自迴歸項、移動平均項數與差分次數參數；若是即在容忍區間內，進至步驟S406，使用ARIMA演算法建模所得之最佳參數，進行未來時間區間之案件數量預測。 In step S404, use RMSE (Root Mean Square Error), MAE (Mean Absolute Error), or MAPE (Mean Absolute Percentage Error) and other prediction model performance evaluation methods to calculate the difference between the predicted value and the actual value, and evaluate the predicted result and the actual result Whether the error value is within the tolerance interval. If it is not within the tolerance interval, proceed to step S405 to adjust the autoregressive term, the number of moving average terms and the number of difference parameters; if it is within the tolerance interval, proceed to step S406, and use the best parameters obtained by ARIMA algorithm modeling , To predict the number of cases in the future time interval.

於步驟S407中，依據所選取時間單位之案件數最大值，將未來時間區間中，可能接收通報的日期時間條列排序。 In step S407, according to the maximum number of cases in the selected time unit, sort the date and time that the notification may be received in the future time interval.

於步驟S408中，呈現規律案件之時序分析結果。 In step S408, the time sequence analysis results of regular cases are presented.

以上針對規律性陳情，使用ARIMA模型進行未來N個時間區間的案件數預測。以未來的通報案件數量為預測對象，將案件數隨時間推移而形成的數據序列，視為隨著時間推移而形成的一個隨機時間序列，透過對該時間序列上案件數的趨勢性、規律性等因素的分析，挖掘各時間區間案件數之間所具有的相關性或依存關係，使用時間序列演算法建模預測，達到利用歷史與即時的案件數，來預測未來案件數的目的。 The above is based on regular reports, and the ARIMA model is used to predict the number of cases in the future N time intervals. Taking the number of reported cases in the future as the target of prediction, the data series formed by the number of cases over time is regarded as a random time series formed over time, through the trend and regularity of the number of cases in the time series Analysis of other factors, mining the correlation or dependence relationship between the number of cases in each time interval, using time series algorithm modeling and forecasting, to achieve the purpose of using historical and real-time cases to predict the number of future cases.

實施例一： Example one:

舉例而言，假設某一地點持續有噪音類別的陳情，陳情日期範圍是2018-09-01~2019-04-30，在8個月242天中，合計接收136件通報，若當日沒有通報，則將案件數補0，其{通報日期，案件數}資料簡列為{2018-09-01，2}、{2018-09-02，0}、{2018-09-03，0}、{2018-09-04，0}、{2018-09-05，0}、{2018-09-06，0}、{2018-09-07，0}、{2018-09-08，0}、{2018-09-09，1}、{2018-09-10，0}、...、{2019-04-21，1}、{2019-04-22，0}、{2019-04-23，1}、{2019-04-24，1}、{2019-04-25，0}、{2019-04-26，1}、{2019-04-27，2}、{2019-04-28，5}、{2019-04-29，0}、{2019-04-30，1}。 For example, suppose there are continuous complaints of noise type in a certain place, and the date range of the complaints is from 2018-09-01 to 2019-04-30. In 8 months and 242 days, a total of 136 notifications were received. Report, if there is no notification on that day, add 0 to the number of cases, and its {date of notification, number of cases} are briefly listed as {2018-09-01, 2}, {2018-09-02, 0}, {2018-09 -03, 0}, {2018-09-04, 0}, {2018-09-05, 0}, {2018-09-06, 0}, {2018-09-07, 0}, {2018-09 -08, 0}, {2018-09-09, 1}, {2018-09-10, 0},..., {2019-04-21, 1}, {2019-04-22, 0}, {2019-04-23, 1}, {2019-04-24, 1}, {2019-04-25, 0}, {2019-04-26, 1}, {2019-04-27, 2}, {2019-04-28, 5}, {2019-04-29, 0}, {2019-04-30, 1}.

以小時、日、週及月為時間單位，繪製如第6A至6D圖之依小時、日、週及月累計之案件數量時間序列圖，觀察14時與20時累計的案件數量最多；18日累計的案件數量最多；週末星期日與星期六累計的案件數量明顯多於週間及3月份案件數最多。 Using hour, day, week and month as time units, draw a time series chart of the number of cases accumulated by hour, day, week and month as shown in Figures 6A to 6D. Observe the number of accumulated cases at 14 o'clock and 20 o'clock; the 18th The cumulative number of cases is the largest; the cumulative number of cases on weekends, Sundays and Saturdays is significantly more than the number of cases during the week and March.

分別使用基本時間單位及延伸時間單位，計算各時間單位出現的機率，以週為時間單位，計算一週中每天出現的百分比，得到{星期一，6.62%}、{星期二，5.15%}、{星期三，10.29%}、{星期四，6.62%}、{星期五，4.41%}、{星期六，23.53%}、{星期日，43.38%}，再以週間、週末為時間單位，計算一週中週間、週末出現的百分比，得到{週間，33.09%}、{週末，66.91%}，因週末的案件數比例大於五成，案件數量的變動具有規律性，所以，選擇使用時間序列演算法建模預測。 Using the basic time unit and the extended time unit respectively, calculate the probability of each time unit appearing. Using the week as the time unit, calculate the percentage of each day of the week to get {Monday, 6.62%}, {Tuesday, 5.15%}, {Wednesday , 10.29%}, {Thursday, 6.62%}, {Friday, 4.41%}, {Saturday, 23.53%}, {Sunday, 43.38%}, and then use the week and weekend as the time unit to calculate the midweek and weekend occurrences As a percentage, we get {weekly, 33.09%} and {weekend, 66.91%}. Since the proportion of the number of cases on weekends is more than 50%, the number of cases changes regularly, so we choose to use time series algorithm to model and forecast.

使用已調整好最適參數之ARIMA模型，預測未來30日的案件數量，請參考第7圖之案件數量預測時間序列圖，得到的{日期，預測案件數}結果為{2019-05-01，-1.86}、{2019-05-02，0.48}、{2019-05-03，0.25}、{2019-05-04，0.65}、{2019-05-05，2.02}、{2019-05-06， 0.17}、{2019-05-07，0.91}、{2019-05-08，-0.06}、{2019-05-09，0.95}、{2019-05-10，0.31}、{2019-05-11，0.87}、{2019-05-12，1.99}、{2019-05-13，-0.08}、{2019-05-14，0.98}、{2019-05-15，-0.37}、{2019-05-16，1.19}、{2019-05-17，0.15}、{2019-05-18，0.86}、{2019-05-19，2.07}、{2019-05-20，-0.15}、{2019-05-21，1.06}、{2019-05-22，-0.63}、{2019-05-23，1.29}、{2019-05-24，-0.05}、{2019-05-25，0.80}、{2019-05-26，2.10}、{2019-05-27，-0.29}、{2019-05-28，1.15}、{2019-05-29，-0.85}、{2019-05-30，1.45}。 Using the ARIMA model that has adjusted the most suitable parameters to predict the number of cases in the next 30 days, please refer to the case number forecast time series chart in Figure 7, and the result of {date, number of predicted cases} is {2019-05-01,- 1.86}, {2019-05-02, 0.48}, {2019-05-03, 0.25}, {2019-05-04, 0.65}, {2019-05-05, 2.02}, {2019-05-06, 0.17}, {2019-05-07, 0.91}, {2019-05-08, -0.06}, {2019-05-09, 0.95}, {2019-05-10, 0.31}, {2019-05-11 , 0.87}, {2019-05-12, 1.99}, {2019-05-13, -0.08}, {2019-05-14, 0.98}, {2019-05-15, -0.37}, {2019-05 -16, 1.19}, {2019-05-17, 0.15}, {2019-05-18, 0.86}, {2019-05-19, 2.07}, {2019-05-20, -0.15}, {2019- 05-21, 1.06}, {2019-05-22, -0.63}, {2019-05-23, 1.29}, {2019-05-24, -0.05}, {2019-05-25, 0.80}, { 2019-05-26, 2.10}, {2019-05-27, -0.29}, {2019-05-28, 1.15}, {2019-05-29, -0.85}, {2019-05-30, 1.45} .

依序選取預測案件數最高的前3筆，分別為{2019-05-26，2.10}、{2019-05-19，2.07}、{2019-05-05，2.02}，再依據時間單位為小時之案件數最大值所在的時間為14時，進而預測適合派遣的前3個日期時間依序為2019-05-26的14時、2019-05-19的14時或2019-05-05的14時。 Select the top 3 projects with the highest number of predicted cases in order, namely {2019-05-26, 2.10}, {2019-05-19, 2.07}, {2019-05-05, 2.02}, and then the time unit is hour The maximum number of cases is at 14:00, and the first 3 dates and times that are suitable for dispatch are predicted to be 14:00 on 2019-05-26, 14:00 on 2019-05-19 or 14 on 2019-05-05. Time.

請參閱第5圖，為本發明一種陳情案件之時序分析方法之特徵案件之預測流程圖。 Please refer to Figure 5, which is a flow chart of predicting characteristic cases of a time series analysis method for petition cases of the present invention.

於步驟S501中，依起迄時間及地點範圍，查詢歷史案件及災情、地理、氣象、環境品質等開放資料的特徵作為預測變數。詳言之，特徵案件預測模組依據陳情時序探勘模組分析之非規律性案件，依起迄時間及地點範圍，查詢歷史案件及災情、地理、氣象、環境品質等開放資料的特徵作為預測變數。 In step S501, according to the range of time and location, historical cases and characteristics of open data such as disaster conditions, geography, weather, and environmental quality are inquired as predictive variables. In detail, the characteristic case prediction module is based on the non-regular cases analyzed by the report time sequence exploration module, and according to the time and location range from the beginning to the end, the characteristics of historical cases and open data such as disaster conditions, geography, meteorology, and environmental quality are used as forecast variables. .

於步驟S502中，取最後T%為測試集，其餘(100-T)%為訓練集。 In step S502, take the last T% as the test set, and the remaining (100-T)% as the training set.

於步驟S503中，使用ANN(Artificial Neural Networks；類神經網路)演算法，建立預測變數對有無通報影響的模型。 In step S503, an ANN (Artificial Neural Networks) algorithm is used to establish a model for predicting the influence of the variable on the presence or absence of notification.

於步驟S504中，若該時間區間預測有通報且實際有通報或預測無通報且實際無通報，則預測正確；若該時間區間預測有通報但實際無通報或預測無通報但實際有通報，則預測錯誤。使用分類矩陣，計算正確率是否大於預設值。若是，進至步驟S505，調整預測變數，保留預測效果較佳的特徵變數，移除預測效果較差的特徵變數；若否，進至步驟S506，使用ANN演算法建模所得之較佳的複數個預測變數，預測變數發生的機率比重，並計算各特徵變數之增益值，增益值的計算公式為

In step S504, if the time interval is predicted to have a notification and there is actually a notification, or if no notification is predicted and no notification is actually made, the prediction is correct; if the time interval is predicted to have a notification but no notification is actually expected or no notification is predicted but there is a notification actually, then The prediction is wrong. Use the classification matrix to calculate whether the correct rate is greater than the preset value. If yes, go to step S505, adjust the prediction variables, keep the feature variables with better prediction effect, and remove the feature variables with poor prediction effect; if not, go to step S506, use ANN algorithm to model the better multiple ones Predict the variable, predict the probability of the variable’s occurrence, and calculate the gain value of each characteristic variable. The calculation formula for the gain value is

於步驟S507中，依據各特徵之增益值高低順序，將未來時間區間中，符合該特徵的日期時間條列排序。 In step S507, according to the order of the gain value of each feature, the date and time matching the feature in the future time interval are sorted.

於步驟S508中，呈現特徵案件之時序分析結果。 In step S508, the time sequence analysis result of the characteristic case is presented.

以上針對非規律性的陳情，使用類神經網路模型進行影響有無通報的權重預測。類神經網路透過變更權重的方式提昇學習的效果，演算法本身沒有篩選變數的功能，所有變數都會計算出屬於自己的權重。如果所挑選的特徵對預測結果無效果或不穩定，容易造成模型的學習效果不佳或收斂過慢的情形；相關性高的特徵，也會造成模型無法收斂的情形。所以，使用類神經網路建模時，必須謹慎挑選變數。 The above uses a neural network model to predict the weights that affect the presence or absence of notifications for irregular submissions. The similar neural network improves the learning effect by changing the weight. The algorithm itself does not have the function of filtering variables, and all variables will calculate their own weights. If the selected features are ineffective or unstable on the prediction results, it is easy to cause the model's learning effect or convergence to be too slow; features with high correlation will also cause the model to fail to converge. Therefore, when using neural network-like modeling, variables must be carefully selected.

使用類神經網路建模預測，輸入預測變數及其值範圍，可得到該變數對有無通報案件的機率比重，分別計算有無通報案件的增益值，若增益值大於1，表示預測變數之發生機率高於該變數原始之發生機率，該變數及其值範圍被視為傾向發生，並將結果以相對增益值的方式呈現。 Use neural network-like modeling and forecasting, input the predicted variable and its value range, you can get the proportion of the variable to whether there is a reported case, and calculate the gain value of whether there is a reported case. If the gain value is greater than 1, it indicates the probability of occurrence of the predicted variable Higher than the original occurrence probability of the variable, the variable and its value range are regarded as a tendency to occur, and the result is presented as a relative gain value.

實施例二： Embodiment two:

舉例而言，假設某一區域持續有環境污染類別的陳情，陳情日期範圍是2018-09-01~2019-06-30，在10個月303天中，合計接收2437件通報，若當日沒有通報，則將案件數補0，其{通報日期，案件數}資料簡列為{2018-09-01，4}、{2018-09-02，7}、{2018-09-03，5}、{2018-09-04，3}、{2018-09-05，7}、{2018-09-06，6}、{2018-09-07，7}、{2018-09-08，10}、{2018-09-09，12}、{2018-09-10，4}、...、{2019-06-21，3}、{2019-06-22，6}、{2019-06-23，12}、{2019-06-24，9}、{2019-06-25，10}、{2019-06-26，10}、{2019-06-27，7}、{2019-06-28，9}、{2019-06-29，5}、{2019-06-30，7}。 For example, suppose there are continuous reports of environmental pollution in a certain area. The date range of the reports is from 2018-09-01 to 2019-06-30. In 10 months and 303 days, a total of 2437 notifications have been received. If there are no notifications on that day , The number of cases is added to 0, and its {date of notification, number of cases} is abbreviated as {2018-09-01, 4}, {2018-09-02, 7}, {2018-09-03, 5}, {2018-09-04, 3}, {2018-09-05, 7}, {2018-09-06, 6}, {2018-09-07, 7}, {2018-09-08, 10}, {2018-09-09, 12}, {2018-09-10, 4},..., {2019-06-21, 3}, {2019-06-22, 6}, {2019-06-23 , 12}, {2019-06-24, 9}, {2019-06-25, 10}, {2019-06-26, 10}, {2019-06-27, 7}, {2019-06-28 , 9}, {2019-06-29, 5}, {2019-06-30, 7}.

以小時、日、週及月為時間單位，繪製如第8A至8D圖之依小時、日、週及月累計之案件數量時間序列圖，觀察18~19時累計的案件數量最多；各個日期、週間與週末及各個月份的累計案件數沒有明顯的變動幅度。 Using hour, day, week and month as the time unit, draw a time series chart of the number of cases accumulated by hour, day, week and month as shown in Figure 8A to 8D. Observe the number of accumulated cases from 18 to 19 o'clock; each date, There was no significant change in the cumulative number of cases during the week, weekend, and each month.

使用基本時間單位，計算各時間單位出現的機率。以小時為時間單位，計算一天中每小時出現的百分比，得到{0，0.53%}、{1，0.33%}、{2，0.90%}、{3，0.66%}、{4，0.00%}、{5，0.00%}、{6， 0.16%}、{7，0.86%}、{8，1.89%}、{9，1.89%}、{10，1.72%}、{11，3.04%}、{12，5.70%}、{13，8.17%}、{14，6.98%}、{15，5.87%}、{16，5.87%}、{17，10.34%}、{18，12.97%}、{19，13.46%}、{20，11.57%}、{21，4.84%}、{22，1.72%}、{23，0.53%}，小時的案件數量變動沒有明顯集中的規律現象，選擇使用類神經網路演算法建模預測。 Use the basic time unit to calculate the probability of each time unit. Using hours as the time unit, calculate the percentage of occurrences per hour in a day to get {0, 0.53%}, {1, 0.33%}, {2, 0.90%}, {3, 0.66%}, {4, 0.00%} , {5, 0.00%}, {6, 0.16%}, {7,0.86%}, {8,1.89%}, {9,1.89%}, {10,1.72%}, {11,3.04%}, {12,5.70%}, {13,8.17 %}, {14,6.98%}, {15,5.87%}, {16,5.87%}, {17,10.34%}, {18,12.97%}, {19,13.46%}, {20,11.57% }, {21, 4.84%}, {22, 1.72%}, {23, 0.53%}, there is no obvious regular phenomenon in the number of hourly case changes, so we choose to use neural network-like algorithm to model and forecast.

將案件主類別、次類別、行政區、案件狀態、案件等級、案發時間、報案時間、結案時間、月、日、星期、小時溫度、雨量、風向、風速、高度、相對溼度、空氣品質指標值、二氧化硫、一氧化碳、臭氧、懸浮微粒、細懸浮微粒、二氧化氮、氮氧化物、一氧化氮、商家等屬性作為輸入變數，其中，類別變數會呈現所有選項，連續變數則分割為級距，使用已選定的複數個預測變數之類神經網路模型，預測特徵及其值範圍對影響通報的機率比重，並計算其增益值。 Set the main category, sub-category, administrative region, case status, case level, time of case, time of report, time of case closing, month, day, week, hour temperature, rainfall, wind direction, wind speed, altitude, relative humidity, air quality index value Attributes such as, sulfur dioxide, carbon monoxide, ozone, aerosols, fine aerosols, nitrogen dioxide, nitrogen oxides, nitric oxide, merchants, etc. are used as input variables. Among them, category variables will present all options, and continuous variables are divided into levels. Using a neural network model such as a plurality of selected predictive variables, predict the proportion of features and their value ranges to the probability of affecting the notification, and calculate the gain value.

假設預測變數及其值為19時，增益值的計算公式為：有通報案件增益值=19時之預測發生機率/19時原始之發生機率，其中，19時原始之發生機率=19時的案件數/總案件數。得到的{預測變數：選項或值的範圍，有無通報案件，增益值}依序為{時：19，有，5.44}、{時：18，有，5.19}、{時：20，有，5.03}、{時：23，無，1.19}、{時：0，無，1.19}、{時：5，無，1.19}、{月：2，無，1.14}、{日：3，無，1.14}、{週：星期三，有，1.8}、{臭氧：44.965-87.922，有，1.56}、{風速：1.835-3.558，有，1.54}、{空氣品質指標狀態：對敏感族群不健康，無，1.08}、{二氧化氮：20.144-39.427，有，1.53}、{空氣品質指標狀態：對所有族群不健康，無，1.07}、{本日最高溫：13.100-22.144，無，1.07}、{本日最高溫：28.133-35.000，有，1.4}、{風速：0.000-0.836，無，1.05}、{臭氧：2.600-20.046，無，1.05}、{相對濕度：0.344-0.642，有，1.3}、{二氧化硫：2.947-5.785，無，1.04}。 Assuming that the predicted variable and its value is 19, the calculation formula of the gain value is: the reported case gain value = the predicted probability of occurrence at 19:00/the original probability of occurrence at 19:00, where the original probability of occurrence at 19:00 = the case at 19:00 Number/total number of cases. The obtained {predictive variable: range of options or values, whether there is a notification case, the gain value} are {hour: 19, yes, 5.44}, {hour: 18, yes, 5.19}, {hour: 20, yes, 5.03 }, {Hour: 23, None, 1.19}, {Hour: 0, None, 1.19}, {Hour: 5, None, 1.19}, {Month: 2, None, 1.14}, {Day: 3, None, 1.14 }, {week: Wednesday, yes, 1.8}, {ozone: 44.965-87.922, yes, 1.56}, {wind speed: 1.835-3.558, yes, 1.54}, {air quality index status: unhealthy for sensitive groups, none, 1.08 }, {Nitrogen Dioxide: 20.144-39.427, yes, 1.53}, {Air Quality Index State: unhealthy for all ethnic groups, none, 1.07}, {the highest temperature of the day: 13.100-22.144, none, 1.07}, {the highest temperature of the day: 28.133-35.000, yes, 1.4}, {wind speed: 0.000-0.836, none, 1.05 }, {Ozone: 2.600-20.046, none, 1.05}, {Relative humidity: 0.344-0.642, yes, 1.3}, {Sulfur dioxide: 2.947-5.785, none, 1.04}.

其結果呈現，請參考第9圖之特徵影響有無通報案件增益值長條圖，中間縱軸代表特徵變數及其選項或值的範圍，雙橫軸代表有無通報案件的增益值，而長條的長度代表該特徵影響有無通報案件的增益程度。如果時間是19時、18時、20時，則有通報案件的機率高於沒有通報案件的機率，所以長條圖會側重在右側有通報案件的這一側；如果時間是23時、0時、5時，則沒有通報案件的機率高於有通報案件的機率，所以長條圖會側重在左側無通報案件的那一側。 For the result presentation, please refer to the bar graph of whether the feature affects the reported case gain value in Figure 9. The middle vertical axis represents the feature variable and its options or value range, the double horizontal axis represents the gain value of the reported case, and the long bar The length represents the degree of gain that the feature affects whether there are reported cases. If the time is 19:00, 18:00, and 20:00, the probability of reporting a case is higher than the probability of not reporting a case, so the bar graph will focus on the side where the case is reported on the right; if the time is 23:00, 0:00 At 5 o'clock, the probability of not reporting a case is higher than the probability of reporting a case, so the bar graph will focus on the side where there is no reporting case on the left.

依序選取不同特徵變數中，其選項或值範圍所計算增益值較高的前3筆，分別為{時，19}、{週，星期三}、{臭氧，44.965~87.922}，再依據小時、週及臭氧特徵的選項或值範圍，找出適合派遣的前3個日期時間依序為2019-07-03(星期三)的19時、2019-07-01的19時或2019-07-02的19時。 In order to select different characteristic variables, the top 3 gains calculated by their options or value ranges are {hour, 19}, {week, Wednesday}, {ozone, 44.965~87.922}, and then according to the hour, The week and ozone feature options or value ranges, find the first 3 dates and times suitable for dispatch, in order of 19:00 on 2019-07-03 (Wednesday), 19:00 on 2019-07-01 or 19:00 on 2019-07-02 19 o'clock.

本發明所提供之陳情案件之時序分析系統與方法，與其他習用技術相互比較時，更具備下列優點： The time sequence analysis system and method of petition cases provided by the present invention have the following advantages when compared with other conventional techniques:

1.本發明之陳情案件之時序分析系統與方法，其中外部資料之蒐集係將即時陳情與鄰近地點的歷史陳情，依據通報的重複性與持續性，歸納重複持續陳情案件，了解案件的趨勢。 1. The time series analysis system and method of petition cases of the present invention, in which the collection of external data is to collect real-time petitions and historical petitions in neighboring locations, and summarize repeated and continuous petition cases based on the repeatability and continuity of the notification, and understand the trend of the case.

2.本發明之陳情案件之時序分析系統與方法，其中陳情時序之探勘係使用敘述統計及繪製不同時間單位之案件數量時間序列圖方法，了解案件數的集中與分散趨勢，分析歷史案件數的變動有無規律性。 2. The time series analysis system and method of petition cases of the present invention, wherein the exploration of petition time series uses narrative statistics and the method of drawing time series diagrams of the number of cases in different time units to understand the concentration and dispersion trend of the number of cases, and analyze the number of historical cases Whether the changes are regular.

3.本發明之陳情案件之時序分析系統與方法，其中規律案件之時序預測係使用時間序列演算法模型，預測未來時間區間之案件數，依據案件數量最多的時間區間，預測未來接收通報的日期時間，呈現規律案件之時序分析結果，實用性高。 3. The time series analysis system and method of petition cases of the present invention, wherein the time series prediction of regular cases uses a time series algorithm model to predict the number of cases in the future time interval, and predict the date of receiving notifications in the future based on the time interval with the largest number of cases Time shows the results of sequential analysis of regular cases, which is highly practical.

4.本發明之陳情案件之時序分析系統與方法，其中特徵案件之預測係使用類神經網路演算法模型，分析案件特徵影響通報的機率，依據影響案件發生的特徵及其增益值，排列符合條件的日期與時段，呈現特徵案件之時序分析結果，有效協助智慧城市權責單位獲取足夠的資訊防範未然，使類似疏失造成對安全的危害能夠降至最低。 4. The time series analysis system and method of petition cases of the present invention, in which the prediction of characteristic cases uses a neural network-like algorithm model to analyze the probability of the case characteristics affecting the notification, according to the characteristics that affect the occurrence of the case and its gain value, and the ranking meets the conditions The date and time of day show the results of time series analysis of characteristic cases, which effectively assists smart city authorities to obtain sufficient information to prevent them from happening beforehand, so that the safety hazards caused by similar negligence can be minimized.

上列詳細說明係針對本發明之一可行實施例之具體說明，惟該實施例並非用以限制本發明之專利範圍，凡未脫離本發明技藝精神所為之等效實施或變更，均應包含於本案之專利範圍中。 The above detailed description is a specific description of a possible embodiment of the present invention, but this embodiment is not intended to limit the scope of the patent of the present invention. Any equivalent implementation or modification that does not deviate from the technical spirit of the present invention should be included in In the scope of the patent in this case.

綜上所述，本案不但在技術思想上確屬創新，並能較習用物品增進上述多項功效，應以充分符合新穎性及進步性之法定發明專利要件，爰依法提出申請，懇請貴局核准本件發明專利申請案，以勵發明，至感德便。 To sum up, this case is not only innovative in terms of technical ideas, but also can improve the above-mentioned multiple functions compared with conventional articles. It should fully meet the requirements of novel and progressive statutory invention patents. An application should be filed in accordance with the law. Please approve this case. Patent applications for inventions, to encourage invention, to the sense of virtue.

11‧‧‧外部資料蒐集模組 11‧‧‧External data collection module

13‧‧‧規律案件預測模組 13‧‧‧Regular case prediction module

2‧‧‧介接平台 2‧‧‧Interface platform

21‧‧‧陳情整合系統 21‧‧‧Composition System

22‧‧‧開放資料系統 22‧‧‧Open Data System

23‧‧‧大數據分析系統 23‧‧‧Big Data Analysis System

3‧‧‧資料儲存庫 3‧‧‧Data Repository

31‧‧‧案件資料庫 31‧‧‧Case Database

32‧‧‧災情資訊資料庫 32‧‧‧Disaster Information Database

33‧‧‧地理資訊資料庫 33‧‧‧Geographic Information Database

34‧‧‧氣象資訊資料庫 34‧‧‧Meteorological Information Database

35‧‧‧環境品質資料庫 35‧‧‧Environmental Quality Database

Claims

A time sequence analysis system for a submission case, including: an external data collection module, which stores the data obtained by interfacing a submission integration system and an open data system in a data repository after preprocessing, and then relies on a submission The notification time and location of the case, search for historical petition cases within the preset time range and location in the data repository, and determine whether the petition case is a repeated and continuous petition case in a nearby location; a petition time series exploration module, Receive the repeated and continuous submission case, analyze whether there is a regularity in the number of cases in different time units; if a regular case prediction module has regularity, the regular case prediction module uses a time series algorithm to model and predict. To provide time series analysis results of regular cases; and a feature case prediction module. If there is no regularity, the feature case prediction module uses neural network algorithm (ANN) to model and predict, thereby providing feature cases According to the time sequence analysis result, the model prediction system using the neural network-like algorithm includes: for the repeated and continuous petition cases with irregularities, query the history in the data repository according to the time and location range. Cases and features in an open data system are used as predictive variables; for the predictive variables, take the last T% as the test set, and the remaining (100-T)% as the training set; use this type of neural network algorithm to establish whether or not the predictive variable is The model for reporting the impact; use the classification matrix to calculate whether the correct rate is greater than the preset value; if it is not greater than the preset value, adjust the prediction variable, retain the feature number with better prediction effect, and remove the feature variable with poor prediction effect; If it is greater than the preset value, use the predicted variables modeled by the ANN algorithm to predict the probability of occurrence of each characteristic variable, and calculate the gain value of each characteristic variable. Among them,

According to the order of the gain value of each characteristic variable, sort the date and time that match the characteristic variable in the future time interval; and provide the time sequence analysis result of the characteristic case.

For example, the petition case sequence analysis system described in item 1 of the scope of patent application, wherein the petition sequence exploration module describes the outline of the number of cases and draws a time series diagram of the number of cases for the repeated and continuous petition case, thereby analyzing different times Whether the number of cases in the unit changes regularly.

For example, the time-series analysis system for petition cases described in the scope of patent application 1, wherein the regular case prediction module uses the time series algorithm to repeatedly model to predict the number of cases in the future time interval, and then the future time interval The date and time of possible receipt of notifications are sorted in order to provide the timing analysis results of the regular cases.

For example, the time series analysis system of petition cases described in the scope of patent application, wherein the characteristic case prediction module uses this type of neural network algorithm to repeatedly model to calculate each characteristic variable and its gain value, thereby providing the Time series analysis results of characteristic cases.

For example, the time sequence analysis system for petition cases described in the first item of the scope of patent application is connected to an interface platform, which includes: the petition integration system, which includes citizens’ suggestions on urban governance, reports of violations and deficiencies, and public rights Maintenance, after notification, the reporting integration system will perform the management and follow-up dispatch services; the open data system contains all kinds of open data collected by the government or related fields, and provides an interface for interfacing and collecting data; and A large data analysis system that provides users with the expected modeling algorithm, the ratio of the training set to the test set, and the imported file data. After repeated modeling and evaluation, the prediction results are output.

For example, the time sequence analysis system of petition cases described in the first item of the scope of patent application is connected to the data repository. The data repository includes: a case database containing case types, locations, times, descriptions, and descriptions of real-time and historical cases Grade and status data; a disaster information database containing disaster information related to the city; a geographic information database containing geographic information related to the city; a meteorological information database containing historical and predicted meteorological information; And an environmental quality database, which contains historical and predicted environmental quality data.

A time sequence analysis method for petition cases, including: collecting external data, receiving a petition case from a petition integration system to determine whether the petition case is a repeated and continuous petition case in a nearby location; and exploring the petition sequence for the repeated and continuous petition Report sentiment cases, analyze whether there is regularity in the number of cases in different time units; and if there is regularity, use time series algorithm to model and forecast to provide the time series analysis results of regular cases; and if there is no regularity, use class The neural network (ANN) algorithm performs modeling and prediction to provide time series analysis results of characteristic cases, wherein the use of neural network-like algorithm to perform modeling and prediction system includes: for the repeated and continuous complaint case, there is no regularity According to the time and location range, query the historical cases in the data repository and the features in an open data system as the prediction variable; for the prediction variable, take the last T% as the test set, and the remaining (100-T)% It is the training set; use this type of neural network algorithm to establish a model for predicting the impact of variables on the presence or absence of notification; use the classification matrix to calculate whether the correct rate is greater than the preset value; if it is not greater than the preset value, adjust the predicted variable and keep it The feature number with better prediction effect is removed from the feature variable with poor prediction effect; if it is greater than the preset value, the prediction variable obtained by the ANN algorithm modeling is used to predict the probability of occurrence of each feature variable, and calculate The gain value of each characteristic variable, among which,

For example, the time sequence analysis method of petition cases as described in item 7 of the scope of patent application, wherein the collection of external data includes: receiving the latest petitions in real time and obtaining external open data at regular intervals, and writing them into the data repository; and reporting according to the petition cases Time and location, query the historical petition data within the preset time range and preset location in the data repository; check whether there is historical petition data of the same case type; if it exists, the petition case is classified as a repeated petition case; If it does not exist, the petition case is classified as a single petition case; calculate whether the notification frequency of the petition case is greater than the preset value; if it is greater than the preset value, the repeated petition case is classified as a continuous petition case; and if not If it is greater than the preset value, the repeated petition case is classified as an occasional petition case.

For example, the time sequence analysis method of petition cases described in item 7 of the scope of patent application, wherein the time sequence of exploration petitions includes: for the repeated and continuous petition case, calculating the average, mode, and median of the number of cases in different time units The statistical value of, maximum, minimum, interquartile, variance or standard deviation to grasp the number of repeated and continuous complaints and the trend of concentration or dispersion; plot the number of cases according to different basic time units Sequence diagram, arrange the number of cases in the order of time unit to understand the change in the number of repeated and continuous petition cases; use the basic time unit or extended time unit to calculate the proportion of the number of cases in each time unit to the total number of cases ; Look for whether there is a time unit with a proportion of more than 50% of the number of cases, and analyze whether there is a regularity in the number of repeated and continuous petition cases; if there is regularity, use the time series algorithm for modeling; and if not Regularity is modeled using this type of neural network algorithm.

For example, the time series analysis method of petition cases described in item 7 of the scope of patent application, wherein the modeling and forecasting system using a time series algorithm includes: dividing the historical cases in the data repository into N according to the same time interval Divide equally; take the last Nth aliquot and divide it into the test set, and the first N-1 into the training set; use the integrated moving average autoregressive model (ARIMA) algorithm to establish a time series model; use the predictive model performance evaluation method, Evaluate whether the error value between the predicted result and the actual result is within the tolerance interval; if it is not within the tolerance interval, adjust the autoregressive term, the number of moving average terms and the number of difference parameters; If it is within the tolerance interval, use the best parameters modeled by the ARIMA algorithm to predict the number of cases in the future time interval; based on the maximum number of cases in the selected time unit, the future time interval may be accepted The date and time of the notification are listed in order; and the time sequence analysis result of the regular case is provided.