TW201911812A

TW201911812A - Obstacle positioning system and maintenance and operation method of video streaming service for the maintenance and operation unit to select the required maintenance information according to the timeliness or the accuracy requirement by inputting plural characteristic parameter sets

Info

Publication number: TW201911812A
Application number: TW106126675A
Authority: TW
Inventors: 王嚴毅; 詹志嘉; 黃志盟; 楊宜澤; 李昀潔
Original assignee: 中華電信股份有限公司
Priority date: 2017-08-08
Filing date: 2017-08-08
Publication date: 2019-03-16
Also published as: TWI662809B

Abstract

The invention discloses an obstacle positioning system and maintenance and operation method of a video streaming service. When there is a new obstacle event area positioning problem to be processed, the model which is trained and optimized in advance can be used to automatically generate the binary obstacle area positioning information. Also, by inputting plural characteristic parameter sets, the maintenance and operation unit can select the required maintenance information according to the timeliness or the accuracy requirement to perform maintenance and repair of the equipment. In addition, based on the comparison between the current and past statistical values of the obstacle areas, the maintenance priority plan decision information is outputted to achieve the maximum benefit to the maintenance resource allocation.

Description

Obstacle positioning system for image streaming service and maintenance method

本發明屬於一種影像串流服務的障礙定位系統及維運方法，尤指一種利用設備品質量測、用戶申告、修復歷史紀錄、及人工測試等多種網際網路服務供應商(Internet Service Provider，ISP)業者之資料源來組合建立高維度參數數量之分類模型。 The invention belongs to an obstacle positioning system and a maintenance method for video streaming services, in particular to a plurality of Internet service providers (ISPs) that utilize equipment quality measurement, user declaration, repair history record, and manual test. The data source of the industry is combined to establish a classification model of the number of high-dimensional parameters.

隨著網路寬頻服務的普及度提升與用戶的大規模增加，ISP業者之網路結構及元件也隨之大幅成長，造成障礙處理的困難度提升。 With the increasing popularity of Internet broadband services and the large-scale increase in users, the network structure and components of ISP operators have also grown significantly, which has made the difficulty of handling obstacles increase.

相反的為了便於管理調派維護人力，許多ISP業者均朝向扁平化維修體制之方向進行，因此能夠快速判斷障礙區域及調用適切人力物力相關資源加以維護，是ISP業者非常亟需的組織需求。 On the contrary, in order to facilitate the management and deployment of maintenance manpower, many ISPs are moving towards the direction of flat maintenance system. Therefore, it is a much-needed organizational requirement for ISP operators to quickly determine obstacle areas and call for appropriate human and material resources.

機器學習之應用，是一個逐漸受ISP業者重視之領域，早期主要是廣泛使用在電子商務領域作為推薦引擎及廣告之用途，近年如醫療科學、氣象預報等諸多領域亦為熱門之應用方式。 The application of machine learning is an area that is gradually being valued by ISPs. In the early days, it was widely used in the field of e-commerce as a recommendation engine and advertising. In recent years, such as medical science, weather forecasting and many other fields are also popular applications.

伴隨著巨量資料的應用越來越普及，以往ISP(Internet Service Provider)業者監控與管理端對端網路品質的方式，必須由以往的抽樣測試或購買建立監測模組的方式，邁入全域管理及監控的階段。 With the increasing popularity of applications with huge amounts of data, the way ISP (Internet Service Provider) operators monitor and manage end-to-end network quality must enter the global domain by means of previous sampling tests or purchase of monitoring modules. The stage of management and monitoring.

現行的各種障礙申告系統中，多數仍需由客服人員人工操作，目的在於可以詢問並記錄到詳細故障原因。但相對的缺點是在人工選擇申告原因代碼時，有時仍不免會發生誤選或不精確(如選擇原因設為不明或其他)的情形。此時，原始的障礙申告描述可發揮輔助分類之作用，彌補單純使用客服人員輸入的申告原因代碼資訊不足的部分。 Most of the current obstacle reporting systems still need to be manually operated by the customer service staff, in order to ask and record the detailed cause of the fault. However, the relative disadvantage is that when manually selecting the reason code, sometimes it is inevitable that mis-selection or inaccuracy will occur (such as the reason for the selection is unknown or otherwise). At this point, the original description of the obstacles can play the role of auxiliary classification, and make up for the lack of information about the reason code of the application reason input by the customer service staff.

一般而言，ISP的網路服務較常使用的是網狀結構，然而影像串流服務為減少傳輸資料量與反饋資料回報之間隔，通常會改以採用樹狀架構。 In general, ISP's network services use a mesh structure. However, video streaming services usually use a tree structure to reduce the interval between the amount of data transmitted and the feedback of feedback.

由於這種樹狀架構，對於各用戶終端設備來說，其資料傳輸路徑自影像伺服器到終端設備本身是固定的。當用戶在使用影像串流服務發生問題時，障礙點大致上也會落在此路徑之相關設備上。因此，對於單筆申告之障礙點分析，以往已可採用線性迴歸模型分析固定路徑上設備及相關資料，建立初略可信之障礙點判定模型。 Due to this tree structure, for each user terminal device, its data transmission path is fixed from the image server to the terminal device itself. When a user experiences a problem with the video streaming service, the obstacle point will roughly fall on the related device of the path. Therefore, for the analysis of obstacle points in a single report, linear regression models have been used in the past to analyze equipment and related data on fixed paths, and to establish a model of obstacles that are initially credible.

然而要達到更高精準度之障礙定位及優先權決策要求，資料源的來源大幅增加後，一般線性迴歸模型不足以滿足ISP業者之維運處理需求，必須使用能處理高維度參數之非線性模型以進行作業。 However, in order to achieve higher accuracy of obstacle location and priority decision requirements, after the source of data sources has increased significantly, the general linear regression model is insufficient to meet the needs of ISPs for processing, and nonlinear models capable of processing high dimensional parameters must be used. To work.

本案發明人鑑於上述習用方式所衍生的各項缺點，乃亟思加以改良創新，並經多年苦心孤詣潛心研究後，終於成功研發完成本影像串流服務的障礙定位系統及維運方法。 In view of the shortcomings derived from the above-mentioned conventional methods, the inventors of the present invention have improved and innovated, and after years of painstaking research, they finally succeeded in developing and developing the obstacle positioning system and maintenance method of the video streaming service.

為達上述目的，本發明提出提供一種影像串流服務的障礙定位系統及維運方法，以ISP業者所建立的各類綜合網管資訊，藉由高維度數量的特徵值取出及模型訓練的過程，藉以找出優化過後之最佳模型，並用此模型對於未來發生的障礙客戶根據其當時取得資料加以產生預估之分類處理資訊。另亦可提供調整影音串流服務維護之優先權決策方案，提升有限維護資源運用之效益。 In order to achieve the above object, the present invention provides an obstacle positioning system and a maintenance method for video streaming services, which are characterized by high-dimensional quantity eigenvalue extraction and model training by various types of integrated network management information established by ISP operators. In order to find the best model after optimization, and use this model to generate forecasting information for the future obstacles customers based on the data obtained at the time. It can also provide a priority decision-making scheme for adjusting the maintenance of video streaming services to improve the efficiency of the use of limited maintenance resources.

利用ISP業者本身建立的設備及線路品質量測、用戶申告、修復歷史紀錄、及人工測試等各式資料來源，建立多種高維度特徵值組合之分類模型。日後每當有新的障礙事件產生，應用事前訓練過之高特徵值數量可適用非線性迴歸模型，即可依據當下的特徵值資料組合，依維護者時效性或正確性需求選擇該型式模型的預估判定障礙區域資訊，加以處理，並產出可協助調整維護優先權決策計畫方案，所定義之障礙區域係分為歸屬於客戶端(含用戶住家內迴路與終端設備)及歸屬於ISP端設備(含設備間迴路)兩大類，意指該障礙經本發明方法程序分析後可標定位於哪一部分發生障礙，提供ISP業者依其維護組織指派維修方式加以應用。 The ISP industry itself establishes a variety of high-dimensional eigenvalue combination classification models based on equipment and line product quality measurement, user declaration, repair history record, and manual test. In the future, whenever a new obstacle event occurs, the number of high eigenvalues trained beforehand can be applied to the nonlinear regression model, and the model can be selected according to the current combination of eigenvalue data and the timeliness or correctness of the maintainer. Estimate the information of the obstacle area, process it, and produce a plan to help adjust the maintenance priority decision. The defined obstacle area is divided into the client (including the user's home loop and terminal equipment) and belongs to the ISP. The end equipment (including the equipment-to-equipment loop) is two categories, which means that the obstacle can be located in which part of the obstacle is analyzed by the method of the present invention, and the ISP is applied according to the maintenance organization assigned maintenance mode.

一種影像串流服務的障礙定位系統，其包括：資料來源模組，是以蒐集判斷障礙定位所需的複數型資訊源，是另包含服務品質管理單元，是包括應用層的服務品質資料，為各影像終端裝置的畫質等級及數值型品質指標；障礙申告管理單元，是包含影像服務之障礙申告相關資料，為申告原因、申告描述文字及人工預檢測後的測試代碼之內容；迴路品質診斷管理單元，是包括實體層的迴路品質測試資料及非標準型用戶迴路施工工法紀錄，為線路電氣特性估計值及最接近用戶端之ISP所屬交換局端設備紀錄與會大幅影響用戶使用距離之特殊工法紀錄如線路耦合(bundling)及光銅混合(G.fast)；寬頻網路監控單元，是包含ISP業者之各節點設備廠牌型號資料、設備告警代碼及告警代碼與內容之資訊。 An obstacle positioning system for video streaming services, comprising: a data source module, which is a plural information source required for collecting and determining obstacle positioning, and further comprising a service quality management unit, which is a service quality data including an application layer, The image quality level and the numerical quality index of each image terminal device; the obstacle report management unit is the information related to the obstacle application including the image service, and is the content of the report, the description text and the test code after the artificial pre-test; the loop quality diagnosis The management unit is a loop quality test data including the physical layer and a non-standard user loop construction method record. It is a special method for estimating the electrical characteristics of the line and the recording of the ISP's switching office equipment closest to the user end, which greatly affects the user's use distance. Records such as line coupling (bundling) and optical copper hybrid (G.fast); broadband network monitoring unit, is the information of the brand model data, equipment alarm code and alarm code and content of each node equipment of the ISP.

特徵值抽取模組，是將障礙定位所需的各類型資訊根據其不同之來源系統特性加以抽取，以組成後續機器學習分析模組的輸入特徵參數群，並處理抽取資料來源模組中各單元之特徵值，是另包含服務品質管理特徵值抽取單元，是以取得近日之是否為4K以上高畫質用戶二元旗標值、影像串流服務品質指標、影像串流服務申訴機率；障礙申告管理特徵值抽取單元，是以取得影像串流服務申告原因代碼、申告描述文字筆記、人工診斷預測試代碼，為專業診斷人員進行初步人工測試後，所輸入之障礙原因代碼；迴路品質診斷管理特徵值抽取單元，是以取得數位用戶迴路多工接入設備(Digital Subscriber Line Access Multiplexer，DSLAM)廠牌型號、DSLAM韌體版本、語音音頻波段衰減值、上行SNR(Signal to Noise Ratio)、下行SNR、用戶端及ISP端週期性之品質監控值、是否使用特殊工法如線路耦合(bundling)及光銅混合(G.fast)工法二元旗標值；寬頻網路監控特徵值抽取單元，是以取得影像串流機上盒或家用多功能閘道器之型號、影像串流機上盒或家用多功能閘道器之上下行速率、局端設備類型、局端設備告警指標量化值、告警類型詞頻(term frequency)、告警嚴重性指標值。 The feature value extraction module extracts various types of information required for positioning the obstacle according to different source system characteristics to form an input feature parameter group of the subsequent machine learning analysis module, and processes each unit in the extracted data source module. The feature value is another service quality management feature value extraction unit, which is to obtain the binary flag value of the high-definition user of 4K or more in recent days, the video stream service quality index, and the video stream service appeal probability; The management feature value extraction unit is to obtain the video stream service request reason code, the report description text note, the manual diagnosis pre-test code, and the obstacle reason code input after the preliminary manual test for the professional diagnostic personnel; the loop quality diagnosis management feature The value extraction unit is to obtain the Digital Subscriber Line Access Multiplexer (DSLAM) brand model, the DSLAM firmware version, the voice and audio band attenuation value, the uplink SNR (Signal to Noise Ratio), and the downlink SNR. , the quality monitoring value of the user and the ISP end periodically, whether to use the special method Such as line coupling (bundling) and optical copper hybrid (G.fast) method binary flag value; broadband network monitoring feature value extraction unit, is to obtain the image of the video streamer box or home multi-function gateway, The downstream rate of the video streamer box or the home multi-function gateway, the type of the central office equipment, the quantized value of the alarm value of the central office equipment, the term frequency of the alarm type, and the alarm severity index value.

機器學習訓練及實作模組，是為接受特徵值抽取模組產出之特徵值，進一步做資料預處理後，以機器學習加以訓練並取得最佳化模型參數，且另包含有一訓練單元及一實作單元，其中訓練單元另包含訓練標的建立單元，是為建立預估模型的判斷標的，作為訓練模型過程中計算損失函數及優化時的基準；類別與缺漏值前處理訓練單元，是對於訓練資料的特徵值加以預處理，並包括將類別型特徵值，展開為二元指示特徵值(binary indicator)，以及，當若數值型特徵值有缺漏值，則以平均值取代，並為部分有缺漏的特徵值新增一個二元缺漏指示特徵值；文字筆記障礙分析訓練單元，是對於訓練資料中每一筆障礙待處理事件，依其文字描述之逐字筆記內容，使用自動斷詞工具與羅吉斯迴歸分析，先行計算出文字描述的障礙相關詞頻組合是屬於客戶端還是ISP端障礙之機率，並將機率並作為後續模型輸入之特徵值之一；高維度特徵值多重組合建立訓練單元，是對於訓練資料每一筆障礙待處理事件，製作一或複數個高維度特徵值集合；最佳化障礙點分類模型建立模組，是利用非線性之梯度提升決策樹(Gradient Boosting Decision Tree，GBDT)為主要推估模型，輸入高維度特徵值多重組合建立訓練單元產出之各型式高維度特徵值組合後，經由最小化損失函數之優化過程找出訓練資料之最佳模型參數，供實際應用時預估每一筆新增之待判斷障礙區間資料。 The machine learning training and implementation module is to accept the characteristic values of the feature value extraction module output, further perform data preprocessing, train with machine learning and obtain optimized model parameters, and further include a training unit and An implementation unit, wherein the training unit further comprises a training target establishing unit, which is a criterion for establishing a prediction model, as a calculation loss function in the training model process and a benchmark when optimizing; the category and the missing value pre-processing training unit are The eigenvalues of the training data are preprocessed, and include expanding the categorical eigenvalues into binary indicator eigenvalues, and, if the numerical eigenvalues have missing values, replacing them with averaging values, and The missing feature value adds a binary missing indication feature value; the text note obstacle analysis training unit is a verbatim note content for each obstacle pending event in the training data, using the automatic word breaking tool and Logis regression analysis, first calculate the text description of the barrier-related word frequency combination is the client or ISP end barrier Obstacle probability, and take the probability as one of the characteristic values of the subsequent model input; the high-dimensional eigenvalue multi-combination establishes the training unit, which is to make one or a plurality of high-dimensional feature value sets for each obstacle waiting event of the training data; The module for optimizing the classification model of obstacles is to use the non-linear gradient decision tree (GBDT) as the main estimation model, and input multiple combinations of high-dimensional eigenvalues to establish the high output of the training unit. After the dimension feature values are combined, the optimal model parameters of the training data are found through the optimization process of minimizing the loss function, and each newly added obstacle interval data is estimated for actual application.

其中機器學習訓練及實作模組之實作單元，負責進行實際即時資料處理預估，另包含類別與缺漏值前處理單元，是對於實際待預估資料及特徵值的特徵值加以預處理，並包括將類別型特徵值，展開為二元指示特徵值(binary indicator)，以及，當若數值型特徵值有缺漏值，則以平均值取代，並為部分有缺漏的特徵值新增一個二元缺漏指示特徵值；文字筆記障礙分析單元，是對於實際待預估資料及特徵值中每一筆障礙待處理事件，依其文字描述之逐字筆記內容，使用自動斷詞工具與羅吉斯迴歸分析，先行計算出文字描述的障礙相關詞頻組合是屬於客戶端還是ISP端障礙之機率，並將機率並作為後續模型輸入之特徵值之一；高維度特徵值多重組合建立單元，是對於實際待預估資料及特徵值每一筆障礙待處理事件，製作一或複數個高維度特徵值集合；障礙點分類預估產出單元，利用每月更新訓練後之最佳優化GBDT模型參數，計算出每一筆新增待判斷障礙區間案件的障礙區機率大小判斷值。 The implementation unit of the machine learning training and implementation module is responsible for the actual real-time data processing estimation, and the pre-processing unit for the category and the missing value is to preprocess the characteristic values of the actual data to be estimated and the characteristic values. And including expanding the categorical feature value into a binary indicator, and if the numerical eigenvalue has a missing value, replacing it with an average value and adding a second to the partially missing feature value The missing element indicates the feature value; the text note obstacle analysis unit is a verbatim note content according to the text description of each of the actual pending data and the feature value to be processed, using the automatic word breaking tool and the return of the Rogers Analysis, first calculate the word description of the barrier-related word frequency combination is the probability of belonging to the client or ISP-side obstacles, and take the probability as one of the characteristic values of the subsequent model input; the high-dimensional eigenvalue multi-combination unit is for the actual Estimating data and eigenvalues for each obstacle pending event, making one or more sets of high-dimensional eigenvalues; The estimated output unit uses the best optimized GBDT model parameters after monthly update training to calculate the judgment value of the obstacle area probability for each new case to be judged.

維護運作資訊產出模組，是為評估不同之輸入特徵值集合，於訓練階段完成優化後之模型錯誤率，產出整合之維運資訊及維護優先權方案分別供維運及管理人員使用。 The maintenance operation information output module is used to evaluate different input feature value sets and optimize the model error rate after the training phase. The output integration information and maintenance priority plan are used by the maintenance and management personnel respectively.

其維護運作資訊產出模組另包含模型效能指標建立單元，是以一種加權錯誤率之評估，建立一個得以評估預測模型好壞的基準計算模型；模型錯誤率計算單元，是為利用模型效能指標建立單元之加權錯誤率計算公式，計算出機器學習訓練階段完成優化後之各型特徵值集合對應的模型預估錯誤率；維運資訊產出單元，是為整合產出待處理障礙客戶資料、障礙區間判定結果及參考模型錯誤率，以提供維運人員依時效性或正確性的優先次序選擇使用相應之查修建議資訊；維護優先權方案產出單元，是為依據當下至過去一個月之內之障礙區間統計平均值相比，進行組織內於客戶端及ISP端之維修優先權決策方案產出，得以使近期故障區較多之處能獲得優先處理及修復，以達維護資源運用之最大效益。 The maintenance operation information output module further includes a model performance indicator establishing unit, which is a benchmark calculation model for evaluating the quality of the prediction model based on an evaluation of the weighted error rate; the model error rate calculation unit is to utilize the model performance indicator The weighted error rate calculation formula of the unit is established, and the model prediction error rate corresponding to each type of feature value set after the optimization in the machine learning training phase is calculated; the maintenance information output unit is for integrating the output of the customer data to be processed, The obstacle interval judgment result and the reference model error rate are used to provide the maintenance personnel to use the corresponding inspection suggestion information according to the priority of timeliness or correctness; the maintenance priority plan output unit is based on the current month to the past month. Compared with the statistical average of the barriers within the organization, the output of the maintenance priority decision scheme in the organization and the ISP is used to enable priority treatment and repair in more recent fault areas, so as to achieve maintenance resources. Maximum benefit.

一種影像串流服務的障礙定位維運方法，其包括：步驟一、資料來源模組經特徵值抽取模組取出高維度之各類型之預估用特徵值；步驟二、經由機器學習訓練及實作模組處理，先以訓練資料訓練出最佳化之預估模型，提供後續實作時根據實際客訴案件的待測特徵值資料估算出多類型特徵值下的相應客戶端與ISP端預估障礙機率；步驟三、最後由維護運作資訊產出模組負責產生維運作業方式選擇及優先權決策資訊；其中步驟二之機器學習訓練及實作模組處理之流程包括：步驟一、是否產生訓練模型，若為是，則先進行第一次模擬訓練，訓練標的建立，若為否，則進行類別與缺漏值前處理；步驟二、當訓練標的建立之後，則進入類別與缺漏值前處理訓練；步驟三、文字筆記障礙分析訓練；步驟四、高維度特徵值多重組合建立訓練；步驟五、最佳化障礙點分類模型建立，並回到類別與缺漏值前處理；步驟六、文字筆記障礙分析；步驟七、高維度特徵值多重組合建立；步驟八、障礙點分類預估產出；步驟九、判斷是否計算下一筆用戶，若為是，則回到是否產生訓練模型，若為否結束。 A method for locating and locating an image stream service includes: step 1: the data source module extracts the high-dimensional types of estimated feature values by the feature value extraction module; and the second step, through machine learning training and real For module processing, the optimized prediction model is trained with the training data, and the corresponding client and ISP terminal under multi-type eigenvalues are estimated based on the eigenvalue data of the actual customer complaint case. Estimating the probability of obstacles; Step 3: Finally, the maintenance operation information output module is responsible for generating the operation mode selection and priority decision information; wherein the process of machine learning training and implementation module processing in step 2 includes: step one, whether The training model is generated. If yes, the first simulation training is performed first, and the training target is established. If not, the category and the missing value are processed; and after the training target is established, the category and the missing value are entered. Processing training; Step 3, text note obstacle analysis training; Step 4, high-dimensional feature value multiple combination to establish training; Step 5, optimizing obstacle points The classification model is established and returned to the category and the missing value pre-processing; step six, text note obstacle analysis; step seven, high-dimensional eigenvalue multiple combination establishment; step eight, obstacle point classification estimated output; step nine, judge whether to calculate The next user, if yes, returns to whether the training model is generated, and if it is no.

其中步驟三之維運作業方式之流程包括：步驟一、設計模型效能指標評估，首次設計完成後即不再變更；步驟二、依最近一次訓練階段之預留測試資料，計算模型錯誤率；步驟三、產出要提供給維運人員之維運資訊；步驟四、產出要提供給管理人員之維護優先權方案資訊；步驟五、判斷是否有下一用戶待預估計算，若為是，則回到步驟二，依最近一次訓練階段之預留測試資料，計算模型錯誤率，若為否，則結束。 The process of the operation mode of the third step includes: step one, evaluation of the performance index of the design model, and no change after the first design is completed; step 2, calculating the model error rate according to the reserved test data of the latest training stage; 3. The output shall be provided to the maintenance personnel for the transportation information; in step 4, the output shall be provided to the management personnel for the maintenance priority plan information; in step 5, it shall be judged whether there is a next user to be estimated and calculated, and if so, Then, return to step 2, calculate the model error rate according to the reserved test data of the latest training phase, and if not, the process ends.

本發明所提供一種影像串流服務的障礙定位系統及維運方法，與其他習用技術相互比較時，更具備下列優點： The invention provides an obstacle positioning system and a transportation method for video streaming services, which have the following advantages when compared with other conventional technologies:

1.可處理高維度之影音串流服務特徵值預測模型建立。 1. It can handle the establishment of high-dimensional video stream service feature value prediction model.

2.可快速二元化分類影音串流服務障礙區域。 2. It can quickly classify and classify audio and video streaming service obstacle areas.

3.可讓維護者依時效性或正確性優先需求選擇建議之維護區域方式。 3. Allow maintainers to choose the recommended maintenance area based on timeliness or correctness.

4.可提供動態調整影音串流服務維護之優先權決策方案，達成有限維護資源運用之最大效益。 4. It can provide the priority decision scheme for dynamically adjusting the maintenance of video streaming services, and achieve the maximum benefit of using limited maintenance resources.

110‧‧‧資料來源模組 110‧‧‧Source Module

111‧‧‧服務品質管理單元 111‧‧‧Service Quality Management Unit

112‧‧‧障礙申告管理單元 112‧‧‧ Obstacle Report Management Unit

113‧‧‧迴路品質診斷管理單元 113‧‧‧Circuit Quality Diagnostic Management Unit

114‧‧‧寬頻網路監控單元 114‧‧‧Broadband network monitoring unit

120‧‧‧特徵值抽取模組 120‧‧‧Characteristic value extraction module

121‧‧‧服務品質管理特徵值抽取單元 121‧‧‧Service Quality Management Feature Value Extraction Unit

122‧‧‧障礙申告管理特徵值抽取單元 122‧‧‧ Obstacle complaint management feature value extraction unit

123‧‧‧迴路品質診斷管理特徵值抽取單元 123‧‧‧Circuit quality diagnostic management feature value extraction unit

124‧‧‧寬頻網路監控特徵值抽取單元 124‧‧‧Broadband network monitoring feature value extraction unit

130‧‧‧機器學習訓練及實作模組 130‧‧‧Machine Learning Training and Implementation Module

131‧‧‧訓練單元 131‧‧‧ training unit

1311‧‧‧訓練標的建立單元 1311‧‧‧ Training target establishment unit

1312‧‧‧類別與缺漏值前處理訓練單元 1312‧‧‧Class and Missing Value Pre-Processing Training Unit

1313‧‧‧文字筆記障礙分析訓練單元 1313‧‧‧Text note obstacle analysis training unit

1314‧‧‧高維度特徵值多重組合建立訓練單元 1314‧‧‧High-dimensional eigenvalue multi-combination to establish training unit

1315‧‧‧最佳化障礙點分類模型建立模組 1315‧‧‧Optimized obstacle point classification model building module

132‧‧‧實作單元 132‧‧‧ Implementation unit

1321‧‧‧類別與缺漏值前處理單元 1321‧‧‧Class and Missing Value Pre-Processing Unit

1322‧‧‧文字筆記障礙分析單元 1322‧‧‧Text note obstacle analysis unit

1323‧‧‧高維度特徵值多重組合建立單元 1323‧‧‧High-dimensional eigenvalue multi-combination building unit

1324‧‧‧障礙點分類預估產出單元 1324‧‧‧Identification of obstacle points

140‧‧‧維護運作資訊產出模組 140‧‧‧Maintenance Operational Information Output Module

141‧‧‧模型效能指標建立單元 141‧‧‧Model performance indicator building unit

142‧‧‧模型錯誤率計算單元 142‧‧‧Model Error Rate Calculation Unit

143‧‧‧維運資訊產出單元 143‧‧‧Weiyun Information Output Unit

144‧‧‧維護優先權方案產出單元 144‧‧‧Maintenance priority programme output unit

S310~S330‧‧‧流程 S310~S330‧‧‧Process

S410~S440‧‧‧機器學習訓練及實作模組處理流程 S410~S440‧‧‧ Machine learning training and implementation module processing flow

S510~S550‧‧‧維運作業方式流程 S510~S550‧‧‧ Maintenance operation mode

請參閱有關本發明之詳細說明及其附圖，將可進一步瞭解本發明之技術內容及其目的功效；有關附圖為：圖1為本發明影像串流服務的障礙定位系統及維運方法之架構圖；圖2為本發明影像串流服務的障礙定位系統及維運方法之機器學習訓練及實作模組架構圖；圖3為本發明影像串流服務的障礙定位系統及維運方法之維護運作資訊產出模組架構圖；圖4為本發明影像串流服務的障礙定位系統及維運方法之流程圖；圖5為本發明影像串流服務的障礙定位系統及維運方法之機器學習訓練及實作模組處理之流程圖；圖6為本發明影像串流服務的障礙定位系統及維運方法之維運作業方式之流程圖。 Please refer to the detailed description of the present invention and the accompanying drawings, which can further understand the technical content of the present invention and the purpose of the present invention. The related drawings are: FIG. 1 is an obstacle positioning system and a method for maintaining the video streaming service of the present invention. FIG. 2 is a schematic diagram of a machine learning training and implementation module structure of an obstacle positioning system and a maintenance method for an image streaming service according to the present invention; FIG. 3 is an obstacle positioning system and a maintenance method for an image streaming service according to the present invention; FIG. 4 is a flowchart of an obstacle positioning system and a method for maintaining an image streaming service according to the present invention; FIG. 5 is a flowchart of an obstacle positioning system and a method for maintaining a video streaming service according to the present invention; Flow chart of learning training and implementation module processing; FIG. 6 is a flow chart of the obstacle positioning system and the maintenance method of the image streaming service of the present invention.

為了使本發明的目的、技術方案及優點更加清楚明白，下面結合附圖及實施例，對本發明進行進一步詳細說明。應當理解，此處所描述的具體實施例僅用以解釋本發明，但並不用於限定本發明。 The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

以下，結合附圖對本發明進一步說明：請參閱圖1所示，為一種影像串流服務的障礙定位系統及維運方法之架構圖，其包括資料來源模組110，是以蒐集判斷障礙定位所需的複數型資訊源，是另包含服務品質管理單元111，是包括應用層的服務品質資料，為各影像終端裝置的畫質等級(例如高解析度HD、4K或8K)及數值型品質指標；障礙申告管理單元112，是包含影像服務之障礙申告相關資料，為申告原因、申告描述文字及人工預檢測後的測試代碼之內容；迴路品質診斷管理單元113，是包括實體層的迴路品質測試資料及非標準型用戶迴路施工工法紀錄，為線路電氣特性估計值及最接近用戶端之ISP所屬交換局端設備紀錄與會大幅影響用戶使用距離之特殊工法紀錄，例如應用歐姆定律的線路耦合(bundling)工法可大幅延長使用距離，而國際電信聯盟(ITU)提出的光銅混合(G.fast)工法可提升速率但會大幅限縮用戶使用距離；寬頻網路監控單元114，是包含ISP業者之各節點設備廠牌型號資料、設備告警代碼及告警代碼與內容之資訊。 The present invention is further described with reference to the accompanying drawings. Referring to FIG. 1 , an architecture diagram of an obstacle positioning system and a maintenance method for an image streaming service includes a data source module 110, which is a collection and judgment obstacle positioning station. The multi-type information source required further includes a service quality management unit 111, which is a service quality data including an application layer, and is a picture quality level (for example, high-resolution HD, 4K or 8K) and a numerical quality indicator of each image terminal device. The obstacle report management unit 112 is related to the obstacle report related information of the image service, and is the content of the report reason, the description description text and the test code after the manual pre-test; the loop quality diagnosis management unit 113 is a loop quality test including the physical layer. Data and non-standard user loop construction method records, which are estimates of the electrical characteristics of the line and the records of the exchanges of the ISPs that are closest to the user end, and the special method records that greatly affect the user's distance, such as the line coupling using Ohm's law (bundling) The construction method can greatly extend the use distance, and the optical copper mixture proposed by the International Telecommunication Union (ITU) The (G.fast) method can increase the rate but greatly limit the user's distance; the broadband network monitoring unit 114 is information including the brand model data, the device alarm code, and the alarm code and content of each node device of the ISP.

特徵值抽取模組120，是將障礙定位所需的各類型資訊根據其不同之來源系統特性加以抽取，以組成後續機器學習分析模組的輸入特徵參數群，並處理抽取資料來源模組中各單元之特徵值，是另包含服務品質管理特徵值抽取單元121，是以取得近日之是否為4K以上高畫質用戶旗標值(例如旗標值若為True表該用戶為4K以上畫質，反之為False)、影像串流服務品質指標、影像串流服務申訴機率；障礙申告管理特徵值抽取單元122，是以取得影像串流服務申告原因代碼、申告描述文字筆記、人工診斷預測試代碼，為專業診斷人員進行初步人工測試後，所輸入之障礙原因代碼；迴路品質診斷管理特徵值抽取單元123，是以取得數位用戶迴路多工接入設備(Digital Subscriber Line Access Multiplexer，DSLAM)廠牌型號、DSLAM韌體版本、語音音頻波段衰減值、上行SNR(Signal to Noise Ratio)、下行SNR、用戶端及最接近該用戶端之ISP端設備週期性之品質監控值、是否使用特殊工法如線路耦合(bundling)及光銅混合(G.fast)工法二元旗標值，特殊工法的使用會大幅影響障礙判定與距離之關聯性故需預先加以記錄以納入後續的訓練模型使用；寬頻網路監控特徵值抽取單元124，是以取得影像串流機上盒或家用多功能閘道器之型號、影像串流機上盒或家用多功能閘道器之上下行速率、局端設備類型、局端設備告警指標量化值、告警類型詞頻(term frequency)、告警嚴重性指標值。 The feature value extraction module 120 extracts various types of information required for the obstacle location according to different source system characteristics to form an input feature parameter group of the subsequent machine learning analysis module, and processes each of the extracted data source modules. The feature value of the unit is another service quality management feature value extraction unit 121, which is to obtain a high-definition user flag value of 4K or more in recent days (for example, if the flag value is True, the user has a picture quality of 4K or more, Conversely, it is False), the video stream service quality indicator, and the video stream service appeal probability; the obstacle report management feature value extracting unit 122 is to obtain the video stream service report reason code, the report description text note, and the manual diagnosis pre-test code. After the preliminary manual test for the professional diagnostician, the entered obstacle reason code; the loop quality diagnostic management feature value extracting unit 123 is to obtain the digital subscriber line access multiplexer (DSLAM) brand model. , DSLAM firmware version, voice and audio band attenuation value, uplink SNR (Signal to Noise Ratio), under SNR, user end and the quality monitoring value of the ISP end device closest to the user end, whether to use special methods such as line coupling (bundling) and optical copper hybrid (G.fast) method binary flag value, special method The use will greatly affect the correlation between the obstacle determination and the distance, so it needs to be recorded in advance to be used in the subsequent training model. The broadband network monitoring feature value extraction unit 124 is to obtain the video streamer box or the home multi-function gateway. The model, the downlink rate of the video streamer box or the home multi-function gateway, the type of the central office equipment, the quantized value of the alarm value of the central office equipment, the term frequency of the alarm type, and the alarm severity index value.

綜上所述，資料來源模組110作為分析資料源，經特徵值抽取模組120取出各類型之預估用特徵值後，利用機器學習訓練及實作模組130，先將訓練資料集訓練出最佳化之預估模型，以提供後續實作時根據實際待測資料，估算出多類型特徵值下的相應預估機率，最後由維護運作資訊產出模組140，負責產生維運作業及優先權決策所需的整合資訊。 In summary, the data source module 110 is used as an analysis data source, and after the feature value extraction module 120 extracts each type of estimated feature value, the machine learning training and implementation module 130 is used to train the training data set first. The optimized estimation model is used to provide the corresponding estimation probability under the multi-type eigenvalues according to the actual data to be tested in the subsequent implementation, and finally the maintenance operation information output module 140 is responsible for generating the maintenance operation. And the integration information needed for priority decision making.

而資料源之蒐集則是來自於蒐集判斷障礙定位所需的各類型資訊，為ISP業者提供影像服務時會使用到之管理及診斷系統，僅在邏輯上加以區分，實體上可以建置於同一主機群或系統內。 The collection of data sources comes from the collection of various types of information needed to locate obstacles. The management and diagnostic systems that are used when providing image services to ISPs are only logically distinguished. The entities can be built in the same Within the host group or system.

而特徵值之抽取則是將障礙定位所需的各類型資訊，根據其不同之來源系統特性加以抽取，以組成後續機器學習分析模組的輸入特徵參數群。其抽取過程均包括獨立性判別篩選流程，輸入資料源間經相關性檢定需不具明顯正相關，或令檢定法之虛無假設為樣本間不具差異性，經檢定後p值小於顯著水準0.05，可拒絕虛無假設者，才納入為特徵值集合之中。 The extraction of the feature values is to select various types of information required for the obstacles, and extract them according to their different source system characteristics to form an input feature parameter group of the subsequent machine learning analysis module. The extraction process includes an independent discriminant screening process. The input data source does not have a significant positive correlation between the correlation tests, or the null hypothesis of the verification method is not discriminant between the samples. After the verification, the p value is less than the significant level of 0.05. Those who reject the null hypothesis are included in the set of feature values.

請參閱圖2所示，為本發明影像串流服務的障礙定位系統及維運方法之機器學習訓練及實作模組架構圖，機器學習訓練及實作模組130，是為接受特徵值抽取模組120產出之特徵值，進一步做資料預處理後，以機器學習加以訓練並取得最佳化模型參數，且另包含有一訓練單元131及一實作單元132，其中訓練單元131另包含訓練標的建立單元1311，是為建立預估模型的判斷標的，作為訓練模型過程中計算損失函數及優化時的基準；類別與缺漏值前處理訓練單元1312，是對於訓練資料的特徵值加以預處理，並包括將類別型特徵值，展開為二元指示特徵值(binary indicator)，以及，當若數值型特徵值有缺漏值，則以平均值取代，並為部分有缺漏的特徵值新增一個二元缺漏指示特徵值；文字筆記障礙分析訓練單元1313，是對於訓練資料中每一筆障礙待處理事件，依其文字描述之逐字筆記內容，使用自動斷詞工具與羅吉斯迴歸分析，先行計算出文字描述的障礙相關詞頻組合是屬於客戶端還是ISP端障礙之機率，並將機率並作為後續模型輸入之特徵值之一；高維度特徵值多重組合建立訓練單元1314，是對於訓練資料每一筆障礙待處理事件，製作一或複數個高維度特徵值集合；最佳化障礙點分類模型建立模組1315，是利用非線性之梯度提升決策樹(Gradient Boosting Decision Tree，GBDT)為主要推估模型，輸入高維度特徵值多重組合建立訓練單元產出之各型式高維度特徵值組合後，經由最小化損失函數之優化過程找出訓練資料之最佳模型參數，供實際應用時預估每一筆新增之待判斷障礙區間資料。 Please refer to FIG. 2, which is a schematic diagram of a machine learning training and implementation module of an obstacle positioning system and a maintenance method for an image streaming service according to the present invention. The machine learning training and implementation module 130 is configured to accept feature value extraction. The eigenvalues generated by the module 120 are further processed by the machine learning to obtain the optimized model parameters, and further include a training unit 131 and an implementation unit 132, wherein the training unit 131 further includes training The target establishing unit 1311 is a criterion for establishing the prediction model, and is used as a reference for calculating the loss function and the optimization time in the training model process; the category and missing value pre-processing training unit 1312 is for preprocessing the eigenvalues of the training data. And including expanding the categorical feature value into a binary indicator, and if the numerical eigenvalue has a missing value, replacing it with an average value and adding a second to the partially missing feature value The missing element indicates the feature value; the text note obstacle analysis training unit 1313 is for each event to be processed in the training data, according to the text description The verbatim note content, using the automatic word-breaking tool and the Logis regression analysis, first calculate the obstacle-related word frequency combination of the text description is the probability of the client or the ISP-side obstacle, and take the probability as the eigenvalue of the subsequent model input. One of the high-dimensional eigenvalue multi-combination establishing training unit 1314 is to make one or a plurality of high-dimensional feature value sets for each obstacle waiting event of the training data; the optimization obstacle point classification model building module 1315 is utilized The non-linear gradient decision tree (GBDT) is the main estimation model. After inputting high-dimensional eigenvalues and multiple combinations to establish the combination of high-dimensional eigenvalues of the training unit, the optimization of the loss function is minimized. The process finds the best model parameters of the training data, and estimates each newly added obstacle interval data for actual application.

機器學習訓練及實作模組130之實作單元132，負責進行實際即時資料處理預估，另包含類別與缺漏值前處理單元1321，是對於實際待預估資料及特徵值的特徵值加以預處理，並包括將類別型特徵值，展開為二元指示特徵值(binary indicator)，以及，當若數值型特徵值有缺漏值，則以平均值取代，並為部分有缺漏的特徵值新增一個二元缺漏指示特徵值；文字筆記障礙分析單元1322，是對於實際待預估資料及特徵值中每一筆障礙待處理事件，依其文字描述之逐字筆記內容，使用自動斷詞工具與羅吉斯迴歸分析，先行計算出文字描述的障礙相關詞頻組合是屬於客戶端還是ISP端障礙之機率，並將機率並作為後續模型輸入之特徵值之一；高維度特徵值多重組合建立單元1323，是對於實際待預估資料及特徵值每一筆障礙待處理事件，製作一或複數個高維度特徵值集合；障礙點分類預估產出單元1324，利用每月更新訓練後之最佳優化GBDT模型參數，計算出每一筆新增待判斷障礙區間案件的障礙區機率大小判斷值。 The implementation unit 132 of the machine learning training and implementation module 130 is responsible for performing actual real-time data processing estimation, and further includes a category and missing value pre-processing unit 1321 for pre-predicting the actual estimated values of the data and the feature values. Processing, and including expanding the categorical feature value into a binary indicator, and, if the numerical feature value has a missing value, replacing it with an average value, and adding a partial missing feature value A binary missing indication feature value; the text note obstacle analysis unit 1322 is a verbatim note content according to the text description of each of the actual pending data and the feature value to be processed, using an automatic word breaking tool and Luo Gis regression analysis, first calculate the obstacle-related word frequency combination of the text description is the probability of belonging to the client or the ISP side obstacle, and take the probability as one of the characteristic values of the subsequent model input; the high-dimensional feature value multiple combination establishing unit 1323, One or a plurality of sets of high-dimensional feature values are generated for each event to be processed for the actual estimated data and the feature value; The obstacle point classification estimation output unit 1324 uses the best optimized GBDT model parameters after the monthly update training to calculate the probability value of the obstacle area probability for each newly added obstacle period.

請參閱圖3所示，為本發明影像串流服務的障礙定位系統及維運方法之維護運作資訊產出模組架構圖，維護運作資訊產出模組140，是為評估不同之輸入特徵值集合，於訓練階段完成優化後之障礙區域預估模型之預估錯誤率，產出整合之維運資訊及維護優先權方案分別供維運及管理人員使用。 Please refer to FIG. 3 , which is an architecture diagram of a maintenance and operation information output module of an obstacle location system and a maintenance method for the video stream service of the present invention. The maintenance operation information output module 140 is used to evaluate different input feature values. The collection, the estimated error rate of the obstacle region estimation model after the optimization in the training phase, the output integration information and the maintenance priority scheme are used by the maintenance and management personnel respectively.

其維護運作資訊產出模組140另包含模型效能指標建立單元141，是以一種加權錯誤率之評估方法，建立一個得以評估預測模型好壞的基準計算模型；模型錯誤率計算單元142，是為利用模型效能指標建立單元之加權錯誤率計算公式，計算出機器學習訓練階段完成優化後之各型特徵值集合對應的模型所預估障礙區域之錯誤率；維運資訊產出單元143，是為整合產出待處理障礙客戶資料、障礙區間判定結果及參考模型錯誤率，以提供維運人員依時效性或正確性的優先次序選擇使用相應之查修建議資訊；維護優先權方案產出單元144，是為依據當下至過去一個月之內之障礙區間統計平均值相比，進行組織內於客戶端及ISP端之維修優先權決策方案產出，得以使近期故障區較多之處能獲得優先處理及修復，以達維護資源運用之最大效益。 The maintenance operation information output module 140 further includes a model performance indicator establishing unit 141, which is a reference calculation model for evaluating the quality of the prediction model by using a weighted error rate evaluation method; the model error rate calculation unit 142 is Using the weighting error rate calculation formula of the model performance indicator establishing unit, the error rate of the estimated obstacle area corresponding to the model corresponding to each type of feature value set after the optimization of the machine learning training stage is calculated; the maintenance information output unit 143 is Integrate the output of the customer data to be processed, the obstacle interval judgment result and the reference model error rate to provide the maintenance personnel to select and use the corresponding inspection recommendation information according to the priority of timeliness or correctness; the maintenance priority scheme output unit 144 In order to make the maintenance priority decision scheme output of the client and the ISP in the organization based on the statistical average of the obstacle interval from the current to the past month, the priority in the recent fault zone can be prioritized. Handling and repairing, in order to achieve the maximum benefit of the use of maintenance resources.

請參閱圖4所示，為本發明影像串流服務的障礙定位系統及維運方法之流程圖，其包括：步驟一、S310資料來源模組經特徵值抽取模組取出高維度之各類型之預估用特徵值；步驟二、S320經由機器學習訓練及實作模組處理，先以訓練資料訓練出最佳化之預估模型，提供後續實作時根據實際客訴案件的待測特徵值資料估算出多類型特徵值下的相應客戶端與ISP端預估障礙機率；步驟三、S330最後由維護運作資訊產出模組負責產生維運作業方式選擇及優先權決策資訊；其中步驟二S320之機器學習訓練及實作模組處理之流程，請參閱圖5所示，首先依執行時間判斷是否需重新進行模型訓練，例如以一個月周期為運作區間，運作時每隔一月重新以訓練單元進行一次模型訓練(如下步驟一至步驟五)，若在每月內一般運作期間，不需重新進行模型訓練，由實作單元進行運算依程序循序進行(如下步驟一、步驟六至步驟九)，包括：步驟一、S410是否產生訓練模型，若為是，則先進行第一次模擬訓練，S420訓練標的建立，若為否，則進行S430類別與缺漏值前處理；步驟二、當S420訓練標的建立之後，則進入S421類別與缺漏值前處理訓練；步驟三、S422文字筆記障礙分析訓練；步驟四、S423高維度特徵值多重組合建立訓練；步驟五、S424最佳化障礙點分類模型建立，並回到S430類別與缺漏值前處理；步驟六、S431文字筆記障礙分析；步驟七、S432高維度特徵值多重組合建立；步驟八、S433障礙點分類預估產出；步驟九、判斷S440是否計算下一筆用戶，若為是，則回到S410是否產生訓練模型，若為否，則S450結束。 Please refer to FIG. 4 , which is a flowchart of an obstacle positioning system and a maintenance method for the video streaming service according to the present invention. The method includes: Step 1 : The S310 data source module extracts each type of high dimension by the feature value extraction module. Estimated eigenvalues; Step 2: S320 through machine learning training and implementation module processing, first training the optimized prediction model with training data, and providing the eigenvalues to be tested according to the actual customer complaint case in subsequent implementation The data estimates the probability of the corresponding client and ISP terminal under multiple types of eigenvalues. Step 3: S330 is finally responsible for generating the operation mode selection and priority decision information by the maintenance operation information output module; Step 2 S320 For the process of machine learning training and implementation module processing, please refer to Figure 5, firstly, according to the execution time, it is judged whether the model training needs to be re-executed. For example, the one-month cycle is used as the operation interval, and the operation is re-trained every other month. The unit performs a model training (steps 1 to 5 below). If the model training is not required during normal operation during the month, it is carried out by the implementation unit. According to the procedure, the steps are as follows (step one, step six to step nine), including: step one, S410 whether the training model is generated, if yes, the first simulation training is performed first, and the S420 training target is established, if not, then Perform S430 category and missing value pre-processing; Step 2: After the S420 training target is established, enter S421 category and missing value pre-processing training; Step 3, S422 text note obstacle analysis training; Step 4, S423 high-dimensional feature value multiple combination Establish training; Step 5, S424 optimization obstacle classification model establishment, and return to S430 category and missing value pre-processing; Step six, S431 text note obstacle analysis; Step seven, S432 high-dimensional feature value multiple combination establishment; Step eight , S433 obstacle point classification estimated output; step IX, determine whether S440 calculates the next user, if yes, then return to S410 whether to generate a training model, if not, then S450 ends.

由上述步驟可得知，步驟一之S420訓練標的建立，為了建立障礙點判定模型，我們必須先提供機器學習演算法模型訓練之準確率判斷標的。如我們利用已完成修復之最近一個月內所有影像串流服務障礙處理歷史紀錄資料為此項訓練標的，做法為依維修人員最終填報的障礙原因及修復回報資料，將障礙區域標示為兩類，分別為障礙點較靠近用戶端的客戶端障礙，以及障礙點較靠近彙集與核心網路的ISP端障礙，亦即將障礙點判定問題轉化為一種二元分類問題。此一個月內的障礙處理歷史紀錄資料中，依時間做排序，前70%取出作為模型訓練與優化用資料，後30%則做為後續模型效能指標測試資料。 It can be known from the above steps that the S420 training target is established in step 1. In order to establish the obstacle point determination model, we must first provide the accuracy rate judgment target of the machine learning algorithm model training. For example, we use all the video stream service obstacle processing history records in the last month to complete the repair as the training target. The practice is to mark the obstacle area as two types according to the obstacle reason and repair report data finally reported by the maintenance personnel. It is the client barrier that is closer to the user side of the obstacle point, and the ISP side barrier that is closer to the aggregation and core network, and the problem of the obstacle point determination is transformed into a binary classification problem. In the history of obstacle processing history in this month, according to time, the first 70% were taken out as model training and optimization data, and the last 30% were used as follow-up model performance index test data.

步驟二之S421類別與缺漏值前處理訓練，對於蒐集的各項特徵值中，若有非數字的類別型特徵值，我們將其展開為二元指示特徵值(binary indicator)，例如可以(1,0)表示男性而(0,1)表示女性。若數值型特徵值有缺漏值，則以平均值取代，並為部分有缺漏的特徵值新增一個二元缺漏指示特徵值。 Step 2 of the S421 category and the missing value pre-processing training. For the collected feature values, if there are non-numeric categorical eigenvalues, we expand them into binary indicator eigenvalues (for example, can be (1) , 0) means male and (0, 1) means female. If the numerical characteristic value has a missing value, it is replaced by the average value, and a binary missing indication characteristic value is added for the partially missing feature value.

步驟三之S422文字筆記障礙分析訓練，有鑑於申告描述文字筆記紀錄為一種自由格式之中文字串，無法直接利用。對此，我們提出一種前處理方法，可將文字筆記轉為實數值，代表其與障礙區域為ISP端或客戶端之相關性。首先利用中文斷詞工具，如Jieba應用軟體，對各文字筆記以及申告原因代碼之中文描述進行斷詞，並將結果以詞頻方式表示，例如(上網障礙，2)或(遙控器故障，3)。再以斷詞後的詞頻做為特徵向量，利用線性羅吉斯迴歸(Logistic Regression)分類模型進行訓練，經過訓練過之模型可對於斷詞後的文字筆記紀錄估算障礙點屬於ISP端或客戶端障礙的機率，而此機率估計值將作為後續障礙點預測模型的輸入特徵值之一。另為避免過適問題(overfitting)，此訓練過程另外蒐集獨立的訓練資料，用於此項文字筆記紀錄分析。 Step 3 of the S422 text note barrier analysis training, in view of the description of the text note record as a free-form text string, can not be directly used. In this regard, we propose a pre-processing method that converts a text note into a real value, representing its relevance to the ISP or client. First, use the Chinese word-breaking tool, such as the Jieba application software, to break words in the Chinese description of each text note and the reason code, and express the result in word frequency, for example (Internet access barrier, 2) or (remote control failure, 3) . Then the word frequency after the word break is used as the feature vector, and the linear Logistic Regression classification model is used for training. The trained model can estimate the obstacle point for the written note after the word break belongs to the ISP or client. The probability of the obstacle, and this probability estimate will be used as one of the input eigenvalues of the subsequent obstacle point prediction model. In order to avoid overfitting, this training process also collects independent training materials for the analysis of the text notes.

步驟四之S423高維度特徵值多重組合建立訓練，在建立訓練模型用高維度之特徵值集合，高維度係指所有特徵參數，包括數值參數及類別參數，完全展開後總計包含300個以上的特徵向量。另外為因應不同特性之維護需求，再分成多種的組合型式如下： Step 4: S423 high-dimensional eigenvalue multiple combination establishment training, in the establishment of training model with high-dimensional feature value set, high dimension refers to all feature parameters, including numerical parameters and category parameters, fully expanded to include more than 300 features vector. In addition, in order to meet the maintenance needs of different characteristics, it is divided into a variety of combinations:

型式一的特徵值輸入集合：包括有服務品質管理系統特徵值抽取單元、障礙申告管理系統特徵值抽取單元、迴路品質診斷管理系統特徵值抽取單元、及寬頻網路監控系統特徵值抽取單元，四個抽取單元所取出的所有特徵值，此型式之特徵值集合因參數完整，準確性較高，但其中障礙申告管理系統特徵值抽取單元中的人工診斷預測試代碼部分特徵值，需另以派工單執行人工測試後才能取得，因此時效性較低。 Type 1 eigenvalue input set: including service quality management system feature value extraction unit, obstacle report management system feature value extraction unit, loop quality diagnosis management system feature value extraction unit, and broadband network monitoring system feature value extraction unit, four All the feature values extracted by the extraction unit, the feature value set of the model is complete and the accuracy is high, but the feature value of the manual diagnosis pre-test code in the feature value extraction unit of the obstacle notification management system needs to be sent Work orders can only be obtained after performing manual tests, so the timeliness is low.

型式二的特徵值輸入集合：如同型式一的各單元所產出之特徵值，惟需去除障礙申告管理系統特徵值抽取單元中的人工診斷預測試代碼。此型式之特徵值集合因不含人工診斷測試部分，時效性較高，但因人工診斷可提升判斷準確度，因此型式一之準確度較高於型式二。 The eigenvalue input set of the type 2: as the eigenvalues produced by the units of the type one, only the manual diagnostic pre-test code in the feature value extraction unit of the obstacle management system needs to be removed. The eigenvalue set of this type has higher timeliness because it does not contain the manual diagnostic test part, but the accuracy of judgment can be improved by manual diagnosis, so the accuracy of type one is higher than that of type two.

步驟五之S424最佳化障礙點分類模型建立，在建立一可用之高維度特徵值分類模型，以梯度提升決策樹(GBDT,Gradient Boosting Decision Tree)為選定之預測模型，並以高維度特徵值多重組合建立訓練單元中之多型式特徵值輸入集合，加以訓練出各自之最佳優化模型。 Step 5: S424 optimizes the obstacle point classification model, establishes a usable high-dimensional eigenvalue classification model, and uses the Gradient Boosting Decision Tree (GBDT) as the selected prediction model, and uses high-dimensional eigenvalues. Multiple combinations establish multi-type eigenvalue input sets in the training unit, and train their respective optimal optimization models.

梯度提升決策樹是機器學習領域中常見的分類演算法。相較於許多常見的機器學習方法，梯度提升決策樹有不需特徵值縮放(feature scaling)以及主動學習非線性特徵組合(non-linear feature combination)等優點。本模組利用梯度提升法(gradient boosting)依序建立決策樹模型(decision tree)，優化定義之損失函數，最後輸出所建立之數個最佳決策樹。 Gradient lifting decision trees are common classification algorithms in the field of machine learning. Compared with many common machine learning methods, the gradient decision tree has the advantages of no feature scaling and a non-linear feature combination. This module uses the gradient boosting method to build a decision tree in order, optimizes the defined loss function, and finally outputs the best decision trees established.

由高維度特徵值多重組合建立訓練產出的多種類型特徵值輸入並訓練完成後，GBDT演算法輸出T顆決策樹，其預測函數定義為f_T，日後實作時給定一新的客戶申告案件所有相關特徵值集合資料，令其特徵向量為x_test，我們即可利用下列公式(1)評估該客戶申告案件之障礙點為ISP端(y=1)或客戶端障礙(y=-1)的機率。 After multiple types of eigenvalue input of training output are established by multi-dimensional combination of high-dimensional eigenvalues and training is completed, GBDT algorithm outputs T decision tree, and its prediction function is defined as f _T , and a new customer application case is given in the future implementation. For all relevant feature value set data, let its feature vector be x _test , we can use the following formula (1) to evaluate the obstacle point of the customer's application case as ISP end (y=1) or client obstacle (y=-1) The chance.

而實作單元中步驟五開始之S430類別與缺漏值前處理、S431文字筆記障礙分析、及S432高維度特徵值多重組合建立，功能上分別與訓練單元中的S421類別與缺漏值前處理訓練、S422文字筆記障礙分析訓練、及S423高維度特徵值多重組合建立訓練相同，差別僅在實作單元中處理的不是訓練資料而是真實待處理計算的特徵值資料。 In the implementation unit, the S430 category and the missing value pre-processing, the S431 text note obstacle analysis, and the S432 high-dimensional eigenvalue multi-combination are established in step 5 of the implementation unit, and the S421 category and the missing value pre-processing training in the training unit respectively. S422 text note obstacle analysis training, and S423 high-dimensional eigenvalue multi-combination establishment training is the same, the difference is only processed in the implementation unit is not the training data but the actual pending processing calculation of the feature value data.

當實作單元依序進行完類別與缺漏值前處理、文字筆記障礙分析及高維度特徵值多重組合建立後，障礙點分類預估產出即可以運用GBDT預測模型，其預測函數定義為f_T，由實際案件的各型特徵值集合資料向量x_test，再次利用上述公式(1)評估並產出障礙點為ISP端(y=1)或客戶端障礙(y=-1)的機率。 After the implementation unit performs the pre-processing of category and missing value, the analysis of the text note barrier and the multi-dimensional eigenvalue multi-combination, the GBDT prediction model can be used to predict the output of the obstacle classification. The prediction function is defined as f _T From the actual case type of the characteristic value set data vector x _test , again using the above formula (1) to evaluate and produce the probability that the obstacle point is the ISP end (y = 1) or the client side obstacle (y = -1).

其中步驟三S330之維運作業方式之流程，請參閱圖6所示，當預估模型計算出障礙區域預測的數值後，維護運作資訊產出模組將負責產出最後的整合型維護資訊及維護優先權決策資訊，其包括：步驟一、S510設計模型效能指標評估，首次設計完成後即不再變更；步驟二、S520依最近一次訓練階段之預留測試資料，計算模型錯誤率；步驟三、S530產出要提供給維運人員之維運資訊；步驟四、S540產出要提供給管理人員之維護優先權方案資訊；步驟五、S550判斷是否有下一用戶待預估計算，若為是，則回到步驟二，S520依最近一次訓練階段之預留測試資料，計算模型錯誤率，若為否，則結束。 The process of the operation mode of the third step S330 is as shown in Figure 6. After the estimated model calculates the predicted value of the obstacle area, the maintenance operation information output module will be responsible for outputting the final integrated maintenance information and Maintain priority decision information, including: Step 1, S510 design model performance index evaluation, no change after the first design is completed; Step 2, S520 according to the latest training phase of the reserved test data, calculate the model error rate; Step 3 S530 outputs the information to be provided to the maintenance personnel; Step 4: S540 outputs the maintenance priority plan information to be provided to the management personnel; Step 5: S550 determines whether there is a next user to be estimated, if Yes, return to step 2, S520 calculates the model error rate according to the reserved test data of the latest training phase, and if not, ends.

而步驟一之S510設計模型效能指標評估，首次設計完成後即不再變更，是設計模型效能指標評估單元評估各類特徵參數導入機器學習訓練階段之最優化模型之障礙區域預估結果，並以錯誤率表示其模型效能量化數值大小，首次設計完後即不再變更，在此當中係以一加權錯誤率大小來驗證模型效能，模型效能加權錯誤率公式如下。 The evaluation of the performance index of the S510 design model in step 1 is no longer changed after the first design is completed. It is the obstacle area estimation result of the optimization model of the design model performance index evaluation unit to evaluate the various characteristic parameters into the machine learning training stage. The error rate indicates the quantified value of the model performance. It is not changed after the first design. In this case, the model performance is verified by a weighted error rate. The model performance weighted error rate formula is as follows.

其中Err為加權錯誤率，w _i為代表預估錯誤嚴重性的權重，依維護難易度經驗，我們將客戶端障礙權重預設為1，ISP端障礙權重則設為大於1之數值，因ISP端的障礙影響之層面較大，故估計錯誤會造成較大之損失。p_i≠y_i 為指示函數，若模型預測類別p_i不等於實際障礙點y_i，則其值為1，反之為0。 Err is the weighted error rate, w _i is the weight representing the severity of the estimated error. According to the maintenance difficulty experience, we set the client barrier weight to 1 and the ISP barrier weight to 1 to the value. The level of impact of the obstacles is large, so it is estimated that the mistake will cause a large loss. p _i ≠y _i For the indication function, if the model prediction category p _{i is} not equal to the actual obstacle point y _i , its value is 1 and vice versa.

例如某訓練測試資料有4個障礙案件故障區域依序為{客戶端、客戶端、ISP端、客戶端}，經模型預估後故障區域依序預估為{客戶端、客戶端、ISP端、ISP端}。本例中最後一項障礙案件預估區域錯誤，未加權錯誤率為25%，若設定客戶端權重=1，ISP端權重=5，則加權後之錯誤率為：Err=(0+0+0+5)/(1+1+1+5)=62.5% 公式(3) For example, in a training test data, there are 4 obstacle cases. The fault area is sequentially {client, client, ISP, client}. After the model is estimated, the fault area is estimated as {client, client, ISP. , ISP side}. In the last obstacle case in this case, the prediction area is wrong. The unweighted error rate is 25%. If the client weight = 1 and the ISP weight = 5, the weighted error rate is: Err=(0+0+ 0+5)/(1+1+1+5)=62.5% formula (3)

在步驟二之S520依最近一次訓練階段之預留測試資料，計算模型錯誤率，其模型錯誤率計算單元功能為利用上述模型效能指標建立之加權錯誤率計算公式(2)，計算出訓練階段完成優化後之各型特徵值集合對應的預估模型之錯誤率。由前一個月已完成障礙修復之歷史申告資料，依時間做排序，前70%係作為模型訓練與優化用資料，後30%則保留於此作為本模型錯誤率計算單元計算預估模型錯誤率之用。後續之每月效能評估數值皆是以相同方式計算而得出。依最近一次訓練階段之預留30%模型效能指標測試資料，以模型效能指標建立單元之公式計算多種不同輸入特徵值集合之預估模型錯誤率，輸出結果範例可如下： In S520 of step 2, the model error rate is calculated according to the reserved test data of the latest training stage, and the model error rate calculation unit function is to calculate the weighted error rate calculation formula (2) established by using the above model performance index, and calculate the completion of the training phase. The error rate of the estimated model corresponding to each optimized set of feature values. The historical report data of obstacle repair has been completed in the previous month, sorted by time, the first 70% is used as model training and optimization data, and the last 30% is retained as the model error rate calculation unit to calculate the estimated model error rate. Use. Subsequent monthly performance evaluation values are calculated in the same way. According to the test data of 30% model performance index reserved in the last training phase, the prediction model error rate of a plurality of different input feature value sets is calculated by the formula of the model performance index establishing unit, and the output result examples are as follows:

型式一的特徵值輸入集合：錯誤率2.13%。 The eigenvalue input set of the type one: the error rate is 2.13%.

型式二的特徵值輸入集合：錯誤率2.26%。 The eigenvalue input set of the type 2: the error rate is 2.26%.

其中型式一與型式二特徵值如高維度特徵值多重組合建立訓練中所述，前者著重正確性，後者著重時效性。 Among them, type one and type two eigenvalues, such as high-dimensional eigenvalues, are combined in the training to establish the training. The former emphasizes correctness and the latter emphasizes timeliness.

在步驟三之S530產出要提供給維運人員之維運資訊，用來產出給維運人員使用之整合維運資訊，包括該待處理判斷障礙區域門號之基本資料資訊、預估之障礙區域，及模型錯誤率計算單元所計算出之多種型別之錯誤預估機率，輸出結果範例可如下所示： In S3 of step 3, the output information to be provided to the maintenance personnel is used to output the integrated transportation information used by the maintenance personnel, including the basic information and the estimated information of the threshold number of the pending obstacle area. The obstacle area, and the error probability of the various types calculated by the model error rate calculation unit, the output result examples can be as follows:

其中型式一與型式二之差別在於產出預估值所需花費時間及輸入特徵值不同，如高維度特徵值多重組合建立訓練中所述，前者著重正確性，後者著重時效性。一般狀況下維護單位可選擇正確性較高的第一型預估區域加以維修處理，若需要非常快速的維修時，例如該客戶為國防或民生重要客戶或有簽訂嚴格SLA(Service Level Agreement)契約者，則在型式一預估數值尚未計算出前可選擇型二建議區域快速先行前往處理。 The difference between Type 1 and Type 2 is that the time required to produce the estimated value and the input eigenvalue are different. For example, the high-dimensional eigenvalue multi-combination establishment training, the former focuses on correctness, and the latter focuses on timeliness. Under normal circumstances, the maintenance unit may select the first type of estimated area with higher correctness for maintenance. If very fast maintenance is required, for example, the customer is a national defense or important customer of the people's livelihood or has a strict SLA (Service Level Agreement) contract. Then, after the type one estimated value has not been calculated, the selectable type 2 suggestion area is quickly advanced to the processing.

而步驟四之S540產出要提供給管理人員之維護優先權方案資訊，要產出給管理人員使用之維運優先權方案決策資訊，包括客戶端與ISP端的維護優先權增減建議量化數值。例如某ISP公司在某六個服務地區都有維護客戶端及ISP設備之兩組維護人員與設備，先以客戶端為例，分別以A與B表示最近30天與上月份全月之客戶端障礙預估件數平均值，且評估A與B差距比值是否過大之門檻值設為T(T>0且T<1)。若T>|(A-B)/B|，即表示最近30天之平均值與上月份之平均值差距絕對值小於T，則此種件數變動不大之狀況下維護優先權以0表示；若|(A-B)/B|>=T，即兩者差距絕對值在T(含)以上，則最近30天障礙增加時(即A>B)維護優先權以1表示，反之障礙減少時(即A<B)維護優先權以-1表示。ISP端之維護優先權數值計算方式與客戶端相同，惟前後差距比較門檻值T可依據公式(2)之模型效能加權錯誤率權重，設定為和客戶端不同。例如ISP端加權錯誤率權重為客戶端5倍時，表示ISP端較為重要，ISP端的差距門檻值T可設為客戶端的1/5。全區合計維護優先權則直接將各地區維護優先權數值相加。數值越高表示近期需要越多處理優先權，適合配置較高級之維修設備與較具經驗之維護人員，或自維修人員、設備過多地區調派之。本維護優先權方案產出單元產出結果範例可如下所示： The S540 of step 4 outputs the maintenance priority plan information to be provided to the manager, and outputs the decision information of the maintenance priority plan used by the manager, including the recommended quantitative value of the maintenance priority of the client and the ISP. For example, an ISP company has two sets of maintenance personnel and equipment for maintaining client and ISP equipment in a certain six service areas. First, the client is used as an example. The clients in the last 30 days and the last month are represented by A and B respectively. The average number of obstacles is estimated, and the threshold for evaluating whether the ratio of A to B is too large is set to T (T>0 and T<1). If T>|(AB)/B|, it means that the absolute value of the average of the last 30 days and the average value of the previous month is less than T, then the maintenance priority is represented by 0 if the number of such pieces does not change much; |(AB)/B|>=T, that is, the absolute difference between the two is above T (inclusive), then the maintenance priority is indicated by 1 when the obstacle increases in the last 30 days (ie A>B), and when the obstacle is reduced (ie A<B) Maintenance priority is indicated by -1. The maintenance priority value of the ISP is calculated in the same way as the client. However, the threshold value T can be set according to the model performance weighted error rate weight of the formula (2), which is set to be different from the client. For example, when the ISP-side weighted error rate weight is 5 times that of the client, it means that the ISP end is more important, and the ISP-side gap threshold T can be set to 1/5 of the client. The total maintenance priority of the district directly adds the maintenance priority values of each region. Higher values indicate that more processing priorities are needed in the near future, and it is suitable for deploying higher-level maintenance equipment and more experienced maintenance personnel, or from maintenance areas and equipment. An example of the output of the output unit of this maintenance priority plan can be as follows:

本例表示該公司近期以台中服務區為例，建議減少客戶端維護優先權而增加ISP端維護優先權；新竹服務區則兩者均維持現狀即可；整體而言全區則須增加客戶端維護優先權並減少ISP端維護優先權，或將ISP端部分過多的維修設備及資深維護人員轉移給客戶端。 This example shows that the company recently took the Taichung service area as an example. It suggested reducing the priority of client maintenance and increasing the priority of ISP maintenance. The Hsinchu service area can maintain the status quo. In general, the whole area must add clients. Maintain priority and reduce ISP end maintenance priority, or transfer too many maintenance equipment and senior maintenance personnel at the ISP end to the client.

上列詳細說明乃針對本發明之一可行實施例進行具體說明，惟該實施例並非用以限制本發明之專利範圍，凡未脫離本發明技藝精神所為之等效實施或變更，均應包含於本案之專利範圍中。 The detailed description of the present invention is intended to be illustrative of a preferred embodiment of the invention, and is not intended to limit the scope of the invention. The patent scope of this case.

綜上所述，本案不僅於技術思想上確屬創新，並具備習用之傳統方法所不及之上述多項功效，已充分符合新穎性及進步性之法定發明專利要件，爰依法提出申請，懇請貴局核准本件發明專利申請案，以勵發明，至感德便。 To sum up, this case is not only innovative in terms of technical thinking, but also has many of the above-mentioned functions that are not in the traditional methods of the past. It has fully complied with the statutory invention patent requirements of novelty and progressiveness, and applied for it according to law. Approved this invention patent application, in order to invent invention, to the sense of virtue.

Claims

An obstacle positioning system for video streaming services includes: a data source module, which is a complex information source for collecting and determining obstacle positioning; and a feature value extraction module, which is based on various types of information required for positioning obstacles. Different source system characteristics are extracted to form an input feature parameter group of the subsequent machine learning analysis module, and the feature values of each unit in the data source module are processed; the machine learning training and the implementation module are accepted The eigenvalues of the eigenvalue extraction module are output, and after further data pre-processing, the machine learning is used to train and obtain the optimized model parameters, and further includes a training unit and a real-life unit; the maintenance operation information output module In order to establish an evaluation method that can simultaneously consider the weight of the estimated error severity of the client and ISP-side obstacle areas, evaluate different input feature value sets, and optimize the obstacle area prediction model after the machine learning module training stage is completed. The error rate, the output is suitable for the integration of different user characteristics, and the dimension based on the quantitative value of maintenance priority Priority programs are for maintenance and operation and management personnel.

The image locating service obstacle locating system according to claim 1, wherein the data source module further comprises: a service quality management unit, which includes an image quality level and service quality data of the application layer; The management unit is related to the obstacle report related information of the image service; the loop quality diagnosis management unit includes the loop quality test data of the physical layer and the special type user loop construction method record; the broadband network monitoring unit includes the nodes of the ISP industry. Equipment brand model data, equipment alarm code and alarm code and content information.

The obstacle location system for video stream service according to claim 1, wherein the feature value extraction module further comprises: a service quality management feature value extraction unit, which is to obtain a high quality user of 4K or higher. Binary flag value, recent video stream service quality indicator, video stream service appeal probability; obstacle report management feature value extraction unit, to obtain video stream service report reason code, report description text note, manual diagnosis pre-test Code; loop quality diagnostic management feature value extraction unit, to obtain digital Subscriber Line Access Multiplexer (DSLAM) brand model, DSLAM firmware version, voice and audio band attenuation value, uplink SNR (Signal To Noise Ratio), downlink SNR, periodic quality monitoring value of the client and ISP, whether to use the special method binary flag value; broadband network monitoring feature value extraction unit to obtain the video streamer box or home The model of the multi-function gateway, the downstream rate of the video streamer box or the home multi-function gateway, and the terminal design Type, quantization index values central office equipment alarm, alarm type term frequency (term frequency), alarm severity index value.

For example, the obstacle positioning system for the video streaming service according to the first aspect of the patent application, wherein the training unit of the machine learning training and the implementation module further comprises: a training target establishing unit, which is a judgment for establishing an estimation model. As a benchmark for calculating the loss function and optimization during the training model; the class and missing value pre-processing training unit preprocesses the eigenvalues of the training data and includes expanding the categorical eigenvalues into binary indication features. Binary indicator, and if the numerical eigenvalue has a missing value, it is replaced by the average value, and a binary missing indication eigenvalue is added for the partially missing eigenvalue; the text note obstacle analysis training unit is For each incident to be processed in the training data, according to the textual description of the verbatim notes, using the automatic word-breaking tool and Logis regression analysis, first calculate the barrier-related word frequency combination of the text description belongs to the client or ISP. The probability of the obstacle, and the probability is taken as one of the characteristic values of the subsequent model input; the high dimensional eigenvalue multiple The training unit is established to make one or a plurality of high-dimensional feature value sets for each obstacle to be processed in the training data; the optimization obstacle point classification model building unit is to use a nonlinear gradient lifting decision tree (Gradient Boosting Decision) Tree, GBDT) is the main estimation model. After inputting the multi-dimensional combination of high-dimensional eigenvalues to establish the combination of various types of high-dimensional eigenvalues produced by the training unit, the optimal model parameters of the training data are found through the optimization process of minimizing the loss function. For each application, it is estimated that each new interval of the obstacle to be judged.

The obstacle location system for the video streaming service as described in claim 1, wherein the machine learning training and implementation module implementation unit is responsible for actual real-time data processing estimation, and includes: categories and omissions The pre-processing unit preprocesses the eigenvalues of the actual data to be estimated and the eigenvalues, and includes expanding the categorical eigenvalues into binary indicator eigenvalues, and if the eigenvalues are numerical eigenvalues If there is a missing value, it is replaced by the average value, and a binary missing indication feature value is added for some missing feature values; the text note obstacle analysis unit is for each obstacle in the actual data to be estimated and the feature value to be processed. The event, according to the textual description of the verbatim note content, using the automatic word-breaking tool and the Logis regression analysis, first calculate the obstacle-related word frequency combination of the text description is the probability of belonging to the client or the ISP side obstacle, and the probability And as one of the characteristic values of the subsequent model input; the multi-dimensional eigenvalue multi-combination establishing unit is for the actual data to be estimated and the eigenvalue An obstacle waiting to be processed, making one or a plurality of high-dimensional feature value sets; the obstacle point classification estimating output unit, using the best optimized GBDT model parameters after monthly update training, calculating each new pending obstacle interval Judgment value of the probability of the obstacle area of the case.

The obstacle positioning system for the video streaming service according to claim 1, wherein the maintenance operation information output module further comprises: a model performance indicator establishing unit, which establishes an evaluation by a weighted error rate evaluation. The benchmark calculation model for predicting the quality of the model; the model error rate calculation unit is to calculate the weighted error rate calculation formula of the model performance indicator unit, and calculate the corresponding estimate of each type of feature value set after the machine learning training phase is optimized. Model error rate; the maintenance information output unit is to integrate the output of the customer data to be processed, the obstacle interval judgment result and the reference model error rate, so as to provide the priority of the maintenance personnel according to the timeliness or correctness. Maintenance recommendation information; maintenance priority program output unit, based on the last 30 days and the past month within the barrier interval statistical average, the organization of the client and ISP end of the maintenance priority decision-making program Out, so that more fault areas can be prioritized and repaired in order to achieve maintenance resources The biggest benefit.

The obstacle positioning system for the video streaming service according to the second aspect of the patent application, wherein the service quality data of the application layer is an image quality level and a numerical quality indicator of each image terminal device.

For example, the obstacle positioning system for the video streaming service described in claim 2, wherein the related information of the obstacle is the content of the application, the description text and the test code after the artificial pre-test.

For example, the obstacle positioning system for the video streaming service described in claim 2, wherein the loop quality test data is a line coupling (bundling) and a light copper hybrid (G.fast) special method record, and an electrical characteristic estimation of the line. The value and the periodic record of the switching office equipment of the ISP closest to the user end.

For example, the obstacle positioning system for the video streaming service described in claim 3, wherein the manual diagnosis pre-test code is a reason code for the obstacle input after the preliminary diagnosis by the professional diagnostic personnel.

A method for locating and locating an image stream service includes: step 1: the data source module extracts the high-dimensional types of estimated feature values by the feature value extraction module; and the second step, through machine learning training and real For module processing, the optimized prediction model is trained with the training data, and the corresponding client and ISP terminal under multi-type eigenvalues are estimated based on the eigenvalue data of the actual customer complaint case. Estimate the probability of obstacles; Step 3: Finally, the maintenance operation information output module is responsible for generating the operation mode selection and priority decision information;

The method for locating and locating an image streaming service according to claim 11 wherein the process of the machine learning training and the implementation module includes: step one: whether a training model is generated, and if yes, first Perform the first simulation training, establish the training target, if it is no, perform the category and the missing value pre-processing; Step 2, when the training target is established, enter the category and the missing value pre-processing training; Step 3, the text note obstacle analysis Training; Step 4, multi-dimensional combination of high-dimensional eigenvalues to establish training; Step 5, optimization of obstacle classification model establishment, and return to category and missing value pre-processing; Step 6, text note obstacle analysis; Step VII, high-dimensional features Multiple combinations of values are established; Step 8: Obstacle point classification predicts output; Step 9: Determine whether to calculate the next user, if yes, return to whether the training model is generated, and if it is no.

For example, the method for locating and locating the video streaming service according to claim 11 is as follows: the process of the operation mode includes: step one, evaluation of the performance index of the design model, and no change after the first design is completed; 2. Calculate the model error rate according to the reserved test data of the latest training phase; Step 3: The output shall be provided to the maintenance personnel for the maintenance information; Step 4: The output shall be provided to the management personnel for the maintenance priority program information. Step 5: Determine whether there is a next user to be estimated. If yes, return to step 2, and calculate the model error rate according to the reserved test data of the latest training phase. If not, the process ends.