TW202226047A

TW202226047A - Method and system of image analysis for infant care

Info

Publication number: TW202226047A
Application number: TW109146451A
Authority: TW
Inventors: 洪上智; 劉建宏; 蔣岳珉; 呂子杰; 黎和欣; 李仁貴; 江正傑
Original assignee: 財團法人工業技術研究院; 家恩股份有限公司
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2022-07-01
Also published as: TWI768625B

Abstract

A method of image analysis for infant care is provided. The method includes: training, by a managing device, a two-stage model, wherein the two-stage model includes a front-stage model and a back-stage model, and the managing device transmits the front-stage model and the back-stage model; receiving, by a smart front-end device, the front-stage model and inputting at least one image into the front-end model, de-identifying the at least one image to generate at least one de-identified image and transmitting the at least one de-identified image; and receiving, by a back-end server, the back-stage model and the at least one de-identified image, inputting the at least one de-identified image into the back-stage model to determine whether there is an abnormal event in the at least one de-identified image.

Description

Method and system for baby care image analysis

本揭露係有關於一種影像分析的方法及系統，且特別係有關於一種嬰兒照護影像分析的方法及系統。The present disclosure relates to a method and system for image analysis, and in particular, to a method and system for infant care image analysis.

嬰兒出生後3個月內之猝死機率最高，造成猝死的原因多元。例如，嬰兒溢奶（發生率達15%）可能造成嗆奶窒息的狀況，威脅嬰兒的生命安全。但是照護人員不可能隨時視察嬰兒狀況，因此透過即時監控影像，結合AI影像分析偵測異常事件之發生，可有效降低嬰兒猝死之風險。Infants have the highest risk of sudden death within 3 months after birth, and there are many reasons for sudden death. For example, spitting up milk (incidence as high as 15%) can cause choking and suffocation, threatening the life of the baby. However, it is impossible for caregivers to inspect the condition of the baby at any time. Therefore, the risk of sudden infant death can be effectively reduced by using real-time monitoring images, combined with AI image analysis to detect the occurrence of abnormal events.

在嬰兒照護時可能發生的異常事件種類多元。若是藉由前端裝置執行AI異常事件偵測，需要較佳的運算效能，往往因而限制異常事件偵測準確度。There are many types of abnormal events that can occur during infant care. If the AI abnormal event detection is performed by the front-end device, better computing performance is required, which often limits the detection accuracy of the abnormal event.

另一種解決方案是採用由後端平台提供之AI影像分析功能，以減少前端裝置所需之運算。但為了透過後端平台執行異常事件偵測，需要將即時影像傳輸至後端平台，造成嬰兒影像暴露，產生個人資料隱私外洩的風險。Another solution is to use the AI image analysis function provided by the back-end platform to reduce the computation required by the front-end device. However, in order to perform abnormal event detection through the back-end platform, real-time images need to be transmitted to the back-end platform, resulting in exposure of infant images and the risk of personal data privacy leakage.

而傳統上影像進行個資隱藏的方式，如馬賽克、模糊化等技術，皆是以對原始影像進行破壞性操作的方式，來達成影像個資隱藏的目的。然而此種破壞性操作的結果將導致後端平台無法執行基於影像的AI分析。The traditional methods of hiding personal information of images, such as mosaic, blurring and other technologies, all use destructive operations on the original image to achieve the purpose of hiding image personal information. However, as a result of such disruptive operations, the backend platform will not be able to perform image-based AI analysis.

因此，需要一種嬰兒照護影像分析的方法及系統，以改善上述問題。Therefore, there is a need for a method and system for analyzing baby care images to improve the above problems.

以下揭露的內容僅為示例性的，且不意指以任何方式加以限制。除所述說明方面、實施方式和特徵之外，透過參照附圖和下述具體實施方式，其他方面、實施方式和特徵也將顯而易見。即，以下揭露的內容被提供以介紹概念、重點、益處及本文所描述新穎且非顯而易見的技術優勢。所選擇，非所有的，實施例將進一步詳細描述如下。因此，以下揭露的內容並不意旨在所要求保護主題的必要特徵，也不意旨在決定所要求保護主題的範圍中使用。The following disclosure is exemplary only and is not intended to be limiting in any way. In addition to the illustrated aspects, embodiments, and features, other aspects, embodiments, and features will be apparent by reference to the drawings and the following detailed description. That is, the following disclosure is provided to introduce the concepts, highlights, benefits, and novel and non-obvious technical advantages described herein. Selected, but not all, embodiments are described in further detail below. Accordingly, the following disclosure is not intended to be an essential feature of the claimed subject matter, nor is it intended to be used in determining the scope of the claimed subject matter.

因此，本揭露之主要目的即在於提供一種嬰兒照護影像分析的方法及系統，以改善上述缺點。Therefore, the main purpose of the present disclosure is to provide a method and system for analyzing baby care images to improve the above shortcomings.

本揭露提出一種嬰兒照護影像分析的方法，包括：藉由一管理器訓練一兩段式模型，其中上述兩段式模型包括一前段模型及一後段模型，並由上述管理器傳送上述前段模型與上述後段模型；藉由一智慧前端接收上述前段模型，將至少一影像輸入上述前段模型，對上述至少一影像進行去識別化，以產生至少一去識別化影像，並由該智慧前端傳送上述至少一去識別化影像；以及藉由一後台伺服器接收上述後段模型及上述至少一去識別化影像，將上述至少一去識別化影像輸入上述後段模型，以判斷上述至少一去識別化影像是否存在異常事件。The present disclosure provides a method for analyzing baby care images, comprising: training a two-stage model by a manager, wherein the two-stage model includes a front-end model and a back-end model, and the manager transmits the front-end model and the back-end model. The above-mentioned back-end model; the above-mentioned front-end model is received by a smart front end, and at least one image is input into the above-mentioned front-end model, and the at least one image is de-identified to generate at least one de-identified image, and the intelligent front end transmits the at least one image. a de-identified image; and receiving the back-end model and the at least one de-identified image by a backend server, and inputting the at least one de-identified image into the back-end model to determine whether the at least one de-identified image exists abnormal event.

在一實施例中，上述兩段式模型係由N1層隱藏層、一池化層以及N2層隱藏層依序串接組成。In one embodiment, the above-mentioned two-stage model is composed of an N1 hidden layer, a pooling layer, and an N2 hidden layer in series.

在一實施例中，上述前段模型係包括上述N1層隱藏層以及上述池化層。In one embodiment, the above-mentioned front-end model system includes the above-mentioned N1 hidden layer and the above-mentioned pooling layer.

在一實施例中，藉由上述管理器訓練上述兩段式模型包括以下步驟：步驟(a)：產生一組模型參數；步驟(b)：根據上述組模型參數產生上述兩段式模型；步驟(c)：依據上述兩段式模型偵測多個訓練影像中之異常事件，以及辨識上述多個訓練影像中之人臉；步驟(d)：根據上述偵測結果及上述辨識結果，判斷上述兩段式模型之一異常事件偵測效能是否大於一第一閾值，且一人臉偵測效能是否小於一第二閾值；步驟(e)：當上述異常事件偵測效能不大於上述第一閾值，或上述人臉偵測效能不小於上述第二閾值時，根據上述組模型參數產生另一組模型參數；以及以上述另一組模型參數取代上述組模型參數，重複執行上述步驟(b)~(e)，直到上述異常事件偵測效能大於上述第一閾值，且上述人臉偵測效能小於上述第二閾值為止。In one embodiment, training the two-stage model by the manager includes the following steps: step (a): generating a set of model parameters; step (b): generating the two-stage model according to the set of model parameters; step (c): Detecting abnormal events in a plurality of training images according to the above-mentioned two-stage model, and identifying faces in the above-mentioned plurality of training images; Step (d): According to the above-mentioned detection results and the above-mentioned identification results, determine the above-mentioned Whether the abnormal event detection performance of the two-stage model is greater than a first threshold, and whether the face detection performance is less than a second threshold; Step (e): when the abnormal event detection performance is not greater than the first threshold, Or when the above-mentioned face detection performance is not less than the above-mentioned second threshold, generate another group of model parameters according to the above-mentioned group of model parameters; and replace the above-mentioned group of model parameters with the above-mentioned another group of model parameters, and repeat the above-mentioned steps (b)～( e), until the abnormal event detection performance is greater than the first threshold, and the face detection performance is less than the second threshold.

在一實施例中，上述管理器係藉由一啟發式演算法根據上述組模型參數產生上述另一組模型參數，其中上述啟發式演算法係為一基因演算法、一粒子群演算法或一模擬退火演算法。In one embodiment, the manager generates the other set of model parameters according to the set of model parameters through a heuristic algorithm, wherein the heuristic algorithm is a genetic algorithm, a particle swarm algorithm or a Simulated Annealing Algorithm.

在一實施例中，上述管理器係使用一平均精度均值法(mean of Average Precision，mAP)估算上述異常事件偵測效能以及上述人臉偵測效能。In one embodiment, the manager uses a mean of Average Precision (mAP) method to estimate the abnormal event detection performance and the face detection performance.

在一實施例中，上述組模型參數及上述另一組模型參數至少包括：隱藏層數N1及N2的比例、各隱藏層深度、各隱藏層卷積核心尺寸、各隱藏層激勵函數以及池化層核心尺寸。In one embodiment, the above-mentioned set of model parameters and the above-mentioned another set of model parameters at least include: the ratio of the number of hidden layers N1 and N2, the depth of each hidden layer, the size of the convolution core of each hidden layer, the activation function of each hidden layer, and the pooling Layer core size.

在一實施例中，上述隱藏層數N1及N2的比例係為1:7、1:3或3:5。In one embodiment, the ratio of the number of hidden layers N1 and N2 is 1:7, 1:3 or 3:5.

在一實施例中，上述兩段式模型係一深度神經網路(Deep Neural Network，DNN)模型。In one embodiment, the above two-stage model is a Deep Neural Network (DNN) model.

本揭露提出一種嬰兒照護影像分析的系統，包括：一管理器，訓練一兩段式模型，其中上述兩段式模型包括一前段模型及一後段模型，並傳送上述前段模型與上述後段模型；一智慧前端，接收上述前段模型，將至少一影像輸入上述前段模型，對上述至少一影像進行去識別化，以產生至少一去識別化影像，並傳送上述去識別化影像；以及一後台伺服器，接收上述後段模型及上述去識別化影像，將上述至少一去識別化影像輸入上述後段模型，以判斷上述至少一去識別化影像是否存在異常事件。The present disclosure provides a system for analyzing baby care images, including: a manager for training a two-stage model, wherein the two-stage model includes a front-end model and a back-end model, and transmits the front-end model and the back-end model; a an intelligent front-end, receiving the above-mentioned front-end model, inputting at least one image into the above-mentioned front-end model, de-identifying the at least one image, to generate at least one de-identified image, and transmitting the above-mentioned de-identified image; and a backend server, The back-end model and the de-identified image are received, and the at least one de-identified image is input into the back-end model to determine whether there is an abnormal event in the at least one de-identified image.

在下文中將參考附圖對本揭露的各方面進行更充分的描述。然而，本揭露可以具體化成許多不同形式且不應解釋為侷限於貫穿本揭露所呈現的任何特定結構或功能。相反地，提供這些方面將使得本揭露周全且完整，並且本揭露將給本領域技術人員充分地傳達本揭露的範圍。基於本文所教導的內容，本領域的技術人員應意識到，無論是單獨還是結合本揭露的任何其它方面實現本文所揭露的任何方面，本揭露的範圍旨在涵蓋本文中所揭露的任何方面。例如，可以使用本文所提出任意數量的裝置或者執行方法來實現。另外，除了本文所提出本揭露的多個方面之外，本揭露的範圍更旨在涵蓋使用其它結構、功能或結構和功能來實現的裝置或方法。應可理解，其可透過申請專利範圍的一或多個元件具體化本文所揭露的任何方面。Aspects of the present disclosure will be described more fully hereinafter with reference to the accompanying drawings. This disclosure may, however, be embodied in many different forms and should not be construed as limited to any specific structure or function presented throughout this disclosure. Rather, these aspects are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. Based on the teachings herein, one skilled in the art should appreciate that the scope of the present disclosure is intended to encompass any aspect disclosed herein, whether implemented alone or in conjunction with any other aspect of the present disclosure. For example, it may be implemented using any number of means or performing methods set forth herein. Additionally, in addition to the aspects of the disclosure set forth herein, the scope of the disclosure is intended to cover apparatus or methods implemented using other structures, functions, or structures and functions. It should be understood that any aspect disclosed herein may be embodied by one or more elements of the claimed scope.

詞語「示例性」在本文中用於表示「用作示例、實例或說明」。本揭露的任何方面或本文描述為「示例性」的設計不一定被解釋為優選於或優於本揭露或設計的其他方面。此外，相同的數字在所有若干圖示中指示相同的元件，且除非在描述中另有指定，冠詞「一」和「上述」包含複數的參考。The word "exemplary" is used herein to mean "serving as an example, instance, or illustration." Any aspect of the present disclosure or designs described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects of the present disclosure or designs. Furthermore, like numerals refer to like elements throughout the several figures, and unless otherwise specified in the description, the articles "a" and "above" include plural references.

可以理解，當元件被稱為被「連接」或「耦接」至另一元件時，該元件可被直接地連接到或耦接至另一元件或者可存在中間元件。相反地，當該元件被稱為被「直接連接」或「直接耦接」至到另一元件時，則不存在中間元件。用於描述元件之間的關係的其他詞語應以類似方式被解釋（例如，「在…之間」與「直接在…之間」、「相鄰」與「直接相鄰」等方式）。It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. In contrast, when an element is referred to as being "directly connected" or "directly coupled" to another element, there are no intervening elements present. Other words used to describe the relationship between elements should be interpreted in a similar fashion (eg, "between" versus "directly between," "adjacent" versus "directly adjacent," etc.).

第1圖為根據本發明一實施例所述之嬰兒照護影像分析系統100之示意圖。嬰兒照護影像分析系統100係包括至少一智慧前端110、一管理器120及一後台伺服器130，FIG. 1 is a schematic diagram of a baby care image analysis system 100 according to an embodiment of the present invention. The baby care image analysis system 100 includes at least a smart front end 110 , a manager 120 and a backend server 130 .

智慧前端110、管理器120、以及後台伺服器130分別為獨立的裝置，其可位於不同地點、被物理性分隔開，彼此間可以透過網路互相連接。管理器120主要係用以整合智慧前端110與後台伺服器130，以提供AI影像分析服務。管理器120包含一神經網路訓練系統，用以訓練產生一兩段式模型，其中上述兩段式模型包括一前段模型及一後段模型。接著，管理器120可將前段模型傳送至智慧前端110，並將後段模型傳送至後台伺服器130。The smart front-end 110 , the manager 120 , and the back-end server 130 are independent devices, which can be located in different locations, are physically separated, and can be connected to each other through a network. The manager 120 is mainly used to integrate the smart front-end 110 and the back-end server 130 to provide AI image analysis services. The manager 120 includes a neural network training system for training and generating a two-stage model, wherein the two-stage model includes a front-stage model and a back-stage model. Then, the manager 120 can transmit the front-end model to the smart front-end 110 and the back-end model to the backend server 130 .

智慧前端110設置於一嬰兒照護盆160上，可包括攝影裝置112、資料處理器116、以及支架118。攝影裝置112透過支架118設置於嬰兒照護盆160上。資料處理器116接收攝影裝置112拍攝嬰兒114之至少一影像，並利用智慧前端110所接收之前段模型，將該至少一影像輸入上述前段模型，以將該至少一影像進行去識別化，產生至少一去識別化影像150。而後，由智慧前端110將上述至少一去識別化影像150傳送至後台伺服器130。The smart front end 110 is disposed on a baby care basin 160 , and may include a camera device 112 , a data processor 116 , and a bracket 118 . The photographing device 112 is disposed on the baby care basin 160 through the bracket 118 . The data processor 116 receives at least one image of the baby 114 captured by the photographing device 112, and uses the preceding model received by the smart front end 110 to input the at least one image into the preceding model to de-identify the at least one image to generate at least one image. A de-identified image 150. Then, the smart front end 110 transmits the at least one de-identified image 150 to the backend server 130 .

後台伺服器130可接收由智慧前端110所傳送之至少一去識別化影像150以及管理器120所傳送之後段模型。後台伺服器130將至少一去識別化影像輸入至上述後段模型，以判斷上述至少一去識別化影像150是否存在異常事件。The backend server 130 can receive at least one de-identified image 150 sent by the smart front end 110 and a backend model sent by the manager 120 . The backend server 130 inputs the at least one de-identified image to the back-end model, so as to determine whether there is an abnormal event in the at least one de-identified image 150 .

智慧前端110、一管理器120及一後台伺服器130的類型範圍從小型手持裝置（例如，行動電話∕可攜式電腦）到大型主機系統（例如大型電腦）。可攜式電腦的示例包括個人數位助理(PDA)、筆記型電腦等裝置。網路可包括但不侷限於一或多個區域網(Local Area Network，LAN)和/或廣域網路(Wide Area Network，WAN)。Types of smart front-end 110, a manager 120, and a back-end server 130 range from small handheld devices (eg, mobile phones/portable computers) to mainframe systems (eg, mainframe computers). Examples of portable computers include personal digital assistants (PDAs), notebook computers, and the like. The network may include, but is not limited to, one or more Local Area Networks (LANs) and/or Wide Area Networks (WANs).

應可理解，第1圖所示的智慧前端110、一管理器120及一後台伺服器130係一嬰兒照護影像分析的系統100架構的示例。第1圖所示的每個元件可經由任何類型的計算裝置來實現，像是參考第6圖描述的計算裝置600，如第6圖所示。It should be understood that the smart front end 110 , a manager 120 and a backend server 130 shown in FIG. 1 are examples of the architecture of a system 100 for analyzing baby care images. Each of the elements shown in FIG. 1 may be implemented by any type of computing device, such as computing device 600 described with reference to FIG. 6 , as shown in FIG. 6 .

第2圖係顯示根據本揭露一實施例所述之嬰兒照護影像分析的方法200之流程圖。此方法可執行於如第1圖所示之嬰兒照護影像分析系統100中。FIG. 2 is a flowchart illustrating a method 200 of infant care image analysis according to an embodiment of the present disclosure. This method can be implemented in the infant care image analysis system 100 as shown in FIG. 1 .

在步驟S205中，管理器120訓練產生一兩段式模型，其中上述兩段式模型包括一前段模型及一後段模型，並將上述前段模型傳送至智慧前端110、以及將上述後段模型傳送至後台伺服器130。在一實施例中，上述兩段式模型係一深度神經網路(Deep Neural Network，DNN)模型，且係由N1層隱藏層、一池化層、以及N2層隱藏層依序串接組成。In step S205, the manager 120 trains to generate a two-stage model, wherein the two-stage model includes a front-end model and a back-end model, and transmits the front-end model to the smart front-end 110 and the back-end model to the backend Server 130. In one embodiment, the above-mentioned two-stage model is a Deep Neural Network (DNN) model, and is composed of N1 hidden layers, a pooling layer, and N2 hidden layers in series.

接著，在步驟S210中，一智慧前端110接收攝影裝置112拍攝嬰兒114之至少一影像及上述前段模型，並將上述至少一影像輸入上述前段模型，將對上述至少一影像進行去識別化，以產生至少一去識別化影像150，並將上述至少一去識別化影像傳送至後台伺服器130，其中上述前段模型係包括上述N1層隱藏層以及上述池化層。Next, in step S210, a smart front-end 110 receives at least one image of the baby 114 and the preceding model from the camera 112, inputs the at least one image into the preceding model, and de-identifies the at least one image to obtain At least one de-identified image 150 is generated, and the at least one de-identified image is sent to the background server 130 , wherein the front-end model includes the N1 hidden layer and the pooling layer.

在步驟S215中，一後台伺服器130接收去識別化影像150及上述後段模型，並將上述至少一去識別化影像150輸入上述後段模型，以判斷至少一去識別化影像150是否存在異常事件，其中上述後段模型係包括上述N2層隱藏層。In step S215, a background server 130 receives the de-identified image 150 and the back-end model, and inputs the at least one de-identified image 150 into the back-end model to determine whether there is an abnormal event in the at least one de-identified image 150, The above-mentioned back-end model system includes the above-mentioned N2 hidden layer.

在一實施例中，當後台伺服器130判斷至少一去識別化影像150存在異常事件時，後台伺服器130可發送警示訊息至智慧前端110以通知操作智慧前端110之使用者，其中智慧前端110可使用相關使用者介面（例如：發光二極體(LED)、液晶顯示器(LCD)、麥克風、蜂鳴器(Buzzer)、藍牙串流）提醒使用者。In one embodiment, when the background server 130 determines that there is an abnormal event in at least one de-identified image 150 , the background server 130 may send a warning message to the smart front end 110 to notify the user who operates the smart front end 110 , wherein the smart front end 110 Relevant user interface (eg: Light Emitting Diode (LED), Liquid Crystal Display (LCD), Microphone, Buzzer, Bluetooth Streaming) can be used to remind the user.

第3圖係顯示根據本揭露一實施例之兩段式模型300的結構圖。FIG. 3 is a structural diagram of a two-stage model 300 according to an embodiment of the present disclosure.

如圖所示，兩段式模型300係由N1層隱藏層310、池化層320、及N2層隱藏層330依序串接所組成。一層隱藏層可由一個卷積(Convolution)層及一個激活(Rectified Linear，Re-Lu)層所組成。池化層可為一最大池化層。隱藏層的卷積核心尺寸係為k×k，池化層的核心尺寸係為p×p。As shown in the figure, the two-stage model 300 is composed of an N1 hidden layer 310 , a pooling layer 320 , and an N2 hidden layer 330 connected in sequence. A hidden layer consists of a convolution (Convolution) layer and an activation (Rectified Linear, Re-Lu) layer. The pooling layer may be a max pooling layer. The convolution kernel size of the hidden layer is k×k, and the kernel size of the pooling layer is p×p.

前段模型係包括N1層隱藏層310以及池化層320，後段模型係包括N2層隱藏層330。一原始影像先被輸入至N1層隱藏層310及池化層320。經過池化層320輸出一去識別化影像150，即，經過池化層320輸出之影像將不再具有可識別性。而N2層隱藏層330則是對去識別化影像判斷是否存在異常事件。在一實施例中，發生在嬰兒身上的異常事件可包括：睜眼、溢奶、發紺、黃疸等異常事件。The front-end model system includes an N1 hidden layer 310 and a pooling layer 320 , and the back-end model system includes an N2 hidden layer 330 . An original image is first input to the N1 hidden layer 310 and the pooling layer 320 . A de-identified image 150 is output through the pooling layer 320 , that is, the image output through the pooling layer 320 will no longer be recognizable. The hidden layer 330 of the N2 layer determines whether there is an abnormal event on the de-identified image. In one embodiment, the abnormal events occurring in the infant may include abnormal events such as eye opening, milk spillage, cyanosis, jaundice, and the like.

以下將詳細說明在第2圖之步驟S205中，管理器120如何訓練上述兩段式模型。須注意的是，如本文所使用的，術語「訓練」用於識別用於訓練兩段式模型的對象。因此，訓練影像是指用於訓練兩段式模型的影像。The following will describe in detail how the manager 120 trains the above-mentioned two-stage model in step S205 in FIG. 2 . Note that, as used herein, the term "training" is used to identify objects for training a two-stage model. Thus, training images refer to the images used to train the two-stage model.

第4圖係顯示根據本揭露一實施例之管理器120訓練上述兩段式模型的示意圖400，其中管理器120至少可包括一模型參數選擇器、一模型訓練器、一異常事件偵測器及一人臉偵測器。FIG. 4 shows a schematic diagram 400 of training the above two-stage model by the manager 120 according to an embodiment of the present disclosure, wherein the manager 120 may at least include a model parameter selector, a model trainer, an abnormal event detector, and A face detector.

在區塊405中，管理器120可先進行初始化，接著由模型參數選擇器410產生一組模型參數，其中上述組模型參數係隨機產生，且上述組模型參數係至少包括：隱藏層數N1及N2的比例、各隱藏層深度、各隱藏層卷積核心尺寸、各隱藏層激勵函數以及池化層核心尺寸。In block 405, the manager 120 can be initialized first, and then the model parameter selector 410 generates a set of model parameters, wherein the set of model parameters is randomly generated, and the set of model parameters at least includes: the number of hidden layers N1 and The ratio of N2, the depth of each hidden layer, the size of the convolution core of each hidden layer, the activation function of each hidden layer, and the core size of the pooling layer.

接著，將上述組模型參數輸入至模型訓練器415，以產生兩段式模型420，兩段式模型420包含前段模型與後段模型，其中前段模型係由N1層隱藏層以及上述池化層組成，後段模型係由N2層隱藏層所組成。接著，將多個訓練影像輸入至兩段式模型420中，由前段模型產生多個去識別化訓練影像，並輸出上述多個去識別化訓練影像至人臉偵測器435，由人臉偵測器435辨識上述多個去識別化訓練影像分別對應至的人臉。而後段模型會接收上述多個去識別化訓練影像，並輸出多個異常去識別化訓練影像至異常事件偵測器425，由異常事件偵測器425辨識上述多個異常去識別化訓練影像為何種異常事件，再輸入異常事件偵測器效能估算器430進行評估。Next, the above-mentioned set of model parameters are input into the model trainer 415 to generate a two-stage model 420, the two-stage model 420 includes a front-end model and a back-end model, wherein the front-end model is composed of N1 hidden layers and the above-mentioned pooling layer, The back-end model is composed of N2 hidden layers. Next, a plurality of training images are input into the two-stage model 420, a plurality of de-identified training images are generated by the previous model, and the plurality of de-identified training images are output to the face detector 435, and the face detector The detector 435 identifies the faces to which the plurality of de-identified training images correspond respectively. The latter-stage model will receive the plurality of de-identification training images, and output the plurality of abnormal de-identification training images to the abnormal event detector 425 , and the abnormal event detector 425 will identify why the plurality of abnormal de-identification training images are described above. The abnormal event is then input to the abnormal event detector performance estimator 430 for evaluation.

每一訓練影像所對應至之異常事件與人臉，管理器120具有例如是一對照表，以供異常事件偵測器效能估算器430估算異常事件偵測效能，即異常事件辨識的正確率，及供人臉偵測效能估算器440估算人臉偵測效能，即針對去識別化影像的人臉辨識正確率。最後，在區塊445中，管理器120判斷異常事件偵測器效能估算器430與人臉偵測效能估算器440所估算的效能是否滿足目標條件（異常事件偵測效能大於一第一閾值，且上述人臉偵測效能是否小於一第二閾值）。For the abnormal events and faces corresponding to each training image, the manager 120 has, for example, a comparison table for the abnormal event detector performance estimator 430 to estimate the abnormal event detection performance, that is, the correct rate of abnormal event recognition, And for the face detection performance estimator 440 to estimate the face detection performance, that is, the correct rate of face recognition for the de-identified image. Finally, in block 445, the manager 120 determines whether the performance estimated by the abnormal event detector performance estimator 430 and the face detection performance estimator 440 meets the target condition (the abnormal event detection performance is greater than a first threshold, and whether the above-mentioned face detection performance is less than a second threshold).

當上述異常事件偵測效能及上述人臉偵測效能是否滿足目標條件時，在區塊450中，管理器結束訓練。當上述異常事件偵測效能及上述人臉偵測效能是否滿足目標條件時，在區塊450中，管理器結束訓練。When the abnormal event detection performance and the face detection performance meet the target conditions, in block 450, the manager ends the training. When the abnormal event detection performance and the face detection performance meet the target conditions, in block 450, the manager ends the training.

當上述異常事件偵測效能及上述人臉偵測效能不滿足目標條件時，管理器將輸出上述結果至區塊410中，以使模型參數選擇器可根據上述組模型參數產生新的一組模型參數。When the abnormal event detection performance and the face detection performance do not meet the target conditions, the manager will output the above results to block 410, so that the model parameter selector can generate a new set of models according to the above set of model parameters parameter.

第5圖係顯示根據本揭露一實施例之管理器120訓練上述兩段式模型的流程圖500，此流程係更進一步說明第4圖中之流程細節。FIG. 5 shows a flowchart 500 of training the above-mentioned two-stage model by the manager 120 according to an embodiment of the present disclosure, which further illustrates the details of the process in FIG. 4 .

在步驟S505中，管理器120產生一組模型參數，其中上述組模型參數係隨機初始化產生，且上述組模型參數係至少包括：隱藏層數N1及N2的比例、各隱藏層深度、各隱藏層卷積核心尺寸、各隱藏層激勵函數以及池化層核心尺寸。In step S505, the manager 120 generates a set of model parameters, wherein the set of model parameters is generated by random initialization, and the set of model parameters at least includes: the ratio of the number of hidden layers N1 and N2, the depth of each hidden layer, the depth of each hidden layer The size of the convolution kernel, the activation function of each hidden layer, and the kernel size of the pooling layer.

在步驟S510中，管理器120根據上述組模型參數產生兩段式模型。接著，在步驟S515中，管理器120依據上述兩段式模型偵測多個訓練影像中之異常事件，以及辨識上述多個訓練影像中之人臉。In step S510, the manager 120 generates a two-stage model according to the above-mentioned set of model parameters. Next, in step S515, the manager 120 detects abnormal events in the plurality of training images according to the above-mentioned two-stage model, and recognizes the human faces in the above-mentioned plurality of training images.

再來，在步驟S520中，管理器120根據上述偵測結果及上述辨識結果，判斷上述兩段式模型之一異常事件偵測效能是否大於一第一閾值，且一人臉偵測效能是否小於一第二閾值。更詳細地說明，異常事件偵測效能應越大越好，表示後段模型偵測異常事件的效能越佳。而人臉偵測效能應越小越好，表示後段模型無法由前段模型所產生之去辨識化影像辨識出人臉影像。在一實施例中，上述管理器係使用一平均精度均值法(mean of Average Precision，mAP)估算上述異常事件偵測效能以及上述人臉偵測效能。在另一實施例中，上述第一閾值及上述第二閾值係由人工事先指定。Next, in step S520, the manager 120 determines whether the abnormal event detection performance of the two-stage model is greater than a first threshold and whether the face detection performance is less than a second threshold. In more detail, the abnormal event detection performance should be as large as possible, indicating that the performance of the back-end model in detecting abnormal events is better. The face detection performance should be as small as possible, which means that the back-end model cannot identify the face image from the de-identified image generated by the front-end model. In one embodiment, the manager uses a mean of Average Precision (mAP) method to estimate the abnormal event detection performance and the face detection performance. In another embodiment, the first threshold and the second threshold are pre-specified manually.

當上述異常事件偵測效能不大於上述第一閾值，或上述人臉偵測效能不小於上述第二閾值時（步驟S520中的「否」），在步驟S525中，管理器120根據上述組模型參數產生另一組模型參數其中上述另一組模型參數至少包括：隱藏層數N1及N2的比例、各隱藏層深度、各隱藏層卷積核心尺寸、各隱藏層激勵函數以及池化層核心尺寸。在一實施例中，上述管理器120係藉由一啟發式演算法根據上述組模型參數產生上述另一組模型參數，其中上述啟發式演算法係為一基因演算法、一粒子群演算法或一模擬退火演算法等演算法。When the abnormal event detection performance is not greater than the first threshold, or the face detection performance is not less than the second threshold (“No” in step S520 ), in step S525 , the manager 120 according to the above-mentioned group model The parameters generate another set of model parameters, wherein the above-mentioned another set of model parameters at least includes: the ratio of the number of hidden layers N1 and N2, the depth of each hidden layer, the size of the convolution core of each hidden layer, the activation function of each hidden layer, and the core size of the pooling layer . In one embodiment, the manager 120 generates the other set of model parameters according to the set of model parameters through a heuristic algorithm, wherein the heuristic algorithm is a genetic algorithm, a particle swarm optimization or A simulated annealing algorithm and other algorithms.

在管理器120根據上述組模型參數產生另一組模型參數後，將以上述另一組模型參數取代上述組模型參數，重複執行上述步驟S510~S525，直到上述異常事件偵測效能大於上述第一閾值，且上述人臉偵測效能小於上述第二閾值為止。經過步驟S510~S525後所訓練完的兩段式模型將被分割為前段模型及後段模型。前段模型及後段模型再分別進一步被傳送至智慧前端110及後台伺服器130來執行。After the manager 120 generates another set of model parameters according to the set of model parameters, the set of model parameters is replaced by the other set of model parameters, and the above steps S510 to S525 are repeatedly executed until the abnormal event detection performance is greater than the first set of model parameters. threshold, and the above-mentioned face detection performance is less than the above-mentioned second threshold. The two-stage model trained after steps S510 to S525 will be divided into a front-stage model and a back-stage model. The front-end model and the back-end model are further sent to the smart front-end 110 and the back-end server 130 for execution, respectively.

在一實施例中，為適用於組合最佳化求解法，模型參數須被設計為有限組合，例如：隱藏層數N1及N2的比例係為(1:7、1:3、3:5)、各隱藏層之卷積核心(K)尺寸為(3×3、4×4、5×5、6×6)、隱藏層深度(C)為(4、16、32、64、128)、池化層核心尺寸為(2×2、3×3、4×4)，其中模型參數model可以如下公式表示：

。 In one embodiment, in order to be suitable for the combination optimization method, the model parameters must be designed as a finite combination, for example, the ratio of the number of hidden layers N1 and N2 is (1:7, 1:3, 3:5) , the size of the convolution kernel (K) of each hidden layer is (3×3, 4×4, 5×5, 6×6), the depth of the hidden layer (C) is (4, 16, 32, 64, 128), The core size of the pooling layer is (2×2, 3×3, 4×4), and the model parameter model can be expressed as follows:

.

如上所述，本揭露之嬰兒照護影像分析方法及系統將影像分析分割為多段，且可同時支援本地及雲端式架構，透過前段模型產生非破壞性且具隱私性之影像，再利用後段模型判斷具隱私性之影像是否存在異常事件。換言之，本揭露之嬰兒照護影像分析方法及系統利用深度神經網路(Deep Neural Network，DNN)進行處理過程會改變原始影像之特性，將原始影像經前段模型處理，做為個資隱藏之手段。此外，本揭露之嬰兒照護影像分析方法及系統中更提出了一種DNN模型分割方法，可量化前段模型對隱藏個資之程度並產生分割點建議，可有效減少前端運算裝置的運算量，更可避免後端平台在運算時，個人資料隱私外洩的風險。As described above, the baby care image analysis method and system of the present disclosure divides the image analysis into multiple segments, and can support both local and cloud-based architectures, generate non-destructive and private images through the front-end model, and then use the back-end model to judge Whether there is an abnormal event in the private image. In other words, the method and system for analyzing baby care images of the present disclosure utilizes a deep neural network (DNN) to change the characteristics of the original image, and the original image is processed by the front-end model as a means of hiding personal information. In addition, the baby care image analysis method and system disclosed in the present disclosure further proposes a DNN model segmentation method, which can quantify the extent to which the front-end model hides personal information and generate segmentation point recommendations, which can effectively reduce the computational workload of the front-end computing device, and can Avoid the risk of personal data privacy leakage when the back-end platform is operating.

對於本發明已描述的實施例，下文描述了可以實現本發明實施例的示例性操作環境。具體參考第6圖，第6圖係顯示用以實現本發明實施例的示例性操作環境，一般可被視為計算裝置600。計算裝置600僅為一合適計算環境的一個示例，並不意圖暗示對本發明使用或功能範圍的任何限制。計算裝置600也不應被解釋為具有與所示元件任一或組合相關任何的依賴性或要求。For the described embodiments of the invention, an exemplary operating environment in which embodiments of the invention may be implemented is described below. Referring specifically to FIG. 6, an exemplary operating environment, which may generally be considered a computing device 600, is shown for implementing embodiments of the present invention. Computing device 600 is merely one example of a suitable computing environment and is not intended to imply any limitation on the scope of use or functionality of the present invention. Neither should computing device 600 be interpreted as having any dependency or requirement relating to any one or combination of the elements shown.

本發明可在電腦程式碼或機器可使用指令來執行本發明，指令可為程式模組的電腦可執行指令，其程式模組由電腦或其它機器，例如個人數位助理或其它可攜式裝置執行。一般而言，程式模組包括例程、程式、物件、元件、數據結構等，程式模組指的是執行特定任務或實現特定抽象數據類型的程式碼。本發明可在各種系統組態中實現，包括可攜式裝置、消費者電子產品、通用電腦、更專業的計算裝置等。本發明還可在分散式運算環境中實現，處理由通訊網路所連結的裝置。The invention may be implemented in computer code or machine-useable instructions, the instructions may be computer-executable instructions of a program module, the program module of which is executed by a computer or other machine, such as a personal digital assistant or other portable device . Generally speaking, a program module includes routines, programs, objects, components, data structures, etc. A program module refers to a program code that performs a specific task or implements a specific abstract data type. The present invention can be implemented in a variety of system configurations, including portable devices, consumer electronics, general purpose computers, more specialized computing devices, and the like. The present invention can also be implemented in a distributed computing environment, processing devices connected by a communication network.

參考第6圖。計算裝置600包括直接或間接耦接以下裝置的匯流排610、記憶體612、一或多個處理器614、一或多個顯示元件616、輸入/輸出(I/O)埠口618、輸入/輸出(I/O)元件620以及說明性電源供應器622。匯流排610表示可為一或多個匯流排之元件（例如，位址匯流排、數據匯流排或其組合）。雖然第6圖的各個方塊為簡要起見以線示出，實際上，各個元件的分界並不是具體的，例如，可將顯示裝置的呈現元件視為I/O元件；處理器可具有記憶體。Refer to Figure 6. Computing device 600 includes a bus 610, memory 612, one or more processors 614, one or more display elements 616, input/output (I/O) ports 618, input/output (I/O) ports 618, which are directly or indirectly coupled to Output (I/O) elements 620 and illustrative power supply 622 . Bus 610 represents an element (eg, an address bus, a data bus, or a combination thereof) that may be one or more bus bars. Although each block of FIG. 6 is shown as a line for brevity, in fact, the demarcation of each element is not specific, for example, the presentation element of the display device may be regarded as an I/O element; the processor may have a memory .

計算裝置600一般包括各種電腦可讀取媒體。電腦可讀取媒體可以是可被計算裝600存取的任何可用媒體，該媒體同時包括易揮發性和非易揮發性媒體、可移動和不可移動媒體。舉例但不侷限於，電腦可讀取媒體可包括電腦儲存媒體和通訊媒體。電腦可讀取媒體同時包括在用於儲存像是電腦可讀取指令、資料結構、程式模組或其它數據之類資訊的任何方法或技術中實現的易揮發性性和非易揮發性媒體、可移動和不可移動媒體。電腦儲存媒體包括但不侷限於(Random Access Memory，RAM)、唯讀記憶體(Read-Only Memory，ROM)、電子抹除式可複寫唯讀記憶體(Electrically-Erasable Programmable Read-Only Memory，EEPROM)、快閃記憶體或其它記憶體技術、CD-ROM、數位多功能光碟(Digital Versatile Disc，DVD)或其它光碟儲存裝置、磁片、磁碟、磁片儲存裝置或其它磁儲存裝置，或可用於儲存所需的資訊並且可被計算裝置600存取的其它任何媒體。電腦儲存媒體本身不包括信號。Computing device 600 generally includes various computer-readable media. Computer-readable media can be any available media that can be accessed by computing device 600, including both volatile and non-volatile media, removable and non-removable media. By way of example and not limitation, computer-readable media may include computer storage media and communication media. Computer-readable media includes both volatile and non-volatile media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data, Removable and non-removable media. Computer storage media include but are not limited to (Random Access Memory, RAM), Read-Only Memory (ROM), Electronically-Erasable Programmable Read-Only Memory (EEPROM, EEPROM) ), flash memory or other memory technology, CD-ROM, Digital Versatile Disc (DVD) or other optical disk storage device, magnetic disk, magnetic disk, platter storage device or other magnetic storage device, or Any other medium that can be used to store the required information and that can be accessed by the computing device 600 . The computer storage medium itself does not contain the signal.

通訊媒體一般包含電腦可讀取指令、資料結構、程式模組或其它採用諸如載波或其他傳輸機制之類的模組化數據訊號形式的數據，並包括任何資訊傳遞媒體。術語「模組化數據訊號」係指具有一或多個特徵集合或以在訊號中編碼資訊之一方式更改的訊號。舉例但不侷限於，通訊媒體包括像是有線網路或直接有線連接的有線媒體及無線媒體，像是聲頻、射頻、紅外線以及其它無線媒體。上述媒體的組合包括在電腦可讀取媒體的範圍內。Communication media typically include computer-readable instructions, data structures, program modules, or other data in the form of modular data signals such as carrier waves or other transport mechanisms, and include any information delivery media. The term "modular data signal" refers to a signal that has one or more sets of features or is modified in a manner that encodes information in the signal. By way of example and not limitation, communication media include wired media such as a wired network or direct wired connection and wireless media such as audio, radio frequency, infrared, and other wireless media. Combinations of the above are included within the scope of computer-readable media.

記憶體612包括以易揮發性和非易揮發性記憶體形式的電腦儲存媒體。記憶體可為可移動、不移動或可以為這兩種的組合。示例性硬體裝置包括固態記憶體、硬碟驅動器、光碟驅動器等。計算裝置600包括一或多個處理器，其讀取來自像是記憶體612或I/O元件620各實體的數據。顯示元件616向使用者或其它裝置顯示數據指示。示例性顯示元件包括顯示裝置、揚聲器、列印元件、振動元件等。Memory 612 includes computer storage media in the form of volatile and non-volatile memory. The memory can be removable, non-removable, or a combination of the two. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Computing device 600 includes one or more processors that read data from entities such as memory 612 or I/O element 620 . Display element 616 displays data indications to a user or other device. Exemplary display elements include display devices, speakers, printing elements, vibrating elements, and the like.

I/O埠口618允許計算裝置600邏輯連接到包括I/O元件620的其它裝置，一些此種裝置為內建裝置。示例性元件包括麥克風、搖桿、遊戲台、碟形衛星訊號接收器、掃描器、印表機、無線裝置等。I/O元件620可提供一自然使用者介面，用於處理使用者生成的姿勢、聲音或其它生理輸入。在一些例子中，這些輸入可被傳送到一合適的網路元件以便進一步處理。計算裝置600可裝備有深度照相機，像是立體照相機系統、紅外線照相機系統、RGB照相機系統和這些系統的組合，以偵測與識別物件。此外，計算裝置600可以裝備有感測器（例如：雷達、光達）週期性地感測周遭一感測範圍內的鄰近環境，產生表示自身與周遭環境關聯的感測器資訊。再者，計算裝置600可以裝備有偵測運動的加速度計或陀螺儀。加速度計或陀螺儀的輸出可被提供給計算裝置600顯示。I/O ports 618 allow computing device 600 to logically connect to other devices including I/O elements 620, some of which are built-in devices. Exemplary elements include microphones, joysticks, game consoles, satellite dishes, scanners, printers, wireless devices, and the like. I/O element 620 may provide a natural user interface for processing user-generated gestures, sounds, or other physiological inputs. In some instances, these inputs can be passed to an appropriate network element for further processing. Computing device 600 may be equipped with depth cameras, such as stereo camera systems, infrared camera systems, RGB camera systems, and combinations of these systems, to detect and identify objects. In addition, the computing device 600 may be equipped with sensors (eg, radar, lidar) to periodically sense the surrounding environment within a sensing range, and generate sensor information representing the relationship between itself and the surrounding environment. Furthermore, computing device 600 may be equipped with an accelerometer or gyroscope that detects motion. The output of the accelerometer or gyroscope may be provided to computing device 600 for display.

此外，計算裝置600中之處理器614也可執行記憶體612中之程式及指令以呈現上述實施例所述之動作和步驟，或其它在說明書中內容之描述。In addition, the processor 614 in the computing device 600 can also execute programs and instructions in the memory 612 to perform the actions and steps described in the above embodiments, or other descriptions in the specification.

在此所揭露程序之任何具體順序或分層之步驟純為一舉例之方式。基於設計上之偏好，必須了解到程序上之任何具體順序或分層之步驟可在此文件所揭露的範圍內被重新安排。伴隨之方法權利要求以一示例順序呈現出各種步驟之元件，也因此不應被此所展示之特定順序或階層所限制。Any specific order or layered steps of the procedures disclosed herein are by way of example only. Based on design preferences, it is understood that any specific order or layered steps in the procedure may be rearranged within the scope disclosed in this document. The accompanying method claims present elements of the various steps in a sample order, and are therefore not to be limited by the specific order or hierarchy presented.

申請專利範圍中用以修飾元件之「第一」、「第二」、「第三」等序數詞之使用本身未暗示任何優先權、優先次序、各元件之間之先後次序、或方法所執行之步驟之次序，而僅用作標識來區分具有相同名稱（具有不同序數詞）之不同元件。The use of ordinal numbers such as "first", "second" and "third" used to modify elements in the scope of the patent application do not imply any priority, priority order, order between elements, or method execution. The order of the steps is only used as an indicator to distinguish different elements with the same name (with different ordinal numbers).

雖然本揭露已以實施範例揭露如上，然其並非用以限定本案，任何熟悉此項技藝者，在不脫離本揭露之精神和範圍內，當可做些許更動與潤飾，因此本案之保護範圍當視後附之申請專利範圍所界定者為準。Although the present disclosure has been disclosed above with examples, it is not intended to limit the present case. Anyone who is familiar with the art can make some changes and modifications without departing from the spirit and scope of the present disclosure. Therefore, the protection scope of this case should be The scope of the patent application attached shall prevail.

100:嬰兒照護影像分析系統 110:智慧前端 112:攝影裝置 114:嬰兒 116:資料處理器 118:支架 120:管理器 130:後台伺服器 150:去識別化影像 160:嬰兒照護盆 200:方法 S205,S210,S215:步驟 300:兩段式模型 310:隱藏層 320:池化層 330:隱藏層 400:示意圖 405,445,450:區塊 410:模型參數選擇器 415:模型訓練器 420:兩段式模型 425:異常事件偵測器 430:異常事件偵測效能估算器 435:人臉偵測器 440:人臉偵測效能估算器 500:方法 S505,S510,S515,S520,S525:步驟 600:計算裝置 610:匯流排 612:記憶體 614:處理器 616:顯示元件 618:I/O埠口 620:I/O元件 622:電源供應器 100: Baby Care Image Analysis System 110: Smart Front End 112: Photographic installations 114: Baby 116: Data Processor 118: Bracket 120:Manager 130: Backend server 150: De-identified images 160:Baby Nursing Basin 200: Method S205, S210, S215: Steps 300: Two-Stage Model 310: Hidden Layer 320: Pooling layer 330: Hidden Layer 400: Schematic 405,445,450: Blocks 410: Model parameter selector 415: Model Trainer 420: Two-Stage Model 425: Abnormal Event Detector 430: Anomaly Event Detection Effectiveness Estimator 435: Face Detector 440: Face Detection Performance Estimator 500: Method S505, S510, S515, S520, S525: Steps 600: Computing Devices 610: Busbar 612: Memory 614: Processor 616: Display Components 618: I/O port 620: I/O Components 622: Power supply

第1圖係顯示根據本發明一實施例所述之嬰兒照護影像分析的系統之環境示意圖。第2圖係顯示根據本揭露一實施例所述之嬰兒照護影像分析的方法之流程圖。第3圖係顯示根據本揭露一實施例之兩段式模型的結構圖。第4圖係顯示根據本揭露一實施例之管理器訓練上述兩段式模型的示意圖。第5圖係顯示根據本揭露一實施例之，管理器訓練上述兩段式模型的流程圖。第6圖係顯示用以實現本發明實施例的示例性操作環境。 FIG. 1 is a schematic diagram showing the environment of a system for analyzing baby care images according to an embodiment of the present invention. FIG. 2 is a flow chart showing a method for analyzing baby care images according to an embodiment of the present disclosure. FIG. 3 is a structural diagram of a two-stage model according to an embodiment of the present disclosure. FIG. 4 is a schematic diagram of training the above-mentioned two-stage model by a manager according to an embodiment of the present disclosure. FIG. 5 shows a flow chart of a manager training the above-mentioned two-stage model according to an embodiment of the present disclosure. Figure 6 shows an exemplary operating environment for implementing embodiments of the present invention.

200:方法 200: Method

S205,S210,S215:步驟 S205, S210, S215: Steps

Claims

A method of infant care image analysis comprising: A two-stage model is trained by a manager, wherein the two-stage model includes a front-end model and a back-end model, and the front-end model and the back-end model are transmitted by the manager; Receiving the above-mentioned front-end model by a smart front-end, inputting at least one image into the above-mentioned front-end model, de-identifying the at least one image to generate at least one de-identified image, and transmitting the at least one de-identified image by the intelligent front-end images; and A background server receives the back-end model and the at least one de-identified image, and inputs the at least one de-identified image into the back-end model to determine whether the at least one de-identified image has an abnormal event.

The method for analyzing baby care images according to claim 1, wherein the two-stage model is composed of an N1 hidden layer, a pooling layer, and an N2 hidden layer in series.

The method for analyzing baby care images according to claim 1, wherein the front-end model comprises the N1 hidden layer and the pooling layer.

The method for analyzing baby care images according to claim 1, wherein the back-end model includes the N2 hidden layer.

The method for analyzing baby care images according to claim 1, wherein training the two-stage model by the manager includes the following steps: Step (a): generating a set of model parameters; Step (b): generate the above-mentioned two-stage model according to the above-mentioned group model parameters; Step (c): Detecting abnormal events in a plurality of training images according to the above-mentioned two-stage model, and identifying faces in the above-mentioned plurality of training images; Step (d): according to the above-mentioned detection results and the above-mentioned identification results, determine whether an abnormal event detection performance of the above-mentioned two-stage model is greater than a first threshold value, and whether the face detection performance is less than a second threshold value; Step (e): when the above-mentioned abnormal event detection performance is not greater than the above-mentioned first threshold, or when the above-mentioned human face detection performance is not less than the above-mentioned second threshold, generate another group of model parameters according to the above-mentioned group of model parameters; and Replace the above-mentioned set of model parameters with the above-mentioned another set of model parameters, and repeat the above-mentioned steps (b) to (e) until the above-mentioned abnormal event detection performance is greater than the above-mentioned first threshold value, and the above-mentioned face detection performance is less than the above-mentioned second threshold value. until.

The method for analyzing baby care images according to claim 1, wherein the manager generates the other set of model parameters according to the set of model parameters by a heuristic algorithm, wherein the heuristic algorithm is a genetic algorithm method, a particle swarm optimization algorithm, or a simulated annealing algorithm.

The method for analyzing baby care images as claimed in claim 1, wherein the manager uses a mean of Average Precision (mAP) method to estimate the abnormal event detection performance and the face detection performance.

The method for analyzing baby care images according to claim 1, wherein the above-mentioned set of model parameters and the above-mentioned another set of model parameters at least include: the ratio of the number of hidden layers N1 and N2, the depth of each hidden layer, and the size of the convolution kernel of each hidden layer , the activation function of each hidden layer, and the core size of the pooling layer.

The method for analyzing baby care images according to claim 1, wherein the ratio of the number of hidden layers N1 and N2 is 1:7, 1:3 or 3:5.

The method for analyzing baby care images according to claim 1, wherein the two-segment model is a Deep Neural Network (DNN) model.

A system for baby care image analysis, comprising: a manager for training a two-stage model, wherein the two-stage model includes a front-end model and a back-end model, and transmits the front-end model and the back-end model; an intelligent front end that receives the preceding model, inputs at least one image into the preceding model, de-identifies the at least one image to generate at least one de-identified image, and transmits the de-identified image; and A backend server receives the back-end model and the de-identified image, and inputs the at least one de-identified image into the back-end model to determine whether there is an abnormal event in the at least one de-identified image.

The system for analyzing baby care images according to claim 11, wherein the two-stage model is composed of an N1 hidden layer, a pooling layer and an N2 hidden layer in series.

The system for analyzing baby care images according to claim 12, wherein the front-end model includes the N1 hidden layer and the pooling layer.

The system for analyzing baby care images according to claim 12, wherein the back-end model includes the N2 hidden layer.

The system for analyzing baby care images according to claim 11, wherein the training of the two-stage model by the manager includes the following steps: Step (a): generating a set of model parameters; Step (b): generate the above-mentioned two-stage model according to the above-mentioned group model parameters; Step (c): train the above two-stage model to detect abnormal events in the plurality of training images, and identify the faces in the above plurality of training images; Step (d): according to the above-mentioned detection results and the above-mentioned identification results, determine whether an abnormal event detection performance of the above-mentioned two-stage model is greater than a first threshold value, and whether the face detection performance is less than a second threshold value; Step (e): when the above-mentioned abnormal event detection performance is not greater than the above-mentioned first threshold, or when the above-mentioned human face detection performance is not less than the above-mentioned second threshold, generate another group of model parameters according to the above-mentioned group of model parameters; and Replace the above-mentioned set of model parameters with the above-mentioned another set of model parameters, and repeat the above-mentioned steps (b) to (e) until the above-mentioned abnormal event detection performance is greater than the above-mentioned first threshold value, and the above-mentioned face detection performance is less than the above-mentioned second threshold value. until.

The system for analyzing baby care images as claimed in claim 15, wherein the manager generates the other set of model parameters according to the set of model parameters through a heuristic algorithm, wherein the heuristic algorithm is a genetic algorithm method, a particle swarm optimization algorithm, or a simulated annealing algorithm.

The system for analyzing baby care images as claimed in claim 15, wherein the manager uses a mean of Average Precision (mAP) method to estimate the abnormal event detection performance and the face detection performance.

The system for analyzing baby care images according to claim 15, wherein the above-mentioned set of model parameters and the above-mentioned another set of model parameters at least include: the ratio of the number of hidden layers N1 and N2, the depth of each hidden layer, the convolution core of each hidden layer size, the activation function of each hidden layer, and the core size of the pooling layer.

The system for analyzing baby care images according to claim 15, wherein the ratio of the hidden layers N1 and N2 is 1:7, 1:3 or 3:5.

The system for analyzing baby care images according to claim 11, wherein the two-stage model is a Deep Neural Network (DNN) model.