JP7227385B2

JP7227385B2 - Neural network training and eye open/close state detection method, apparatus and equipment

Info

Publication number: JP7227385B2
Application number: JP2021541183A
Authority: JP
Inventors: ワン，フェイ; キャン，チェン
Original assignee: ベイジンセンスタイムテクノロジーディベロップメントカンパニーリミテッド
Priority date: 2019-02-28
Filing date: 2019-11-13
Publication date: 2023-02-21
Anticipated expiration: 2039-11-13
Also published as: WO2020173135A1; JP2022517398A; KR20210113621A; CN111626087A

Description

「関連出願の相互参照」
本開示は、２０１９年２月２８日に中国特許庁に出願された第２０１９１０１５３４６３．４号「ニューラルネットワークのトレーニング及び目開閉状態の検出方法、装置並び機器」を発明の名称とした中国特許出願の優先権を主張し、その内容全体が引用により本開示に組み込まれる。 "Cross References to Related Applications"
The present disclosure is based on a Chinese patent application titled No. 201910153463.4 "Neural network training and eye open/closed state detection method, device and apparatus" filed with the Chinese Patent Office on February 28, 2019. priority is claimed, the entire contents of which are incorporated by reference into this disclosure.

本開示はコンピュータビジュアルテクノロジーに関し、特にニューラルネットワークのトレーニング方法、ニューラルネットワークのトレーニング装置、目開閉状態の検出方法、目開閉状態の検出装置、インテリジェント運転制御方法、インテリジェント運転制御装置、電子機器、コンピュータ読み取り可能な記憶媒体及びコンピュータプログラムに関する。 The present disclosure relates to computer visual technology, in particular, neural network training method, neural network training device, eye open/close state detection method, eye open/close state detection device, intelligent driving control method, intelligent driving control device, electronic equipment, computer reading It relates to possible storage media and computer programs.

目開閉状態の検出は、つまり目開閉状況を検出することである。目開閉状態の検出は疲労監視、生体認識、表情認識などの分野に用いることができる。例えば、運転支援技術において、運転手に対して目開閉状態の検出を行い、目開閉状態の検出結果に基づいて、運転手が疲労運転の状態にあるかを判断して疲労運転の監視を実現する必要がある。目開閉状態を正確に検出し、誤判断をなるべく回避することは車両走行の安全性の向上に有利である。 The detection of the open/closed state of the eyes is to detect the open/closed state of the eyes. Eye open/close state detection can be used in fields such as fatigue monitoring, biometric recognition, and facial expression recognition. For example, in driving support technology, it is possible to monitor fatigue driving by detecting whether the driver's eyes are open or closed, and based on the results of the detection, it is possible to determine whether the driver is in a state of fatigue driving. There is a need to. Accurately detecting the open/closed state of the eyes and avoiding misjudgment as much as possible is advantageous for improving the safety of vehicle travel.

本開示の実施形態はニューラルネットワークトレーニング、目開閉状態の検出及びインテリジェント運転制御の技術方案を提供する。 Embodiments of the present disclosure provide technical solutions for neural network training, eye open/close state detection and intelligent driving control.

本開示の実施形態の一側面では、トレーニング対象の目開閉検出用ニューラルネットワークを介して、少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する画像セット内の複数の目画像のそれぞれに対して、目開閉状態の検出処理を行って目開閉状態の検出結果を出力することと、前記目画像の、目開閉のラベリング情報及び前記ニューラルネットワークから出力された目開閉状態の検出結果に基づいて、前記少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ決定し、前記少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失に基づいて前記ニューラルネットワークのネットワークパラメータを調整することと、を含み、異なる画像セットに含まれる目画像は少なくとも部分的に異なる、ニューラルネットワークのトレーニング方法を提供する。 In one aspect of an embodiment of the present disclosure, for each of a plurality of eye images in an image set corresponding to each of at least two eye open/closed detection training tasks, via a trained eye open/closed detection neural network: performing eye open/closed state detection processing and outputting an eye open/closed state detection result; determining a respective loss corresponding to each of at least two eye open/close detection training tasks, and adjusting network parameters of the neural network based on the loss corresponding to each of the at least two eye open/close detection training tasks. and wherein eye images included in different image sets are at least partially different.

本開示の実施形態の別の側面では、被処理画像を取得することと、ニューラルネットワークを介して、前記被処理画像に対して目開閉状態の検出処理を行い、目開閉状態の検出結果を出力することと、を含み、前記ニューラルネットワークは上記の実施形態に記載のニューラルネットワークのトレーニング方法によりトレーニングして得たものである、目開閉状態の検出方法を提供する。 In another aspect of the embodiment of the present disclosure, an image to be processed is acquired, an eye open/closed state detection process is performed on the image to be processed via a neural network, and a detection result of the eye open/closed state is output. and, wherein the neural network is obtained by training according to the neural network training method described in the above embodiment.

本開示の実施形態の別の側面では、車両に搭載される撮影装置により収集された被処理画像を取得することと、ニューラルネットワークを介して、前記被処理画像に対して目開閉状態の検出処理を行い、目開閉状態の検出結果を出力することと、少なくとも時系列の複数の被処理画像における同一の対象者の目開閉状態の検出結果に基づいて、前記対象者の疲労状態を決定することと、前記対象者の疲労状態に応じて、指令を生成し出力することと、を含み、前記ニューラルネットワークは上記の実施形態に記載のニューラルネットワークのトレーニング方法でトレーニングされたものである、インテリジェント運転制御方法を提供する。 According to another aspect of the embodiment of the present disclosure, an image to be processed collected by an imaging device mounted on a vehicle is acquired, and eye open/closed state detection processing is performed on the image to be processed via a neural network. and outputting the detection result of the eye open/closed state, and determining the fatigue state of the subject based on at least the detection result of the eye open/closed state of the same subject in a plurality of time-series images to be processed. and generating and outputting a command according to the subject's fatigue state, wherein the neural network is trained by the neural network training method according to the above embodiment. Intelligent driving Provide a control method.

本開示の実施形態の別の側面では、少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する画像セット内の複数の目画像のそれぞれに対して、目開閉状態の検出処理を行って目開閉状態の検出結果を出力することに用いられるトレーニング対象の目開閉検出用ニューラルネットワークと、前記目画像の、目開閉のラベリング情報及び前記ニューラルネットワークから出力された目開閉状態の検出結果に基づいて、前記少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ決定し、前記少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失に基づいて前記ニューラルネットワークのネットワークパラメータを調整することに用いられる調整モジュールと、を含み、異なる画像セットに含まれる目画像は少なくとも部分的に異なる、ニューラルネットワークのトレーニング装置を提供する。 In another aspect of an embodiment of the present disclosure, an eye open/closed state detection process is performed on each of a plurality of eye images in an image set corresponding to each of the at least two eye open/closed detection training tasks to detect an eye open/closed state. Based on the eye open/closed detection neural network of the training target used to output the detection result of the eye image, the eye open/closed labeling information of the eye image, and the eye open/closed state detection result output from the neural network, used to respectively determine a loss corresponding to each of at least two eye open/close detection training tasks, and adjust network parameters of the neural network based on the losses corresponding to each of said at least two eye open/close detection training tasks. and an adjustment module, wherein eye images included in different image sets are at least partially different.

本開示の実施形態の別の側面では、被処理画像を取得することに用いられる取得モジュールと、前記被処理画像に対して目開閉状態の検出処理を行い、目開閉状態の検出結果を出力することに用いられるニューラルネットワークと、を含み、前記ニューラルネットワークは上記の実施形態に記載のニューラルネットワークのトレーニング装置でトレーニングされたものである、目開閉状態の検出装置を提供する。 According to another aspect of the embodiment of the present disclosure, an acquisition module used to acquire an image to be processed performs an eye open/closed state detection process on the image to be processed, and outputs a detection result of the eye open/closed state. and a neural network used in particular, wherein the neural network is trained by the neural network training apparatus described in the above embodiments.

本開示の実施形態の別の側面では、車両に搭載される撮影装置により収集された被処理画像を取得することに用いられる取得モジュールと、前記被処理画像に対して目開閉状態の検出処理を行い、目開閉状態の検出結果を出力することに用いられるニューラルネットワークと、少なくとも時系列の複数の被処理画像における同一の対象者の目開閉状態の検出結果に基づいて、前記対象者の疲労状態を決定することに用いられる疲労状態決定モジュールと、前記対象者の疲労状態に応じて、指令を生成し出力することに用いられる指令モジュールと、を含み、前記ニューラルネットワークは上記の実施形態に記載のニューラルネットワークのトレーニング装置でトレーニングされたものである、インテリジェント運転制御装置を提供する。 In another aspect of the embodiment of the present disclosure, an acquisition module used to acquire an image to be processed collected by an imaging device mounted on a vehicle; and the fatigue state of the subject based on the detection results of the eye open/closed state of the same subject in at least a plurality of time-series images to be processed. and a command module used to generate and output a command according to the fatigue state of the subject, wherein the neural network is described in the above embodiments. provides an intelligent driving controller that has been trained with the neural network training device of .

本開示の実施形態の別の側面では、コンピュータプログラムを記憶するためのメモリと、前記メモリに記憶されたコンピュータプログラムを実行し、かつ前記コンピュータプログラムが実行されると、本開示のいずれかの方法の実施形態を実現させるプロセッサと、を含む電子機器を提供する。 In another aspect of an embodiment of the disclosure, a memory for storing a computer program, executing the computer program stored in the memory, and when the computer program is executed, any method of the disclosure and a processor that implements an embodiment of .

本開示の実施形態の別の側面では、プロセッサにより実行されると、本開示のいずれかの方法の実施形態を実現させるコンピュータプログラムを記憶したコンピュータ読み取り可能な記憶媒体を提供する。 In another aspect of the embodiments of the disclosure, there is provided a computer readable storage medium storing a computer program that, when executed by a processor, implements an embodiment of any of the methods of the disclosure.

本開示の実施形態の別の側面では、機器のプロセッサにおいて実行されると、本開示のいずれかの方法の実施形態を実現させるコンピュータ命令を含む、コンピュータプログラムを提供する。 In another aspect of the disclosed embodiments, there is provided a computer program product comprising computer instructions that, when executed in a processor of a device, implements an embodiment of any of the disclosed methods.

本開示の実施例を実施する過程において、発明者らは、従来の単一タスクをトレーニングするニューラルネットワークにおいて、当該タスクの画像セットに対してトレーニングされたニューラルネットワークについて、当該タスクに対応するシーンでは比較的良い目開閉検出の正確率を有するが、当該タスクに対応しない他のシーンでは目開閉検出の正確度を確保することは困難であることを見出した。単に異なるシーンで収集された複数の画像をニューラルネットワークトレーニング用の１画像セットとし、画像セット内の画像が異なるシーンのものであるか、異なるトレーニングタスクに対応するかを区別しないと、この１画像セットからューラルネットワークトレーニングへ毎回入力された画像サブセット（バッチ）の分布は制御できず、あるシーンの画像が多くあるが、他のシーンの画像が少なく、ひいてはない可能性があり、異なる反復トレーニングされた画像サブセットセの分布も完全に同じではない。つまり、ニューラルネットワークの反復毎に画像サブセットの分布がランダムすぎ、異なるトレーニングタスクに対して損失計算が実行されず、トレーニングプロセスにおいて各異なるトレーニングタスクを考慮したニューラルネットワークの能力学習を制御できない。そのため、トレーニングされたニューラルネットワークは異なるタスクに対応する異なるシーンでの目開閉検出の正確性を確保することができない。 In the course of implementing embodiments of the present disclosure, the inventors found that in a conventional neural network that trains a single task, for a neural network trained on the image set of the task, the scene corresponding to the task is Although it has a relatively good eye open/close detection accuracy rate, it is difficult to ensure the eye open/close detection accuracy in other scenes that do not correspond to the task. Simply taking multiple images collected in different scenes as one image set for neural network training, without distinguishing whether the images in the image set are from different scenes or correspond to different training tasks, this one image The distribution of the image subsets (batches) input each time from the set to neural network training is uncontrollable, there may be many images of one scene, but few or even less of others, resulting in different iterations. The distributions of the trained image subsets are also not exactly the same. That is, the distribution of the image subsets in each iteration of the neural network is too random, loss calculation is not performed for different training tasks, and it is impossible to control the ability learning of the neural network considering each different training task in the training process. Therefore, the trained neural network cannot ensure the accuracy of eye open/close detection in different scenes corresponding to different tasks.

本開示に係るニューラルネットワークのトレーニング方法及び装置、目開閉状態の検出方法及び装置、インテリジェント運転制御方法及び装置、電子機器、コンピュータ読み取り可能な記憶媒体及びコンピュータプログラムにより、複数の異なる目開閉検出タスクから対応する画像セットをそれぞれ決定し、複数の画像セットからニューラルネットワークの一回のトレーニングにおける複数の目画像を決定し、複数の画像セットからの目画像に基づいて当該トレーニングにおける各トレーニングタスクの目開閉検出結果に関するニューラルネットワークの損失をそれぞれ決定し、各損失に基づいてニューラルネットワークのネットワークパラメータを調整する。このようにして、ニューラルネットワークの毎回の反復トレーニングにニューラルネットワークに提供された目画像のサブセットに各トレーニングタスクに対応する目画像が含まれ、各トレーニングタスクに対して損失が計算されるため、ニューラルネットワークのトレーニングプロセスにおいて、トレーニングタスク毎に目の開閉能力の検出に関する能力学習が可能であり、異なるトレーニングタスクを考慮した能力学習を行なうことができる。これにより、トレーニングされたニューラルネットワークは複数のトレーニングタスクに対応する複数のシーンにおける各シーンでの目画像の目開閉検出の正確性を同時に高めることができ、当該ニューラルネットワークに基づいて異なるシーンで目開閉を正確に検出する発明の普遍性と一般化の向上を促進し、複数シーンに関する実際の応用ニーズをより良く満たすために有利である。 According to the present disclosure, a neural network training method and apparatus, an eye open/closed state detection method and apparatus, an intelligent driving control method and apparatus, an electronic device, a computer-readable storage medium, and a computer program enable a plurality of different eye open/close detection tasks to be performed. Determining corresponding image sets respectively, determining multiple eye images in one training session of the neural network from the multiple image sets, and opening and closing the eyes for each training task in the training based on the eye images from the multiple image sets. The losses of the neural network for each detection result are determined, and the network parameters of the neural network are adjusted based on each loss. In this way, the subset of eye images provided to the neural network for each iteration of training of the neural network contains an eye image corresponding to each training task, and the loss is computed for each training task, so that the neural In the training process of the network, ability learning regarding the detection of eye opening/closing ability is possible for each training task, and ability learning can be performed in consideration of different training tasks. This allows the trained neural network to simultaneously increase the accuracy of eye open/close detection in each scene in multiple scenes corresponding to multiple training tasks, and to detect the eyes in different scenes based on the neural network. It is advantageous for promoting the improvement of universality and generalization of the invention of accurately detecting opening and closing, and better meeting the actual application needs for multiple scenes.

以下に図面及び実施形態を参照しながら本開示の技術方案を更に詳しく説明する。 The technical solution of the present disclosure will be described in more detail below with reference to the drawings and embodiments.

本明細書の図面は、明細書の一部分として本開示の実施形態を説明し、その説明と共に本開示の原理を解釈するために用いられる。 The drawings herein are used as part of the specification to illustrate embodiments of the disclosure and, together with the description, to interpret the principles of the disclosure.

図面を参照しながら、以下の詳細な説明により、本開示がより明瞭になる。 The following detailed description makes the present disclosure clearer with reference to the drawings.

本開示のニューラルネットワークのトレーニング方法の一実施形態のフローチャートを示す。1 shows a flow chart of one embodiment of a neural network training method of the present disclosure; 本開示の目開閉状態の検出方法の一実施形態のフローチャートを示す。1 shows a flowchart of an embodiment of an eye open/closed state detection method of the present disclosure; 本開示の目開閉状態の検出方法の一実施形態のフローチャートを示す。1 shows a flowchart of an embodiment of an eye open/closed state detection method of the present disclosure; 本開示のインテリジェント運転制御方法の一実施形態のフローチャートを示す。1 shows a flowchart of one embodiment of an intelligent driving control method of the present disclosure; 本開示のニューラルネットワークのトレーニング装置の一実施形態の構成の模式図を示す。1 shows a schematic diagram of a configuration of an embodiment of a neural network training device of the present disclosure; FIG. 本開示の目開閉状態の検出装置の一実施形態の構成模式図を示す。1 shows a configuration schematic diagram of an embodiment of an eye open/closed state detection apparatus of the present disclosure; FIG. 本開示のインテリジェント運転制御装置の一実施形態の構成模式図を示す。1 shows a configuration schematic diagram of an embodiment of an intelligent driving control device of the present disclosure; FIG. 本開示の実施形態の例示的な機器のブロック図を示す。1 shows a block diagram of an exemplary device according to embodiments of the present disclosure; FIG.

以下に図面を参照しながら本開示の様々な例示的実施例を詳細に説明する。特に説明がない限り、これらの実施例に記述される手段及びステップの相対的な配置、数式及び数値は本開示の範囲を限定するものではない。 Various exemplary embodiments of the present disclosure are described in detail below with reference to the drawings. Unless otherwise stated, the relative arrangements, formulas and numerical values of the means and steps described in these examples are not intended to limit the scope of the present disclosure.

また、説明の便利のため、図面に示される各部分の寸法が実際の比例関係にしたがって描かれるものではないことは理解されるべきである。 Also, for convenience of explanation, it should be understood that the dimensions of the parts shown in the drawings are not drawn according to actual proportional relationships.

以下に少なくとも１つの例示的実施例の記述は実際に、説明的なものに過ぎず、本開示及びそれの応用または使用に対する如何なる限定ではない。 The description of at least one exemplary embodiment below is merely illustrative in nature and is not in any way limiting on the disclosure and its application or use.

関連分野の一般的な技術者に周知された技術、方法及び機器について詳細な検討はされない場合があるが、適当な場合に、かかる技術、方法及び機器は本明細書の一部としてみなされるべきである。 Techniques, methods and equipment known to those of ordinary skill in the relevant fields may not be discussed in detail, but where appropriate, such techniques, methods and equipment should be considered part of this specification. is.

なお、類似の符号及びや文字は類似の要素を示す。そのため、ある要素が１つの図面において定義されると、それについてその後の図面で更に検討する必要はないことに注意すべきである。 Similar symbols and letters indicate similar elements. As such, it should be noted that once an element is defined in one drawing, it need not be further discussed in subsequent drawings.

本開示実施例は、端末装置、コンピュータシステム及びサーバなど電子機器に用いることができ、他の多くの汎用または専用のコンピュータシステム環境または構成とともに操作されることができる。端末装置、コンピュータシステム及びサーバなど電子機器とともに使用さることに適する公知の端末装置、コンピュータシステム、環境および／または構成の例として、パソコンシステム、サーバコンピュータシステム、シン・クライアント、シッククライアント、ハンドヘルドまたはラップトップデバイス、マイクロプロセッサベースのシステム、セットトップボックス、プログラマブル家庭用電子機器、ネットワークパーソナルコンピュータ、小型コンピュータシステム、大型コンピュータシステムおよび上述のシステムのいずれかを含む分散型クラウドコンピューティングテクノロジー環境などを含むが、これらに限定されない。 The disclosed embodiments may be employed in electronic devices such as terminals, computer systems and servers, and are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of known terminals, computer systems, environments and/or configurations suitable for use with electronic equipment such as terminals, computer systems and servers include personal computer systems, server computer systems, thin clients, thick clients, handhelds or laptops. Top devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, large computer systems and distributed cloud computing technology environments including any of the above systems, etc. , but not limited to.

端末装置、コンピュータシステム及びサーバなど電子機器はコンピュータシステムによって実行されるコンピュータシステムの実行可能命令（プログラムモジュールなど）の一般的なコンテキストで記述され得る。一般に、プログラムモジュールには、特定のタスクを実行したり、特定の抽象データ型を実現したりするルーチン、プログラム、オブジェクトプログラム、コンポーネント、ロジック、データ構造などが含まれ得る。コンピュータシステム／サーバは、分散型クラウドコンピューティング環境で実施可能であり、分散型クラウドコンピューティング環境では、タスクは、通信ネットワークを介してリンクされたリモート処理装置によって実行される。分散型クラウドコンピューティング環境では、プログラムモジュールは、記憶装置を含むローカルまたはリモートコンピューティングシステムの記憶媒体に配置できる。 Electronic devices such as terminals, computer systems and servers may be described in the general context of computer system-executable instructions (such as program modules) being executed by computer systems. Generally, program modules can include routines, programs, object programs, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in storage media in local or remote computing systems including storage devices.

例示的な実施例
図１は本開示のニューラルネットワークのトレーニング方法の一実施形態のフローチャートを示す。図１に示すように、この実施例に係る方法は、ステップ：Ｓ１００及びＳ１１０を含む。以下に図１の各ステップをそれぞれ詳しく説明する。 Illustrative Examples FIG. 1 shows a flowchart of one embodiment of the neural network training method of the present disclosure. As shown in FIG. 1, the method according to this embodiment includes steps: S100 and S110. Each step in FIG. 1 will be described in detail below.

Ｓ１００、トレーニング対象の目開閉検出用ニューラルネットワークを介して、少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する画像セット内の複数の目画像のそれぞれに対して、目開閉状態の検出処理を行い、目開閉状態の検出結果を出力する。 S100, performing an eye open/closed state detection process for each of the plurality of eye images in the image set corresponding to each of the at least two eye open/closed detection training tasks via the eye open/closed detection neural network to be trained; , outputs the detection result of the eye open/closed state.

選択可能な一例において、本開示のトレーニング対象の目開閉検出用ニューラルネットワークはトレーニングされた後、被処理画像に対して目開閉状態の検出を行い、被処理画像の目開閉状態の検出結果を出力することに用いることができる。例えば、１つの被処理画像に対して、ニューラルネットワークは２つの確率値を出力し、そのうちの１つの確率値は被処理画像における対象者の目が開いている状態にある確率を示し、この確率値が大きいほど、開眼状態に近いことを表す。そのうちのもう１つの確率値は被処理画像における対象者の目が閉じている状態にある確率を示し、この確率値が大きいほど、閉眼状態に近いことを表す。２つの確率値の和は１であってもよい。 In one selectable example, the eye open/closed detection neural network to be trained of the present disclosure performs eye open/closed state detection on the processed image after being trained, and outputs the eye open/closed state detection result of the processed image. can be used to For example, for one processed image, the neural network outputs two probability values, one of which indicates the probability that the subject's eyes are open in the processed image, and this probability The larger the value, the closer to the open-eye state. The other probability value indicates the probability that the subject's eyes are closed in the image to be processed. The sum of the two probability values may be one.

選択可能な一例において、本開示のニューラルネットワークは畳み込みニューラルネットワークであってもよい。本開示のニューラルネットワークは畳み込み層、Ｒｅｌｕ（ＲｅｃｔｉｆｉｅｄＬｉｎｅａｒＵｎｉｔ、正規化線形ユニット）層（活性化層とも呼ばれる）、プーリング層、全結合層及び分類用（例えば２項分類）の層などを含んでもよいが、これらに限定されない。このニューラルネットワークに含まれる層数が多いほど、ネットワークが深い。本開示はニューラルネットワークの具体的な構成を限定しない。 In one selectable example, the neural network of the present disclosure may be a convolutional neural network. The neural networks of the present disclosure may include convolution layers, Relu (Rectified Linear Unit) layers (also called activation layers), pooling layers, fully connected layers and layers for classification (e.g., binary classification), etc. Good, but not limited to: The more layers included in this neural network, the deeper the network. This disclosure does not limit the specific configuration of the neural network.

選択可能な一例において、本開示でニューラルネットワークをトレーニングするプロセスに関わる目開閉検出のトレーニングタスクは少なくとも２つあり、かつそれぞれの目開閉検出トレーニングタスクはいずれもニューラルネットワークに目開閉状態の検出を実現させるためのトレーニングタスク全体に属すべきである。異なる目開閉検出トレーニングタスクに対応するトレーニング目標は完全に同じではない。つまり、本開示は以ニューラルネットワークのトレーニングタスク全体を複数のトレーニングタスクに分けることができ、１つのトレーニングタスクは１つのトレーニング目標に対応し、かつ異なるトレーニングタスクに対応するトレーニング目標が異なる。 In one selectable example, there are at least two eye open/closed detection training tasks involved in the process of training a neural network in this disclosure, and each eye open/closed detection training task enables the neural network to detect an eye open/closed state. should belong to the overall training task for The training goals corresponding to different eye open/close detection training tasks are not exactly the same. That is, the present disclosure can divide the overall training task of the neural network into multiple training tasks, one training task corresponding to one training target, and different training targets corresponding to different training tasks.

選択可能な一例において、本開示の少なくとも２つの目開閉検出トレーニングタスクは、目に装着物が装着されている場合の目開閉検出タスク、目に装着物が装着されない場合の目開閉検出タスク、室内環境での目開閉検出タスク、室外環境での目開閉検出タスク、目に装着物が装着され且つ装着物にスポットがある場合の目開閉検出タスク、目に装着物が装着され且つ装着物にスポットがない場合の目開閉検出タスクのうちの少なくとも２つを含んでもよい。上記装着物は眼鏡または透明なプラスチックシートなどであってもよい。上記スポットは装着物が反射することによって装着物に形成されたスポットであってもよい。本開示の眼鏡は通常、レンズを通して着用者の目を見ることができる眼鏡を指す。 In one selectable example, the at least two eye open/closed detection training tasks of the present disclosure are an eye open/closed detection task with wearable eyes, an eye open/closed detection task with no eye wearables, Eye open/close detection task in the environment, Eye open/close detection task in an outdoor environment, Eye open/close detection task when the wearable object is attached to the eye and a spot is on the wearable object, Eye wearable object is attached to the eye and the wearable object is spotted at least two of the eye open/closed detection tasks in the absence of . The wearable item may be eyeglasses or a transparent plastic sheet. The spot may be a spot formed on the wearable by reflection of the wearable. Eyeglasses of the present disclosure generally refer to eyeglasses that allow the wearer's eyes to be seen through the lenses.

所望により、目に装着物が装着されている場合の目開閉検出タスクは、眼鏡をかけている目開閉検出タスクであってもよい。この眼鏡をかけている目開閉検出タスクは室内の眼鏡をかけている目開閉検出及び室外の眼鏡をかけている目開閉検出のうちの少なくとも１つを実現できる。 If desired, the eye open/closed detection task when the wearer is wearing eyeglasses may be the eye open/closed detection task when the eyeglasses are worn. The task of detecting open/closed eyes while wearing glasses can realize at least one of detection of open/closed eyes while wearing glasses indoors and detection of open/closed eyes while wearing glasses outdoors.

所望により、目に装着物が装着されない場合の目開閉検出タスクは、眼鏡をかけていない目開閉検出タスクであってもよい。この眼鏡をかけていない目開閉検出タスクは室内の眼鏡をかけていない目開閉検出及び室外の眼鏡をかけていない目開閉検出のうちの少なくとも１つことを実現できる。 If desired, the eye open/closed detection task when no wearable object is worn on the eye may be an eye open/closed detection task without eyeglasses. The eye open/close detection task without wearing glasses can realize at least one of indoor open/close eye detection without wearing glasses and open/close eye detection outdoors without wearing glasses.

所望により、室内環境での目開閉検出タスクは、室内の眼鏡をかけていない目開閉検出、室内で眼鏡をかけており、かつ眼鏡が反射した目開閉検出、及び室内で眼鏡をかけており、かつ眼鏡が反射していない目開閉検出のうちの少なくとも１つを実現できる。 Optionally, the task of detecting open/closed eyes in an indoor environment includes detection of open/closed eyes without wearing glasses in the room, detection of open/closed eyes in the room with glasses and reflected by the glasses, and detection of open/closed eyes in the room with glasses, In addition, at least one of eye open/close detection without reflection from the spectacles can be realized.

所望により、室外環境での目開閉検出タスクは、室外の眼鏡をかけていない目開閉検出、室外で眼鏡をかけており、かつ眼鏡が反射した目開閉検出、及び室外で眼鏡をかけており、かつ眼鏡が反射していない目開閉検出のうちの少なくとも１つことを実現できる。 If desired, the eye open/close detection task in an outdoor environment includes eye open/close detection when the user is not wearing eyeglasses outdoors, eye open/close detection when the user is wearing eyeglasses outdoors and reflects the eyeglasses, and eye open/close detection when the user is wearing eyeglasses outdoors. In addition, at least one of eye open/close detection without reflection from the spectacles can be realized.

所望により、目に装着物が装着され且つ装着物にスポットがある場合の目開閉検出タスクは眼鏡をかけており、かつ眼鏡が反射した目開閉検出タスクであってもよい。この眼鏡をかけており、かつ眼鏡が反射した目開閉検出タスクは、室内で眼鏡をかけており、かつ眼鏡が反射した目開閉検出及び室外で眼鏡をかけており、かつ眼鏡が反射した目開閉検出のうちの少なくとも１つを実現できる。 If desired, the eye open/closed detection task when the wearable is worn on the eye and the wearable has a spot may be the eye open/closed detection task in which the eyeglasses are worn and the eyeglasses are reflected. The eye open/closed detection task of the person wearing glasses and reflected by the glasses includes the task of detecting the opened/closed eyes of the person wearing glasses indoors and the eyes of the person wearing glasses indoors and the open/closed eyes of the person wearing glasses outdoors. At least one of detection can be implemented.

所望により、目に装着物が装着され且つ装着物にスポットがない場合の目開閉検出タスクは眼鏡をかけており、かつ眼鏡が反射していない目開閉検出タスクであってもよい。この眼鏡をかけており、かつ眼鏡が反射していない目開閉検出タスクは、室内で眼鏡をかけており、かつ眼鏡が反射していない目開閉検出及び室外で眼鏡をかけており、かつ眼鏡が反射していない目開閉検出のうちの少なくとも１つを実現できる。 If desired, the eye open/closed detection task when the wearable is worn on the eye and the wearable has no spots may be the eye open/closed detection task in which the eyeglasses are worn and the eyeglasses are not reflective. The eye open/closed detection task for the person wearing the glasses and the eyeglasses not reflecting is the eye open/closed detection task for the person wearing the glasses indoors and the eyeglasses not reflecting, and the eye open/close detection task for the person wearing the glasses indoors and the eyeglasses not reflecting. At least one of non-reflective eye open/close detection can be implemented.

上記内容から分かるように、本開示の異なる目開閉検出トレーニングタスクの間に共通部分があり、例えば、眼鏡をかけている目開閉検出タスクは、室内環境での目開閉検出タスク、室外環境での目開閉検出タスク、目に装着物が装着され且つ装着物にスポットがある場合の目開閉検出タスク、目に装着物が装着され且つ装着物にスポットがない場合の目開閉検出タスクとそれぞれ共通部分があってもよい。ここで、上記列記した６つの目開閉検出トレーニングタスクの間に共通部分があることについては、一々説明しない。また、本開示はかかる目開閉検出トレーニングタスクの数を限定せず、かつ目開閉検出トレーニングタスクの数は実際の需要に応じて決定することができる。本開示はいずれかの目開閉検出トレーニングタスクの具体化される形式も限定しない。 As can be seen from the above content, there is a common part between the different eye open/closed detection training tasks of the present disclosure. Eye open/close detection task, eye open/close detection task when the wearable object is attached to the eye and there is a spot, and eye open/close detection task when the wearable object is attached to the eye and there is no spot on the eye open/closed detection task. There may be Here, the fact that there are common parts among the six eye open/closed detection training tasks listed above will not be explained one by one. Also, the present disclosure does not limit the number of such eye open/close detection training tasks, and the number of eye open/close detection training tasks can be determined according to actual needs. This disclosure does not limit the form in which any eye open/closed detection training task is embodied.

所望により、図２に示すように、本開示の少なくとも２つの目開閉検出トレーニングタスクは、下記３つの目開閉検出トレーニングタスクを含んでもよい。 Optionally, as shown in FIG. 2, the at least two eye open/close detection training tasks of the present disclosure may include the following three eye open/close detection training tasks.

目開閉検出トレーニングタスクａ、室内環境での目開閉検出トレーニングタスク。 Eye open/close detection training task a, eye open/close detection training task in an indoor environment.

目開閉検出トレーニングタスクｂ、室外環境での目開閉検出タスク。 Eye open/closed detection training task b, eye open/closed detection task in an outdoor environment.

目開閉検出トレーニングタスクｃ、目に装着物が装着され且つ装着物にスポットがある場合の目開閉検出タスク。 Eye open/closed detection training task c, eye open/closed detection task when the wearable is worn on the eye and the wearable has a spot.

目開閉検出トレーニングタスクａと目開閉検出トレーニングタスクｂとの間には共通部分がなく、トレーニングタスクａとトレーニングタスクｃとの間には共通部分があってもよく、トレーニングタスクｂとトレーニングタスクｃとの間には共通部分があってもよい。 There is no common portion between eye open/close detection training task a and eye open/close detection training task b, there may be a common portion between training task a and training task c, and training task b and training task c. There may be a common part between

選択可能な一例において、本開示の少なくとも２つの目開閉検出トレーニングタスクはそれぞれ対応する画像セットがあり、例えば、図２の目開閉検出トレーニングタスクａ、目開閉検出トレーニングタスクｂ及び目開閉検出トレーニングタスクｃはそれぞれ対応する画像セットがある。通常、各画像セット毎に複数の目画像が含まれる。異なる画像セットに含まれる目画像は少なくとも部分的に異なる。つまり、１つの画像セットにとって、この画像セット内の少なくもと一部の目画像は他の画像セットにない。所望により、異なる画像セットに含まれる目画像は共通部分があってもよい。 In one selectable example, each of the at least two eye open/close detection training tasks of the present disclosure has a corresponding set of images, such as eye open/close detection training task a, eye open/close detection training task b, and eye open/close detection training task of FIG. c has a corresponding image set. Typically, multiple eye images are included for each image set. The eye images contained in different image sets are at least partially different. That is, for one image set, at least some eye images in this image set are absent from other image sets. If desired, the eye images contained in different image sets may have common parts.

所望により、上記列記した６つの目開閉検出トレーニングタスクのそれぞれに対応する画像セットはそれぞれ、目に装着物が装着されている目画像セット、目に装着物が装着されない目画像セット、室内環境で収集した目画像セット、室外環境で収集した目画像セット、目に装着物が装着され且つ装着物にスポットがある目画像セット、目に装着物が装着され且つ装着物にスポットがない目画像セットであってもよい。 Optionally, the image sets corresponding to each of the six eye open/closed detection training tasks listed above are respectively the eye image set with the eye wearer attached, the eye image set without the eye wearer, and the eye image set in the room environment. A set of acquired eye images, a set of eye images acquired in an outdoor environment, a set of eye images with wearables on the eyes and spots on the wearables, and a set of eye images with wearables on the eyes and no spots on the wearables. may be

所望により、目に装着物が装着されている目画像セットのうちの全ての目画像は眼鏡をかけている目画像であってもよく、例えば、この目画像セットは、室内環境で収集した眼鏡をかけている目画像及び室外環境で収集した眼鏡をかけている目画像を含んでもよい。 If desired, all eye images in the set of eye images with wearables on the eyes may be eye images with eyeglasses, for example, the eye image set may be eyeglasses collected in an indoor environment. It may also include images of eyes wearing glasses and images of eyes wearing glasses collected in an outdoor environment.

所望により、目に装着物が装着されない目画像セットのうちの全ての画像は眼鏡をかけていない目画像であってもよく、例えば、この目画像セットは、室内環境で収集した眼鏡をかけていない目画像及び室外環境で収集した眼鏡をかけていない目画像を含んでもよい。 If desired, all images in the set of eye images where the eye wear is not worn may be eye images without eyeglasses, e.g. and eye images collected in an outdoor environment without glasses.

所望により、室内環境で収集した目画像セットは室内環境で収集した眼鏡をかけていない目画像、及び室内環境で収集した眼鏡をかけている目画像を含んでもよい。 If desired, the set of eye images collected in an indoor environment may include eye images without eyeglasses collected in an indoor environment and eye images with eyeglasses collected in an indoor environment.

所望により、室外環境で収集した目画像セットは室外環境で収集した眼鏡をかけていない目画像、及び室外環境で収集した眼鏡をかけている目画像を含んでもよい。 If desired, the set of eye images collected in the outdoor environment may include eye images without eyeglasses collected in the outdoor environment and eye images with eyeglasses collected in the outdoor environment.

所望により、目に装着物が装着され且つ装着物にスポットがある目画像セットのうちの全ての目画像は眼鏡をかけており、かつ眼鏡にスポットがある目画像であってもよい。例えば、この目画像セットは、室内環境で収集した眼鏡をかけており、かつ眼鏡にスポットがある目画像及び室外環境で収集した眼鏡をかけており、かつ眼鏡にスポットがある目画像を含んでもよい。 If desired, all eye images in the set of eye images with eye wear and spots on eye wear may be eye images with eyeglasses and spots on eyeglasses. For example, the eye image set may include eye images with spectacles and spots on spectacles collected in an indoor environment and eye images with spectacles and spots on spectacles collected in an outdoor environment. good.

所望により、目に装着物が装着され且つ装着物にスポットがない目画像セットのうちの全ての目画像は眼鏡をかけており、かつ眼鏡にスポットがない目画像であってもよい。例えば、この目画像セットは、室内環境で収集した眼鏡をかけており、かつ眼鏡にスポットがない目画像及び室外環境で収集した眼鏡をかけており、かつ眼鏡にスポットがない目画像を含んでもよい。 If desired, all eye images in the set of eye images with eyewear on and no spots on the eyewear may be eye images with eyeglasses on and no spots on the eyeglasses. For example, the eye image set may include eye images with glasses and no spots on glasses collected in an indoor environment and eye images with glasses and no spots on glasses collected in an outdoor environment. good.

選択可能な一例において、本開示に含まれる画像セットは本開示に含まれる目開閉検出トレーニングタスクによって決定される。例えば、本開示は上記６つの目開閉検出トレーニングタスクのうちの少なくとも２つを含むと、本開示はこの少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する目画像セットを含むことになる。 In one alternative example, the image set included in this disclosure is determined by an eye open/close detection training task included in this disclosure. For example, if the present disclosure includes at least two of the six eye open/closed detection training tasks described above, the present disclosure will include eye image sets corresponding to each of the at least two eye open/closed detection training tasks.

選択可能な一例において、本開示のニューラルネットワークのトレーニングプロセスに用いられる目画像は目画像サンプルと呼んでもよく、通常、目画像サンプルの画像コンテンツには目が含まれる。本開示の目画像サンプルは、通常、片眼に基づく目画像サンプルである。すなわち、目画像サンプルの画像コンテンツは両眼を含まず、片眼を含む。所望により、目画像サンプルは片側の目に基づく目画像サンプルであってもよい。例えば、左目に基づく目画像サンプルであってもよい。当然、本開示は目画像サンプルが両目に基づく目画像サンプルまたはいずれか側の目に基づく目画像サンプルである場合を除外しない。 In one alternative example, the eye images used in the neural network training process of the present disclosure may be referred to as eye image samples, and typically the image content of the eye image samples includes eyes. The eye image samples of this disclosure are typically eye image samples based on one eye. That is, the image content of the eye image sample does not include both eyes, it includes one eye. If desired, the eye image samples may be eye image samples based on one eye. For example, it may be an eye image sample based on the left eye. Of course, this disclosure does not exclude cases where the eye image samples are both eye-based eye image samples or either eye-based eye image samples.

選択可能な一例において、本開示の目画像は通常、撮影装置により撮影した目を含む画像から切り取った目画像ブロックであってもよい。例えば、本開示における目画像を形成する過程は、撮影装置により撮影した画像に対して目の検出を行い、画像における目の部分を決定し、そして、検出された目の部分を画像から切り取り、所望により、本開示は切り取った画像ブロックに対してズームおよび／または画像コンテンツのマッピング（たとえば、右目画像ブロックは、画像コンテンツのマッピングを通じて左目画像ブロックに変換される）などの処理をし、目開閉検出用ニューラルネットワークをトレーニングするための目画像を形成することを含み得る。当然、本開示における目画像は撮影装置により撮影した、目を含む完全な画像を目画像とする可能性を除外しない。また、本開示における目画像は対応するトレーニングサンプルセットにおける目画像であってもよい。 In one selectable example, the eye images of the present disclosure may generally be eye image blocks cropped from an image containing the eye taken by an imager. For example, the process of forming an eye image in the present disclosure includes performing eye detection on an image captured by an imaging device, determining eye portions in the image, and cropping the detected eye portions from the image, Optionally, the present disclosure performs operations such as zooming and/or image content mapping on the cropped image blocks (e.g., right-eye image blocks are transformed into left-eye image blocks through image content mapping) and eye opening and closing. It may include forming an eye image for training a detection neural network. Of course, the eye image in the present disclosure does not exclude the possibility that the eye image is a complete image including the eye captured by the imaging device. Also, the eye images in this disclosure may be the eye images in the corresponding training sample set.

選択可能な一例において、本開示における目開閉検出用ニューラルネットワークをトレーニングするための目画像は、通常、ラベリング情報を有し、かつこのラベリング情報は目画像における目開閉状態を表すことができる。つまり、ラベリング情報は目画像における目が開眼状態にあるか、また閉眼状態にあるかを表すことができる。１つの選択可能な例において、目画像のラベリング情報が１であることは、この目画像における目が開眼状態にあることを表し、目画像のラベリング情報が０であることは、この目画像における目が閉眼状態にあることを表す。 In one optional example, the eye images for training the eye open/closed detection neural network in this disclosure typically have labeling information, and the labeling information can represent the eye open/closed state in the eye image. That is, the labeling information can indicate whether the eyes in the eye image are open or closed. In one possible example, an eye image labeling information of 1 indicates that the eye in this eye image is in an open-eye state, and an eye image labeling information of 0 indicates that Indicates that the eyes are closed.

選択可能な一例において、本開示は、通常、異なるトレーニングタスクのそれぞれに対応する目画像セットから対応する枚数の目画像をそれぞれ取得する。例えば、図２において、目開閉検出トレーニングタスクａに対応する画像セットから対応する枚数の目画像を取得してトレーニング対象の目開閉検出用ニューラルネットワークに提供し、目開閉検出トレーニングタスクｂに対応する画像セットから対応する枚数の目画像を取得してトレーニング対象の目開閉検出用ニューラルネットワークに提供し、目開閉検出トレーニングタスクｃに対応する画像セットから対応する枚数の目画像を取得してトレーニング対象の目開閉検出用ニューラルネットワークに提供する。 In one selectable example, the present disclosure typically obtains each corresponding number of eye images from sets of eye images corresponding to each of the different training tasks. For example, in FIG. 2, a corresponding number of eye images are acquired from the image set corresponding to the eye open/closed detection training task a and provided to the training target eye open/closed detection neural network, and corresponding to the eye open/closed detection training task b. Acquiring a corresponding number of eye images from the image set and providing them to the neural network for eye open/close detection of the training target, acquiring a corresponding number of eye images from the image set corresponding to the eye open/close detection training task c, and providing them to the training target It provides a neural network for eye open/close detection.

１つの選択可能な例において、本開示は、異なるトレーニングタスクに予め設定された画像の枚数の比例に従って、異なるトレーニングタスクのそれぞれに対応する目画像セットから対応する枚数の目画像をそれぞれ取得することができる。また、目画像を取得するプロセスにおいて、通常、予め設定されたバッチ処理の数も考慮する。例えば、目開閉検出トレーニングタスクａ、目開閉検出トレーニングタスクｂ及び目開閉検出トレーニングタスクｃに対して予め設定された画像の枚数の比例が１：１：１である場合、予め設定されたバッチ処理数が６００であると、本開示は目開閉検出トレーニングタスクａに対応する目画像セットから２００枚の目画像、目開閉検出トレーニングタスクｂに対応する目画像セットから２００枚の目画像、目開閉検出トレーニングタスクｃに対応する目画像セットから２００枚の目画像を取得することができる。 In one selectable example, the present disclosure respectively acquires a corresponding number of eye images from the eye image sets corresponding to each of the different training tasks according to a preset proportion of the number of images for the different training tasks. can be done. The process of acquiring eye images also typically takes into account a preset number of batches. For example, if the ratio of the number of images preset for the eye open/close detection training task a, the eye open/close detection training task b, and the eye open/close detection training task c is 1:1:1, then the preset batch processing is performed. If the number is 600, the present disclosure selects 200 eye images from the eye image set corresponding to eye open/close detection training task a, 200 eye images from the eye image set corresponding to eye open/close detection training task b, eye open/closed. 200 eye images can be obtained from the eye image set corresponding to the detection training task c.

所望により、ある目開閉検出トレーニングタスクに対応する目画像セット内の目画像の枚数が、対応する枚数に達していない場合（例えば、２００に達していない）、バッチ処理数に達するように他の目開閉検出トレーニングタスクに対応する目画像セットから対応する枚数の目画像を取得することができる。例えば、目開閉検出トレーニングタスクｃに対応する目画像セットに１００枚の目画像のみがあり、目開閉検出トレーニングタスクａ及び目開閉検出トレーニングタスクｂのそれぞれに対応する目画像セット内の目画像の枚数がいずれも２５０を超えると、目開閉検出トレーニングタスクａに対応する目画像セットから２５０枚の目画像、目開閉検出トレーニングタスクｂに対応する目画像セットから２５０枚の目画像、目開閉検出トレーニングタスクｃに対応する目画像セットから１００枚の目画像を取得し、合計６００枚の目画像を取得することができる。これにより、目画像を取得する柔軟性を高めることができる。 Optionally, if the number of eye images in the eye image set corresponding to an eye open/closed detection training task does not reach the corresponding number (e.g., does not reach 200), another A corresponding number of eye images can be obtained from the eye image set corresponding to the eye open/close detection training task. For example, there are only 100 eye images in the eye image set corresponding to the eye open/close detection training task c, and there are only 100 eye images in the eye image set corresponding to each of the eye open/close detection training task a and the eye open/close detection training task b. When the number of both exceeds 250, 250 eye images from the eye image set corresponding to eye open/close detection training task a, 250 eye images from the eye image set corresponding to eye open/close detection training task b, and eye open/close detection. 100 eye images are obtained from the eye image set corresponding to training task c, and a total of 600 eye images can be obtained. This allows for greater flexibility in acquiring eye images.

なお、本開示は数をランダムに設置する方法を採用して異なるトレーニングタスクのそれぞれに対応する目画像セットから対応する枚数の目画像をそれぞれ取得することができる。本開示は異なるトレーニングタスクのそれぞれに対応する目画像セットから対応する枚数の目画像をそれぞれ取得するための具体的な実現方法を限定しない。また、目画像セットから目画像を取得するプロセスにおいて、ラベリング情報が開閉不明な状態である目画像の取得を回避すべきであり、これにより、目開閉検出用ニューラルネットワークの検出の正確性の向上に有利である。 It should be noted that the present disclosure can adopt the method of randomly setting the number to obtain a corresponding number of eye images from each eye image set corresponding to each different training task. The present disclosure does not limit the specific implementation method for obtaining each corresponding number of eye images from each corresponding eye image set for each different training task. In addition, in the process of acquiring eye images from the eye image set, acquisition of eye images whose labeling information is in an unknown open/closed state should be avoided, thereby improving the detection accuracy of the eye open/close detection neural network. It is advantageous to

選択可能な一例において、本開示は取得した複数の目画像の順番をトレーニング対象の目開閉検出用ニューラルネットワークに提供し、入力された目画像毎に目開閉状態の検出処理をトレーニング対象の目開閉検出用ニューラルネットワークによりそれぞれ行うことができる。これにより、トレーニング対象の目開閉検出用ニューラルネットワークが各目画像の目開閉状態の検出結果を順番に出力する。例えば、トレーニング対象の目開閉検出用ニューラルネットワークに入力された一枚の目画像は、順番に畳み込み層の処理、全結合層の処理及び分類用の層の処理を経て後、トレーニング対象の目開閉検出用ニューラルネットワークにより２つの確率値を出力し、２つの確率値の範囲はともに０～１であり、かつ２つの確率値の和は１である。そのうちの１つの確率値は開眼状態に対応し、この確率値の大きさが１に近いほど、この目画像における目が開眼状態に近いことを表す。その内のもう一つの確率値は閉眼状態に対応し、この確率値の大きさが１に近いほど、この目画像における目が閉眼状態に近いことを表す。 In one optional example, the present disclosure provides an order of acquired eye images to a training subject's eyes open/closed detection neural network to perform an eye open/closed state detection process for each input eye image. Each can be performed by a neural network for detection. As a result, the eye open/close detection neural network to be trained sequentially outputs the detection result of the eye open/closed state of each eye image. For example, a single eye image input to the training target eye open/close detection neural network is sequentially processed by the convolution layer, the fully connected layer, and the classification layer. Two probability values are output by the detection neural network, the range of the two probability values is between 0 and 1, and the sum of the two probability values is one. One of the probability values corresponds to the open-eye state, and the closer the magnitude of this probability value to 1, the closer the eye in the eye image is to the open-eye state. Another probability value among them corresponds to the closed-eye state, and the closer the magnitude of this probability value to 1, the closer the eye in the eye image is to the closed-eye state.

Ｓ１１０、目画像の目開閉のラベリング情報及び上記ニューラルネットワークから出力された目開閉状態の検出結果に基づき、上記少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ決定し、少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失に基づいてニューラルネットワークのネットワークパラメータを調整する。 S110, based on the eye open/closed labeling information of the eye image and the eye open/closed state detection result output from the neural network, respectively determine a loss corresponding to each of the at least two eye open/closed detection training tasks; Adjust the network parameters of the neural network based on the losses corresponding to each of the eye open/close detection training tasks.

選択可能な一例において、本開示は各目開閉検出トレーニングタスクのそれぞれに対応する損失を決定し、全てのトレーニングタスクのそれぞれに対応する損失に基づいて総合損失を決定し、この総合損失を利用してニューラルネットワークのネットワークパラメータを調整すべきである。本開示におけるネットワークパラメータは畳み込みカーネルパラメータおよび／または行列の重みなどを含んでもよいが、これらに限定されない。本開示はネットワークパラメータに含まれる具体的な内容を限定しない。 In one optional example, the present disclosure determines a loss corresponding to each eye open/close detection training task, determines a combined loss based on the loss corresponding to each of all training tasks, and utilizes this combined loss. should be used to adjust the network parameters of the neural network. Network parameters in this disclosure may include, but are not limited to, convolution kernel parameters and/or matrix weights. This disclosure does not limit the specific content included in network parameters.

選択可能な一例において、いずれかの目開閉検出トレーニングタスクに対して、本開示は当該トレーニングタスクに対応する画像セット内の複数の目画像のそれぞれに対してニューラルネットワークから出力した目開閉状態の検出結果のうちの最大確率値と、当該画像セット内の対応する目画像のラベリング情報に対応する境界面との間の夾角に基づいて、当該トレーニングタスクに対応する損失を決定することができる。所望により、本開示は目画像の目開閉のラベリング情報及びニューラルネットワークから出力された目開閉状態の検出結果に基づいて、Ａ－ｓｏｆｔｍａｘ（角度付きの正規化された指数）損失関数を利用して、異なる目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ決定し、異なる目開閉検出トレーニングタスクのそれぞれに対応する損失に基づいて総合損失（例えば各損失の和）を決定し、確率的勾配降下法を採用してニューラルネットワークのネットワークパラメータを調整することができる。例えば、本開示はＡ－ｓｏｆｔｍａｘ損失関数を用いて各目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ算出し、全ての目開閉検出トレーニングタスクのそれぞれに対応する損失の和に基づいてバックプロパゲーション処理を行い、トレーニング対象の目開閉検出用ニューラルネットワークのネットワークパラメータを損失勾配降下の方法で更新させることができる。 In one optional example, for any eye open/closed detection training task, the present disclosure provides eye open/close state detection output from a neural network for each of a plurality of eye images in an image set corresponding to the training task. Based on the included angle between the maximum probability value of the results and the boundary plane corresponding to the labeling information of the corresponding eye image in the image set, the loss corresponding to the training task can be determined. Optionally, the present disclosure utilizes an A-softmax (Angular Normalized Exponential) loss function based on the eye open/closed labeling information of the eye image and the eye open/close state detection results output from the neural network. , determine the losses corresponding to each of the different eye open/closed detection training tasks, respectively, determine the overall loss (e.g., the sum of each loss) based on the losses corresponding to each of the different eye open/closed detection training tasks, and perform stochastic gradient descent method can be adopted to tune the network parameters of the neural network. For example, the present disclosure uses the A-softmax loss function to calculate the respective losses corresponding to each eye open/close detection training task, and calculates the backproperty based on the sum of the losses corresponding to all eye open/close detection training tasks. Gation processing can be performed to update the network parameters of the trained eye open/close detection neural network by the method of lossy gradient descent.

上記内容から分かるように、本開示はニューラルネットワークをトレーニングするプロセスにおいて、毎回の反復トレーニングにニューラルネットワークに提供された全ての目画像は１つの目画像のサブセットを形成することができる。この目画像のサブセットには、各トレーニングタスクに対応する目画像が含まれる。本開示は各トレーニングタスクに対して損失を計算するため、ニューラルネットワークはトレーニングのプロセスにおいて、トレーニングタスク毎に目の開閉能力の検出に関する能力学習が可能であり、異なるトレーニングタスクを考慮した能力学習を行なうことができる。これにより、トレーニングされたニューラルネットワークは複数のトレーニングタスクに対応する複数のシーンにおける各シーンでの目画像の目開閉検出の正確性を同時に高めることができ、当該ニューラルネットワークに基づいて異なるシーンで目開閉を正確に検出する発明の普遍性と一般化の向上を促進し、複数シーンに関する実際の応用ニーズをより良く満たすために有利である。 As can be seen from the above content, the present disclosure, in the process of training a neural network, all eye images provided to the neural network for each iteration training can form a subset of one eye image. This subset of eye images includes eye images corresponding to each training task. Since the present disclosure calculates the loss for each training task, the neural network is capable of learning the ability to detect eye opening and closing ability for each training task in the process of training, and performs ability learning considering different training tasks. can do. This allows the trained neural network to simultaneously increase the accuracy of eye open/close detection in each scene in multiple scenes corresponding to multiple training tasks, and to detect the eyes in different scenes based on the neural network. It is advantageous for promoting the improvement of universality and generalization of the invention of accurately detecting opening and closing, and better meeting the actual application needs for multiple scenes.

本開示におけるＡ－ｓｏｆｔｍａｘ損失関数は下記の式（１）で示すことができる。

式（１） The A-softmax loss function in the present disclosure can be expressed by Equation (1) below.

formula (1)

上記式（１）において、Ｌ_ａｎｇは１つのトレーニングタスクに対応する損失を表し、Ｎは当該トレーニングタスクの目画像の枚数を表し、｜｜＊｜｜は＊のモデュラス（Modulus）を表し、ｘ_ｉは当該トレーニングタスクに対応するｉ番目の目画像を表し、ｙ_ｉは当該トレーニングタスクに対応するｉ番目の目画像のラベリング値を表し、ｍは常数であって、ｍの最小値が通常、所定値以上、例えば２+√３以上であり、

は、ｉ番目の目画像について、ニューラルネットワークから出力された目開閉状態の検出結果のうちの最大確率値と、ラベリング値に対応する境界面との間の夾角を表す。

は、ｍと上記夾角との積を表す。 In equation (1) above, _Lang represents the loss corresponding to one training task, N represents the number of eye images in the training task, ||*|| represents the modulus of *, and x _i represents the i-th eye image corresponding to the training task, _yi represents the labeling value of the i-th eye image corresponding to the training task, m is a constant, and the minimum value of m is usually is greater than or equal to a predetermined value, for example, greater than or equal to 2+√3;

represents the included angle between the maximum probability value among the eye open/close state detection results output from the neural network and the boundary surface corresponding to the labeling value for the i-th eye image.

represents the product of m and the included angle.

選択可能な一例において、トレーニング対象の目開閉検出用ニューラルネットワークに対するトレーニングが所定の反復条件に達した時、このトレーニングプロセスが終了する。本開示における所定の反復条件は、目画像に対してトレーニング対象の目開閉検出用ニューラルネットワークが出力した目開閉状態の検出結果と目画像のラベリング情報との間の差異が、所定差異の要求を満たすことを含んでもよい。差異が所定の差異要求を満たした場合、ニューラルネットワークに対する今回のトレーニングが成功に完了された。また、本開示における所定の反復条件は、トレーニング対象の目開閉検出用ニューラルネットワークをトレーニングし、使用される目画像の枚数が所定の枚数の要求に達したことなどを含んでもよい。使用される目画像の枚数が所定の枚数の要求に達したが、差異が所定の差異要求を満たしてない場合、ニューラルネットワークに対する今回のトレーニングが成功ではない。成功にトレーニングされたニューラルネットワークは目開閉状態の検出処理に用いることができる。 In one alternative example, the training process is terminated when the training of the eye open/close detection neural network to be trained reaches a predetermined iteration condition. The predetermined iteration condition in the present disclosure is that the difference between the detection result of the eye open/closed state output by the training target eye open/closed detection neural network for the eye image and the labeling information of the eye image satisfies the request for the predetermined difference. may include filling. If the difference satisfies the predetermined difference requirement, the neural network has been successfully trained this time. In addition, the predetermined iteration condition in the present disclosure may include training the eye open/closed detection neural network to be trained and the number of eye images to be used reaches a predetermined number of requests. If the number of eye images used reaches the predetermined number requirement, but the difference does not meet the predetermined difference requirement, then the neural network has not been successfully trained this time. A successfully trained neural network can be used in the eye open/closed state detection process.

本開示は異なるトレーニングタスクの損失に基づいて総合損失を形成し、総合損失を利用して目開閉検出用ニューラルネットワークのネットワークパラメータを調整し、ニューラルネットワークがトレーニングプロセスにおいて、トレーニングタスク毎に目の開閉能力の検出に関する能力学習が可能であり、異なるトレーニングタスクを考慮した能力学習を行なうことができる。これにより、トレーニングされたニューラルネットワークは複数のトレーニングタスクに対応する複数のシーンにおける各シーンでの目画像の目開閉検出の正確性を同時に高めることができ、当該ニューラルネットワークに基づいて異なるシーンで目開閉を正確に検出する発明の普遍性と一般化の向上を促進し、複数シーンに関する実際の応用ニーズをより良く満たすために有利である。 The present disclosure forms a combined loss based on the losses of different training tasks, uses the combined loss to adjust the network parameters of a neural network for detecting eye open/close, and the neural network detects eye open/close for each training task in the training process. Ability learning is possible with respect to ability detection, and ability learning can be performed considering different training tasks. This allows the trained neural network to simultaneously increase the accuracy of eye open/close detection in each scene in multiple scenes corresponding to multiple training tasks, and to detect the eyes in different scenes based on the neural network. It is advantageous for promoting the improvement of universality and generalization of the invention of accurately detecting opening and closing, and better meeting the actual application needs for multiple scenes.

図３は本開示の目開閉状態の検出方法の一実施形態のフローチャートを示す。 FIG. 3 shows a flow chart of one embodiment of the eye open/closed state detection method of the present disclosure.

図３に示すように、この実施例の方法はステップ：Ｓ３００及びＳ３１０を含む。以下に、図３における各ステップをそれぞれ詳しく説明する。 As shown in FIG. 3, the method of this embodiment includes steps: S300 and S310. Each step in FIG. 3 will be described in detail below.

Ｓ３００、被処理画像を取得する。 S300, an image to be processed is obtained.

選択可能な一例において、本開示の被処理画像は、静止的な画像または写真など画像であってもよく、または動的ビデオのビデオフレーム、例えば、移動物体上に設定された撮影装置によって撮影されたビデオのビデオフレームであってもよく、別の例では、固定位置に設定された撮影装置によって撮影されたビデオのビデオフレームであってもよい。上記移動物体は、車両、ロボット、またはロボットアームであってもよい。上記固定位置はデスクまたは壁であってもよい。本開示は、移動物体および固定位置の具体化される形式を限定しない。 In one selectable example, the processed images of the present disclosure may be images such as still images or photographs, or video frames of dynamic video, e.g., captured by an imager set on a moving object. In another example, it may be a video frame of a video captured by a capture device set in a fixed position. The moving object may be a vehicle, robot, or robotic arm. The fixed position may be a desk or a wall. This disclosure does not limit the embodied forms of moving objects and fixed locations.

選択可能な一例において、本開示は被処理画像を取得した後、被処理画像における目の位置領域を検出することができる。例えば、顔検出または顔のキーポイント検出方法などにより、被処理画像の目のバウンディングボックスを決定することができる。その後、本開示は目のバウンディングボックスに基づいて目の領域の画像を被処理画から切り取り、切り取った目画像ブロックがニューラルネットワークに提供される。当然、切り取った目画像ブロックは一定の前処理をされた後にニューラルネットワークに提供され得る。例えば、切り取った目画像ブロックに対してズーム処理を行い、ズーム処理された目画像ブロックの大きさをニューラルネットワークに入力された画像の寸法要求を満足させる。別の例では、対象者の両眼の目画像ブロックを切り取った後、所定側の目画像ブロックに対してマッピング処理を行い、対象者の２つの同一側の目画像ブロックを形成させる。所望により、２つの同一側の目画像ブロックに対してもズーム処理を行なうことができる。本開示は被処理画像から目画像ブロックを切り取るための具体的な実現方法を限定せず、切り取った目画像ブロックに対して前処理を行なうための具体的な実現方法も限定しない。 In one optional example, the present disclosure can detect eye location regions in the processed image after acquiring the processed image. For example, a bounding box for the eyes of the processed image can be determined, such as by face detection or face keypoint detection methods. The present disclosure then crops the image of the eye region from the processed image based on the eye bounding box, and the cropped eye image block is provided to the neural network. Of course, the clipped eye image blocks can be provided to the neural network after undergoing certain preprocessing. For example, zooming is performed on the cropped eye image block, and the size of the zoomed eye image block satisfies the size requirements of the input image to the neural network. In another example, after clipping the eye image blocks for both eyes of the subject, the mapping process is performed on the eye image blocks on a given side to form two eye image blocks on the same side of the subject. If desired, zooming can also be performed on two same-side eye image blocks. This disclosure does not limit a specific implementation method for cropping an eye image block from a processed image, nor does it limit a specific implementation method for performing pre-processing on the cropped eye image block.

Ｓ３１０、ニューラルネットワークを介して、上記被処理画像に対して、目開閉状態の検出処理を行い、目開閉状態の検出結果を出力する。本開示におけるニューラルネットワークは本開示におけるニューラルネットワークのトレーニング方法の実施形態を利用して成功にトレーニングして得たものである。 In step S310, an eye open/closed state detection process is performed on the image to be processed via a neural network, and the detection result of the eye open/closed state is output. The neural network in the present disclosure has been successfully trained using embodiments of the neural network training method in the present disclosure.

選択可能な一例において、入力された目画像ブロックに対して本開示におけるニューラルネットワークから出力された目開閉状態の検出結果は少なくとも１つの確率値、例えば、目が開眼状態にあることを示す確率値及び目が閉眼状態にあることを示す確率値であってもよい。この２つの確率値の範囲はともに０～１であり、同一の目画像ブロックに対する２つの確率値の和は１である。目が開眼状態にあることを示す確率値の大きさが１に近いほど、目画像ブロックにおける目が開眼状態に近いことを表す。目が閉眼状態にあることを示す確率値の大きさが１に近いほど、目画像ブロックにおける目が閉眼状態に近いことを表す。 In one selectable example, the eye open/closed state detection result output from the neural network in the present disclosure for the input eye image block is at least one probability value, such as a probability value indicating that the eyes are in the open state. and a probability value indicating that the eyes are closed. The two probability values both range from 0 to 1, and the sum of the two probability values for the same eye image block is one. The closer to 1 the probability value indicating that the eye is in the open state, the closer the eye in the eye image block is to the open state. The closer the magnitude of the probability value indicating that the eye is in the closed state is to 1, the closer the eye in the eye image block is to the closed eye state.

選択可能な一例において、本開示はニューラルネットワークから出力された、時系列の目開閉状態の検出結果に対して更に判断することができる。これにより、時系列の複数の被処理画像における対象者の目の動作、例えば、速くまばたく動作、または１つの目を開け、もう１つの目を閉じる動作、または目を細める動作などを決定することができる。 In an alternative example, the present disclosure can further determine against time-series eye open/close state detection results output from the neural network. This determines the subject's eye behavior, such as fast blinking, or opening one eye and closing the other, or squinting, in a plurality of time-series processed images. can do.

選択可能な一例において、本開示はニューラルネットワークから出力された時系列の目開閉状態の検出結果及び対象者の顔の他の器官の状態に基づいて、時系列の複数の被処理画像における対象者の表情、例えば、微笑み、大笑いまたは泣きまたは悲しみなどを決定することができる。 In a selectable example, the present disclosure is based on the time-series detection result of the eye open/close state output from the neural network and the state of other organs of the subject's face. facial expressions such as smiling, laughing or crying or sad.

選択可能な一例において、本開示はニューラルネットワークから出力された時系列の目開閉状態の検出結果に対して更に判断することができる。これにより、時系列の複数の被処理画像における対象者の疲労状態、例えば、軽度の疲労または居眠りまたは熟睡などを決定することができる。 In an alternative example, the present disclosure can further determine against time-series eye open/close state detection results output from the neural network. This makes it possible to determine the subject's state of fatigue, such as mild fatigue, dozing off, or deep sleep, in a plurality of time-series processed images.

選択可能な一例において、本開示はニューラルネットワークから出力された、時系列の目開閉状態の検出結果に対して更に判断することができる。これにより、時系列の複数の被処理画像における対象者の目の動作を決定することができるため、本開示は少なくとも目の動作に基づいて時系列の複数の被処理画像における対象者で表される対話制御情報を決定することができる。 In an alternative example, the present disclosure can further determine against time-series eye open/close state detection results output from the neural network. Since this allows determination of the eye movement of the subject in the plurality of time-series processed images, the present disclosure at least presents the subject in the plurality of time-series processed images based on the eye movement. dialog control information can be determined.

選択可能な一例において、本開示によって決定される目の動作、表情、疲労状態及び対話制御情報は様々な用途として利用することができる。例えば、対象者の所定の目の動作および／または表情を使用して、ライブ／中継中の所定の特殊効果をトリガーするか、または対応する人間とコンピュータの相互作用などを実現して、用途の実現方法を多様にすることに有利である。別の例では、インテリジェント運転技術において、運転手の疲労状態をリアルタイムに検出することにより、疲労運転の現象の防止に有利である。本開示はニューラルネットワークから出力された目開閉状態の検出結果の具体的な応用を限定しない。 In one alternative example, eye movements, facial expressions, fatigue status, and interaction control information determined by this disclosure can be used for a variety of purposes. For example, predetermined eye movements and/or facial expressions of a subject may be used to trigger predetermined special effects during live/broadcast, or to achieve corresponding human-computer interactions, etc. It is advantageous to diversify the implementation method. For another example, in intelligent driving technology, real-time detection of the driver's fatigue state is advantageous in preventing the phenomenon of fatigue driving. The present disclosure does not limit the specific application of the eye open/closed state detection result output from the neural network.

図４は本開示のインテリジェント運転制御方法の一実施形態のフローチャートを示す。本開示のインテリジェント運転制御方法は自動運転環境に適用することができ、巡航運転環境にも適用することができる。本開示はインテリジェント運転制御方法の適用環境を限定しない。 FIG. 4 shows a flowchart of one embodiment of the intelligent driving control method of the present disclosure. The intelligent driving control method of the present disclosure can be applied in the autonomous driving environment and can also be applied in the cruise driving environment. This disclosure does not limit the application environment of the intelligent driving control method.

図４に示すように、この実施例の方法はステップ：Ｓ４００、Ｓ４１０、Ｓ４２０及びＳ４３０を含む。以下に図４における各ステップを詳しく説明する。 As shown in FIG. 4, the method of this embodiment includes steps: S400, S410, S420 and S430. Each step in FIG. 4 will be described in detail below.

Ｓ４００、車両に搭載される撮影装置により収集された被処理画像を取得する。本ステップの具体的な実現方法は上記方法の実施形態における図３のＳ３００に関する説明を参照されたく、ここでその詳細を省略する。 S400, obtaining an image to be processed, which is captured by an imaging device mounted on a vehicle. For the specific implementation method of this step, please refer to the description of S300 in FIG. 3 in the above method embodiment, and the details thereof are omitted here.

Ｓ４１０、ニューラルネットワークを介して、上記被処理画像に対して、目開閉状態の検出処理を行い、目開閉状態の検出結果を出力する。本実施例のニューラルネットワークは上記ニューラルネットワークのトレーニング方法の実施形態を利用して成功にトレーニングして得たものである。本ステップの具体的な実現方法は上記方法の実施形態における図３のＳ３１０に関する説明を参照されたく、ここでその詳細を省略する。 In step S410, an eye open/closed state detection process is performed on the image to be processed via a neural network, and the detection result of the eye open/closed state is output. The neural network of this example was successfully trained using the above embodiment of the neural network training method. For the specific implementation method of this step, please refer to the description of S310 in FIG. 3 in the embodiment of the above method, and the details thereof are omitted here.

Ｓ４２０、少なくとも時系列の複数の被処理画像における同一の対象者の目開閉状態の検出結果に基づいて対象者の疲労状態を決定する。 S420, the subject's fatigue state is determined at least based on the detection result of the same subject's eye open/closed state in a plurality of time-series processed images.

選択可能な一例において、本開示の対象者は、通常、車両の運転手である。本開示は同一対象者に属し、かつ時系列の複数の目開閉状態の検出結果に基づいて、この対象者（例えば運転手）が単位時間当たりにまばたく回数、１回当たりの閉眼時間または１回当たりの開眼時間などの指標パラメータを決定することができ、これにより、所定の指標要求を用いて対応する指標パラメータを更に判断し、対象者（例えば運転手）が疲労状態にあるか否かを決定することができる。本開示における疲労状態は例えば、軽度の疲労状態、中度の疲労状態または重度の疲労状態など様々の異なる度合いの疲労状態を含んでもよい。本開示は対象者の疲労状態を決定するための具体的な実現方法を限定しない。 In one selectable example, the subject of this disclosure is typically a vehicle driver. The present disclosure belongs to the same subject, and based on the detection results of multiple eye open/closed states in time series, the number of times this subject (for example, a driver) blinks per unit time, the closed eye time per time, or An indicator parameter, such as the eye-opening time per session, can be determined so that the corresponding indicator parameter can be further determined using the predetermined indicator request to determine whether the subject (e.g., the driver) is in a state of fatigue. can decide whether Fatigue conditions in the present disclosure may include various different degrees of fatigue, such as, for example, mild fatigue, moderate fatigue, or severe fatigue. This disclosure does not limit specific implementations for determining a subject's fatigue status.

Ｓ４３０、対象者の疲労状態に応じて、指令を生成し出力する。 S430, generate and output a command according to the subject's fatigue state.

選択可能な一例において、本開示は対象者の疲労状態に応じて生成された指令として、インテリジェント運転状態への切り替え指令、疲労運転の音声警告指令、振動ウェイクアップ指令及び危険な運転情報の報知指令などのうちの少なくとも１種を含んでもよい。本開示は指令の具体化される形式を限定しない。 In one selectable example, the present disclosure provides commands generated according to the subject's fatigue state, such as a command to switch to an intelligent driving state, a voice warning command for fatigue driving, a vibration wake-up command, and a dangerous driving information notification command. At least one of such may be included. This disclosure does not limit the embodied form of the directive.

本開示のニューラルネットワークのトレーニング方法でトレーニングされたニューラルネットワークは、ニューラルネットワークの目開閉状態の検出結果の正確性の向上に有利である。そのため、このニューラルネットワークから出力された目開閉状態の検出結果を用いて疲労状態を判断することは、疲労状態検出の正確性の向上に役立ち、検出された疲労状態の検出結果に応じて指令を生成し、疲労運転の回避、更に運転の安全性に有利である。 The neural network trained by the neural network training method of the present disclosure is advantageous in improving the accuracy of the detection result of the eye open/closed state of the neural network. Therefore, judging the state of fatigue using the detection results of the eye open/closed state output from this neural network is useful for improving the accuracy of detecting the state of fatigue. This is advantageous for avoiding fatigue driving and further for driving safety.

図５は本開示のニューラルネットワークのトレーニング装置の一実施形態の構成模式図を示す。図５に示すニューラルネットワークのトレーニング装置は、トレーニング対象の目開閉検出用ニューラルネットワーク５００及び調整モジュール５１０を含む。所望により、この装置はさらに、入力モジュール５２０を含んでもよい。 FIG. 5 shows a structural schematic diagram of an embodiment of the neural network training device of the present disclosure. The neural network training apparatus shown in FIG. 5 includes a training target eye open/close detection neural network 500 and an adjustment module 510 . Optionally, the device may further include an input module 520. FIG.

トレーニング対象の目開閉検出用ニューラルネットワーク５００は少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する画像セット内の複数の目画像に対して、それぞれ開閉状態の検出処理を行い、目開閉状態の検出結果を出力することに用いられる。異なる画像セットに含まれる目画像は少なくとも部分的に異なる。 The eye open/closed detection neural network 500 to be trained performs an open/closed state detection process on each of a plurality of eye images in an image set corresponding to each of at least two eye open/closed detection training tasks to detect the open/closed state of the eye. Used to output results. The eye images contained in different image sets are at least partially different.

選択可能な一例において、本開示のトレーニング対象の目開閉検出用ニューラルネットワーク５００はトレーニングされた後、被処理画像に対して目開閉状態の検出を行い、被処理画像の目開閉状態の検出結果を出力することに用いることができる。例えば、１つの被処理画像に対して、ニューラルネットワーク５００２つの確率値を出力し、そのうちの１つの確率値は被処理画像における対象者の目が開いている状態にある確率を示し、この確率値が大きいほど、開眼状態に近いことを表す。そのうちのもう１つの確率値は被処理画像における対象者の目が閉じている状態にある確率を示し、この確率値が大きいほど、閉眼状態に近いことを表す。２つの確率値の和は１であってもよい。 In one selectable example, the training target eye open/closed detection neural network 500 of the present disclosure is trained, then performs eye open/close state detection on the processed image, and outputs the eye open/closed state detection result of the processed image to Can be used for output. For example, for one processed image, the neural network 500 outputs two probability values, one of which indicates the probability that the subject's eyes are open in the processed image, and this probability value The larger is, the closer the eye is to the open state. The other probability value indicates the probability that the subject's eyes are closed in the image to be processed. The sum of the two probability values may be one.

選択可能な一例において、本開示におけるニューラルネットワーク５００は畳み込みニューラルネットワークであってもよい。本開示におけるニューラルネットワーク５００は畳み込み層、Ｒｅｌｕ層（活性化層とも呼ばれる）、プーリング層、全結合層及び分類用（例えば２項分類）の層などを含んでもよいが、これらに限定されない。このニューラルネットワーク５００に含まれる層数が多いほど、ネットワークが深い。本開示はニューラルネットワーク５００の具体的な構成を限定しない。 In one alternative example, neural network 500 in this disclosure may be a convolutional neural network. Neural network 500 in the present disclosure may include, but is not limited to, convolutional layers, Relu layers (also called activation layers), pooling layers, fully connected layers, and layers for classification (eg, binary classification). The more layers included in this neural network 500, the deeper the network. This disclosure does not limit the specific configuration of neural network 500 .

選択可能な一例において、本開示でニューラルネットワーク５００をトレーニングするプロセスに関わる目開閉検出トレーニングタスクは少なくとも２つあり、かつそれぞれの目開閉検出トレーニングタスクはいずれもニューラルネットワークに目開閉状態の検出を実現させるためのトレーニングタスク全体に属すべきである。異なる目開閉検出トレーニングタスクに対応するトレーニング目標が完全に同じではない。つまり、本開示は以ニューラルネットワーク５００のトレーニングタスク全体を複数のトレーニングタスクに分けることができ、１つのトレーニングタスクは１つのトレーニング目標に対応し、かつ異なるトレーニングタスクに対応するトレーニング目標が異なる。 In one selectable example, there are at least two eye open/closed detection training tasks involved in the process of training the neural network 500 in this disclosure, and each eye open/closed detection training task both enables the neural network to detect an eye open/closed state. should belong to the overall training task for The training targets corresponding to different eye open/close detection training tasks are not exactly the same. That is, the present disclosure can divide the overall training task of neural network 500 into multiple training tasks, one training task corresponding to one training target, and different training targets corresponding to different training tasks.

選択可能な一例において、本開示の少なくとも２つの目開閉検出トレーニングタスクは、目に装着物が装着されている場合の目開閉検出タスク、目に装着物が装着されない場合の目開閉検出タスク、室内環境での目開閉検出タスク、室外環境での目開閉検出タスク、目に装着物が装着され且つ装着物にスポットがある場合の目開閉検出タスク、目に装着物が装着され且つ装着物にスポットがない場合の目開閉検出タスクのうちの少なくとも２つを含んでもよい。上記装着物は眼鏡または透明なプラスチックシートなどであってもよい。上記スポットは装着物が反射することによって装着物に形成されたスポットであってもよい。上記列記したタスクの詳細は上記方法の実施形態の説明を参照されたく、ここでその詳細を省略する。 In one selectable example, the at least two eye open/closed detection training tasks of the present disclosure are an eye open/closed detection task with wearable eyes, an eye open/closed detection task with no eye wearables, Eye open/close detection task in the environment, Eye open/close detection task in an outdoor environment, Eye open/close detection task when the wearable object is attached to the eye and a spot is on the wearable object, Eye wearable object is attached to the eye and the wearable object is spotted at least two of the eye open/closed detection tasks in the absence of . The wearable item may be eyeglasses or a transparent plastic sheet. The spot may be a spot formed on the wearable by reflection of the wearable. For the details of the above listed tasks, please refer to the description of the above method embodiments, and the details thereof are omitted here.

選択可能な一例において、本開示の少なくとも２つの目開閉検出トレーニングタスクはそれぞれ対応する画像セットがあり、通常、画像セット毎に複数の目画像が含まれる。異なる画像セットに含まれる目画像は少なくとも部分的に異なる。つまり、１つの画像セットにとって、この画像セット内の少なくもと一部の目画像は他の画像セットにない。所望により、異なる画像セットに含まれる目画像は共通部分があってもよい。 In one alternative example, each of the at least two eye open/closed detection training tasks of the present disclosure has a corresponding image set, typically including multiple eye images per image set. The eye images contained in different image sets are at least partially different. That is, for one image set, at least some eye images in this image set are absent from other image sets. If desired, the eye images contained in different image sets may have common parts.

所望により、上記列記した６つの目開閉検出トレーニングタスクのそれぞれに対応する画像セットはそれぞれ、目に装着物が装着されている目画像セット、目に装着物が装着されない目画像セット、室内環境で収集した目画像セット、室外環境で収集した目画像セット、目に装着物が装着され且つ装着物にスポットがある目画像セット、目に装着物が装着され且つ装着物にスポットがない目画像セットであってもよい。上記列記した画像セットの詳細は上記方法の実施形態の記載を参照されたく、ここでその詳細を省略する。 Optionally, the image sets corresponding to each of the six eye open/closed detection training tasks listed above are respectively the eye image set with the eye wearer attached, the eye image set without the eye wearer, and the eye image set in the room environment. A set of acquired eye images, a set of eye images acquired in an outdoor environment, a set of eye images with wearables on the eyes and spots on the wearables, and a set of eye images with wearables on the eyes and no spots on the wearables. may be For the details of the above-listed image sets, please refer to the description of the above method embodiments, and the details thereof are omitted here.

選択可能な一例において、本開示における目画像は、通常、撮影装置により撮影した目画像を含むから切り取った目画像ブロックであってもよい。本開示における目画像を形成するプロセスは、上記方法の実施形態の記載を参照されたく、ここでその詳細を省略する。 In one selectable example, the eye images in this disclosure may be eye image blocks that are cropped from those that typically include eye images captured by an imager. For the process of forming an eye image in the present disclosure, please refer to the description of the above method embodiments, and the details thereof are omitted here.

選択可能な一例において、本開示の目開閉検出用ニューラルネットワーク５００をトレーニングするための目画像は、通常、ラベリング情報を有し、かつ、このラベリング情報は目画像における目開閉状態を表すことができる。所望により、本開示におけるラベリング情報目画像における目が開閉不明の状態にあることも表すことができる。しかし、本開示における目開閉検出用ニューラルネットワーク５００をトレーニングするための目画像は、通常、ラベリング情報が開閉不明な状態である目画像を含まないため、開閉不明な状態の目画像によるニューラルネットワーク５００への影響を回避することに有利であり、目開閉検出用ニューラルネットワーク５００の検出の正確性の向上に有利である。 In one alternative example, the eye images for training the eye open/closed detection neural network 500 of the present disclosure typically have labeling information, and the labeling information can represent the eye open/closed state in the eye images. . If desired, it is also possible to indicate that the eyes in the labeling information eye image in the present disclosure are in an unknown open/closed state. However, eye images for training the eye open/close detection neural network 500 in the present disclosure usually do not include eye images whose labeling information is in an unknown open/closed state. This is advantageous for avoiding the influence of eye opening and closing, and is advantageous for improving the detection accuracy of the eye open/close detection neural network 500 .

入力モジュール５２０は異なる画像セットから対応する枚数の目画像を取得し、トレーニング対象の目開閉検出用ニューラルネットワーク５００に提供することに用いられる。例えば、入力モジュール５２０は異なる目開閉検出トレーニングタスクに対し、当該異なる目開閉検出トレーニングタスクに予め設定された画像の枚数の比例に従って、異なる画像セットから対応する枚数の目画像をそれぞれ取得し、トレーニング対象の目開閉検出用ニューラルネットワーク５００に提供することに用いられる。また、入力モジュール５２０は目画像を取得するプロセスにおいて、通常、予め設定されたバッチ処理数も考慮する。例えば、目開閉検出トレーニングタスクａ、目開閉検出トレーニングタスクｂ及び目開閉検出トレーニングタスクｃに対して予め設定された画像の枚数の比例が１：１：１である場合、予め設定されたバッチ処理数が６００であると、入力モジュール５２０は目開閉検出トレーニングタスクａに対応する目画像セットから２００枚の目画像、目開閉検出トレーニングタスクｂに対応する目画像セットから２００枚の目画像、目開閉検出トレーニングタスクｃに対応する目画像セットから２００枚の目画像を取得することができる。 The input module 520 is used to obtain a corresponding number of eye images from different image sets and provide them to the training target eye open/close detection neural network 500 . For example, for different eye open/close detection training tasks, the input module 520 acquires a corresponding number of eye images from different image sets according to the proportion of the number of images preset for the different eye open/close detection training tasks, and trains the corresponding number of eye images. It is used to provide a neural network 500 for eye open/closed detection of a subject. The input module 520 also typically considers a preset number of batches in the process of acquiring eye images. For example, if the ratio of the number of images preset for the eye open/close detection training task a, the eye open/close detection training task b, and the eye open/close detection training task c is 1:1:1, then the preset batch processing is performed. If the number is 600, the input module 520 outputs 200 eye images from the eye image set corresponding to eye open/close detection training task a, 200 eye images from the eye image set corresponding to eye open/close detection training task b, eye 200 eye images can be obtained from the eye image set corresponding to the open/close detection training task c.

所望により、ある目開閉検出トレーニングタスクに対応する目画像セット内の目画像の枚数が対応する枚数に達していない（例えば２００に達していない）場合、入力モジュール５２０はバッチ処理数に達するように他の目開閉検出トレーニングタスクに対応する目画像セットから対応する枚数の目画像を取得することができる。例えば、目開閉検出トレーニングタスクｃに対応する目画像セットに１００枚の目画像のみがあり、目開閉検出トレーニングタスクａ及び目開閉検出トレーニングタスクｂのそれぞれに対応する目画像セット内の目画像の枚数がいずれも２５０を超えると、入力モジュール５２０は目開閉検出トレーニングタスクａに対応する目画像セットから２５０枚の目画像、目開閉検出トレーニングタスクｂに対応する目画像セットから２５０枚の目画像、目開閉検出トレーニングタスクｃに対応する目画像セットから１００枚の目画像を取得することができる。これにより、入力モジュール５２０は合計６００枚の目画像を取得することになる。 Optionally, if the number of eye images in the set of eye images corresponding to an eye open/closed detection training task does not reach the corresponding number (e.g., does not reach 200), the input module 520 is configured to reach the batch number. A corresponding number of eye images can be obtained from eye image sets corresponding to other eye open/close detection training tasks. For example, there are only 100 eye images in the eye image set corresponding to the eye open/close detection training task c, and there are only 100 eye images in the eye image set corresponding to each of the eye open/close detection training task a and the eye open/close detection training task b. If both numbers exceed 250, the input module 520 selects 250 eye images from the eye image set corresponding to eye open/close detection training task a, and 250 eye images from the eye image set corresponding to eye open/close detection training task b. , 100 eye images can be obtained from the eye image set corresponding to the eye open/closed detection training task c. This causes the input module 520 to acquire a total of 600 eye images.

なお、入力モジュール５２０数をランダムに設置する方法を採用して、異なるトレーニングタスクのそれぞれに対応する目画像セットから対応する枚数の目画像をそれぞれ取得することができる。本開示は入力モジュール５２０が異なるトレーニングタスクのそれぞれに対応する目画像セットから対応する枚数の目画像をそれぞれ取得するための具体的な実現方法を限定しない。また、入力モジュール５２０は目画像セットから目画像を取得するプロセスにおいて、ラベリング情報が開閉不明な状態である目画像の取得を回避すべきであり、これにより、目開閉検出用ニューラルネットワークの検出の正確性の向上に有利である。 In addition, a method of randomly setting the number of input modules 520 can be adopted to obtain a corresponding number of eye images from each eye image set corresponding to each different training task. The present disclosure does not limit the specific implementation method for the input module 520 to respectively acquire a corresponding number of eye images from each corresponding set of eye images for different training tasks. In addition, in the process of acquiring eye images from the eye image set, the input module 520 should avoid acquiring eye images whose labeling information is in an unknown open/closed state, so that the neural network for detecting eye open/closed detection will not be able to detect such images. This is advantageous for improving accuracy.

選択可能な一例において、入力モジュール５２０は取得した複数の目画像の順番をトレーニング対象の目開閉検出用ニューラルネットワーク５００に提供し、入力された目画像毎に目開閉状態の検出処理をトレーニング対象の目開閉検出用ニューラルネットワーク５００によりそれぞれ行い、これにより、トレーニング対象の目開閉検出用ニューラルネットワーク５００が各目画像の目開閉状態の検出結果を順番に出力する。例えば、トレーニング対象の目開閉検出用ニューラルネットワーク５００に入力された一枚の目画像は、順番に畳み込み層の処理、全結合層の処理及び分類用の層の処理を経て後、トレーニング対象の目開閉検出用ニューラルネットワーク５００により２つの確率値を出力し、２つの確率値の範囲はともに０～１であり、かつ２つの確率値の和は１である。そのうちの１つの確率値は開眼状態に対応し、この確率値の大きさが１に近いほど、この目画像における目が開眼状態に近いことを表す。その内のもう一つの確率値は閉眼状態に対応し、この確率値の大きさが１に近いほど、この目画像における目が閉眼状態に近いことを表す。 In one selectable example, the input module 520 provides an order of the acquired eye images to the trained eye open/closed detection neural network 500 to perform eye open/closed state detection processing on each of the input eye images. This is performed by the eye open/closed detection neural network 500, whereby the training target eye open/closed detection neural network 500 sequentially outputs the detection result of the eye open/closed state of each eye image. For example, a single eye image input to the training target eye open/close detection neural network 500 is sequentially subjected to convolution layer processing, fully connected layer processing, and classification layer processing. Two probability values are output by the open/close detection neural network 500, the range of the two probability values is 0 to 1, and the sum of the two probability values is one. One of the probability values corresponds to the open-eye state, and the closer the magnitude of this probability value to 1, the closer the eye in the eye image is to the open-eye state. Another probability value among them corresponds to the closed-eye state, and the closer the magnitude of this probability value to 1, the closer the eye in the eye image is to the closed-eye state.

調整モジュール５１０は目画像の目開閉のラベリング情報及びニューラルネットワーク５００から出力された目開閉状態の検出結果に基づき、上記少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ決定し、少なくとも２つの目開閉検出トレーニングタスクのそれぞれに対応する損失に基づいてニューラルネットワーク５００のネットワークパラメータを調整することに用いられる。 The adjustment module 510 determines a loss corresponding to each of the at least two eye open/closed detection training tasks based on the eye open/closed labeling information of the eye image and the eye open/closed state detection result output from the neural network 500, and at least It is used to tune the network parameters of neural network 500 based on the losses corresponding to each of the two eye open/closed detection training tasks.

選択可能な一例において、調整モジュール５１０は各目開閉検出トレーニングタスクのそれぞれに対応する損失を決定し、全てのトレーニングタスクのそれぞれに対応する損失に基づいて総合損失を決定すべきである。調整モジュール５１０はこの総合損失を利用してニューラルネットワークのネットワークパラメータを調整する。本開示におけるネットワークパラメータは畳み込みカーネルパラメータおよび／または行列の重みなどを含んでもよいが、これらに限定されない。本開示はネットワークパラメータに含まれる具体的な内容を限定しない。 In one alternative example, adjustment module 510 should determine a corresponding loss for each eye open/close detection training task, and determine a total loss based on the respective losses for all training tasks. Adjustment module 510 uses this total loss to adjust the network parameters of the neural network. Network parameters in this disclosure may include, but are not limited to, convolution kernel parameters and/or matrix weights. This disclosure does not limit the specific content included in network parameters.

選択可能な一例において、いずれかの目開閉検出トレーニングタスクに対して、調整モジュール５１０は当該トレーニングタスクに対応する画像セット内の複数の目画像のそれぞれに対してニューラルネットワークから出力した目開閉状態の検出結果のうちの最大確率値と、当該画像セット内の対応する目画像のラベリング情報に対応する境界面との間の夾角に基づいて、当該トレーニングタスクに対応する損失を決定することができる。 In one optional example, for any eye open/closed detection training task, the adjustment module 510 adjusts the eye open/close state output from the neural network for each of the plurality of eye images in the image set corresponding to the training task. A loss corresponding to the training task can be determined based on the included angle between the maximum probability value among the detection results and the boundary surface corresponding to the labeling information of the corresponding eye image in the image set.

所望により、調整モジュール５１０は目画像の目開閉のラベリング情報及びニューラルネットワークから出力された目開閉状態の検出結果に基づいて、Ａ－ｓｏｆｔｍａｘ（角度付きの正規化された指数）損失関数を利用して、異なる目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ決定し、異なる目開閉検出トレーニングタスクのそれぞれに対応する損失に基づいて総合損失（例えば各損失の和）を決定する。その後、調整モジュール５１０は確率的勾配降下法を採用してニューラルネットワークのネットワークパラメータを調整することができる。例えば、調整モジュール５１０はＡ－ｓｏｆｔｍａｘ損失関数を用いて各目開閉検出トレーニングタスクのそれぞれに対応する損失をそれぞれ算出し、全ての目開閉検出トレーニングタスクのそれぞれに対応する損失の和に基づいてバックプロパゲーション処理を行い、トレーニング対象の目開閉検出用ニューラルネットワーク５００のネットワークパラメータを損失勾配降下の方法で更新させることができる。 Optionally, the adjustment module 510 utilizes an A-softmax (Angular Normalized Exponential) loss function based on the eye open/closed labeling information of the eye image and the eye open/close state detection results output from the neural network. to determine a loss corresponding to each of the different eye open/closed detection training tasks, and a total loss (eg, the sum of each loss) based on the losses corresponding to each of the different eye open/closed detection training tasks. The tuning module 510 can then employ stochastic gradient descent to tune the network parameters of the neural network. For example, the adjustment module 510 may use the A-softmax loss function to calculate the respective losses corresponding to each eye open/close detection training task, and back up based on the sum of the losses corresponding to all eye open/close detection training tasks. Propagation processing may be performed to update the network parameters of the trained eye open/close detection neural network 500 by the method of lossy gradient descent.

選択可能な一例において、トレーニング対象の目開閉検出用ニューラルネットワーク５００に対するトレーニングが所定の反復条件に達した時、調整モジュール５１０は今回のトレーニングプロセスが終了するように制御することができる。本開示における所定の反復条件は、目画像に対してトレーニング対象の目開閉検出用ニューラルネットワーク５００が出力した目開閉状態の検出結果と目画像のラベリング情報との間の差異が所定差異の要求を満たすことを含んでもよい。差異が所定の差異要求を満たした場合、ニューラルネットワーク５００に対する今回のトレーニングが成功に完了された。 In one alternative example, when the training of the trained eye open/close detection neural network 500 reaches a predetermined iteration condition, the adjustment module 510 can control the current training process to end. The predetermined iteration condition in the present disclosure is that the difference between the detection result of the eye open/closed state output by the training target eye open/closed detection neural network 500 for the eye image and the labeling information of the eye image is a predetermined difference. may include filling. If the difference satisfies the predetermined difference requirement, the neural network 500 has been successfully trained this time.

所望により、調整モジュール５１０により使用される所定の反復条件は、トレーニング対象の目開閉検出用ニューラルネットワークをトレーニングし、使用される目画像の枚数が所定の枚数の要求に達したことなどを含んでもよい。使用される目画像の枚数が所定の枚数の要求に達したが、差異が所定の差異要求を満たしてない場合、ニューラルネットワーク５００に対する今回のトレーニングが成功ではない。成功にトレーニングされたニューラルネットワーク５００は目開閉状態の検出処理に用いることができる。 Optionally, the predetermined iteration conditions used by the adjustment module 510 may include training a subject neural network for eye open/close detection, and that the number of eye images used has reached a predetermined number requirement. good. If the number of eye images used reaches the predetermined number requirement, but the difference does not meet the predetermined difference requirement, then the neural network 500 was not successfully trained this time. A successfully trained neural network 500 can be used in the eye open/closed state detection process.

図６は本開示の目開閉状態の検出装置の一実施形態の構成模式図を示す。図６に示すように、この実施例の装置は、取得モジュール６００及びニューラルネットワーク６１０を含む。所望により、目開閉状態の検出装置はさらに、決定モジュール６２０を含んでもよい。 FIG. 6 shows a configuration schematic diagram of an embodiment of an eye open/closed state detection apparatus of the present disclosure. As shown in FIG. 6, the apparatus of this embodiment includes acquisition module 600 and neural network 610 . Optionally, the eye open/closed state detection device may further include a determination module 620 .

取得モジュール６００は被処理画像を取得することに用いられる。 Acquisition module 600 is used to acquire the processed image.

選択可能な一例において、取得モジュール６００により取得された被処理画像は、静止的な画像または写真など画像であってもよく、または動的ビデオのビデオフレーム、例えば、移動物体上に設定された撮影装置によって撮影されたビデオのビデオフレームであってもよく、別の例では、固定位置に設定された撮影装置によって撮影されたビデオのビデオフレームであってもよい。上記移動物体は、車両、ロボット、またはロボットアームであってもよい。上記固定位置はデスクまたは壁であってもよい。 In one selectable example, the processed image acquired by the acquisition module 600 may be an image such as a still image or a photograph, or a video frame of a dynamic video, such as a shot set on a moving object. It may be a video frame of a video captured by the device, or in another example, a video frame of a video captured by a capture device set in a fixed position. The moving object may be a vehicle, robot, or robotic arm. The fixed position may be a desk or a wall.

選択可能な一例において、取得モジュール６００は被処理画像を取得した後、被処理画像における目の位置領域を検出することができる。例えば、取得モジュール６００は顔検出または顔のキーポイント検出方法などにより、被処理画像の目のバウンディングボックスを決定することができる。その後、取得モジュール６００は目のバウンディングボックスに基づいて目の領域の画像を被処理画から切り取り、切り取った目画像ブロックがニューラルネットワーク６１０に提供される。当然、取得モジュール６００は切り取った目画像ブロックに対して一定の前処理を実施した後に、それをニューラルネットワーク６１０に提供することができる。例えば、取得モジュール６００は切り取った目画像ブロックに対してズーム処理を行い、ズーム処理された目画像ブロックの大きさをニューラルネットワークに入力された画像の寸法要求を満足させる。別の例では、対象者の両眼の目画像ブロックを切り取った後、そのうちの所定側の目画像ブロックに対して取得モジュール６００によりマッピング処理を行い、対象者の２つの同一側の目画像ブロックを形成させる。所望により、取得モジュール６００はさらに、２つの同一側の目画像ブロックに対してズーム処理を行なうことができる。本開示は取得モジュール６００が被処理画像から目画像ブロックを切り取るための具体的な実現方法を限定せず、取得モジュール６００が切り取った目画像ブロックに対して前処理を行なうための具体的な実現方法も限定しない。 In one selectable example, the acquisition module 600 can detect eye location regions in the processed image after acquiring the processed image. For example, the acquisition module 600 can determine the eye bounding boxes of the processed image, such as by face detection or face keypoint detection methods. The acquisition module 600 then crops the image of the eye region from the processed image based on the eye bounding box, and the cropped eye image block is provided to the neural network 610 . Of course, the acquisition module 600 can perform certain pre-processing on the cropped eye image block before providing it to the neural network 610 . For example, the acquisition module 600 performs a zoom process on the cropped eye image block, and the size of the zoomed eye image block satisfies the size requirements of the input image to the neural network. In another example, after the eye image blocks for both eyes of the subject are cropped, the eye image blocks on a predetermined side of them are mapped by the acquisition module 600 to obtain two eye image blocks on the same side of the subject. form. If desired, the acquisition module 600 can also perform zoom processing on the two same-side eye image blocks. The present disclosure does not limit the specific implementation method for the acquisition module 600 to cut out the eye image block from the processed image, but the specific implementation for the acquisition module 600 to pre-process the cut eye image block. The method is also not limited.

ニューラルネットワーク６１０は被処理画像に対して目開閉状態の検出処理を行い、目開閉状態の検出結果を出力することに用いられる。 The neural network 610 is used to perform eye open/closed state detection processing on the image to be processed and output the eye open/closed state detection result.

選択可能な一例において、入力された目画像ブロックに対して本開示におけるニューラルネットワーク６１０から出力された目開閉状態の検出結果は少なくとも１つの確率値、例えば、目が開眼状態にあることを示す確率値及び目が閉眼状態にあることを示す確率値であってもよい。この２つの確率値の範囲はともに０～１であり、同一の目画像ブロックに対する２つの確率値の和は１である。目が開眼状態にあることを示す確率値の大きさが１に近いほど、目画像ブロックにおける目が開眼状態に近いことを表す。目が閉眼状態にあることを示す確率値の大きさが１に近いほど、目画像ブロックにおける目が閉眼状態に近いことを表す。 In one selectable example, the eye open/closed state detection result output from the neural network 610 of the present disclosure for the input eye image block has at least one probability value, e.g. value and a probability value indicating that the eye is in an eye-closed state. The two probability values both range from 0 to 1, and the sum of the two probability values for the same eye image block is one. The closer to 1 the probability value indicating that the eye is in the open state, the closer the eye in the eye image block is to the open state. The closer the magnitude of the probability value indicating that the eye is in the closed state is to 1, the closer the eye in the eye image block is to the closed eye state.

決定モジュール６２０は少なくとも、時系列の複数の被処理画像における同一の対象者の目開閉状態の検出結果に基づいて、対象者の目の動作および／または表情および／または疲労状態および／または対話制御情報を決定することに用いられる。 The determination module 620 determines at least the subject's eye movement and/or expression and/or fatigue state and/or dialogue control based on the detection result of the same subject's eye open/closed state in a plurality of time-series processed images. Used to determine information.

選択可能な一例において、対象者の目の動作は例えば、速くまばたく動作、または１つの目を開け、もう１つの目を閉じる動作、または目を細める動作などである。対象者の表情は例えば、微笑み、大笑いまたは泣きまたは悲しみなどである。対象者の疲労状態は例えば、軽度の疲労または居眠りまたは熟睡などである。対象者で表される対話制御情報は例えば、確認や拒否などである。 In one selectable example, the subject's eye action is, for example, a quick blink action, or an action of opening one eye and closing another eye, or a squinting action. The facial expression of the subject is, for example, smiling, laughing, crying or sad. The subject's state of fatigue is, for example, mild fatigue or doze or deep sleep. The interaction control information represented by the target person is, for example, confirmation or refusal.

図７は本開示のインテリジェント運転制御装置の一実施形態の構成模式図を示す。図７に示す装置は主として、取得モジュール６００、ニューラルネットワーク６１０、疲労状態決定モジュール７００及び指令モジュール７１０を含む。 FIG. 7 shows a configuration schematic diagram of an embodiment of the intelligent driving control device of the present disclosure. The apparatus shown in FIG. 7 mainly includes an acquisition module 600 , a neural network 610 , a fatigue state determination module 700 and a command module 710 .

取得モジュール６００は車両に搭載される撮影装置により収集された被処理画像を取得することに用いられる。 Acquisition module 600 is used to acquire processed images acquired by an imager mounted on a vehicle.

取得モジュール６００及びニューラルネットワーク６１０により具体的に実行される操作は、上記装置の実施形態の記載を参照されたく、ここでその詳細を省略する。 For the operations specifically performed by the acquisition module 600 and the neural network 610, please refer to the description of the embodiments of the above apparatus, and the details thereof are omitted here.

疲労状態決定モジュール７００は少なくとも時系列の複数の被処理画像における同一の対象者の目開閉状態の検出結果に基づいて対象者の疲労状態を決定することに用いられる。 The fatigue state determination module 700 is used to determine the fatigue state of the subject at least based on the detection results of the eye open/close state of the same subject in a plurality of time-series processed images.

選択可能な一例において、本開示における対象者は、通常、運転手である。疲労状態決定モジュール７００は同一対象者に属し、かつ時系列の複数の目開閉状態の監視結果に基づいて、この対象者（例えば運転手）が単位時間当たりにまばたく回数、１回当たりの閉眼時間または１回当たりの開眼時間などの指標パラメータを決定することができる。これにより、疲労状態決定モジュール７００は所定の指標要求を用いて対応する指標パラメータを更に判断する。疲労状態決定モジュール７００は対象者（例えば運転手）が疲労状態にあるか否かを決定することができる。本開示における疲労状態は例えば、軽度の疲労状態、中度の疲労状態または重度の疲労状態など様々な異なる度合いの疲労状態を含んでもよい。本開示は疲労状態決定モジュール７００が対象者の疲労状態を決定するための具体的な実現方法を限定しない。 In one selectable example, the subject of this disclosure is typically a driver. The fatigue state determination module 700 belongs to the same subject and determines the number of times the subject (for example, a driver) blinks per unit time, based on the results of monitoring a plurality of eye open/closed states in time series. Indicator parameters such as eye closure time or eye open time per session can be determined. Accordingly, the fatigue state determination module 700 uses the predetermined index request to further determine the corresponding index parameters. The fatigue state determination module 700 can determine whether the subject (eg, the driver) is in a state of fatigue. Fatigue conditions in the present disclosure may include various different degrees of fatigue, such as, for example, mild fatigue, moderate fatigue, or severe fatigue. This disclosure does not limit the specific implementation method for the fatigue state determination module 700 to determine the fatigue state of the subject.

指令モジュール７１０は対象者の疲労状態に応じて、指令を生成し出力することに用いられる。 The command module 710 is used to generate and output commands according to the fatigue state of the subject.

選択可能な一例において、指令モジュール７１０により対象者の疲労状態に応じて生成された指令として、インテリジェント運転状態への切り替え指令、疲労運転の音声警告指令、振動ウェイクアップ指令及び危険な運転情報の報知指令などのうちの少なくとも１種を含んでもよい。本開示は指令の具体化される形式を限定しない。 In one selectable example, the command generated by the command module 710 according to the subject's fatigue state includes a command to switch to an intelligent driving state, a voice warning command for fatigue driving, a vibration wake-up command, and a notification of dangerous driving information. It may include at least one of instructions and the like. This disclosure does not limit the embodied form of the directive.

本開示のニューラルネットワークのトレーニング方法でトレーニングされたニューラルネットワーク６１０は、ニューラルネットワークの目開閉状態の検出結果の正確性の向上に有利である。そのため、疲労状態決定モジュール７００がこのニューラルネットワーク６１０から出力された目開閉状態の検出結果を用いて疲労状態を判断することによって、疲労状態検出の正確性の向上に役立つ。これにより、指令モジュール７１０が検出された疲労状態の検出結果に応じて指令を生成することによって、疲労運転の回避、更に運転の安全性に有利である。 The neural network 610 trained by the neural network training method of the present disclosure is advantageous in improving the accuracy of the detection result of the eye open/close state of the neural network. Therefore, the fatigue state determination module 700 determines the fatigue state using the detection result of the eye open/closed state output from the neural network 610, which helps improve the accuracy of the fatigue state detection. Accordingly, the command module 710 generates a command according to the detected fatigue state, which is advantageous for avoiding fatigue driving and further driving safety.

例示的な機器
図８は本開示の実施形態の例示的な機器のブロック図を示す。この機器８００は、自動車に搭載される制御システム／電子システ、移動端末（例えば、スマートフォンなど）、パーソナルコンピュータ（ＰＣ、例えばデスクトップコンピュータまたはノートブックコンピュータなど）、タブレットコンピュータ及びサーバなどであってもよい。図８では、機器８００は、１つまたは複数のプロセッサ、通信部などを含み、前記１つまたは複数のプロセッサは、１つまたは複数の中央処理ユニット（ＣＰＵ）８０１、および／または１つまたは複数の加速ユニット８１３であってもよい。加速ユニット８１３は、グラフィックプロセッサ（ＧＰＵ）などであってもよい。プロセッサは、読み取り専用メモリ（ＲＯＭ）８０２に格納された実行可能命令、または記憶部８０８からランダムアクセスメモリ（ＲＡＭ）８０３にロードされた実行可能命令に基づいて、さまざまな適切な動作及び処理を実行できる。通信部８１２は、ネットワークカードを含み得るが、これに限定されず、前記ネットワークカードは、ＩＢ（インフィニバンド）ネットワークカードを含み得るが、これに限定されない。プロセッサは、実行可能命令を実行するために読み取り専用メモリ８０２および／またはランダムアクセスメモリ８０３と通信し、バス８０４を介して通信部分８１２に接続され、通信部８１２を介して他のターゲットデバイスと通信することによって本開示の対応するステップを完了ささせる。 Exemplary Equipment FIG. 8 shows a block diagram of an exemplary equipment of an embodiment of the present disclosure. This device 800 may be a control system/electronic system installed in an automobile, a mobile terminal (e.g., smart phone, etc.), a personal computer (PC, e.g., desktop computer or notebook computer, etc.), a tablet computer, a server, etc. . In FIG. 8, device 800 includes one or more processors, communications, etc., wherein said one or more processors are one or more central processing units (CPUs) 801 and/or one or more may be the acceleration unit 813 of Acceleration unit 813 may be a graphics processor (GPU) or the like. The processor performs various appropriate operations and processes based on executable instructions stored in read only memory (ROM) 802 or loaded from storage 808 into random access memory (RAM) 803. can. The communication unit 812 may include, but is not limited to, a network card, and the network card may include, but is not limited to, an IB (InfiniBand) network card. The processor communicates with read-only memory 802 and/or random-access memory 803 to execute executable instructions, is connected to communication portion 812 via bus 804, and communicates with other target devices via communication portion 812. complete the corresponding steps of this disclosure by doing.

上記の各命令によって実行される操作は、上記方法実施例の関連記載を参照されたく、ここでその詳細を省略する。また、ＲＡＭ８０３には、装置の動作に必要な様々なプログラムやデータを記憶することもできる。ＣＰＵ８０１、ＲＯＭ８０２、およびＲＡＭ８０３は、バス８０４を介して相互に接続される。 For the operations performed by each of the above instructions, please refer to the related descriptions of the above method embodiments, and the details thereof are omitted here. The RAM 803 can also store various programs and data necessary for the operation of the device. CPU 801 , ROM 802 and RAM 803 are interconnected via bus 804 .

ＲＡＭ８０３がある場合、ＲＯＭ８０２は選択可能なモジュールである。ＲＡＭ８０３は、実行可能命令を記憶するか、動作中に実行可能命令をＲＯＭ８０２に書き込み、実行可能命令は中央処理ユニット８０１に上記の方法に含まれるステップを実行させる。入力／出力（Ｉ／Ｏ）インターフェース８０５もまた、バス８０４に接続されている。通信部８１２は、統合的に配置され得るか、または複数のサブモジュール（例えば、複数のＩＢネットワークカード）を有し、それぞれバスに接続されるように構成され得る。 If RAM 803 is present, ROM 802 is an optional module. RAM 803 stores or writes executable instructions to ROM 802 during operation, which cause central processing unit 801 to perform the steps included in the methods described above. Input/output (I/O) interface 805 is also connected to bus 804 . The communication unit 812 may be arranged integrally, or may have multiple sub-modules (eg, multiple IB network cards), each configured to be connected to a bus.

以下の手段は、Ｉ／Ｏインターフェース８０５に接続されている：キーボードおよびマウスなどを含む入力部８０６、陰極線管（ＣＲＴ）、液晶ディスプレイ（ＬＤＣ）およびスピーカーなどを含む出力部８０７、ハードディスクを含む記憶部８０８、およびＬＡＮカード、モデムなどのネットワークインターフェースカードを含む通信部８０９。通信部８０９は、インターネットなどのネットワークを介して通信処理を行う。ドライバ８１０はまた、必要に応じてＩ／Ｏインターフェース８０５に接続されている。磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリなどのリムーバブル媒体８１１は、必要に応じてドライブ８１０に取り付けられ、そこから読み取られたコンピュータプログラムが必要に応じて記憶部８０８に取り付けられることが容易になる。 The following means are connected to the I/O interface 805: an input section 806 including a keyboard and mouse, etc., an output section 807 including a cathode ray tube (CRT), a liquid crystal display (LDC) and speakers, etc., a memory including a hard disk. section 808, and communication section 809 including network interface cards such as LAN cards, modems, and the like. A communication unit 809 performs communication processing via a network such as the Internet. Drivers 810 are also connected to I/O interfaces 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory can be attached to the drive 810 as needed, and a computer program read therefrom can easily be attached to the storage unit 808 as needed. become.

なお、図８に示すアーキテクチャは選択可能な実現方法にすぎない。具体的実践において、実際のニーズに応じて、上記図８の手段の数及びタイプを選択、削除、追加、または置換することができる。機能が異なる部材の配置について、分散設置または統合設置などの実現方法を採用してもよい。例えば、加速ユニット８１３とＣＰＵ８０１を分散に配置することができる。別の例では、加速ユニット８１３をＣＰＵ８０１に統合させることができる。通信部は分散に配置することができ、ＣＰＵ８０１または加速ユニット８１３に統合させることもできる。これらの代替可能な実施形態は全て本開示の保護範囲に含まれる。 Note that the architecture shown in FIG. 8 is only an alternative implementation. In specific practice, the number and type of means in FIG. 8 above can be selected, deleted, added or replaced according to actual needs. Regarding the arrangement of members with different functions, a method of realizing such as distributed installation or integrated installation may be adopted. For example, the acceleration unit 813 and the CPU 801 can be distributed. In another example, acceleration unit 813 can be integrated into CPU 801 . The communication part can be distributed and can be integrated into the CPU 801 or the acceleration unit 813 . All these alternative embodiments fall within the protection scope of the present disclosure.

特に、本開示の実施形態によれば、以下にフローチャートを参照して説明するプロセスは、コンピュータソフトウェアプログラムとして実施されることができる。例えば、本開示の実施形態は、機器で読み取り可能な媒体上に具体化されたコンピュータプログラムを含み、コンピュータプログラムはフローチャートに示されるステップを実行するためのプログラムコードを含み、プログラムコードは本開示に係る方法のステップの実行に対応する命令を含み得る。 In particular, according to embodiments of the present disclosure, the processes described below with reference to flowcharts may be implemented as computer software programs. For example, an embodiment of the present disclosure includes a computer program embodied on a machine-readable medium, the computer program including program code for performing the steps shown in the flowcharts, the program code described in the present disclosure. It may contain instructions corresponding to performing the steps of such methods.

このような実施形態では、このコンピュータプログラムは、通信部分８０９を介してネットワークからダウンロードおよびインストールされ得、および／またはリムーバブル媒体８１１からインストールされ得る。このコンピュータプログラムが中央処理ユニット（ＣＰＵ）８０１によって実行されると、上記の対応するステップを実施するための本開示に記載される命令が実行される。 In such embodiments, the computer program may be downloaded and installed from a network via communications portion 809 and/or installed from removable media 811 . When this computer program is executed by central processing unit (CPU) 801, the instructions described in this disclosure for performing the corresponding steps above are executed.

選択可能な１つ以上の実施形態において、本開示の実施例はさらに、実行されると、コンピュータに上記のいずれかの実施例に記載のニューラルネットワークのトレーニング方法または目開閉状態の検出方法またはインテリジェント運転制御方法を実行させるコンピュータ読み取り可能な命令を記憶するためのコンピュータプログラム製品を提供する。 In one or more optional embodiments, the embodiments of the present disclosure further, when executed, cause a computer to perform a neural network training method or an eye open/closed state detection method or an intelligent method according to any of the above embodiments. A computer program product is provided for storing computer readable instructions for performing an operational control method.

このコンピュータプログラム製品は、ハードウェア、ソフトウェア、またはそれらの組み合わせによって具体化され得る。選択可能な一例では、前記コンピュータプログラム製品は、コンピュータ記憶媒体とし具体化されている。選択可能な別の例では、前記コンピュータプログラム製品は、ソフトウェア開発キット（ＳｏｆｔｗａｒｅＤｅｖｅｌｏｐｍｅｎｔＫｉｔ，ＳＤＫ）などのソフトウェア製品として具体化されている。 This computer program product may be embodied in hardware, software, or a combination thereof. In one option, the computer program product is embodied in a computer storage medium. In another alternative, the computer program product is embodied as a software product, such as a Software Development Kit (SDK).

選択可能な１つ以上の実施形態において、本開示の実施例はさらに、別の、目開閉状態の検出方法、インテリジェント運転制御方法及びニューラルネットワークのトレーニング方法並びそれに対応する装置及び電子機器、コンピュータ記憶媒体、コンピュータプログラム及びコンピュータプログラム製品を提供し、そのうちの方法は、第一装置により、上記可能な何れかの実施可能な実施例におけるニューラルネットワークのトレーニング方法または目開閉状態の検出方法またはインテリジェント運転制御方法を第２の装置に実行させるためのニューラルネットワークトレーニング命令または目開閉状態の検出命令またはインテリジェント運転制御命令を、第２の装置に送信することと、第一装置が第２の装置から送信されたニューラルネットワークトレーニング結果または目開閉状態の検出結果またはインテリジェント運転制御結果を受信することと、を含む。 In one or more optional embodiments, the examples of the present disclosure further provide another eye open/closed state detection method, intelligent driving control method and neural network training method and corresponding devices and electronics, computer storage. Provide a medium, a computer program and a computer program product, the method of which is a method for training a neural network or a method for detecting an eye open/closed state or an intelligent driving control in any of the above possible practicable embodiments by a first device Sending to the second device a neural network training command or an eye open/close state detection command or an intelligent driving control command for causing the second device to perform the method; receiving a neural network training result or an eye open/close state detection result or an intelligent driving control result.

いくつかの実施例では、このニューラルネットワークトレーニング命令または眼の開閉状態検出命令またはインテリジェント運転制御命令は、具体的に呼び出し命令であってもよく、第１の装置は命令を呼び出すようにニューラルネットワークトレーニング操作または目開閉状態の検出操作またはインテリジェント運転制御操作を第２の装置に実行させることができ、相応的に、受信した呼び出し命令に応じて、第２の装置は上記のニューラルネットワークトレーニング方法または目開閉状態の検出方法またはインテリジェント運転制御方法のいずれかの実施例におけるステップおよび／またはフローを実行することができる。 In some embodiments, this neural network training command or eye open/closed state detection command or intelligent driving control command may specifically be a calling command, and the first device trains the neural network to call the command. The second device can be caused to perform the operation or eye open/closed state detection operation or intelligent driving control operation, and correspondingly, in response to the received call command, the second device performs the above neural network training method or eyes. The steps and/or flows in either embodiment of the open/closed state detection method or the intelligent driving control method can be performed.

本開示の実施形態における「第１の」および「第２の」などの用語は区別するためだけであり、本開示の実施例を限定するものとして解釈されるべきではないことを理解されたい。また、本開示において、「複数」は２つまたは２つ以上を指すことができ、「少なくとも１つ」は１つ、２つまたは２つ以上を指すことができることも理解されたい。更に、本開示で言及される任意の部材、データまたは構造は、明確な限定がない、または前後の文脈に反対の示唆がない限り、一般に、１つまたは複数として理解することも理解されたい。また、本開示における様々な実施例の記述は各実施例間の差異を重点として強調し、同一または類似するところは互いに参照することができ、簡潔にするために、それらは１つずつ繰り返されないことも理解されたい。 It should be understood that terms such as "first" and "second" in embodiments of the present disclosure are for distinction only and should not be construed as limiting examples of the present disclosure. It should also be understood that, in this disclosure, "plurality" can refer to two or more, and "at least one" can refer to one, two, or more. Further, it should also be understood that any member, data or structure referred to in this disclosure is generally to be understood as one or more unless there is a clear limitation or the context suggests otherwise. Also, the description of various embodiments in this disclosure emphasizes the differences between each embodiment, and the same or similar references may be made to each other, and for the sake of brevity, they are repeated one by one. It should also be understood that no

本開示の方法および装置、電子機器及びコンピュータ読み取り可能な記憶媒体は多くの方法で実施され得る。例えば、ソフトウェア、ハードウェア、ファームウェアまたはソフトウェア、ハードウェア、ファームウェアの任意の組み合わせによって本開示の方法および装置、電子機器及びコンピュータ読み取り可能な記憶媒体を実現することができる。方法に用いられるステップの上記順番は説明的なものに過ぎず、特に説明がない限り、本開示の方法のステップは上記具体的に記載された順番に限定されない。さらに、いくつかの実施形態では、本開示は記録媒体に記録されたプログラムとして実施され得る。これらのプログラムは、本開示に係る方法を実施するための機器で読み取り可能な命令を含む。したがって、本開示はまた、本開示に係る方法を実行するためのプログラムを記憶するための記録媒体をカバーする。 The methods and apparatus, electronics, and computer-readable storage media of the disclosure may be implemented in many ways. For example, the methods and apparatus, electronic devices and computer readable storage media of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, firmware. The above order of steps used in the method is for illustration only, and unless otherwise stated, the steps of the method of the present disclosure are not limited to the above specifically listed order. Further, in some embodiments the present disclosure may be implemented as a program recorded on a recording medium. These programs contain machine-readable instructions for implementing the methods of the present disclosure. Accordingly, the present disclosure also covers recording media for storing programs for performing the methods of the present disclosure.

この開示の説明は、例示および説明のために示すものであり、網羅的なものではなく、または本開示を披露された各形態に限定するものではない。当業者にとって、様々な修正及び変更が自明である。選択及び実施形態の説明は、本開示の原理と実際の適用をよりよく説明し、当業者が本開示の実施例を理解して特定の用途に適する様々な修正を伴う各実施形態を設計できるようにするためである。
The description of this disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or to limit the disclosure to each form presented. Various modifications and alterations will be apparent to those skilled in the art. The selection and description of the embodiments better explain the principles and practical applications of the disclosure, and enable those skilled in the art to understand the embodiments of the disclosure and design each embodiment with various modifications to suit a particular application. This is to ensure that

Claims

An eye open/closed state detection process is performed on each of a plurality of eye images in an image set corresponding to each of at least two eye open/closed detection training tasks via an eye open/close detection neural network to be trained. outputting the detection result of the open/closed state;
determining a loss corresponding to each of the at least two eye open/closed detection training tasks based on eye open/closed labeling information of the eye image and eye open/closed state detection results output from the neural network; adjusting network parameters of the neural network based on losses corresponding to each of two eye open/close detection training tasks;
including
the eye images in the different image sets are at least partially different,
The at least two eye open/closed detection training tasks include an eye open/closed detection task when the wearable object is worn on the eye, an eye open/closed detection task when the wearable object is not worn on the eye, an eye open/closed detection task in an indoor environment, Eye open/closed detection task in an outdoor environment, eye open/closed detection task when the wearable object is attached to the eye and there is a spot, eye open/closed detection task when the wearable object is attached to the eye and there is no spot on the eyewear and corresponding to each of the at least two eye open/closed detection training tasks are an eye image set with the wearer attached to the eye and an eye image set without the wearer attached to the eye. , an eye image set collected in an indoor environment, an eye image set collected in an outdoor environment, an eye image set with a wearable on the eye and a spot on the wearable, and a set of eye wear with the wearable on the eye and a spot on the wearable. including at least two of the non-eye image sets;
adjusting network parameters of the neural network based on losses corresponding to each of the at least two eye open/close detection training tasks;
determining a combined loss of the at least two eye open/closed detection training tasks based on losses corresponding to each of the at least two eye open/closed detection training tasks; and determining network parameters of the neural network based on the combined loss. A method of training a neural network , comprising: adjusting

performing an eye open/closed state detection process on each of a plurality of eye images in an image set corresponding to each of at least two eye open/closed detection training tasks via the eye open/closed detection neural network to be trained; Outputting the detection result of the eye open/closed state is
respectively obtaining a corresponding number of eye images from different image sets for different eye open/close detection training tasks according to the proportion of the number of images preset for the different eye open/close detection training tasks;
An eye open/closed state detection process is performed on each of the corresponding number of eye images via the eye open/closed detection neural network to be trained, and the eye open/closed state detection result corresponding to each eye image is obtained. to output;
2. The method of claim 1 , comprising:

Determining a loss corresponding to each of the at least two eye open/closed detection training tasks based on eye open/closed labeling information of the eye image and eye open/closed state detection results output from the neural network, respectively;
For any eye open/closed detection training task, the maximum probability value among the eye open/closed state detection results output from the neural network for each of a plurality of eye images in the image set corresponding to the training task; 3. The method of claim 1 or 2 , comprising determining a loss corresponding to the training task based on an included angle between a boundary surface corresponding to the labeling information of the corresponding eye image in the image set. the method of.

obtaining a processed image;
performing an eye open/closed state detection process on the image to be processed via a neural network and outputting a detection result of the eye open/closed state;
including
A method for detecting an open/closed state of eyes, wherein the neural network is trained by the method according to any one of claims 1 to 3 .

Determining eye movement and/or facial expression and/or fatigue state and/or dialogue control information of the subject based on at least detection results of eye open/closed states of the same subject in a plurality of time-series processed images. 5. The method of claim 4 , further comprising:

Acquiring an image to be processed collected by an imaging device mounted on a vehicle;
performing an eye open/closed state detection process on the image to be processed via a neural network and outputting a detection result of the eye open/closed state;
Determining the state of fatigue of the subject at least based on detection results of the eye open/closed state of the same subject in a plurality of time-series images to be processed;
generating and outputting a command according to the subject's fatigue state;
including
An intelligent driving control method, wherein the neural network is trained by the method according to any one of claims 1 to 3 .

Training used for performing eye open/closed state detection processing on each of a plurality of eye images in an image set corresponding to each of at least two eye open/closed detection training tasks and outputting eye open/closed state detection results. a neural network for detecting eye opening and closing of a subject;
determining a loss corresponding to each of the at least two eye open/closed detection training tasks based on eye open/closed labeling information of the eye image and eye open/closed state detection results output from the neural network; an adjustment module used to adjust network parameters of the neural network based on losses corresponding to each of two eye open/close detection training tasks;
including
the eye images in the different image sets are at least partially different,
The at least two eye open/closed detection training tasks include an eye open/closed detection task when the wearable object is worn on the eye, an eye open/closed detection task when the wearable object is not worn on the eye, an eye open/closed detection task in an indoor environment, Eye open/closed detection task in an outdoor environment, eye open/closed detection task when the wearable object is attached to the eye and there is a spot, eye open/closed detection task when the wearable object is attached to the eye and there is no spot on the eyewear and corresponding to each of the at least two eye open/closed detection training tasks are an eye image set with the wearer attached to the eye and an eye image set without the wearer attached to the eye. , an eye image set collected in an indoor environment, an eye image set collected in an outdoor environment, an eye image set with a wearable on the eye and a spot on the wearable, and a set of eye wear with the wearable on the eye and a spot on the wearable. including at least two of the non-eye image sets;
adjusting network parameters of the neural network based on losses corresponding to each of the at least two eye open/close detection training tasks;
determining a combined loss of the at least two eye open/closed detection training tasks based on losses corresponding to each of the at least two eye open/closed detection training tasks; and determining network parameters of the neural network based on the combined loss. and adjusting the neural network training device.

an acquisition module used to acquire a processed image;
a neural network used to perform eye open/closed state detection processing on the image to be processed and to output a detection result of the eye open/closed state;
including
8. An eye open/closed state detection apparatus, wherein the neural network is trained by the apparatus according to claim 7 .

an acquisition module used to acquire an image to be processed acquired by an imaging device mounted on a vehicle;
a neural network used to perform eye open/closed state detection processing on the image to be processed and to output a detection result of the eye open/closed state;
a fatigue state determination module used to determine the fatigue state of the subject based on detection results of the eye open/closed state of the same subject in at least a plurality of time-series images to be processed;
a command module used to generate and output a command according to the fatigue state of the subject;
including
8. An intelligent driving control system, wherein said neural network is trained by the system of claim 7 .

a memory for storing a computer program;
a processor executing a computer program stored in said memory and, when said computer program is executed, realizing the method according to any one of the preceding claims 1 to 6 ;
electronic equipment including;

A computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any one of claims 1 to 6 .

A computer program product comprising computer instructions which, when executed in a processor of an apparatus, implements the method of any one of the preceding claims 1-6 .