JP2021192224A

JP2021192224A - Method and device, electronic device, computer-readable storage medium, and computer program for detecting pedestrian

Info

Publication number: JP2021192224A
Application number: JP2021054113A
Authority: JP
Inventors: ヂァン・シァンシィン; Shangxin Zhang
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-06-10
Filing date: 2021-03-26
Publication date: 2021-12-16
Anticipated expiration: 2041-03-26
Also published as: CN111695491A; KR20210042859A; JP7269979B2; CN111695491B

Abstract

To provide a method and a device, an electronic device, a computer-readable storage medium, and a computer program for detecting a pedestrian, which relate to a field of image processing for pedestrian detection.SOLUTION: A specific embodiment of a method comprises the steps of: acquiring a target image; obtaining a first trimming image by trimming the target image; obtaining a feature map by extracting a feature of the first trimmed image; obtaining a second trimming image by trimming the feature map; and obtaining a detection result of a pedestrian by identifying the pedestrian in the second trimmed image. The embodiment can reduce an amount of calculation of the image, reduce an amount of a calculation resource use, and can be more conveniently applied to a mobile terminal having a low hardware level.SELECTED DRAWING: Figure 2

Description

本発明の実施例は、コンピュータ技術の分野に関し、特に画像処理の分野に関し、具体的には歩行者を検出するための方法及び装置、電子デバイス、コンピュータ可読記憶媒体及びコンピュータプログラムに関する。 Examples of the present invention relate to the field of computer technology, particularly to the field of image processing, specifically to methods and devices for detecting pedestrians, electronic devices, computer readable storage media and computer programs.

歩行者の検出アルゴリズムの正確さの継続的な進歩、及びトリップコンピュータ側での計算量の少ない歩行者検出に対する切実なニーズに伴い、ますます多くの歩行者の検出アルゴリズムをモバイル端末に置く。これらのモバイル端末はハードウェアレベルが低く、計算能力が低い。現在、効果の良い歩行者の検出アルゴリズムがディープラーニング技術を使用するが、ディープラーニングアルゴリズムは多くの計算リソースを必要とする。これらのディープラーニングアルゴリズムがハードウェアレベルの低いモバイル端末では動作できないことが多い。 With continued advances in the accuracy of pedestrian detection algorithms and the urgent need for low-complexity pedestrian detection on the trip computer side, more and more pedestrian detection algorithms are being placed on mobile terminals. These mobile devices have low hardware levels and low computing power. Currently, effective pedestrian detection algorithms use deep learning techniques, but deep learning algorithms require a lot of computational resources. These deep learning algorithms often do not work on mobile devices with low hardware levels.

本発明の実施例では、歩行者を検出するための方法及び装置、電子デバイス、コンピュータ可読記憶媒体及びコンピュータプログラムを提案する。 In the embodiments of the present invention, there are proposed methods and devices for detecting pedestrians, electronic devices, computer-readable storage media and computer programs.

第１態様において、ターゲット画像を取得することと、上記ターゲット画像をトリミング処理することにより、第１のトリミング画像を得ることと、上記第１のトリミング画像の特徴を抽出することにより、特徴マップを得ることと、上記特徴マップをトリミング処理することにより、第２のトリミング画像を得ることと、上記第２のトリミング画像における歩行者を識別することにより、歩行者の検出結果を得ることとを含む、歩行者を検出するための方法に関する。 In the first aspect, a feature map is obtained by acquiring a target image, obtaining a first trimmed image by trimming the target image, and extracting features of the first trimmed image. It includes obtaining, obtaining a second trimmed image by trimming the feature map, and obtaining a pedestrian detection result by identifying a pedestrian in the second trimmed image. , Concerning methods for detecting pedestrians.

第２態様において、ターゲット画像を取得するように構成される画像取得ユニットと、上記ターゲット画像をトリミング処理することにより、第１のトリミング画像を得るように構成される第１のトリミングユニットと、上記第１のトリミング画像の特徴を抽出することにより、特徴マップを得るように構成される特徴抽出ユニットと、上記特徴マップをトリミング処理することにより、第２のトリミング画像を得るように構成される第２のトリミングユニットと、上記第２のトリミング画像における歩行者を識別することにより、歩行者の検出結果を得るように構成される歩行者検出ユニットとを含む、歩行者を検出するための装置に関する。 In the second aspect, an image acquisition unit configured to acquire a target image, a first trimming unit configured to obtain a first trimmed image by trimming the target image, and the above. A feature extraction unit configured to obtain a feature map by extracting the features of the first trimmed image, and a first configured to obtain a second trimmed image by trimming the feature map. 2. A device for detecting a pedestrian, including a trimming unit 2 and a pedestrian detection unit configured to obtain a pedestrian detection result by identifying the pedestrian in the second trimmed image. ..

第３態様において、本発明の実施例は、１つ以上のプロセッサと、１つ以上のプログラムが記憶された記憶装置とを含み、上記１つ以上のプログラムが上記１つ以上のプロセッサによって実行されるとき、第１態様のいずれかの実施例で説明された方法を上記１つ以上のプロセッサに実行させる電子デバイスに関する。 In a third aspect, an embodiment of the present invention includes one or more processors and a storage device in which one or more programs are stored, and the one or more programs are executed by the one or more processors. The present invention relates to an electronic device that causes one or more processors to perform the method described in any embodiment of the first aspect.

第４態様において、本発明の実施例は、コンピュータプログラムが記憶されたコンピュータ可読媒体であって、上記コンピュータプログラムがプロセッサによって実行されるとき、第１態様のいずれかの実施例で説明された方法を実現するコンピュータ可読媒体に関する。 In a fourth aspect, the embodiment of the present invention is a computer-readable medium in which a computer program is stored, and the method described in any of the first embodiments when the computer program is executed by a processor. Regarding computer-readable media that realizes.

第５態様において、本発明の実施例は、コンピュータプログラムであって、上記コンピュータプログラムがプロセッサによって実行されるとき、第１態様のいずれかの実施例で説明された方法を実現するコンピュータプログラムに関する。 In a fifth aspect, an embodiment of the invention relates to a computer program that, when the computer program is executed by a processor, realizes the method described in any of the first embodiments.

本発明の技術によると、既存の歩行者の検出アルゴリズムの計算量が大きいという問題が解決され、ターゲット画像をトリミング処理することにより、画像の計算量を低減し、計算リソースの使用量を低減し、ハードウェアレベルが低いモバイル端末に、より便利に適用することができる。 According to the technique of the present invention, the problem that the calculation amount of the existing pedestrian detection algorithm is large is solved, and the calculation amount of the image is reduced and the usage amount of the calculation resource is reduced by trimming the target image. , Can be applied more conveniently to mobile terminals with low hardware level.

本明細書で記載された内容は、本開示の実施例のキーまたは重要な特徴を特定することを意図したものではなく、本開示の範囲を制限するものでもないことを理解すべきである。本開示の他の特徴は、以下の説明によって容易に理解される。 It should be understood that the content described herein is not intended to identify the key or important features of the embodiments of the present disclosure and is not intended to limit the scope of the present disclosure. Other features of the present disclosure are readily understood by the following description.

本発明のその他の特徴、目的および利点をより明確にするために、以下の図面を参照してなされた非限定的な実施例の詳細な説明を参照する。 In order to further clarify the other features, purposes and advantages of the present invention, reference is made to the detailed description of the non-limiting examples made with reference to the following drawings.

本発明の一実施例が適用可能な例示的なシステムアーキテクチャ図である。It is an exemplary system architecture diagram to which one embodiment of the present invention can be applied. 本発明による歩行者を検出するための方法の一実施例のフローチャートである。It is a flowchart of one Embodiment of the method for detecting a pedestrian by this invention. 図２に示された第１の実施例による応用シーンを示す概略図である。It is a schematic diagram which shows the application scene by 1st Example shown in FIG. 本発明による歩行者を検出するための方法の一実施例のフローチャートである。It is a flowchart of one Embodiment of the method for detecting a pedestrian by this invention. 本発明による歩行者を検出するための装置の一実施例の概略構成図である。It is a schematic block diagram of one Example of the apparatus for detecting a pedestrian by this invention. 本発明の実施例を実現するために適用される電子デバイスのコンピュータシステムの概略構成図である。It is a schematic block diagram of the computer system of the electronic device applied to realize the embodiment of this invention.

以下、図面及び実施例を参照して本発明をさらに詳細に説明する。ここで説明された具体的な実施例は、単に関連発明を説明するためのものであって、当該発明を限定するものではないことが理解される。なお、説明を容易にするために、図面には、関連発明に関する部分のみが示されている。 Hereinafter, the present invention will be described in more detail with reference to the drawings and examples. It is understood that the specific examples described herein are merely for explaining the related invention and not limiting the invention. For ease of explanation, the drawings show only the parts relating to the related invention.

なお、矛盾しない場合には、本発明の実施例及び実施例における特徴を互いに組み合わせることができる。以下、図面を参照して、実施例に合わせて本発明を詳細に説明する。 If there is no contradiction, the embodiments of the present invention and the features of the embodiments can be combined with each other. Hereinafter, the present invention will be described in detail according to examples with reference to the drawings.

図１には、本発明が適用され得る、歩行者を検出するための方法又は歩行者を検出するための装置の実施例の例示的なシステムアーキテクチャ１００が示されている。 FIG. 1 shows an exemplary system architecture 100 of an embodiment of a method for detecting a pedestrian or a device for detecting a pedestrian to which the present invention may be applied.

図１に示すように、システムアーキテクチャ１００は、端末装置１０１、１０２、１０３および画像収集装置１０４を含むことができる。画像収集装置１０４と端末装置１０１、１０２、１０４との間は、ネットワーク通信によって接続され、ネットワークは、例えば有線、無線通信リンク、または光ファイバケーブルなどの様々な接続タイプを含むことができる。 As shown in FIG. 1, the system architecture 100 can include terminal devices 101, 102, 103 and an image acquisition device 104. The image acquisition device 104 and the terminal devices 101, 102, 104 are connected by network communication, and the network can include various connection types such as, for example, wired, wireless communication links, or fiber optic cables.

端末装置１０１、１０２、１０３は、ネットワークを介して画像収集装置１０４によって収集された画像を受信し、画像を処理することができる。画像収集装置１０４は、端末装置１０１、１０２、１０３に取り付けられることができる。 The terminal devices 101, 102, and 103 can receive the images collected by the image collecting device 104 via the network and process the images. The image collecting device 104 can be attached to the terminal devices 101, 102, 103.

なお、端末装置１０１、１０２、１０３は、ハードウェアでもソフトウェアでもよい。端末装置１０１、１０２、１０３がハードウェアである場合、ディスプレイを有し、画像処理をサポートする様々な電子デバイスであってもよく、スマートフォン、タブレットコンピューター、ラップトップコンピューター、デスクトップコンピューターなどが含まれるが、これらに限定されない。端末装置１０１、１０２、１０３がソフトウェアである場合、上述した電子デバイスに取り付けられることができる。複数のソフトウェアまたはソフトウェアモジュール（例えば分散型サービスを提供する）として実現されてもよく、単一のソフトウェアまたはソフトウェアモジュールとして実現されてもよい。ここで、具体的に限定しない。 The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices that have a display and support image processing, including smartphones, tablet computers, laptop computers, desktop computers, and the like. , Not limited to these. When the terminal devices 101, 102, 103 are software, they can be attached to the above-mentioned electronic device. It may be implemented as multiple software or software modules (eg, providing distributed services) or as a single software or software module. Here, it is not specifically limited.

画像収集装置１０４は、カメラ、ビデオカメラなどの画像を収集するための様々な装置であってもよい。画像収集装置１０４は、収集した画像をリアルタイムで端末装置１０１、１０２、１０３に送信することができる。 The image collecting device 104 may be various devices for collecting images such as a camera and a video camera. The image collecting device 104 can transmit the collected images to the terminal devices 101, 102, 103 in real time.

なお、本発明の実施例による歩行者を検出するための方法は、端末装置１０１、１０２、１０３によって実行されてもよいし、画像収集装置１０４によって実行されてもよい。これに対応して、歩行者を検出するための装置は、一般的に端末装置１０１、１０２、１０３に配置されてもよいし、画像収集装置１０４に配置されてもよい。 The method for detecting a pedestrian according to the embodiment of the present invention may be executed by the terminal devices 101, 102, 103, or may be executed by the image collecting device 104. Correspondingly, the device for detecting a pedestrian may be generally arranged in the terminal devices 101, 102, 103, or may be arranged in the image collecting device 104.

図１の端末装置および画像収集装置の数は単なる例示であることを理解すべきである。必要に応じて、任意の数の端末装置および画像収集装置を備えることができる。 It should be understood that the number of terminal devices and image acquisition devices in FIG. 1 is merely exemplary. If necessary, any number of terminal devices and image acquisition devices can be provided.

続けて図２を参照すると、本発明による歩行者を検出するための方法の一実施例のフロー２００が示されている。本実施例の歩行者を検出するための方法は、以下のステップを含む。 Subsequently, with reference to FIG. 2, a flow 200 of an embodiment of the method for detecting a pedestrian according to the present invention is shown. The method for detecting a pedestrian in this embodiment includes the following steps.

ステップ２０１において、ターゲット画像を取得する。 In step 201, the target image is acquired.

本実施例において、歩行者を検出するための方法の実行主体（例えば図１に示された端末装置１０１、１０２、１０３または画像収集装置１０４）は、有線接続または無線接続によってターゲット画像を取得することができる。ここで、ターゲット画像は、歩行者を含む任意の画像であることができる。 In this embodiment, the executing subject of the method for detecting a pedestrian (for example, the terminal device 101, 102, 103 or the image collecting device 104 shown in FIG. 1) acquires a target image by a wired connection or a wireless connection. be able to. Here, the target image can be any image including a pedestrian.

ステップ２０２において、ターゲット画像をトリミング処理することにより、第１のトリミング画像を得る。 In step 202, the target image is trimmed to obtain a first cropped image.

実行主体は、ターゲット画像を得た後、画像をトリミング処理することにより、第１のトリミング画像を得ることができる。具体的には、実行主体は、上方の一部の領域をトリミングすることにより、第１のトリミング画像を得ることができる。または、実行主体は、ターゲット画像をプリセットサイズにトリミングすることにより、第１のトリミング画像を得ることもできる。実行主体によってトリミングされた部分には有効な歩行者情報が含まれなくてもよく、すなわち、トリミングされた部分には歩道橋を歩く歩行者が含まれることが理解できる。第１のトリミング画像には有効な歩行者情報が含まれる。 The execution subject can obtain the first trimmed image by trimming the image after obtaining the target image. Specifically, the executing subject can obtain the first cropped image by trimming a part of the upper region. Alternatively, the execution subject can also obtain the first cropped image by trimming the target image to a preset size. It can be understood that the portion trimmed by the performer does not have to contain valid pedestrian information, that is, the trimmed portion includes pedestrians walking on the pedestrian bridge. The first cropped image contains valid pedestrian information.

ステップ２０３において、第１のトリミング画像の特徴を抽出することにより、特徴マップを得る。 In step 203, a feature map is obtained by extracting the features of the first cropped image.

実行主体は、第１のトリミング画像を得た後、第１のトリミング画像の特徴を抽出することにより、特徴マップを得ることができる。具体的には、実行主体は、様々な特徴抽出アルゴリズムを用いて、第１のトリミング画像に対し特徴抽出を行うことにより、特徴マップを得ることができる。上記特徴抽出アルゴリズムは、畳み込みニューラルネットワークなどを含むことができる。上記特徴マップには、例えば歩行者の輪郭、位置、中心などの有効な歩行者の情報が含まれてもよい。 The execution subject can obtain a feature map by extracting the features of the first trimmed image after obtaining the first trimmed image. Specifically, the execution subject can obtain a feature map by performing feature extraction on the first trimmed image using various feature extraction algorithms. The feature extraction algorithm may include a convolutional neural network or the like. The feature map may include valid pedestrian information such as the contour, position, and center of the pedestrian.

ステップ２０４において、特徴マップをトリミング処理することにより、第２のトリミング画像を得る。 In step 204, the feature map is trimmed to obtain a second cropped image.

実行主体は、特徴マップを得た後、特徴マップをトリミング処理することにより、第２のトリミング画像を得ることができる。ステップ２０３の特徴抽出の後、特徴マップには、歩行者の一部の情報が含まれ、すなわち歩行者検出に必要な情報が含まれていることが理解できる。これに加えて、計算量をさらに減らすために、特徴マップをさらにトリミング処理することにより、第２のトリミング画像を得ることができる。ここで、実行主体は、特徴マップの上方の一部の領域をトリミングすることができる。または、実行主体は、特徴マップをプリセットサイズにトリミングすることもできる。 The execution subject can obtain a second trimmed image by trimming the feature map after obtaining the feature map. After the feature extraction in step 203, it can be understood that the feature map contains some information of the pedestrian, that is, the information necessary for pedestrian detection. In addition to this, a second cropped image can be obtained by further trimming the feature map in order to further reduce the amount of calculation. Here, the execution subject can trim a part of the area above the feature map. Alternatively, the execution subject can trim the feature map to a preset size.

ステップ２０５において、第２のトリミング画像における歩行者を識別することにより、歩行者の検出結果を得る。 In step 205, the pedestrian detection result is obtained by identifying the pedestrian in the second cropped image.

第２のトリミング画像を得た後、実行主体は、第２のトリミング画像に対して歩行者検出を行うことにより、歩行者の検出結果を得る。具体的には、実行主体は、様々な歩行者検出アルゴリズムを用いて、第２のトリミング画像を処理することにより、歩行者の検出結果を得ることができる。上記歩行者検出アルゴリズムは、様々なニューラルネットワークを含むことができる。 After obtaining the second trimmed image, the executing subject obtains a pedestrian detection result by performing pedestrian detection on the second trimmed image. Specifically, the executing subject can obtain a pedestrian detection result by processing the second trimmed image using various pedestrian detection algorithms. The pedestrian detection algorithm can include various neural networks.

本発明の上記実施例による歩行者を検出するための方法では、ターゲット画像を複数回トリミングしながら歩行者の特徴を抽出することができ、計算量を減らすとともに、歩行者検出の正確性も確保できる。 In the method for detecting a pedestrian according to the above embodiment of the present invention, the characteristics of the pedestrian can be extracted while trimming the target image multiple times, reducing the amount of calculation and ensuring the accuracy of pedestrian detection. can.

続けて図３を参照すると、本発明による歩行者を検出するための方法の別の実施例のフロー３００が示されている。図３に示すように、本実施例の方法は、以下のステップを含む。 Subsequently referring to FIG. 3, a flow 300 of another embodiment of the method for detecting a pedestrian according to the present invention is shown. As shown in FIG. 3, the method of this embodiment includes the following steps.

ステップ３０１において、車両に取り付けられたドライブレコーダーで収集された画像をターゲット画像とする。 In step 301, the image collected by the drive recorder attached to the vehicle is set as the target image.

実行主体は、車両に取り付けられたドライブレコーダーで収集された画像を取得し、上記画像をターゲット画像とすることができる。上記画像には、車両の走行環境の情報が含まれることができる。ドライブレコーダーの取り付け位置が比較的低く、収集された画像には歩道橋情報が含まれることができ、上記歩道橋には歩行者がいる可能性があることは理解できる。上記歩行者は車両の走行に影響しないので、上記歩行者を検出する必要はない。 The executing subject can acquire an image collected by a drive recorder attached to the vehicle and use the above image as a target image. The above image may include information on the traveling environment of the vehicle. It is understandable that the mounting position of the drive recorder is relatively low, the collected images can contain pedestrian bridge information, and there may be pedestrians on the pedestrian bridge. Since the pedestrian does not affect the running of the vehicle, it is not necessary to detect the pedestrian.

ステップ３０２において、ターゲット画像の上方のプリセット比率の領域をトリミングすることにより、第１のトリミング画像を得る。 In step 302, a first cropped image is obtained by trimming the area of the preset ratio above the target image.

本実施例において、実行主体は、ターゲット画像の上方のプリセット比率の領域をトリミングすることにより、第１のトリミング画像を得ることができる。上記プリセット比率は、画像に含まれるコンテンツによって確定されることができる。具体的には、実行主体は、ターゲット画像における各オブジェクトを識別し、車両からプリセット距離以外の領域に含まれるオブジェクトをトリミングすることができる。例を挙げると、実行主体は、車両の前方５０メートル先の領域をトリミングすることができる。上記領域は、ターゲット画像の上方にある。いくつかの具体的な適用例では、上記プリセット比率は、１／４であってもよい。 In this embodiment, the execution subject can obtain the first cropped image by trimming the region of the preset ratio above the target image. The preset ratio can be determined by the content contained in the image. Specifically, the execution subject can identify each object in the target image and crop the object included in the area other than the preset distance from the vehicle. For example, the executing subject can trim an area 50 meters ahead of the vehicle. The area is above the target image. In some specific applications, the preset ratio may be 1/4.

ステップ３０３において、第１のトリミング画像の特徴を抽出し、抽出プロセスにおいて第１のトリミング画像をトリミング処理することにより、特徴マップを得る。 In step 303, the feature of the first trimmed image is extracted, and the feature map is obtained by trimming the first trimmed image in the extraction process.

本実施例において、実行主体は、第１のトリミング画像を得た後、第１のトリミング画像を抽出し特徴抽出を行い、抽出プロセスにおいて第１のトリミング画像をトリミング処理することにより、特徴マップを得る。具体的には、実行主体は、特徴抽出アルゴリズムを用いて、第１のトリミング画像に対し特徴抽出を行った後、中間特徴マップを得ることができる。そして、実行主体は、得られた中間特徴マップをトリミングすることができる。そして、実行主体は、特徴抽出アルゴリズムを用いて、トリミングされた後の中間特徴マップに対して再度特徴抽出を行うことにより、再度中間特徴マップを得る。実行主体は、得られた中間特徴マップを再度トリミングすることにより、特徴マップを得ることができる。実行主体は、第１のトリミング画像の特徴を複数回抽出することにより、特徴マップを得ることができ、得られた特徴マップを複数回トリミングすることにより、特徴マップを得ることもできることが理解できる。このようにすると、歩行者の有効な特徴をトリミングしないことが保証される。 In this embodiment, the execution subject obtains the first trimmed image, then extracts the first trimmed image, performs feature extraction, and trims the first trimmed image in the extraction process to obtain a feature map. obtain. Specifically, the execution subject can obtain an intermediate feature map after performing feature extraction on the first trimmed image by using the feature extraction algorithm. Then, the executing subject can trim the obtained intermediate feature map. Then, the execution subject obtains the intermediate feature map again by performing feature extraction again on the trimmed intermediate feature map using the feature extraction algorithm. The execution subject can obtain the feature map by trimming the obtained intermediate feature map again. It can be understood that the execution subject can obtain the feature map by extracting the features of the first trimmed image multiple times, and can also obtain the feature map by trimming the obtained feature map multiple times. .. This ensures that the effective features of the pedestrian are not trimmed.

本実施例のいくつかの選択可能な実施形態において、実行主体は、図３に示されない以下のステップによって特徴マップを得ることができ、即ち、第１のトリミング画像に対して少なくとも２回の畳み込み演算を行い、少なくとも１回の畳み込み演算の後、得られた特徴マップに対して少なくとも１回のトリミング処理を行うことにより、特徴マップを得る。 In some selectable embodiments of this embodiment, the performer can obtain a feature map by the following steps not shown in FIG. 3, i.e., at least two convolutions with respect to the first cropped image. A feature map is obtained by performing an operation, performing at least one convolution operation, and then performing a trimming process at least once on the obtained feature map.

本実施例において、実行主体は、少なくとも２つの畳み込み層を用いて、第１のトリミング画像に対して特徴抽出を行うことができる。実行主体は、上記少なくとも２つの畳み込み層を用いて、第１のトリミング画像に対して少なくとも２回の畳み込み演算を行うことができる。畳み込み演算を行うたびに、中間特徴マップが得られることが理解できる。少なくとも１回の畳み込み演算の後、得られた中間特徴マップに対して少なくとも１回のトリミング処理を行う。具体的には、実行主体は、２回の畳み込み演算を行った後、得られた中間特徴マップに対して１回のトリミング処理を行うことができる。例えば、中間特徴マップの上方の１／９の領域をトリミングする。または、実行主体は、１回の畳み込み演算を行った後、中間特徴マップの上方の１／１８の領域をトリミングする。この後、トリミングされた中間特徴マップに対して畳み込み演算を再度行うことにより、中間特徴マップを再度得る。上記中間特徴マップを再度トリミングし、即ち、中間特徴マップの上方の１／１８の領域を再度トリミングする。 In this embodiment, the execution subject can perform feature extraction on the first cropped image using at least two convolution layers. The execution subject can perform at least two convolution operations on the first cropped image using the at least two convolution layers. It can be understood that an intermediate feature map is obtained each time the convolution operation is performed. After at least one convolution operation, the obtained intermediate feature map is subjected to at least one trimming process. Specifically, the execution subject can perform one trimming process on the obtained intermediate feature map after performing two convolution operations. For example, trim the upper 1/9 area of the intermediate feature map. Alternatively, the execution subject trims the upper 1/18 area of the intermediate feature map after performing one convolution operation. After that, the convolution operation is performed again on the trimmed intermediate feature map to obtain the intermediate feature map again. The intermediate feature map is trimmed again, that is, the area above 1/18 of the intermediate feature map is trimmed again.

歩行者の中心が一般的にターゲット画像の中下に位置し、歩行者の特徴を抽出するので、得られた中間特徴マップの上方の領域をトリミングすることができる。このようにすると、歩行者の有効な特徴をトリミングしないことが保証される。 Since the center of the pedestrian is generally located in the lower middle of the target image and the features of the pedestrian are extracted, the upper area of the obtained intermediate feature map can be trimmed. This ensures that the effective features of the pedestrian are not trimmed.

ステップ３０４において、特徴マップをトリミング処理することにより、第２のトリミング画像を得る。 In step 304, the feature map is trimmed to obtain a second cropped image.

特徴マップを得た後、実行主体は、得られた特徴マップをトリミング処理することができる。特徴マップには歩行者の中心位置情報が含まれる。したがって、この場合、特徴マップを再度トリミング処理しても、歩行者の検出結果に影響を与えない。計算量を減らすために、特徴マップの上方の一部の領域をトリミングすることができる。いくつかの具体的な適用例では、上記の一部の領域は、１／４の領域であってもよい。 After obtaining the feature map, the execution subject can trim the obtained feature map. The feature map contains pedestrian center position information. Therefore, in this case, even if the feature map is trimmed again, the detection result of the pedestrian is not affected. Some areas above the feature map can be trimmed to reduce the amount of computation. In some specific applications, some of the above regions may be 1/4 regions.

本実施例のいくつかの選択可能な実施形態において、上記第２のトリミング画像のサイズは、ターゲット画像のサイズの１／２以上である。 In some selectable embodiments of this embodiment, the size of the second cropped image is greater than or equal to 1/2 the size of the target image.

本実施形態では、歩行者検出の計算量を低減しながら、歩行者検出の正確性を確保するために、第２のトリミング画像のサイズをターゲット画像のサイズの１／２以上とする。 In the present embodiment, the size of the second trimmed image is set to ½ or more of the size of the target image in order to ensure the accuracy of pedestrian detection while reducing the calculation amount of pedestrian detection.

ステップ３０５において、第２のトリミング画像における歩行者を識別することにより、歩行者の検出結果を得る。 In step 305, the detection result of the pedestrian is obtained by identifying the pedestrian in the second cropped image.

続けて図４を参照すると、図４は、本実施例による歩行者を検出するための方法の一応用シーンを示す概略図である。図４の応用シーンでは、自動運転車にはドライブレコーダー４０１が取り付けられ、ドライブレコーダー４０１は、収集した画像を自動運転車の車載コンピュータ４０２に送信することができる。上記車載コンピュータ４０２は、上記実施例によって歩行者検出を行うことにより、歩行者の検出結果を得ることができる。 With reference to FIG. 4, FIG. 4 is a schematic diagram showing an application scene of the method for detecting a pedestrian according to the present embodiment. In the application scene of FIG. 4, a drive recorder 401 is attached to the self-driving car, and the drive recorder 401 can transmit the collected images to the in-vehicle computer 402 of the self-driving car. The in-vehicle computer 402 can obtain a pedestrian detection result by performing pedestrian detection according to the above embodiment.

本発明の上記実施例による歩行者を検出するための方法では、ドライブレコーダーで収集された画像を複数回トリミング処理し、且つ有効な歩行者情報を含む特徴をできるだけ抽出することができるため、歩行者検出の正確性を確保するとともに、歩行者検出の計算量を低減することができる。 In the method for detecting a pedestrian according to the above embodiment of the present invention, the image collected by the drive recorder can be trimmed a plurality of times, and features including valid pedestrian information can be extracted as much as possible. It is possible to ensure the accuracy of pedestrian detection and reduce the amount of calculation for pedestrian detection.

さらに図５を参照して、上記の各図に示された方法の実現として、本発明は歩行者を検出するための装置の一実施例を提供し、当該装置の実施例は、図２に示す方法実施例に対応し、当該装置は、具体的に様々な電子デバイスに適用できる。 Further, with reference to FIG. 5, as an embodiment of the method shown in each of the above figures, the present invention provides an embodiment of an apparatus for detecting a pedestrian, and an embodiment of the apparatus is shown in FIG. Corresponding to the method embodiment shown, the apparatus can be specifically applied to various electronic devices.

図５に示すように、本実施例に係る歩行者を検出するための装置５００は、画像取得ユニット５０１と、第１のトリミングユニット５０２と、特徴抽出ユニット５０３と、第２のトリミングユニット５０４と、歩行者検出ユニット５０５とを含む。 As shown in FIG. 5, the apparatus 500 for detecting a pedestrian according to the present embodiment includes an image acquisition unit 501, a first trimming unit 502, a feature extraction unit 503, and a second trimming unit 504. , The pedestrian detection unit 505 and the like.

画像取得ユニット５０１は、ターゲット画像を取得するように構成される。 The image acquisition unit 501 is configured to acquire a target image.

第１のトリミングユニット５０２は、ターゲット画像をトリミング処理することにより、第１のトリミング画像を得るように構成される。 The first trimming unit 502 is configured to obtain the first trimmed image by trimming the target image.

特徴抽出ユニット５０３は、第１のトリミング画像の特徴を抽出することにより、特徴マップを得るように構成される。 The feature extraction unit 503 is configured to obtain a feature map by extracting the features of the first trimmed image.

第２のトリミングユニット５０４は、特徴マップをトリミング処理することにより、第２のトリミング画像を得るように構成される。 The second trimming unit 504 is configured to obtain a second trimmed image by trimming the feature map.

歩行者検出ユニット５０５は、第２のトリミング画像における歩行者を識別することにより、歩行者の検出結果を得る。 The pedestrian detection unit 505 obtains a pedestrian detection result by identifying the pedestrian in the second trimmed image.

本実施例のいくつかの選択可能な実施形態において、特徴抽出ユニット５０３は、さらに、第１のトリミング画像の特徴を抽出し、抽出プロセスにおいて第１のトリミング画像をトリミング処理することにより、特徴マップを得るように構成される。 In some selectable embodiments of this embodiment, the feature extraction unit 503 further extracts the features of the first cropped image and trims the first cropped image in the extraction process to obtain a feature map. Is configured to obtain.

本実施例のいくつかの選択可能な実施形態において、特徴抽出ユニット５０３は、さらに、第１のトリミング画像に対して少なくとも２回の畳み込み演算を行い、少なくとも１回の畳み込み演算の後、得られた特徴マップに対して少なくとも１回のトリミング処理を行うことにより、特徴マップを得るように構成される。 In some selectable embodiments of this embodiment, the feature extraction unit 503 further performs at least two convolution operations on the first cropped image and is obtained after at least one convolution operation. The feature map is configured to be obtained by performing the trimming process at least once on the feature map.

本実施例のいくつかの選択可能な実施形態において、画像取得ユニット５０１は、さらに、車両に取り付けられたドライブレコーダーで収集された画像をターゲット画像とするように構成される。 In some selectable embodiments of this embodiment, the image acquisition unit 501 is further configured to target an image collected by a drive recorder mounted on the vehicle.

本実施例のいくつかの選択可能な実施形態において、第１のトリミングユニット５０２は、さらに、ターゲット画像の上方のプリセット比率の領域をトリミングすることにより、第１のトリミング画像を得るように構成される。 In some selectable embodiments of this embodiment, the first cropping unit 502 is further configured to obtain a first cropped image by trimming a region of preset ratio above the target image. To.

本実施例のいくつかの選択可能な実施形態において、第２のトリミング画像のサイズは、ターゲット画像のサイズの１／２以上である。 In some selectable embodiments of this embodiment, the size of the second cropped image is greater than or equal to 1/2 the size of the target image.

歩行者を検出するための装置５００に記載されたユニット５０１〜ユニット５０５は、それぞれ図２を参照して説明された方法における各ステップに対応することを理解すべきである。したがって、上記の歩行者を検出するための方法について説明された動作および特徴は、装置５００およびその中に含まれるユニットにも同様に適用され、ここでは説明を省略する。 It should be understood that the units 501 to 505 described in the device 500 for detecting a pedestrian correspond to each step in the method described with reference to FIG. 2, respectively. Therefore, the above-mentioned operations and features described for the method for detecting a pedestrian are similarly applied to the device 500 and the units contained therein, and the description thereof is omitted here.

本発明の実施例によると、本発明は、電子デバイス及び可読記憶媒体をさらに提供する。 According to the embodiments of the present invention, the present invention further provides electronic devices and readable storage media.

図６に示すように、本発明の実施例の歩行者を検出するための方法による電子デバイスのブロック図である。電子デバイスは、ラップトップコンピュータ、デスクトップコンピュータ、作業台、パーソナルデジタルアシスタント、サーバー、ブレードサーバ、大型コンピュータ、および他の適切なコンピュータのような様々な形態のデジタルコンピュータを表すことを意図している。電子デバイスは、パーソナルデジタル処理、携帯電話、スマートフォン、ウェアラブルデバイス、および他の類似のコンピューティングデバイスのような様々な形態のモバイルデバイスを表すこともできる。本明細書に示された部品、それらの接続および関係、およびそれらの機能は、単なる例にすぎず、本明細書で説明されおよび/または要求されている本発明の実現を制限することを意図しない。 As shown in FIG. 6, it is a block diagram of an electronic device by the method for detecting a pedestrian of the embodiment of this invention. Electronic devices are intended to represent various forms of digital computers such as laptop computers, desktop computers, workbench, personal digital assistants, servers, blade servers, large computers, and other suitable computers. Electronic devices can also represent various forms of mobile devices such as personal digital processing, mobile phones, smartphones, wearable devices, and other similar computing devices. The parts, their connections and relationships, and their functions set forth herein are merely examples and are intended to limit the realization of the invention as described and / or required herein. do not do.

図６に示すように、当該電子デバイスは、少なくとも１つのプロセッサ６０１と、メモリ６０２と、高速インターフェースと低速インターフェースを含む各部品を接続するためのインターフェースとを含む。各部品は、異なるバス６０３を利用して互いに接続され、共通マザーボードに取り付けられてもよいし、必要に応じて他の方法で取り付けられてもよい。プロセッサは、ＧＵＩのグラフィカル情報を外部入力／出力装置（例えば、インタフェースにカップリングされた表示装置）に表示するためのメモリ内またはメモリ上に記憶された命令を含む、電子デバイス内で実行された指令を処理することができる。他の実施形態では、必要に応じて、複数のプロセッサおよび／または複数のバス６０３を複数のメモリおよび複数のメモリとともに使用することができる。同様に、複数の電子デバイスを接続してもよく、各機器は、部分的に必要な動作（例えば、サーバアレイ、１組のブレードサーバ、またはマルチプロセッサシステムとして）を提供する。図６では、１つのプロセッサ６０１を例にとる。 As shown in FIG. 6, the electronic device includes at least one processor 601 and a memory 602, and an interface for connecting each component including a high-speed interface and a low-speed interface. The components may be connected to each other using different buses 603 and mounted on a common motherboard, or may be mounted in other ways as needed. The processor was executed in an electronic device, including in-memory or in-memory instructions for displaying graphical information in the GUI on an external input / output device (eg, a display device coupled to an interface). Can process commands. In other embodiments, a plurality of processors and / or a plurality of buses 603 can be used with the plurality of memories and the plurality of memories, if necessary. Similarly, multiple electronic devices may be connected and each device provides partially required operation (eg, as a server array, a set of blade servers, or a multiprocessor system). In FIG. 6, one processor 601 is taken as an example.

メモリ６０２は、本発明による非一時的コンピュータ可読記憶媒体である。ここで、メモリ６０２は、少なくとも１つのプロセッサによって実行され得る指令を記憶することにより、本発明による歩行者を検出するための方法を少なくとも１つのプロセッサに実行させる。本発明の非一時的コンピュータ可読記憶媒体は、コンピュータ指令を記憶し、当該コンピュータ指令は、本発明による歩行者を検出するための方法をコンピュータに実行させるために使用される。 The memory 602 is a non-temporary computer-readable storage medium according to the present invention. Here, the memory 602 causes at least one processor to execute the method for detecting a pedestrian according to the present invention by storing a command that can be executed by at least one processor. The non-temporary computer-readable storage medium of the present invention stores computer commands, which are used to cause a computer to perform a method for detecting a pedestrian according to the present invention.

メモリ６０２は、非一時的コンピュータ可読記憶媒体として、本発明の実施例における歩行者を検出するための方法に対応するプログラム指令／ユニット（例えば、図５に示された画像取得ユニット５０１、第１のトリミングユニット５０２、特徴抽出ユニット５０３、第２のトリミングユニット５０４および歩行者検出ユニット５０５）のような、非一時的ソフトウェアプログラム、非一時的コンピュータ実行可能プログラム、およびモジュールを記憶するために使用されることができる。プロセッサ６０１は、メモリ６０２に記憶された非一時的ソフトウェアプログラム、指令およびモジュールを実行することにより、サーバーの様々な機能アプリケーションおよびデータ処理を実行し、すなわち、上述した方法の実施例における歩行者を検出するための方法が実現される。 The memory 602, as a non-temporary computer-readable storage medium, is a program command / unit corresponding to the method for detecting a pedestrian in the embodiment of the present invention (for example, the image acquisition unit 501, 1st shown in FIG. 5). Used to store non-temporary software programs, non-temporary computer executable programs, and modules, such as the trimming unit 502, feature extraction unit 503, second trimming unit 504, and pedestrian detection unit 505). Can be done. Processor 601 performs various functional applications and data processing of the server by executing non-temporary software programs, instructions and modules stored in memory 602, i.e., the pedestrian in the embodiment of the method described above. A method for detection is realized.

メモリ６０２は、プログラム記憶領域およびデータ記憶領域を含むことができ、ここで、プログラム記憶領域は、オペレーティングシステム、少なくとも１つの機能に必要なアプリケーションプログラムを記憶することができ、データ記憶領域は、歩行者を検出するための方法を実行する電子デバイスの使用によって作成されたデータなどを記憶することができる。また、メモリ６０２は、高速ランダムアクセスメモリを含むことができ、例えば少なくとも１つの磁気ディスク記憶装置、フラッシュメモリ装置、または他の非一時的固体記憶装置などの非一時的メモリを含むこともできる。いくつかの実施例では、選択肢の一つとして、メモリ６０２は、プロセッサ６０１に対して遠隔的に配置されたメモリを含むことができ、これらの遠隔メモリは、ネットワークを介して歩行者を検出するための方法を実行する電子デバイスに接続されることができる。上記のネットワークの例は、インターネット、企業内ネットワーク、ローカルエリアネットワーク、モバイル通信ネットワークおよびその組み合わせを含むが、これらに限定されない。 The memory 602 can include a program storage area and a data storage area, where the program storage area can store the operating system, an application program required for at least one function, and the data storage area is a walk. It can store data created by the use of electronic devices that perform methods for detecting a person. The memory 602 can also include high speed random access memory and can also include non-temporary memory such as, for example, at least one magnetic disk storage device, flash memory device, or other non-temporary solid-state storage device. In some embodiments, as one of the options, memory 602 may include memory remotely located with respect to processor 601 and these remote memories detect pedestrians over the network. Can be connected to an electronic device that performs the method for. Examples of networks above include, but are not limited to, the Internet, corporate networks, local area networks, mobile communication networks and combinations thereof.

歩行者を検出するための方法を実行する電子デバイスは、入力装置６０４および出力装置６０５をさらに含むことができる。プロセッサ６０１、メモリ６０２、入力装置６０４および出力装置６０５は、バス６０３または他の方法で接続されることができ、図６では、バス６０３で接続されることを例にとる。 Electronic devices that perform the method for detecting pedestrians can further include input device 604 and output device 605. The processor 601 and the memory 602, the input device 604 and the output device 605 can be connected by the bus 603 or other methods, and in FIG. 6, the connection by the bus 603 is taken as an example.

入力装置６０４は、入力された数字または文字メッセージを受信し、歩行者を検出するための方法を実行する電子デバイスのユーザ設定および機能制御に関するキー信号入力を生成することができ、例えばタッチスクリーン、キーパッド、マウス、トラックボード、タッチパッド、指示棒、１つ以上のマウスボタン、トラックボール、ジョイスティックなどの入力装置が挙げられる。出力装置６０５は、表示装置、補助照明装置（例えば、ＬＥＤ）、および触覚フィードバック装置（例えば、振動モータ）などを含むことができる。当該表示装置は、液晶ディスプレイ（ＬＣＤ）、発光ダイオード（ＬＥＤ）ディスプレイ、およびプラズマディスプレイを含むことができるが、これらに限定されない。いくつかの実施例では、表示装置は、タッチスクリーンであってもよい。 The input device 604 can generate a key signal input relating to user settings and functional control of an electronic device that receives an input number or letter message and performs a method for detecting a pedestrian, such as a touch screen. Input devices such as keypads, mice, trackboards, touchpads, indicator bars, one or more mouse buttons, trackballs, joysticks, etc. The output device 605 can include a display device, an auxiliary lighting device (eg, LED), a tactile feedback device (eg, a vibration motor), and the like. The display device can include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some embodiments, the display device may be a touch screen.

ここで説明されたシステムおよび技術の様々な実施形態は、デジタル電子回路システム、集積回路システム、専用ＡＳＩＣ（専用集積回路）、コンピュータハードウェア、ファームウェア、ソフトウェア、および／またはこれらの組み合わせにおいて実現されることができる。これらの様々な実施形態は、以下の内容を含むことができ、即ち、１つ以上のコンピュータプログラムに実施され、当該１つ以上のコンピュータプログラムは、少なくとも１つのプログラマブルプロセッサを含むプログラマブルシステム上で実行および/または解釈されることができ、当該プログラマブルプロセッサは、専用または汎用プログラマブルプロセッサであってもよく、記憶システム、少なくとも１つの入力装置、および少なくとも１つの出力装置からデータおよび指令を受信し、且つデータおよび指令を当該記憶システム、当該少なくとも１つの入力装置、および当該少なくとも１つの出力装置に送信することができる。 Various embodiments of the systems and techniques described herein are realized in digital electronic circuit systems, integrated circuit systems, dedicated ASICs (dedicated integrated circuits), computer hardware, firmware, software, and / or combinations thereof. be able to. These various embodiments may include the following content, i.e., implemented in one or more computer programs, the one or more computer programs running on a programmable system including at least one programmable processor. And / or can be interpreted, the programmable processor may be a dedicated or general purpose programmable processor, receiving data and commands from a storage system, at least one input device, and at least one output device, and. Data and commands can be transmitted to the storage system, the at least one input device, and the at least one output device.

これらの計算プログラム（プログラム、ソフトウェア、ソフトウェアアプリケーション、またはコードとも呼ばれる）は、プログラマブルプロセッサのマシン指令を含み、高度なプロセスおよび／またはオブジェクトに向けたプログラミング言語、および／またはアセンブリ／マシン言語を利用してこれらの計算プログラムを実行することができる。本明細書で使用されたような用語「機械可読媒体」および「コンピュータ可読媒体」とは、機械指令および／またはデータをプログラマブルプロセッサに提供するための任意のコンピュータプログラム製品、デバイス、および／または装置（例えば、磁気ディスク、光ディスク、メモリ、プログラマブルロジックデバイス（ＰＬＤ））を指し、機械可読信号である機械指令を受信する機械可読媒体を含む。用語「機械可読信号」とは、機械指令および／またはデータをプログラマブルプロセッサに提供するための任意の信号を指す。 These computational programs (also called programs, software, software applications, or codes) include machine instructions for programmable processors, utilize programming languages for advanced processes and / or objects, and / or assembly / machine languages. You can run these calculators. As used herein, the terms "machine readable medium" and "computer readable medium" are any computer program product, device, and / or device for providing machine instructions and / or data to a programmable processor. (For example, a magnetic disk, an optical disk, a memory, a programmable logic device (PLD)), and includes a machine-readable medium that receives a machine-readable signal, which is a machine-readable signal. The term "machine readable signal" refers to any signal for providing machine commands and / or data to a programmable processor.

ユーザとのインタラクティブを提供するために、ここで説明されたシステムおよび技術をコンピュータ上で実施することができ、当該コンピュータは、ユーザに情報を表示するための表示装置（例えば、ＣＲＴ（陰極線管）またはＬＣＤ（液晶ディスプレイ）モニタ）、キーボードおよびポインティングデバイス（例えば、マウスまたはトラックボール）を備え、ユーザーは、当該キーボードおよび当該ポインティングデバイスを介して入力をコンピュータに提供することができる。他の種類の装置は、ユーザとのインタラクティブを提供するために使用されることもできる。例えば、ユーザに提供されたフィードバックは、任意の形態のセンシングフィードバック（例えば、視覚フィードバック、聴覚フィードバック、または触覚フィードバック）であってもよく、任意の形態（声入力、音声入力、または触覚入力を含む）でユーザからの入力を受信してもよい。 In order to provide interaction with the user, the systems and techniques described herein can be implemented on a computer, which is a display device for displaying information to the user (eg, a CRT). Alternatively, it comprises an LCD (Liquid Crystal Display) monitor), a keyboard and a pointing device (eg, a mouse or trackball), and the user can provide input to the computer via the keyboard and the pointing device. Other types of devices can also be used to provide interactivity with the user. For example, the feedback provided to the user may be any form of sensing feedback (eg, visual feedback, auditory feedback, or tactile feedback), including any form (voice input, voice input, or tactile input). ) May receive input from the user.

ここで説明されたシステムおよび技術を、バックグラウンド部品を含む計算システム（例えば、データサーバー）、またはミドルウエア部品を含む計算システム（例えば、アプリケーションサーバー）、またはフロントエンド部品を含む計算システム（例えば、グラフィカルユーザインタフェースまたはネットワークブラウザを有するユーザコンピュータが挙げられ、ユーザーは、当該グラフィカルユーザインタフェースまたは当該ネットワークブラウザを介してここで説明されたシステムおよび技術の実施形態とインタラクティブに動作することができ）、またはこのようなバックグラウンド部品、ミドルウエア部品、またはフロントエンド部品の任意の組合せを含む計算システム上で実施することができる。システムの部品は、任意の形態またはメディアのデジタルデータ通信（例えば、通信ネットワーク）によって相互に接続されてもよい。通信ネットワークの例は、ローカルエリアネットワーク（ＬＡＮ）、広域ネットワーク（ＷＡＮ）、およびインターネットを含む。 The systems and techniques described herein can be a computing system that includes background components (eg, a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg,). A user computer having a graphical user interface or network browser may be mentioned, and the user may interact with embodiments of the systems and techniques described herein via the graphical user interface or network browser), or. It can be implemented on a computing system that includes any combination of such background components, middleware components, or front-end components. The components of the system may be interconnected by any form or media digital data communication (eg, a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.

コンピュータシステムは、クライアントとサーバーとを含むことができる。クライアントとサーバーは、一般に互いに離れ、通常は通信ネットワークを介してインタラクティブに動作する。クライアントとサーバーとの関係は、対応するコンピュータ上で実行され、且つ互いにクライアントーサーバー関係を有するコンピュータプログラムによって生成される。 A computer system can include a client and a server. Clients and servers are generally separated from each other and usually operate interactively over a communication network. The client-server relationship is generated by a computer program that runs on the corresponding computer and has a client-server relationship with each other.

以上で示された様々な形態のフローを用いて、ステップを並べ替え、追加、または削除できることを理解すべきである。例えば、本発明に記載された各ステップは、並列的に実行されてもよいし、順次実行されてもよいし、異なる順序で実行されてもよく、本発明に開示された技術案の所望の結果が達成される限り、本明細書では制限しない。 It should be understood that steps can be sorted, added, or deleted using the various forms of flow shown above. For example, each step described in the present invention may be performed in parallel, sequentially, or in a different order, as desired by the proposed technology disclosed in the present invention. As long as the results are achieved, no limitation is made herein.

上記具体的な実施形態は、本発明の保護範囲に対する制限を構成するものではない。当業者は、設計要件とその他の要因に応じて、様々な修正、組み合わせ、サブコンビネーション、および代替を行うことが可能であることを理解すべきである。本発明の精神及び原則内でなされたいかなる修正、均等置換及び改善等は、いずれも本発明の保護範囲に含まれるべきである。
The specific embodiment does not constitute a limitation on the scope of protection of the present invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and alternatives can be made, depending on design requirements and other factors. Any modifications, equal substitutions, improvements, etc. made within the spirit and principles of the invention should be included in the scope of protection of the invention.

Claims

To get the target image and
By trimming the target image, a first trimmed image can be obtained.
By extracting the features of the first trimmed image, a feature map can be obtained.
By trimming the feature map, a second trimmed image can be obtained.
A method for detecting a pedestrian, which comprises obtaining a pedestrian detection result by identifying the pedestrian in the second cropped image.

Obtaining a feature map by extracting the features of the first trimmed image is performed by extracting the features of the first trimmed image and trimming the first trimmed image in the extraction process. The method of claim 1, comprising obtaining the feature map.

Obtaining the feature map by extracting the features of the first trimmed image and trimming the first trimmed image in the extraction process is possible.
Performing the convolution operation at least twice for the first cropped image,
The method according to claim 2, wherein the feature map is obtained by performing a trimming process at least once on the obtained feature map after at least one convolution operation.

The method according to claim 1, wherein acquiring the target image includes using an image collected by a drive recorder attached to the vehicle as the target image.

Obtaining the first trimmed image by trimming the target image includes obtaining the first trimmed image by trimming the region of the preset ratio above the target image. Item 1. The method according to Item 1.

The method according to claim 1, wherein the size of the second cropped image is ½ or more of the size of the target image.

An image acquisition unit configured to acquire the target image, and
A first trimming unit configured to obtain a first trimmed image by trimming the target image, and a first trimming unit.
A feature extraction unit configured to obtain a feature map by extracting features of the first trimmed image, and a feature extraction unit.
A second trimming unit configured to obtain a second trimmed image by trimming the feature map, and a second trimming unit.
A device for detecting a pedestrian, including a pedestrian detection unit configured to obtain a pedestrian detection result by identifying the pedestrian in the second cropped image.

The feature extraction unit is further configured to obtain the feature map by extracting the feature of the first trimmed image and trimming the first trimmed image in the extraction process. The device described.

The feature extraction unit is
The first cropped image is subjected to at least two convolution operations.
The apparatus according to claim 8, further configured to obtain the feature map by performing at least one trimming process on the obtained feature map after at least one convolution operation.

The image acquisition unit further
The device according to claim 7, wherein an image collected by a drive recorder attached to a vehicle is used as a target image.

The device according to claim 7, wherein the first trimming unit is further configured to obtain the first trimmed image by trimming a region of a preset ratio above the target image.

The apparatus according to claim 7, wherein the size of the second cropped image is ½ or more of the size of the target image.

With one or more processors
Including a storage device in which one or more programs are stored.
An electronic device that, when the one or more programs are executed by the one or more processors, realizes the method according to any one of claims 1 to 6 on the one or more processors.

A computer-readable medium in which a computer program is stored, wherein the method according to any one of claims 1 to 6 is realized when the computer program is executed by a processor.

A computer program that realizes the method according to any one of claims 1 to 6, when the computer program is executed by a processor.