JP2024512447A

JP2024512447A - Data generation method, device and electronic equipment

Info

Publication number: JP2024512447A
Application number: JP2023556723A
Authority: JP
Inventors: 涛 ▲呉▼
Original assignee: Qingdao Pico Technology Co Ltd
Current assignee: Qingdao Pico Technology Co Ltd
Priority date: 2021-04-21
Filing date: 2022-03-25
Publication date: 2024-03-19
Also published as: US20230410386A1; CN113269782B; WO2022222689A1; EP4290452A1; KR20230142769A; CN113269782A

Abstract

本願にはデータ生成方法、装置及び電子機器が開示される。当該方法は、第１画像データを取得するステップであって、第１画像データは、ユーザが位置している実環境を表すデータである、ステップと、ターゲットオブジェクトのカテゴリ情報と平面情報とを取得するステップであって、ターゲットオブジェクトは、第１画像データにおけるオブジェクトであり、平面情報は、ターゲットオブジェクトの外表面の情報を含む、ステップと、第２画像データを取得するステップであって、第２画像データは、仮想オブジェクトを含むデータである、ステップと、カテゴリ情報と平面情報とに基づいて、第１画像データと第２画像データとを混合して、ターゲット画像データを生成するステップであって、ターゲット画像データは、ターゲットオブジェクトと仮想オブジェクトとを含むデータである、ステップと、を含む。【選択図】図１A data generation method, apparatus, and electronic device are disclosed herein. The method includes the steps of acquiring first image data, the first image data being data representing a real environment in which a user is located, and acquiring category information and plane information of a target object. the target object is an object in the first image data, and the planar information includes information on the outer surface of the target object; and the step of obtaining the second image data, the step of obtaining the second image data. The image data is data including a virtual object, and the step of generating target image data by mixing the first image data and the second image data based on the category information and the plane information. , the target image data is data including a target object and a virtual object. [Selection diagram] Figure 1

Description

本願は、２０２１年０４月２１日に提出された、出願の名称が「データ生成方法、装置及び電子機器」であって、中国特許出願番号が「２０２１１０４３１９７２.６」である優先権を主張し、この中国特許出願の全内容が引用により本願に組み込まれている。 This application claims priority to the application titled "Data generation method, device and electronic device" filed on April 21, 2021, and the Chinese patent application number is "202110431972.6", The entire content of this Chinese patent application is incorporated by reference into this application.

本願は、混合現実の技術の分野に関し、より具体的には、データ生成方法、装置及び電子機器に関する。 TECHNICAL FIELD The present application relates to the field of mixed reality technology, and more particularly to data generation methods, devices, and electronic devices.

現在、混合現実（ＭＲ、ＭｉｘｅｄＲｅａｌｉｔｙ）技術は、科学的な可視化、医療トレーニング、エンジニアリング設計、遠隔オフィス操作、パーソナルエンターテインメントなどの様々な分野に広く応用されており、この技術により、ユーザは、実環境コンテンツと仮想コンテンツが混合して生成されたシーンで、仮想オブジェクトと対話することができ、ユーザは実環境におけるいくつかの重要なデータの楽しさをより深く理解することができる。 Currently, mixed reality (MR) technology has been widely applied in various fields such as scientific visualization, medical training, engineering design, remote office operation, and personal entertainment. In the scene generated by mixing environmental content and virtual content, users can interact with virtual objects, allowing users to have a deeper understanding of the enjoyment of some important data in the real environment.

しかし、現在の電子機器の生成する混合現実データは常に粗雑である。例えば、床、天井、壁などの表面のような実環境における大きな表面を認識し、認識されたこのような情報に基づいて仮想オブジェクトを重畳して配置するだけで、シーンの精細度が不足しており、ユーザの体験に影響を与えるという問題がある。 However, mixed reality data generated by current electronic devices is always crude. For example, simply recognizing large surfaces in the real environment, such as floors, ceilings, walls, etc., and superimposing and arranging virtual objects based on such recognized information, can eliminate the lack of detail in the scene. This has the problem of affecting the user experience.

本願の実施例の目的の一つは、電子機器使用時のユーザの楽しみを向上させるように、混合現実データを生成するための新しい技術案を提供することである。 One of the objectives of the embodiments of the present application is to provide a new technical solution for generating mixed reality data so as to improve the user's enjoyment when using electronic devices.

本願の第１の態様によれば、データ生成方法が提供され、当該方法は、
第１画像データを取得するステップであって、前記第１画像データは、ユーザが位置している現実環境を表すデータである、ステップと、
ターゲットオブジェクトのカテゴリ情報と平面情報とを取得するステップであって、前記ターゲットオブジェクトは、前記第１画像データにおけるオブジェクトであり、前記平面情報は、前記ターゲットオブジェクトの外表面の情報を含む、前記のステップと、
第２画像データを取得するステップであって、前記第２画像データは、仮想オブジェクトを含むデータである、ステップと、
前記カテゴリ情報と前記平面情報とに基づいて、前記第１画像データと前記第２画像データとを混合して、ターゲット画像データを生成するステップであって、前記ターゲット画像データは、前記ターゲットオブジェクトと前記仮想オブジェクトとを含むデータである、ステップと、を含む。 According to a first aspect of the present application, a data generation method is provided, the method comprising:
obtaining first image data, the first image data being data representing a real environment in which the user is located;
a step of acquiring category information and plane information of a target object, the target object being an object in the first image data, and the plane information including information on an outer surface of the target object; step and
a step of acquiring second image data, the second image data being data including a virtual object;
generating target image data by mixing the first image data and the second image data based on the category information and the plane information, wherein the target image data and a step, which is data including the virtual object.

いくつかの実施例では、前記カテゴリ情報と前記平面情報とに基づいて、前記第１画像データと前記第２画像データとを混合して、ターゲット画像データを生成する前記ステップは、前記カテゴリ情報に基づいて、前記第２画像データにおける前記仮想オブジェクトと前記第１画像データにおける前記ターゲットオブジェクトとの相対位置関係を決定するステップと、前記平面情報と前記相対位置関係とに基づいて、前記仮想オブジェクトを前記ターゲットオブジェクトの所定位置までレンダリングして前記ターゲット画像データを取得するステップと、を含む。 In some embodiments, the step of generating target image data by mixing the first image data and the second image data based on the category information and the plane information includes determining a relative positional relationship between the virtual object in the second image data and the target object in the first image data based on the plane information and the relative positional relationship; and obtaining the target image data by rendering to a predetermined position of the target object.

いくつかの実施例では、前記ターゲットオブジェクトの前記カテゴリ情報と前記平面情報とを取得する前記ステップは、前記第１画像データをターゲット画像分割モデルに入力して、前記ターゲットオブジェクトのマスク情報を取得するステップと、前記マスク情報に基づいて前記カテゴリ情報及び前記平面情報を取得するステップと、を含む。 In some embodiments, the step of obtaining the category information and the planar information of the target object includes inputting the first image data to a target image segmentation model to obtain mask information of the target object. and obtaining the category information and the plane information based on the mask information.

いくつかの実施例では、前記マスク情報に基づいて前記カテゴリ情報を取得する前記ステップは、前記マスク情報をターゲットカテゴリ認識モデルに入力して、前記カテゴリ情報を取得するステップを含む。 In some examples, the step of obtaining the category information based on the mask information includes inputting the mask information into a target category recognition model to obtain the category information.

いくつかの実施例では、前記マスク情報に基づいて前記平面情報を取得する前記ステップは、前記マスク情報に基づいて、前記第１画像データにおける前記ターゲットオブジェクトに対応するターゲット画像ブロックを取得するステップと、前記ターゲット画像ブロックに基づいて、世界座標系における前記ターゲットオブジェクトのキーポイントのターゲット位置情報を取得するステップであって、前記キーポイントは前記ターゲットオブジェクトのコーナーポイントを含む、ステップと、前記ターゲット位置情報及び所定の平面フィッティングアルゴリズムに基づいて、前記平面情報を取得するステップであって、前記平面情報は、前記ターゲットオブジェクトの各平面に対応する中心点座標及び平面法線ベクトルを含む、ステップと、を含む。 In some embodiments, the step of obtaining the plane information based on the mask information includes obtaining a target image block corresponding to the target object in the first image data based on the mask information. , obtaining target position information of key points of the target object in a world coordinate system based on the target image block, the key points including corner points of the target object; obtaining the plane information based on information and a predetermined plane fitting algorithm, the plane information including center point coordinates and plane normal vectors corresponding to each plane of the target object; including.

いくつかの実施例では、前記方法が電子機器に適用され、前記ターゲット画像ブロックに基づいて、前記世界座標系における前記ターゲットオブジェクトの前記キーポイントの前記ターゲット位置情報を取得する前記ステップは、前記ターゲット画像ブロックに基づいて、前記キーポイントの前記第１画像データにおける第１位置情報を検出するステップと、現在時刻を含む第１時刻における前記電子機器の位置姿勢情報、及び、前記キーポイントの前記第１時刻よりも先の第２時刻において取得された第３画像データにおける第２位置情報を取得するステップと、前記第１位置情報と前記位置姿勢情報と前記第２位置情報とに基づいて前記ターゲット位置情報を取得するステップと、を含む。 In some embodiments, the method is applied to an electronic device, and the step of obtaining the target location information of the key points of the target object in the world coordinate system based on the target image block includes detecting first position information of the key point in the first image data based on an image block; position and orientation information of the electronic device at a first time including the current time; and detecting first position information of the key point in the first image data; acquiring second position information in third image data acquired at a second time that is earlier than one time; and obtaining location information.

いくつかの実施例では、前記ターゲット画像分割モデル及び前記ターゲットカテゴリ認識モデルは、サンプルデータを取得するステップであって、前記サンプルデータは、所定のシーンにおけるサンプルオブジェクトを含むデータである、ステップと、前記サンプルデータに基づいて、初期画像分割モデル及び初期カテゴリ認識モデルを連携トレーニングし、前記ターゲット画像分割モデル及び前記ターゲットカテゴリ認識モデルを取得するステップと、によってトレーニングされて取得される。 In some embodiments, the target image segmentation model and the target category recognition model include obtaining sample data, the sample data including sample objects in a predetermined scene; and jointly training an initial image segmentation model and an initial category recognition model based on the sample data to obtain the target image segmentation model and the target category recognition model.

いくつかの実施例では、前記ターゲット画像データを取得した後、前記方法は、さらに、前記ターゲット画像データを表示するステップを含む。 In some embodiments, after obtaining the target image data, the method further includes displaying the target image data.

本願の第２の態様によれば、本願においてデータ生成装置が提供され、当該装置は、
第１画像データを取得することに使用される第１画像データ取得モジュールであって、前記第１画像データは、ユーザが位置している現実環境を表すデータである、第１画像データ取得モジュールと、
ターゲットオブジェクトのカテゴリ情報と平面情報とを取得することに使用される情報取得モジュールであって、前記ターゲットオブジェクトは、前記第１画像データにおけるオブジェクトであり、前記平面情報は、前記ターゲットオブジェクトの外表面の情報を含む、情報取得モジュールと、
第２画像データを取得することに使用される第２画像データ取得モジュールであって、前記第２画像データは、仮想オブジェクトを含むデータである、第２画像データ取得モジュールと、
前記カテゴリ情報と前記平面情報とに基づいて、前記第１画像データと前記第２画像データとを混合して、ターゲット画像データを生成することに使用されるターゲット画像データ生成モジュールであって、前記ターゲット画像データは、前記ターゲットオブジェクトと前記仮想オブジェクトとを含むデータである、ターゲット画像データ生成モジュールと、を含む。 According to a second aspect of the present application, a data generation device is provided in the present application, the device comprising:
a first image data acquisition module used to acquire first image data, the first image data being data representing a real environment in which a user is located; ,
An information acquisition module used to acquire category information and plane information of a target object, wherein the target object is an object in the first image data, and the plane information is an object on an outer surface of the target object. an information acquisition module containing information about the
a second image data acquisition module used to acquire second image data, the second image data being data including a virtual object;
A target image data generation module used to generate target image data by mixing the first image data and the second image data based on the category information and the plane information, The target image data includes a target image data generation module that is data including the target object and the virtual object.

本願の第３の態様によれば、電子機器が提供され、前記電子機器は、本願の第２の態様に記載の装置を含み、または、
前記電子機器は、実行可能な指令を記憶するメモリと、本願の第１の態様に記載された方法を実行させるために、前記指令による制御に従って前記電子機器を作動させるプロセッサと、を含む。 According to a third aspect of the present application, there is provided an electronic device, the electronic device comprising the device according to the second aspect of the present application, or
The electronic device includes a memory that stores executable instructions and a processor that operates the electronic device under control of the instructions to perform the method described in the first aspect of the present application.

本願の有益な効果について、本願の実施例によれば、電子機器が、ユーザが位置している実環境を表す第１画像データを取得し、当該第１画像データにおけるターゲットオブジェクトの平面情報及びカテゴリ情報を取得し、その後、仮想オブジェクトを含む第２画像データを取得することにより、当該平面情報と当該カテゴリ情報とに基づいて、第１画像データと第２画像データとを混合して、ターゲットオブジェクトと仮想オブジェクトとを同時に含むターゲット画像データを得ることができる。本実施例において提供される方法によれば、ターゲットオブジェクトの外表面の情報及びカテゴリ情報を認識することにより、電子機器が混合現実データを構築する際に、ターゲットオブジェクトのカテゴリ情報と平面情報とに基づいて、仮想環境に集約された仮想オブジェクトと正確に結合することができ、構築されたターゲット画像データの精細度を向上させ、さらにユーザ体験を向上させ、電子機器使用時のユーザの楽しみを向上させることができる。 Regarding the beneficial effects of the present application, according to the embodiments of the present application, an electronic device acquires first image data representing a real environment in which a user is located, and obtains plane information and category of a target object in the first image data. information, and then obtain second image data that includes a virtual object, thereby mixing the first image data and second image data based on the plane information and the category information to create a target object. It is possible to obtain target image data that simultaneously includes a virtual object and a virtual object. According to the method provided in this embodiment, by recognizing the outer surface information and category information of the target object, when an electronic device constructs mixed reality data, the category information and plane information of the target object are combined. Based on the virtual environment, it can be accurately combined with the virtual objects aggregated in the virtual environment, improving the definition of the built target image data, further improving the user experience and improving the user's enjoyment when using electronic devices. can be done.

以下の図面を参照して本願の例示的な実施例の詳細な説明により、本願の他の特徴及びその利点が明らかになるであろう。 Other features of the present application and its advantages will become apparent from the detailed description of exemplary embodiments of the present application with reference to the following drawings.

明細書に組み込まれて明細書の一部を構成する図面は、本願の実施例を示し、その説明とともに本願の原理を説明するために使用される。
本願の実施例において提供されるデータ生成方法の概略的なフローチャートである。本願の実施例において提供されるデータ生成装置の原理ブロック図である。本願の実施例において提供される電子機器の概略的なハードウェア構成図である。 The drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.
1 is a schematic flowchart of a data generation method provided in an embodiment of the present application. FIG. 1 is a principle block diagram of a data generation device provided in an embodiment of the present application. 1 is a schematic hardware configuration diagram of an electronic device provided in an embodiment of the present application.

以下、添付の図面を参照して、本願の様々な例示的な実施例について詳細に説明する。なお、これらの実施例に説明された部材及びステップの相対的な設定、数値式及び数値は、特に明記されていない限り、本願の範囲を限定しない。 Various exemplary embodiments of the present application will now be described in detail with reference to the accompanying drawings. It should be noted that the relative settings of members and steps, numerical formulas, and numerical values described in these examples do not limit the scope of the present application unless otherwise specified.

以下の少なくとも１つの例示的な実施例の説明は、実際に単に例示的なものであり、本願及びその適用または使用に対するいかなる制限として決して用いられない。 The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way used as any limitation on the present application and its application or use.

当業者に知られている技術、方法及び装置について、詳細な議論は行われないかもしれないが、適切な場合に、前記技術、方法及び装置は、明細書の一部とみなされるべきである。 A detailed discussion of techniques, methods and devices known to those skilled in the art may not be provided, but where appropriate, said techniques, methods and devices should be considered as part of the specification. .

ここに示され、議論されているすべての例において、任意の具体的な値は、限定的なものではなく単なる例示的なものとして解釈されるべきである。従って、例示的な実施例の他の例は、異なる値を有することができる。 In all examples shown and discussed herein, any specific values should be construed as merely illustrative rather than limiting. Accordingly, other examples of example embodiments may have different values.

以下の図面において、類似の符号とアルファベットが類似のものを示していることに留意すべきである。したがって、ある図面においてあるものが定義されると、次の図面においてそれについてさらに議論する必要はありません。 It should be noted that in the following drawings, like symbols and alphabets indicate similar things. Therefore, once something is defined in one drawing, there is no need to discuss it further in the next drawing.

現在の電子機器は、混合現実データを生成する際に、実環境における大型表面のみを認識し、実環境における物体及び物体タイプを認識できないことが多い。例えば、電子機器は現実環境における画像データを採集した後、画像データにおける一方の表面がテーブルに対応しており、他方の表面が椅子に対応していることを知らない。これにより、当該画像データに基づいて仮想コンテンツを結合して得られた混合現実シーンが粗く見えるようになり、例えば、電子機器は、実世界における実オブジェクトと仮想世界における仮想オブジェクトとの上下関係のような相対位置関係を正確に判断できず、仮想オブジェクトを実画像環境のある位置に簡単に重畳して表示するだけであるため、既存の混合現実データを生成するための方法には精細度が不足しており、ユーザ体験に影響を与える可能性があるという問題がある。 Current electronic devices often only recognize large surfaces in a real environment and cannot recognize objects and object types in a real environment when generating mixed reality data. For example, after collecting image data in a real environment, an electronic device does not know that one surface in the image data corresponds to a table and the other surface corresponds to a chair. As a result, the mixed reality scene obtained by combining virtual contents based on the image data appears rough. Existing methods for generating mixed reality data lack precision, as relative positional relationships cannot be accurately determined, and virtual objects are simply superimposed and displayed at certain positions in the real image environment. There are issues with this being lacking and potentially impacting the user experience.

上記の問題を解決するために、本願の実施例においてデータ生成方法が提供される。本願の実施例において提供されるデータ生成方法の概略的なフローチャートである図１を参照する。本方法を電子機器に適用することで、当該機器が精細度の高い混合現実データを生成し、ユーザが当該データを閲覧できるように表示することができ、ユーザ体験を向上させることができる。 To solve the above problems, a data generation method is provided in an embodiment of the present application. Reference is made to FIG. 1, which is a schematic flowchart of a data generation method provided in an embodiment of the present application. By applying the method to an electronic device, the device can generate high-definition mixed reality data and display the data for viewing by a user, thereby improving the user experience.

なお、本実施例において、当該方法を実施する電子機器は、例えば、表示画面、実環境情報を採集するための少なくとも２つの画像採集装置からなる表示装置を含むことが可能である。具体的に実施される場合、当該画像採集装置は、採集範囲が１５３°×１２０°×１６７°（Ｈ×Ｖ×Ｄ）前後であり、解像度が６４０×４８０以上、フレームレートが３０Ｈｚ以上のモノクロカメラであってもよく、もちろん、必要に応じて他の構成のカメラであってもよいが、採集範囲が広いほどカメラの光学歪みが大きくなり、最終的なデータの精度に影響を与える可能性がある。具体的に実施される場合、当該電子機器は、例えば、ＶＲデバイス、ＡＲデバイス、またはＭＲデバイスなどのデバイスであってもよい。 Note that in this embodiment, the electronic device that implements the method can include, for example, a display device including a display screen and at least two image collecting devices for collecting real environment information. When implemented specifically, the image collection device is capable of capturing monochrome images with a collection range of around 153° x 120° x 167° (H x V x D), a resolution of 640 x 480 or more, and a frame rate of 30 Hz or more. Of course, cameras of other configurations may be used as needed, but the wider the collection area, the greater the optical distortion of the camera, which may affect the accuracy of the final data. There is. When specifically implemented, the electronic device may be a device such as a VR device, an AR device, or an MR device, for example.

図１に示すように、本実施例の方法は、以下に詳細に説明するように、ステップＳ１１００～Ｓ１４００を含むことができる。 As shown in FIG. 1, the method of the present example may include steps S1100 to S1400, as described in detail below.

ステップＳ１１００において、第１画像データを取得し、そのうち、前記第１画像データは、ユーザが位置している実環境を表すデータである。 In step S1100, first image data is obtained, the first image data representing a real environment in which a user is located.

具体的には、第１画像データは、ユーザが位置している実環境、即ち実物理環境を反映したデータであってもよい。この画像データには、例えば、ユーザが位置している異なるシーンに応じて、ソファ、食卓、樹木、建築物、自動車、道路など、実環境における様々な実体オブジェクトが含まれていてもよい。 Specifically, the first image data may be data that reflects the real environment in which the user is located, that is, the real physical environment. This image data may include various real objects in the real environment, such as sofas, dining tables, trees, buildings, cars, roads, etc., depending on the different scenes in which the user is located, for example.

本実施例では、第１画像データは、電子機器に設けられた少なくとも２つの画像採集装置によって、ユーザが位置している実環境におけるデータを採集して生成され得る。もちろん、具体的に実施される場合、実際の必要に応じて、当該第１画像データは、当該電子機器以外の他の機器によってユーザが位置している実環境のデータを採集して生成されてもよく、例えば、ユーザが位置している環境に単独設置された画像採集装置により当該第１画像データを採集して取得し、当該電子機器との接続を確立することで当該第１画像データを当該電子機器に供給してもよく、本実施例では、第１画像データの取得態様を特に限定するものではない。 In this embodiment, the first image data may be generated by collecting data in the real environment where the user is located, by at least two image collecting devices installed in the electronic device. Of course, in the case of specific implementation, the first image data may be generated by collecting data of the real environment in which the user is located using other equipment other than the electronic equipment, depending on actual needs. For example, the first image data may be collected and acquired by an image collecting device installed independently in the environment where the user is located, and the first image data may be acquired by establishing a connection with the electronic device. The first image data may be supplied to the electronic device, and the manner in which the first image data is obtained is not particularly limited in this embodiment.

ステップＳ１２００において、ターゲットオブジェクトのカテゴリ情報と平面情報とを取得し、そのうち、前記ターゲットオブジェクトは前記第１画像データにおけるターゲットオブジェクトであり、前記平面情報はターゲットオブジェクトの外表面の情報を含む。 In step S1200, category information and plane information of the target object are obtained, where the target object is the target object in the first image data, and the plane information includes information on the outer surface of the target object.

本実施形例では、ターゲットオブジェクトは、第１画像データのうちの、実環境における実体オブジェクトに対応する１つまたは複数のオブジェクトであってもよい。例えば、実環境におけるテーブル、椅子、ソファなどの物体に対応するオブジェクトであってもよい。 In this embodiment, the target object may be one or more objects in the first image data that correspond to real objects in the real environment. For example, the objects may correspond to objects such as tables, chairs, sofas, etc. in the real environment.

ターゲットオブジェクトの平面情報は、ターゲットオブジェクトの外表面の情報であってもよく、具体的に、ターゲットオブジェクトの外表面の位置、寸法などの属性を表す情報であってもよい。例えば、当該情報は、当該外表面の位置と寸法を同時に表すための、ターゲットオブジェクトのある外表面の中心座標データと当該外表面の法線ベクトルとである。 The plane information of the target object may be information about the outer surface of the target object, and specifically may be information representing attributes such as the position and dimensions of the outer surface of the target object. For example, the information is center coordinate data of an outer surface of the target object and a normal vector of the outer surface to simultaneously represent the position and dimensions of the outer surface.

ターゲットオブジェクトのカテゴリ情報は、ターゲットオブジェクトの属するオブジェクトタイプを示す情報であってもよく、例えば、ターゲットオブジェクトが「ソファ」である場合、そのカテゴリ情報は「家具」であってもよく、そのまま「ソファ」であってもよい。具体的に実施される場合、ターゲットオブジェクトのカテゴリ情報は、必要に応じて設定することができる。例えば、オブジェクトの属する大分類の情報であってもよいし、それの属する小分類の情報であってもよい。また、当該カテゴリ情報について、物体の属するタイプの識別子を用いて表してもよい。例えば、家具を「０」で表して、ソファを「１」で表してもよく、ここでは割愛する。 The category information of the target object may be information indicating the object type to which the target object belongs. For example, if the target object is "sofa", the category information may be "furniture", and the category information may be information indicating the object type to which the target object belongs. ”. When specifically implemented, the category information of the target object can be set as necessary. For example, the information may be information about a major classification to which the object belongs, or information about a minor classification to which the object belongs. Further, the category information may be expressed using an identifier of the type to which the object belongs. For example, furniture may be represented by "0" and sofa may be represented by "1", but these are omitted here.

一実施例では、前記した、前記ターゲットオブジェクトのカテゴリ情報と平面情報とを取得するステップは、第１画像データをターゲット画像分割モデルに入力し、前記ターゲットオブジェクトのマスク情報を取得するステップと、前記マスク情報に基づいて前記カテゴリ情報及び前記平面情報を取得するステップと、を含む。 In one embodiment, the step of obtaining the category information and plane information of the target object includes inputting the first image data to a target image segmentation model and obtaining mask information of the target object; acquiring the category information and the plane information based on mask information.

当該実施例では、前記した、前記マスク情報に基づいて前記カテゴリ情報を取得するステップは、前記マスク情報をターゲットカテゴリ認識モデルに入力し、前記カテゴリ情報を取得するステップを含む。 In this embodiment, the above-described step of acquiring the category information based on the mask information includes inputting the mask information to a target category recognition model and acquiring the category information.

デジタル画像処理の分野では、マスク（Ｍａｓｋ）情報は、具体的に、画像処理の領域または処理過程を制御するように処理待ちの画像（全部または一部）を遮蔽するための情報であってもよい。具体的に実施される場合、マスクは、処理待ちの画像におけるユーザが興味を持つ領域、即ち、ユーザの注目する領域を抽出するための２次元行列配列または多値画像であってもよい。例えば、マスクと処理待ちの画像とを乗算することにより、処理待ちの画像の他の領域の画像値は０になり、ユーザが興味を持つ領域の画像値は変化しない。 In the field of digital image processing, mask information may specifically be information for shielding an image (in whole or in part) awaiting processing to control the image processing area or processing process. good. In a specific implementation, the mask may be a two-dimensional matrix array or a multivalued image for extracting a region of interest to the user in the image waiting to be processed, that is, a region of interest to the user. For example, by multiplying the mask by the image to be processed, the image values of other areas of the image to be processed become 0, and the image values of the areas of interest to the user do not change.

本実施例では、具体的に、予めトレーニングして得られたターゲット画像分割モデルにより、ターゲットオブジェクトのマスク情報を取得し、その後、当該マスク情報に基づいて、予めトレーニングして得られたターゲットカテゴリ認識モデルにより、ターゲットオブジェクトのカテゴリ情報を認識し、及び、当該マスク情報に基づいて、ターゲットオブジェクトの平面情報を計算して得る。以下では、まず、ターゲット画像分割モデルとターゲットカテゴリ認識モデルをどのようにトレーニングして得るかについて説明する。 Specifically, in this example, mask information of a target object is acquired using a target image segmentation model obtained by pre-training, and then, based on the mask information, target category recognition obtained by pre-training is performed. The category information of the target object is recognized by the model, and the plane information of the target object is calculated and obtained based on the mask information. Below, we will first explain how to train and obtain a target image segmentation model and a target category recognition model.

本実施例では、ターゲット画像分離モデルは、オブジェクトをキャリアから分離するためのモデルであり、例えば、ターゲットオブジェクトを用いて後続の虚実の結合処理を行うために、当該ターゲットオブジェクトをそのキャリア画像から分離する。具体的に実施される場合、当該ターゲット画像分割モデルは、ＭａｓｋＲ－ＣＮＮネットワーク構造に基づくモデルなどの畳み込みニューラルネットワークモデルであってもよく、ここでは特に限定されない。 In this embodiment, the target image separation model is a model for separating an object from a carrier. For example, in order to perform a subsequent virtual-actual combination process using a target object, the target object is separated from its carrier image. do. In particular implementations, the target image segmentation model may be a convolutional neural network model, such as a model based on a Mask R-CNN network structure, and is not particularly limited here.

ターゲットカテゴリ認識モデルは、入力されたマスク情報に基づいて、当該マスク情報に対応するオブジェクトの属するカテゴリを認識するためのモデルであり、例えば、ターゲットオブジェクトがソファである場合、ターゲットオブジェクトのマスク情報をターゲットカテゴリ認識モデルに入力することにより、そのカテゴリを「家具」とすることを得ることができ、さらには「ソファ」と認識することができる。具体的に実施される場合、当該ターゲットカテゴリ認識モデルは、同様に畳み込みニューラルネットワークモデルであってもよく、そのモデル構造についてここで割愛する。 The target category recognition model is a model for recognizing the category to which an object corresponding to the mask information belongs based on input mask information. For example, if the target object is a sofa, the target category recognition model recognizes the category to which the object corresponding to the mask information belongs. By inputting it into the target category recognition model, it is possible to determine that the category is "furniture" and furthermore, it is possible to recognize it as "sofa." In a concrete implementation, the target category recognition model may also be a convolutional neural network model, the model structure of which will be omitted here.

本実施例では、当該ターゲット画像分割モデル及び当該ターゲットカテゴリ認識モデルは、所定のシーンにおけるサンプルオブジェクトを含むデータであるサンプルデータを取得するステップと、前記サンプルデータに基づいて、初期画像分割モデル及び初期カテゴリ認識モデルを連携トレーニングし、前記ターゲット画像分割モデル及び前記ターゲットカテゴリ認識モデルを取得するステップと、によってトレーニングされて取得される。 In this embodiment, the target image segmentation model and the target category recognition model are created by obtaining sample data, which is data including sample objects in a predetermined scene, and based on the sample data, an initial image segmentation model and an initial jointly training a category recognition model to obtain the target image segmentation model and the target category recognition model.

具体的に実施される場合、サンプルデータとして異なるシーン中の環境画像データを予め取得しておくことができ、例えば、１２８種類の所定のシーン中の環境画像データを取得し、各環境画像データにおけるオブジェクトを手動でマークすることにより、ターゲット画像分割モデルとターゲットカテゴリ認識モデルをトレーニングするためのサンプルデータを取得することができる。その後、当該サンプルデータに基づいて、ターゲット画像分割モデルとターゲットカテゴリ認識モデルとにそれぞれ対応する初期画像分割モデルと初期カテゴリ認識モデルとを連携トレーニングすることで、ターゲット画像分割モデルとターゲットカテゴリ認識モデルを取得することができる。 When implemented specifically, environmental image data in different scenes can be acquired in advance as sample data. For example, environmental image data in 128 types of predetermined scenes can be acquired, and environmental image data in each environmental image data can be acquired in advance. By manually marking objects, we can obtain sample data for training the target image segmentation model and target category recognition model. After that, based on the sample data, the target image segmentation model and the target category recognition model are trained by jointly training the initial image segmentation model and the initial category recognition model corresponding to the target image segmentation model and the target category recognition model, respectively. can be obtained.

一実施例では、前記した、前記サンプルデータに基づいて、初期画像分割モデル及び初期カテゴリ認識モデルを連携トレーニングし、前記ターゲット画像分割モデル及び前記ターゲットカテゴリ認識モデルを取得するステップは、前記サンプルデータを前記初期画像分割モデルに入力して、前記サンプルオブジェクトのサンプルマスク情報を取得するステップと、前記サンプルマスク情報を前記初期カテゴリ認識モデルに入力して、前記サンプルオブジェクトのサンプルカテゴリ情報を取得するステップと、トレーニングの過程において、前記初期画像分割モデルと前記初期カテゴリ認識モデルとのパラメータを調整することにより、所定の収束条件を満たす前記ターゲット画像分割モデルと前記ターゲットカテゴリ認識モデルとを取得するステップと、を含む。 In one embodiment, the above-described step of jointly training an initial image segmentation model and an initial category recognition model based on the sample data to obtain the target image segmentation model and the target category recognition model includes inputting the sample mask information into the initial image segmentation model to obtain sample mask information for the sample object; inputting the sample mask information into the initial category recognition model to obtain sample category information for the sample object; , in the training process, adjusting parameters of the initial image segmentation model and the initial category recognition model to obtain the target image segmentation model and the target category recognition model that satisfy a predetermined convergence condition; including.

具体的に、サンプルデータを取得した後、サンプルデータを初期画像分割モデルに入力することにより、サンプルオブジェクトのサンプルマスク情報を取得する。そして、初期カテゴリ認識モデルを再使用して当該サンプルマスク情報を処理し、サンプルオブジェクトのサンプルカテゴリ情報を得、連携トレーニングの過程において、当該２つのモデルに対応する損失関数を設計し、当該２つのモデルにそれぞれ対応するパラメータを不断に調整することにより、所定の収束条件を満たすターゲット画像分割モデルとターゲットカテゴリ認識モデルとを取得する。そのうち、当該所定の収束条件は、例えば、当該２つのモデルの認識結果の誤差が所定の閾値を超えないようにすることができ、モデルトレーニングに関する詳細な処理は従来の技術で詳細に説明されているので、ここでは割愛する。 Specifically, after acquiring sample data, the sample mask information of the sample object is acquired by inputting the sample data into an initial image segmentation model. Then, the initial category recognition model is reused to process the sample mask information to obtain the sample category information of the sample object, and in the process of joint training, a loss function corresponding to the two models is designed, and the two models are By constantly adjusting parameters corresponding to the models, a target image segmentation model and a target category recognition model that satisfy a predetermined convergence condition are obtained. Among them, the predetermined convergence condition can, for example, prevent the error between the recognition results of the two models from exceeding a predetermined threshold, and the detailed process regarding model training is explained in detail in the conventional technology. Since there are many, I will omit it here.

以上、ターゲット画像分離モデルとターゲットカテゴリ認識モデルとをどのようにトレーニングして取得するかについて説明したが、具体的に実施される場合、当該ターゲット画像分離モデルに基づいて第１画像データにおけるターゲットオブジェクトのマスク情報を認識して得、当該マスク情報に基づいて、ターゲットオブジェクトのカテゴリ情報を取得する過程において、当該マスク情報に基づいて、ターゲットオブジェクトの平面情報を取得することもできる。以下、該平面情報をどのように取得するかについて詳細に説明する。 The above has explained how to train and acquire the target image separation model and the target category recognition model, but when it is specifically implemented, the target object in the first image data is determined based on the target image separation model. In the process of obtaining the category information of the target object based on the mask information, it is also possible to obtain the plane information of the target object based on the mask information. Hereinafter, how to acquire the plane information will be explained in detail.

一実施例では、前記した、前記マスク情報に基づいて前記平面情報を取得するステップは、前記マスク情報に基づいて、前記第１画像データにおける前記ターゲットオブジェクトに対応するターゲット画像ブロックを取得するステップと、前記ターゲット画像ブロックに基づいて、前記ターゲットオブジェクトのキーポイントの世界座標系におけるターゲット位置情報を取得するステップであって、そのうち、前記キーポイントは前記ターゲットオブジェクトのコーナーポイントを含む、前記のステップと、前記ターゲット位置情報及び所定の平面フィッティングアルゴリズムに基づいて、前記ターゲットオブジェクトの各平面に対応する中心点座標及び平面法線ベクトルを含む前記平面情報を取得するステップと、を含む。 In one embodiment, the above-described step of obtaining the plane information based on the mask information includes obtaining a target image block corresponding to the target object in the first image data based on the mask information. , obtaining target position information in a world coordinate system of key points of the target object based on the target image block, wherein the key points include corner points of the target object; , obtaining the plane information including center point coordinates and plane normal vectors corresponding to each plane of the target object, based on the target position information and a predetermined plane fitting algorithm.

ターゲット画像ブロックは、第１画像データにおけるターゲットオブジェクトを構成するための画素からなる画像ブロックである。 The target image block is an image block composed of pixels for forming a target object in the first image data.

具体的に、ターゲットオブジェクトの外表面の情報を正確に認識して取得待ちのターゲット画像データの精細度を高めるために、本実施例では、第１画像データにおけるターゲットオブジェクトに対応するターゲット画像ブロックを取得した後、ターゲットオブジェクトを構成する各キーポイント、例えば、コーナーポイントのターゲット位置情報、即ち、各キーポイントの実世界座標系における３次元位置座標を検出して取得ことができる。その後、所定の平面フィッティングアルゴリズムを再使用して、ターゲットオブジェクトの各外表面の情報をフィッティングして、前記平面情報を取得することができる。 Specifically, in order to accurately recognize the information on the outer surface of the target object and increase the definition of the target image data waiting to be acquired, in this embodiment, the target image block corresponding to the target object in the first image data is After the acquisition, the target position information of each key point, for example, a corner point constituting the target object, that is, the three-dimensional position coordinate of each key point in the real world coordinate system can be detected and acquired. Thereafter, a predetermined plane fitting algorithm can be reused to fit the information of each outer surface of the target object to obtain said plane information.

なお、当該所定の平面フィッティングアルゴリズムは、例えば最小二乗法による平面フィッティングアルゴリズムであってもよいし、他のアルゴリズムであってもよく、ここでは特に限定されない。 Note that the predetermined plane fitting algorithm may be, for example, a plane fitting algorithm based on the least squares method, or may be another algorithm, and is not particularly limited here.

一実施例では、電子機器は、前記ターゲット画像ブロックに基づいて、前記ターゲットオブジェクトのキーポイントの世界座標系におけるターゲット位置情報を取得する場合、前記ターゲット画像ブロックに基づいて、前記キーポイントの前記第１画像データにおける第１位置情報を検出するステップと、現在時刻を含む第１時刻における電子機器の位置姿勢情報、及び、前記キーポイントの第１時刻よりも先の第２時刻において取得された第３画像データにおける第２位置情報を取得するステップと、前記第１位置情報と前記位置姿勢情報と前記第２位置情報とに基づいて前記ターゲット位置情報を取得するステップと、に使用される。 In one embodiment, when acquiring target position information of a key point of the target object in a world coordinate system based on the target image block, the electronic device is configured to acquire target position information of the key point of the key point based on the target image block. detecting first position information in one image data, position and orientation information of the electronic device at a first time including the current time, and position and orientation information of the electronic device acquired at a second time earlier than the first time of the key point; 3 image data, and acquiring the target position information based on the first position information, the position/orientation information, and the second position information.

第１位置情報は、ターゲットオブジェクトのキーポイントの第１画像データにおける２次元座標データであってもよい。電子機器の位置姿勢情報は、電子機器が備える画像採集装置のシステムパラメータに基づいて計算して取得ことができ、ここでは割愛する。 The first position information may be two-dimensional coordinate data of key points of the target object in the first image data. The position and orientation information of the electronic device can be calculated and acquired based on the system parameters of the image acquisition device included in the electronic device, and will not be described here.

第２位置情報は、ターゲットオブジェクトのキーポイントが現在の時点より前の履歴時点に採集された画像データ、即ち、履歴画像フレームにおける２次元座標データであってもよく。 The second position information may be image data of a key point of the target object collected at a historical point in time before the current point in time, that is, two-dimensional coordinate data in a historical image frame.

具体的に実施される場合、キーポイントの第２時点における第２位置情報に基づいて、当該キーポイントの第１時点における位置軌跡を予測し、当該位置軌跡に基づいて第１位置情報を補正できるようにする。最後に、当該第１位置情報と電子機器の位置姿勢情報に基づいて、当該キーポイントの世界座標系におけるターゲット位置情報、即ち、３次元座標データを取得することができる。 When implemented specifically, the position trajectory of the key point at the first time point can be predicted based on the second position information of the key point at the second time point, and the first position information can be corrected based on the position trajectory. do it like this. Finally, target position information in the world coordinate system of the key point, that is, three-dimensional coordinate data, can be obtained based on the first position information and the position and orientation information of the electronic device.

ステップＳ１２００の後、ステップＳ１３００を実行して、仮想オブジェクトを含むデータである第２画像データを取得する。 After step S1200, step S1300 is executed to obtain second image data that is data including a virtual object.

仮想オブジェクトは、ユーザが位置している実環境では存在しないオブジェクト、即ち、仮想コンテンツであってもよく、例えば、仮想世界における動植物、建築物などであってもよく、ここでは特に限定されない。 The virtual object may be an object that does not exist in the real environment where the user is located, that is, it may be virtual content, for example, it may be animals, plants, buildings, etc. in the virtual world, and is not particularly limited here.

なお、本実施例では、ターゲットオブジェクトを含む第１画像データ及び仮想オブジェクトを含む第２画像データは、２次元データであってもよく、３次元データであってもよく、本実施例において特に限定されない。 Note that in this example, the first image data including the target object and the second image data including the virtual object may be two-dimensional data or three-dimensional data, and in this example, there are no particular limitations. Not done.

ステップＳ１４００において、前記カテゴリ情報と前記平面情報とに基づいて、前記第１画像データと前記第２画像データとを混合し、前記ターゲットオブジェクトと前記仮想オブジェクトとを含むデータであるターゲット画像データを生成する。 In step S1400, the first image data and the second image data are mixed based on the category information and the plane information to generate target image data that is data including the target object and the virtual object. do.

具体的には、上述のステップを経て、ユーザが位置している実環境を反映する第１画像データにおけるターゲットオブジェクトの平面情報及びカテゴリ情報を取得し、混合待ちの仮想オブジェクトを含む第２画像データを取得した後、当該平面情報及び当該カテゴリ情報に基づいて、第１画像データにおけるターゲットオブジェクトを分割し、第２画像データにおける仮想オブジェクトと混合することで、実環境におけるターゲットオブジェクトと仮想環境における仮想オブジェクトを同時に含むターゲット画像データを得る。 Specifically, through the steps described above, the plane information and category information of the target object in the first image data reflecting the real environment in which the user is located are acquired, and the second image data including the virtual object waiting to be mixed is acquired. After obtaining the target object in the real environment and the virtual object in the virtual environment, the target object in the first image data is divided and mixed with the virtual object in the second image data based on the plane information and the category information. Obtain target image data that simultaneously includes objects.

一実施例では、前記した、前記カテゴリ情報と前記平面情報とに基づいて、前記第１画像データと前記第２画像データとを混合して、ターゲット画像データを生成するステップは、前記カテゴリ情報に基づいて、前記第２画像データにおける前記仮想オブジェクトと前記第１画像データにおける前記ターゲットオブジェクトとの相対位置関係を決定するステップと、前記平面情報と前記相対位置関係とに基づいて、前記仮想オブジェクトを前記ターゲットオブジェクトの所定位置までレンダリングして前記ターゲット画像データを取得するステップと、を含む。 In one embodiment, the step of generating target image data by mixing the first image data and the second image data based on the category information and the plane information includes determining a relative positional relationship between the virtual object in the second image data and the target object in the first image data based on the plane information and the relative positional relationship; and obtaining the target image data by rendering to a predetermined position of the target object.

以上の処理を経てターゲットオブジェクトと仮想オブジェクトとを混合したターゲット画像データを取得した後、当該方法は、さらに前記ターゲット画像データを表示するステップを含む。 After obtaining target image data in which a target object and a virtual object are mixed through the above processing, the method further includes the step of displaying the target image data.

具体的に、ユーザが実環境におけるターゲットオブジェクトに基づいて仮想環境における仮想オブジェクトと対話しやすくするために、上記ターゲット画像データを取得した後、電子機器はその表示画面に当該ターゲット画像データを表示することができ、さらに、ユーザが表示された当該ターゲット画像データに基づいて、仮想オブジェクトと対話する対話コンテンツをさらに取得することもでき、例えば、仮想オブジェクトが猫である場合、ユーザは当該仮想の猫と対話し、対応する対話ビデオを保存することができる。 Specifically, in order to facilitate the user's interaction with the virtual object in the virtual environment based on the target object in the real environment, after acquiring the target image data, the electronic device displays the target image data on its display screen. Further, the user may further obtain interaction content for interacting with the virtual object based on the displayed target image data, for example, if the virtual object is a cat, the user may interact with the virtual object based on the displayed target image data. can interact with and save the corresponding interaction video.

電子機器使用時のユーザの楽しみを更に向上させるために、当該電子機器はネットワークモジュールをさらに含み、ネットワークモジュールを介してインターネットに接続した後、電子機器は、また、画像データ及び／又は動画データなどのような、ユーザとターゲット画像データにおける仮想オブジェクトとが対話する対話データを保存し、当該対話データを他のユーザに提供でき、例えば、当該ユーザの友人が閲覧するようにし、その詳細な処理手順についてはここで割愛する。もちろん、以上は本実施例で提供された当該方法を適用する一例にすぎず、具体的に実施される場合、当該方法をウォールステッカー、ネットワーク上でのソーシャル、仮想遠隔オフィス、パーソナルゲーム、広告などのシーンにも適用することができ、ここで割愛する。 In order to further improve the user's enjoyment when using the electronic device, the electronic device further includes a network module, and after connecting to the Internet through the network module, the electronic device can also store image data and/or video data, etc. It is possible to save the interaction data of the interaction between the user and the virtual object in the target image data, such as , and provide the interaction data to other users, for example, so that the user's friends can view it, and provide detailed processing instructions. I will omit the details here. Of course, the above is just an example of applying the method provided in this embodiment, and when specifically implemented, the method can be applied to wall stickers, social networks on networks, virtual remote offices, personal games, advertisements, etc. It can also be applied to the following scenes, so it will be omitted here.

以上より、本実施例で提供されたデータ生成方法によれば、電子機器が、ユーザが位置している実環境を表す第１画像データを取得し、当該第１画像データにおけるターゲットオブジェクトの平面情報及びカテゴリ情報を取得し、その後、仮想オブジェクトを含む第２画像データを取得することにより、当該平面情報と当該カテゴリ情報とに基づいて、第１画像データと第２画像データとを混合して、ターゲットオブジェクトと仮想オブジェクトとを同時に含むターゲット画像データを得ることができる。本実施例で提供された方法は、ターゲットオブジェクトの外表面の情報及びカテゴリ情報を認識することにより、電子機器が混合現実データを構築する際に、ターゲットオブジェクトのカテゴリ情報と平面情報とに基づいて、仮想環境に集約された仮想オブジェクトと正確に結合することができ、構築されたターゲット画像データの精細度を向上させ、さらにユーザ体験を向上させることができる。 As described above, according to the data generation method provided in this embodiment, an electronic device acquires first image data representing the real environment in which a user is located, and obtains plane information of a target object in the first image data. and category information, and then, by obtaining second image data including the virtual object, the first image data and the second image data are mixed based on the plane information and the category information, Target image data that simultaneously includes a target object and a virtual object can be obtained. The method provided in this embodiment allows the electronic device to construct mixed reality data by recognizing the outer surface information and category information of the target object, based on the category information and plane information of the target object. , can be accurately combined with virtual objects aggregated in the virtual environment, improve the definition of the constructed target image data, and further improve the user experience.

本実施例では、上記方法の実施例に対応して、図２に示すように、電子機器に適用可能なデータ生成装置２０００がさらに提供される。具体的に、第１画像データ取得モジュール２１００、情報取得モジュール２２００、第２画像データ取得モジュール２３００、及びターゲット画像データ生成モジュール２４００を含むことができる。 In this embodiment, a data generation device 2000 applicable to electronic equipment is further provided, as shown in FIG. 2, corresponding to the above method embodiment. Specifically, it may include a first image data acquisition module 2100, an information acquisition module 2200, a second image data acquisition module 2300, and a target image data generation module 2400.

当該第１画像データ取得モジュール２１００は、第１画像データを取得することに使用され、そのうち、前記第１画像データは、ユーザが位置している実環境を表すデータである。 The first image data acquisition module 2100 is used to acquire first image data, where the first image data is data representing a real environment in which a user is located.

当該情報取得モジュール２２００は、ターゲットオブジェクトのカテゴリ情報と平面情報とを取得することに使用され、そのうち、前記ターゲットオブジェクトは、前記第１画像データにおけるオブジェクトであり、前記平面情報は、前記ターゲットオブジェクトの外表面の情報を含む。 The information acquisition module 2200 is used to acquire category information and plane information of a target object, where the target object is an object in the first image data, and the plane information is an object in the first image data. Contains information about the outer surface.

一実施例では、当該情報取得モジュール２２００は、前記ターゲットオブジェクトのカテゴリ情報と平面情報とを取得する場合、前記第１画像データをターゲット画像分割モデルに入力し、前記ターゲットオブジェクトのマスク情報を取得することと、前記マスク情報に基づいて前記カテゴリ情報及び前記平面情報を取得することと、に使用される。 In one embodiment, when acquiring the category information and plane information of the target object, the information acquisition module 2200 inputs the first image data to a target image segmentation model and acquires mask information of the target object. and obtaining the category information and the plane information based on the mask information.

一実施例では、当該情報取得モジュール２２００は、前記マスク情報に基づいて前記カテゴリ情報を取得する場合、前記マスク情報をターゲットカテゴリ認識モデルに入力し、前記カテゴリ情報を取得することに使用される。 In one embodiment, when acquiring the category information based on the mask information, the information acquisition module 2200 is used to input the mask information into a target category recognition model and acquire the category information.

一実施例では、当該情報取得モジュール２２００は、前記マスク情報に基づいて前記平面情報を取得する場合、前記マスク情報に基づいて、前記第１画像データにおける前記ターゲットオブジェクトに対応するターゲット画像ブロックを取得することと、前記ターゲット画像ブロックに基づいて、前記ターゲットオブジェクトのキーポイントの世界座標系におけるターゲット位置情報を取得することと、前記ターゲット位置情報及び所定の平面フィッティングアルゴリズムに基づいて、前記ターゲットオブジェクトの各平面に対応する中心点座標及び平面法線ベクトルを含む前記平面情報を取得することと、に使用され、そのうち、前記キーポイントは前記ターゲットオブジェクトのコーナーポイントを含む。 In one embodiment, when acquiring the plane information based on the mask information, the information acquisition module 2200 acquires a target image block corresponding to the target object in the first image data based on the mask information. obtaining target position information of key points of the target object in a world coordinate system based on the target image block; and obtaining the plane information including center point coordinates and plane normal vectors corresponding to each plane, where the key points include corner points of the target object.

一実施例では、装置２０００は電子機器に適用され、当該情報取得モジュール２２００は、前記ターゲット画像ブロックに基づいて、前記ターゲットオブジェクトのキーポイントの世界座標系におけるターゲット位置情報を取得する場合、前記ターゲット画像ブロックに基づいて、前記キーポイントの前記第１画像データにおける第１位置情報を検出することと、現在時刻を含む第１時刻における電子機器の位置姿勢情報、及び、前記キーポイントの第１時刻よりも先の第２時刻において取得された第３画像データにおける第２位置情報を取得することと、前記第１位置情報と前記位置姿勢情報と前記第２位置情報とに基づいて前記ターゲット位置情報を取得することと、に使用される。 In one embodiment, the apparatus 2000 is applied to an electronic device, and the information acquisition module 2200 is configured to acquire the target position information of key points of the target object in a world coordinate system based on the target image block. detecting first position information of the key point in the first image data based on an image block; position and orientation information of the electronic device at a first time including the current time; and first time of the key point. acquiring second position information in third image data acquired at a second time earlier than the target position information; and acquiring the target position information based on the first position information, the position/orientation information, and the second position information. used to obtain and.

当該第２画像データ取得モジュール２３００は、第２画像データを取得することに使用され、そのうち、前記第２画像データは仮想オブジェクトを含むデータである。 The second image data acquisition module 2300 is used to acquire second image data, where the second image data is data including a virtual object.

当該ターゲット画像データ生成モジュール２４００は、前記カテゴリ情報と前記平面情報とに基づいて、前記第１画像データと前記第２画像データとを混合し、ターゲット画像データを生成することに使用され、そのうち、前記ターゲット画像データは、前記ターゲットオブジェクトと前記仮想オブジェクトとを含むデータである。 The target image data generation module 2400 is used to generate target image data by mixing the first image data and the second image data based on the category information and the plane information, and includes: The target image data is data including the target object and the virtual object.

一実施例では、当該ターゲット画像データ生成モジュール２４００は、前記カテゴリ情報と前記平面情報とに基づいて、前記第１画像データと前記第２画像データとを混合して、ターゲット画像データを生成する場合、前記カテゴリ情報に基づいて、前記第２画像データにおける前記仮想オブジェクトと前記第１画像データにおける前記ターゲットオブジェクトとの相対位置関係を決定することと、前記平面情報と前記相対位置関係とに基づいて、前記仮想オブジェクトを前記ターゲットオブジェクトの所定位置までレンダリングして前記ターゲット画像データを取得することと、に使用される。 In one embodiment, the target image data generation module 2400 generates target image data by mixing the first image data and the second image data based on the category information and the plane information. , determining a relative positional relationship between the virtual object in the second image data and the target object in the first image data based on the category information; and based on the plane information and the relative positional relationship. , rendering the virtual object to a predetermined position of the target object to obtain the target image data.

一実施例では、当該装置２０００は、前記ターゲット画像データを取得した後、前記ターゲット画像データを表示することに使用される表示モジュールをさらに含む。 In one embodiment, the apparatus 2000 further includes a display module used to display the target image data after acquiring the target image data.

本実施例では、上記方法の実施例に対応して、本願の任意の実施例のデータ生成方法を実施するための本願の任意の実施例に係るデータ生成装置２０００を含むことができる電子機器がさらに提供される。 In this embodiment, an electronic device that can include the data generation device 2000 according to any embodiment of the present application for implementing the data generation method of any embodiment of the present application is provided, corresponding to the embodiment of the method described above. Further provided.

図３に示すように、電子機器３０００は、実行可能な指令を記憶するメモリ３１００と、本願の任意の実施例のデータ生成方法を実行させるために、指令による制御に従って電子機器を作動させるプロセッサ３２００と、を更に含む。 As shown in FIG. 3, an electronic device 3000 includes a memory 3100 that stores executable instructions, and a processor 3200 that operates the electronic device under control of the instructions to execute the data generation method of any embodiment of the present application. and further includes.

上記装置２０００の各モジュールは、当該指令がプロセッサ３２００によって実行されて、本願の任意の実施例に係る方法を実行することによって実現することができる。 Each module of the apparatus 2000 described above may be implemented by the instructions being executed by the processor 3200 to perform a method according to any embodiment of the present application.

具体的に実施される場合、電子機器３０００は、例えば、表示画面、実環境情報を採集するための少なくとも２つの画像採集装置からなる表示装置を含むことが可能である。具体的に実施される場合、当該画像採集装置は、採集範囲が１５３°×１２０°×１６７°（Ｈ×Ｖ×Ｄ）前後であり、解像度が６４０×４８０以上、フレームレートが３０Ｈｚ以上のモノクロカメラであってもよく、もちろん、必要に応じて他の構成のカメラであってもよいが、採集範囲が広いほどカメラの光学歪みが大きくなり、最終的なデータの精度に影響を与える可能性がある。具体的に実施される場合、当該電子機器は、例えば、ＶＲデバイス、ＡＲデバイス、またはＭＲデバイスなどのデバイスであってもよい。 In a specific implementation, the electronic device 3000 may include a display device including, for example, a display screen and at least two image collecting devices for collecting real environment information. When implemented specifically, the image collection device is capable of capturing monochrome images with a collection range of around 153° x 120° x 167° (H x V x D), a resolution of 640 x 480 or more, and a frame rate of 30 Hz or more. Of course, cameras of other configurations may be used as needed, but the wider the collection area, the greater the optical distortion of the camera, which may affect the accuracy of the final data. There is. When specifically implemented, the electronic device may be a device such as a VR device, an AR device, or an MR device, for example.

本願は、システム、方法、及び／またはコンピュータプログラム製品であってもよい。コンピュータプログラム製品は、プロセッサに本願の様々な態様を実現させるためのコンピュータ可読プログラム指令を備えたコンピュータ可読記憶媒体を含むことができる。 The present application may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions for causing a processor to implement various aspects of the present application.

コンピュータ可読記憶媒体は、指令実行装置によって使用される指令を保持し記憶することができる有形の装置であってもよい。コンピュータ可読記憶媒体は、例えば、電気記憶装置、磁気記憶装置、光記憶装置、電磁記憶装置、半導体記憶装置、または上述した任意の適切な組み合わせであってもよいが、これらに限定されない。コンピュータ可読記憶媒体のより具体的な例（非網羅的なリスト）は、ポータブルコンピュータディスク、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、読み取り専用メモリ（ＲＯＭ）、消去可能プログラマブル読み取り専用メモリ（ＥＰＲＯＭまたはフラッシュメモリ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、ポータブル圧縮ディスク読み取り専用メモリ（ＣＤ－ＲＯＭ）、デジタル多機能ディスク（ＤＶＤ）、メモリスティック、フロッピーディスク、機械符号化装置、例えば、指令が格納されたパンチカードまたは溝内ボス構造、及び上述の任意の適切な組み合わせが挙げられる。ここで使用されるコンピュータ可読記憶媒体は、無線電波や他の自由に伝搬される電磁波、導波路や他の伝送媒体を介して伝搬される電磁波（例えば、光ファイバケーブルを介した光パルス）、または電線を介して伝送される電気信号などの瞬時信号そのものと解釈されない。 A computer-readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, electrical storage, magnetic storage, optical storage, electromagnetic storage, semiconductor storage, or any suitable combination of the above. More specific examples (non-exhaustive list) of computer readable storage media include portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory). ), static random access memory (SRAM), portable compressed disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, machine encoding device, e.g. Includes card or groove-in-channel boss structures, and any suitable combinations of the above. As used herein, computer-readable storage media can include radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagated through waveguides or other transmission media (e.g., light pulses through a fiber optic cable), or is not interpreted as an instantaneous signal itself, such as an electrical signal transmitted over a wire.

ここに記述されたコンピュータ可読プログラム指令は、コンピュータ可読記憶媒体から各計算／処理装置にダウンロードすることができ、またはインターネット、ローカルエリアネットワーク、広域ネットワーク、及び／または無線ネットワークなどのネットワークを介して外部コンピュータまたは外部記憶装置にダウンロードすることができる。ネットワークは、銅伝送ケーブル、光ファイバ伝送、無線伝送、ルータ、ファイアウォール、スイッチ、ゲートウェイコンピュータ、及び／またはエッジサーバを含むことができる。各計算／処理装置におけるネットワークアダプタカードまたはネットワークインターフェースは、ネットワークからコンピュータ可読プログラム指令を受信し、各計算／処理装置におけるコンピュータ可読記憶媒体に格納するために当該コンピュータ可読プログラム指令を転送する。 The computer readable program instructions described herein can be downloaded to each computing/processing device from a computer readable storage medium or transmitted externally via a network such as the Internet, a local area network, a wide area network, and/or a wireless network. Can be downloaded to your computer or external storage. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage on a computer readable storage medium in each computing/processing device.

本願の動作を実行するためのコンピュータプログラム指令は、アセンブリ命令、命令セットアーキテクチャ（ＩＳＡ）命令、機械指令、機械関連指令、マイクロコード、ファームウェア指令、状態設定データ、または１つまたは複数のプログラミング言語の任意の組み合わせで書かれたソースコードまたはオブジェクトコードであってもよい。前記プログラミング言語は、Ｓｍａｌｌｔａｌｋ、Ｃ＋＋などのオブジェクト指向プログラミング言語、「Ｃ」言語や類似のプログラミング言語などの一般的な手続き型プログラミング言語を含む。コンピュータ可読プログラム指令は、完全にユーザコンピュータ上で実行され、部分的にユーザコンピュータ上で実行され、独立したソフトウェアパッケージとして実行され、部分的にユーザコンピュータ上部分的にリモートコンピュータ上で実行され、または完全にリモートコンピュータまたはサーバ上で実行されてもよい。リモートコンピュータに関連する場合、リモートコンピュータは、ローカルエリアネットワーク（ＬＡＮ）または広域ネットワーク（ＷＡＮ）を含む任意の種類のネットワークを介してユーザコンピュータに接続することができ、または、インターネットサービスプロバイダを用いてインターネットを介して接続するなどの外部コンピュータに接続することができる。いくつかの実施例では、コンピュータ可読プログラム命令の状態情報を利用してプログラマブル論理回路、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、又はプログラマブル論理アレイ（ＰＬＡ）などの電子回路をカスタマイズすることができる。当該電子回路はコンピュータ可読プログラム指令を実行することで本願の様々な態様を実現することができる。 Computer program instructions for performing the operations of the present application may include assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or the like in one or more programming languages. It may be source code or object code written in any combination. The programming languages include object-oriented programming languages such as Smalltalk, C++, and general procedural programming languages such as the "C" language and similar programming languages. Computer-readable program instructions may be executed entirely on a user computer, partially executed on a user computer, executed as a separate software package, partially executed on a user computer, partially executed on a remote computer, or May be run entirely on a remote computer or server. As related to remote computers, the remote computer can be connected to the user computer through any type of network, including a local area network (LAN) or wide area network (WAN), or using an Internet service provider. Can be connected to external computers such as connecting via the Internet. In some embodiments, state information in computer readable program instructions may be utilized to customize electronic circuits such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs). The electronic circuitry can execute computer readable program instructions to implement various aspects of the present application.

ここでは、本願の実施例による方法、装置（システム）、及びコンピュータプログラム製品のフローチャート及び／またはブロック図を参照して、本願の各方面を説明する。フローチャート及び／またはブロック図の各ブロック、及びフローチャート及び／またはブロック図の各ブロックの組み合わせは、コンピュータ可読プログラム指令によって実現できることを理解されたい。 Aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to example embodiments of the present application. It is to be understood that each block in the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

これらのコンピュータ可読プログラム指令は、汎用コンピュータ、専用コンピュータ、または他のプログラマブルデータ処理装置のプロセッサに提供することができる。このことにより、これらの指令がコンピュータまたは他のプログラマブルデータ処理装置のプロセッサによって実行されるとき、フローチャート及び／またはブロック図における１つまたは複数のブロックに規定された機能／動作を実現できる装置ができる。コンピュータ、プログラマブルデータ処理装置、及び／または他のデバイスが特定の方法で動作させるこれらのコンピュータ可読プログラム指令をコンピュータ可読記憶媒体に記憶してもよい。これにより、指令が記憶されたコンピュータ可読媒体は、フローチャート及び／またはブロック図における１つまたは複数のブロックに規定された機能／動作の様々な態様を実現する指令を含む製造品を含む。 These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing device. These instructions, when executed by a processor of a computer or other programmable data processing device, result in an apparatus capable of implementing the functions/acts set forth in one or more blocks in the flowcharts and/or block diagrams. . These computer-readable program instructions that cause a computer, programmable data processing apparatus, and/or other device to operate in a particular manner may be stored on a computer-readable storage medium. A computer readable medium having instructions stored thereon thereby includes an article of manufacture that includes instructions for implementing various aspects of the functions/operations set forth in one or more blocks in a flowchart and/or block diagram.

コンピュータ可読プログラム指令をコンピュータ、他のプログラマブルデータ処理装置、または他のデバイスにロードして、コンピュータ、他のプログラマブルデータ処理装置、または他のデバイス上で一連の動作ステップを実行して、コンピュータが実現するプロセスを生成することで、コンピュータ、他のプログラマブルデータ処理装置、または他のデバイスで実行される指令は、フローチャート及び／またはブロック図における１つまたは複数のブロックに規定された機能／動作を実行する。 the computer-readable program instructions loaded into the computer, other programmable data processing apparatus, or other device to perform a sequence of operational steps on the computer, other programmable data processing apparatus, or other device; Instructions executed by a computer, other programmable data processing apparatus, or other device to produce a process that performs the functions/acts specified in one or more blocks in a flowchart and/or block diagram. do.

図面のフローチャート及びブロック図は、本願の複数の実施例に従ったシステム、方法、及びコンピュータプログラム製品の実現可能なアーキテクチャ、機能、及び動作を示す。この点で、フローチャートまたはブロック図の各ブロックは、所定の論理機能を実現するための実行可能な指令を１つまたは複数含む１つのモジュール、プログラムセグメント、または指令の一部を表すことができる。代替として実現される場合において、ブロックに表示される機能は、図面に表示される順序とは異なる順序で発生することができる。例えば、２つの連続するブロックは実際には基本的に並列に実行することができ、それらは関連する機能に応じて逆の順序で実行することもできる。なお、ブロック図及び／またはフローチャートにおける各ブロック、及びブロック図及び／またはフローチャートにおけるブロックの組み合わせは、所定の機能または動作を実行する専用のハードウェアベースのシステムで実現することができ、または専用のハードウェアとコンピュータ指令との組み合わせで実現することができる。当業者には、ハードウェアによる実現、ソフトウェアによる実現、及びソフトウェアとハードウェアとの結合による実現は同等であることが知られている。 The flowchart and block diagrams in the drawings illustrate possible architecture, functionality, and operation of systems, methods, and computer program products in accordance with embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, program segment, or portion of instructions that includes one or more executable instructions for implementing the predetermined logical function. In alternative implementations, the functions depicted in the blocks may occur in a different order than that depicted in the drawings. For example, two consecutive blocks can actually be executed essentially in parallel, or they can even be executed in reverse order depending on the functionality involved. It should be noted that each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented with a dedicated hardware-based system that performs a predetermined function or operation, or a dedicated hardware-based system that performs a predetermined function or operation. It can be realized by a combination of hardware and computer instructions. It is known to those skilled in the art that hardware implementations, software implementations, and combination software and hardware implementations are equivalent.

以上、本願の各実施例について説明したが、上記の説明は例示的であり、網羅的ではなく、開示された各実施例にも限定されない。説明された各実施例の範囲及び精神から逸脱することなく、多くの修正及び変更が当業者にとって明らかである。本明細書で使用される用語の選択は、各実施例の原理、実際的な適用、または市場における技術的改善を最良に説明すること、または本明細書で開示される各実施例を当業者が理解できるようにすることを目的とする。本願の範囲は、添付の特許請求の範囲によって規定される。 Although each embodiment of the present application has been described above, the above description is illustrative, not exhaustive, and is not limited to each disclosed embodiment. Many modifications and changes will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The choice of terminology used herein has been chosen to best explain the principles, practical application, or technical improvements in the marketplace of each embodiment, or to those skilled in the art, to understand each embodiment disclosed herein. The purpose is to make it easier to understand. The scope of the application is defined by the claims appended hereto.

Claims

A data generation method,
obtaining first image data, the first image data being data representing a real environment in which the user is located;
obtaining category information and plane information of a target object, wherein the target object is an object in the first image data, and the plane information includes information on an outer surface of the target object; the above steps;
a step of acquiring second image data, the second image data being data including a virtual object;
generating target image data by mixing the first image data and the second image data based on the category information and the plane information, wherein the target image data The data generation method includes the steps of: data including the virtual object.

The step of generating target image data by mixing the first image data and the second image data based on the category information and the plane information,
determining a relative positional relationship between the virtual object in the second image data and the target object in the first image data based on the category information;
The method according to claim 1 , further comprising: rendering the virtual object to a predetermined position of the target object based on the plane information and the relative positional relationship to obtain the target image data.

The step of obtaining the category information and plane information of the target object includes:
inputting first image data into a target image segmentation model to obtain mask information of the target object;
The method of claim 1, comprising: obtaining the category information and the plane information based on the mask information.

The step of obtaining the category information based on the mask information includes:
4. The method of claim 3, comprising inputting the mask information into a target category recognition model to obtain the category information.

The step of acquiring the plane information based on the mask information,
obtaining a target image block corresponding to the target object in the first image data based on the mask information;
obtaining target position information of key points of the target object in a world coordinate system based on the target image block, the key points including corner points of the target object;
obtaining the plane information based on the target position information and a predetermined plane fitting algorithm, the plane information including center point coordinates and plane normal vectors corresponding to each plane of the target object; 4. The method of claim 3, comprising the steps of:

6. The method according to claim 5, wherein the method is applied to an electronic device, and the step of obtaining the target position information of the key points of the target object in the world coordinate system based on the target image block. teeth,
detecting first position information of the key point in the first image data based on the target image block;
acquiring position and orientation information of the electronic device at a first time including the current time, and second position information in third image data acquired at a second time prior to the first time of the key point; ,
6. The method of claim 5, comprising obtaining the target position information based on the first position information, the position and orientation information, and the second position information.

The target image segmentation model and the target category recognition model are:
obtaining sample data, the sample data including sample objects in a predetermined scene;
5. The target image segmentation model and the target category recognition model are obtained by training an initial image segmentation model and an initial category recognition model based on the sample data. The method described.

2. The method of claim 1, wherein after acquiring the target image data, the method further comprises:
2. The method of claim 1, comprising displaying the target image data.

A data generation device,
a first image data acquisition module used to acquire first image data, the first image data being data representing a real environment in which a user is located; ,
An information acquisition module used to acquire category information and plane information of a target object, wherein the target object is an object in the first image data, and the plane information is an object on an outer surface of the target object. an information acquisition module containing information of;
a second image data acquisition module used to acquire second image data, the second image data being data including a virtual object;
A target image data generation module used to generate target image data by mixing the first image data and the second image data based on the category information and the plane information, the A data generation device including a target image data generation module, wherein the target image data is data including the target object and the virtual object.

An electronic device,
comprising an apparatus according to claim 9, or
An electronic device comprising a memory for storing executable instructions and a processor for operating the electronic device under control of the instructions to carry out the method according to any one of claims 1 to 8. .