WO2022073415A1 - 一种数据生成方法、装置及存储介质 - Google Patents

一种数据生成方法、装置及存储介质 Download PDF

Info

Publication number
WO2022073415A1
WO2022073415A1 PCT/CN2021/119393 CN2021119393W WO2022073415A1 WO 2022073415 A1 WO2022073415 A1 WO 2022073415A1 CN 2021119393 W CN2021119393 W CN 2021119393W WO 2022073415 A1 WO2022073415 A1 WO 2022073415A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
rgbd
key points
data
game engine
Prior art date
Application number
PCT/CN2021/119393
Other languages
English (en)
French (fr)
Inventor
付强
杜国光
马世奎
彭飞
Original Assignee
达闼机器人有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 达闼机器人有限公司 filed Critical 达闼机器人有限公司
Priority to US17/563,692 priority Critical patent/US20220126447A1/en
Publication of WO2022073415A1 publication Critical patent/WO2022073415A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1671Programme controls characterised by programming, planning systems for manipulators characterised by simulation, either to verify existing program or to create and verify new program, CAD/CAM oriented, graphic oriented programming systems
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/52Controlling the output signals based on the game progress involving aspects of the displayed game scene
    • A63F13/525Changing parameters of virtual cameras
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/55Controlling game characters or game objects based on the game progress
    • A63F13/57Simulating properties, behaviour or motion of objects in the game world, e.g. computing tyre load in a car race game
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/67Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor adaptively or by learning from player actions, e.g. skill level adjustment or by storing successful combat sequences for re-use
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J19/00Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators
    • B25J19/02Sensing devices
    • B25J19/021Optical sensing devices
    • B25J19/023Optical sensing devices including video camera means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1612Programme controls characterised by the hand, wrist, grip control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/257Colour aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1602Programme controls characterised by the control system, structure, architecture
    • B25J9/1605Simulation of manipulator lay-out, design, modelling of manipulator
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/37Measurements
    • G05B2219/37537Virtual sensor
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/37Measurements
    • G05B2219/37572Camera, tv, vision
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/40Robotics, robotics mapping to robotics vision
    • G05B2219/40309Simulation of human hand motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a data generation method, apparatus, and computer-readable storage medium.
  • the embodiments of the present application creatively provide a data generation method, an apparatus, and a computer-readable storage medium.
  • a data generation method includes: importing a robot model using a game engine; simulating an RGBD camera through a scene capture component in the game engine;
  • the joint control module controls the human hand of the imported robot model to move within the field of view of the RGBD camera; uses the RGBD camera to collect RGBD image data; according to the RGBD image data and the coordinate information of the 3D pose of 21 key points , generating an annotated dataset with 21 keypoint coordinates.
  • the importing the robot model using the game engine includes: importing each joint of the robot into the game engine in a manner of stacking joints according to the 3D model of the robot.
  • simulating an RGBD camera through a scene capture component in the game engine includes: capturing a scene by using the scene capture component to obtain image data; rendering the image data to a texture rendering component; selecting a capture data source The color image data and the depth image data in the image data are reorganized to obtain the reorganized image data; the channel isolation of the color image data and the unit unified processing of the depth image data are performed on the reorganized image data to simulate Get an RGBD camera.
  • the RGBD image includes a 2D color image and a depth image; according to the RGBD image data and the coordinate information of the 3D poses of 21 key points, an annotation dataset with coordinates of 21 key points is generated , including: converting the coordinates of the 3D poses of the 21 key points into the 2D color image to mark the position of each key point in the 2D color image; using the depth image to obtain each key point depth information.
  • the method before converting the coordinates of the 3D poses of the 21 key points into the 2D color image, the method further includes: converting the coordinates of the 3D poses of the 21 key points into From the coordinates in the RGBD camera coordinate system, the relative coordinates of 21 key points are obtained; the RGBD image data is corresponding to the relative coordinates of the 21 key points.
  • a data generation device the device includes: a model import module configured to import a robot model using a game engine; a camera simulation module configured to capture a component through a scene in the game engine A simulated RGBD camera; a joint control module, configured to control the human hand of the imported robot model to move within the field of view of the RGBD camera; an image acquisition control module, configured to use the RGBD camera to collect RGBD image data; a data generation module, It is configured to generate an annotation dataset with coordinates of 21 key points according to the RGBD image data and the coordinate information of the 3D poses of the 21 key points.
  • the model import module is configured to import each joint of the robot into the game engine in a manner of stacking joints according to the 3D model of the robot.
  • the camera simulation module is configured to capture a scene by using a scene capture component to obtain image data; render the image data to a texture rendering component; select a capture data source to convert the color in the image data Image data and depth image data are reorganized to obtain reorganized image data; channel isolation of color image data and unit unified processing of depth image data are performed on the reorganized image data to simulate an RGBD camera.
  • the RGBD image includes a 2D color image and a depth image
  • the data generation module is configured to convert the coordinates of the 3D pose of the 21 key points into the 2D color image, so as to The position of each key point in the 2D color image is marked; the depth information of each key point is obtained by using the depth image.
  • the data generation module is further configured to, before converting the coordinates of the 3D poses of the 21 key points into the 2D color image, convert the coordinates of the 3D poses of the 21 key points into the 2D color image.
  • the coordinates are converted into coordinates in the RGBD camera coordinate system, and the relative coordinates of 21 key points are obtained; the RGBD image data and the relative coordinates of the 21 key points are corresponding.
  • a data generating apparatus comprising: one or more processors; a memory configured to store one or more programs, the one or more programs being stored by the one or more programs The processor executes, so that the one or more processors implement any of the above data generation methods.
  • a computer-readable storage medium is further provided, wherein the storage medium includes a set of computer-executable instructions for executing any of the above data generation methods when the instructions are executed.
  • the data generation method, device and computer-readable storage medium firstly import a robot model using a game engine; then simulate an RGBD camera through a scene capture component in the game engine; then use a joint control module in the game engine
  • the human hand controlling the imported robot model moves within the field of view of the RGBD camera to collect RGBD image data; finally, according to the RGBD image data and the coordinate information of the 3D poses of 21 key points, generate 21 Annotated dataset of keypoint coordinates.
  • the present application generates a dataset containing the RGBD image of the robot hand and the 3D pose of the 21 key points on the hand, which is difficult to provide in the actual scene through the game engine, which can be generated very quickly and accurately with 21 key points.
  • the dataset of coordinates, and the generated dataset has been annotated.
  • the application can complete a dataset including tens of thousands of images that would have taken days or even weeks to generate in half a day, which greatly improves the efficiency.
  • the generated simulation dataset can be used to verify the performance of the learning algorithm, and the high-reduction modeling of the game engine also makes the dataset generated in the simulation valuable in practical scenarios.
  • FIG. 1 shows a schematic diagram of the implementation flow of the data generation method according to the embodiment of the present application
  • Fig. 2 shows the display effect diagram of 21 key point positions of an application example of the present application
  • Fig. 3 shows the scene rendering of the marked data generated by an application example of the present application
  • FIG. 4 shows a schematic diagram of the composition and structure of a data generating apparatus according to an embodiment of the present application
  • FIG. 5 shows a schematic diagram of a composition structure of an electronic device provided by an embodiment of the present application.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include at least one of that feature.
  • plurality means two or more, unless otherwise expressly and specifically defined.
  • Fig. 1 shows a schematic diagram of the implementation flow of the data generation method according to an embodiment of the present application
  • Fig. 2 shows a display effect diagram of 21 key point positions of an application instance of the present application
  • Fig. 3 shows an annotation generated by an application instance of the present application A rendering of the scene after the data.
  • an embodiment of the present application provides a data generation method, which includes the following steps:
  • Step 101 import a robot model using a game engine.
  • the electronic device uses a game engine (Unreal Engine 4, UE4) to import the robot model.
  • UE4 game engine can guarantee a high degree of restoration of imported robot models and real robots.
  • the electronic device may be any form of smart device installed with a game engine.
  • Step 102 simulate an RGBD camera through a scene capture component in the game engine.
  • the electronic device uses the scene capture component to capture the scene to obtain image data; renders the image data to the texture rendering component; selects the capture data source to recombine the color image data and the depth image data in the image data to obtain Recombined image data; perform channel isolation of color image data and unit unified processing of depth image data on the reconstituted image data to simulate an RGBD camera.
  • the electronic device develops a custom camera module by using the scene capture component (SceneCaptureComponent2D) in the UE4 game engine.
  • SceneCaptureComponent2D can capture and render the scene to the texture rendering component (TextureRenderTarget2D), select the appropriate capture data source (CaptureSource) and reorganize the color data and depth data, so that the same scene capture component can obtain color images and depth images at the same time. data.
  • CaptureSource capture data source
  • the image data is read from the rendering target, the channel isolation of the color image and the unit unification of the depth image are performed, and then the standard RGBD data can be obtained.
  • the application of the camera simulation module is very simple.
  • an action node (actor) as an internal component, and can transmit RGBD images in real time just like the actual camera.
  • the camera simulation module supports modifying the internal parameters of the camera. Ensure that the generated image is consistent with the real camera.
  • the electronic device simulates an RGBD camera through the scene capture component in the game engine.
  • the internal parameter matrix of the real camera is used, so that the data in the simulation can be consistent with the image data of the real camera.
  • Step 103 using the joint control module in the game engine to control the human hand of the imported robot model to move within the field of view of the RGBD camera.
  • the electronic device can use the joint control module in the game engine to control the human hand of the imported robot model, for example, the left hand or the right hand makes random movements within the field of view of the RGBD camera, so as to collect a large amount of available data images.
  • Step 104 using the RGBD camera to collect RGBD image data.
  • the RGBD image includes a 2D color image and a depth image.
  • Step 105 according to the RGBD image data and the coordinate information of the 3D poses of the 21 key points, generate an annotation data set with coordinates of 21 key points.
  • the electronic device converts the coordinates of the 3D poses of the 21 key points into the 2D color image, so as to mark the position of each key point in the 2D color image; obtain each key point by using the depth image Depth information for keypoints.
  • the electronic device will obtain the coordinate information of the 3D pose of the 21 key points through the game engine; convert the coordinates of the 3D pose of the 21 key points into the coordinates in the RGBD camera coordinate system , the relative coordinates of 21 key points are obtained; the RGBD image data is corresponding to the relative coordinates of the 21 key points.
  • each is bound to an empty character, and the game engine can obtain the coordinate information of each empty character in real time.
  • write a blueprint in UE4 convert the coordinates of the 3D pose of the 21 key points to the coordinates in the RGBD camera coordinate system, and store them in the file in a certain order.
  • Correspond the collected RGBD image data with the relative coordinates of the 21 key points obtained and convert the 3D coordinates of the 21 key points into the 2D color image using the camera's internal parameter matrix, and mark the position of each key point in the 2D image. position, so as to determine the range of the hand in the image for the purpose of labeling.
  • the marked image is shown in Figure 3 below.
  • the range of the hand is completely surrounded by a marked frame of a specific color, and the depth information of each key point is obtained by using the depth image.
  • the present application firstly use a game engine to import a robot model; then simulate an RGBD camera through a scene capture component in the game engine; then use a joint control module in the game engine to control the human hand of the imported robot model Act within the field of view of the RGBD camera to collect RGBD image data; finally, according to the RGBD image data and the coordinate information of the 3D poses of 21 key points, an annotation dataset with coordinates of 21 key points is generated.
  • the present application generates a dataset containing the RGBD image of the robot hand and the 3D pose of the 21 key points on the hand, which is difficult to provide in the actual scene through the game engine, which can be generated very quickly and accurately with 21 key points.
  • the dataset of coordinates, and the generated dataset has been annotated.
  • the application can complete a dataset including tens of thousands of images that would have taken days or even weeks to generate in half a day, which greatly improves the efficiency.
  • the generated simulation dataset can be used to verify the performance of the learning algorithm, and the high-reduction modeling of the game engine also makes the dataset generated in the simulation valuable in practical scenarios.
  • FIG. 4 shows a schematic structural diagram of a data generating apparatus according to an embodiment of the present application.
  • the data generation apparatus 40 includes: a model import module 401, configured to import a robot model using a game engine; a camera simulation module 402, configured to simulate an RGBD camera through a scene capture component in the game engine;
  • the joint control module 403 is configured to control the human hand of the imported robot model to move within the field of view of the RGBD camera;
  • the image acquisition control module 404 is configured to use the RGBD camera to collect RGBD image data;
  • the data generation module 405 is configured to To generate an annotation dataset with coordinates of 21 key points according to the RGBD image data and the coordinate information of the 3D poses of the 21 key points.
  • the model importing module 401 is configured to import each joint of the robot into the game engine in a manner of stacking joints according to the 3D model of the robot.
  • the camera simulation module 402 is configured to capture the scene by using the scene capture component to obtain image data; render the image data to the texture rendering component; select the capture data source to convert the color image in the image data Data and depth image data are reorganized to obtain reorganized image data; channel isolation of color image data and unit unified processing of depth image data are performed on the reorganized image data to simulate an RGBD camera.
  • the RGBD image includes a 2D color image and a depth image
  • the data generation module 405 is configured to convert the coordinates of the 3D poses of the 21 key points into the 2D color image to mark The position of each key point in the 2D color image; the depth information of each key point is obtained by using the depth image.
  • the data generation module 405 is further configured to convert the coordinates of the 3D poses of the 21 key points into the 2D color image before converting the coordinates of the 3D poses of the 21 key points into Convert to coordinates in the RGBD camera coordinate system to obtain the relative coordinates of 21 key points; and correspond the RGBD image data with the relative coordinates of the 21 key points.
  • FIG. 5 shows a schematic diagram of a composition structure of an electronic device provided by an embodiment of the present application.
  • the electronic device may be the data generating device 40 or a stand-alone device separate from it that can communicate with the data generating device 40 to receive the collected input signals therefrom.
  • FIG. 5 illustrates a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 11 includes one or more processors 111 and a memory 112 .
  • the processor 111 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 11 to perform desired functions.
  • CPU central processing unit
  • the processor 111 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 11 to perform desired functions.
  • Memory 112 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 111 may execute the program instructions to implement the dynamic intent-supporting control methods of various embodiments of the present disclosure described above and / or other desired functionality.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the electronic device 11 may also include an input device 113 and an output device 114 interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the input device 113 may be the above-mentioned microphone or microphone array configured to capture the input signal of the sound source.
  • the input device 113 may be a communication network connector configured to receive the collected input signal from the data generating device 40 .
  • the input device 13 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 114 can output various information to the outside, including the determined distance information, direction information, and the like.
  • the output device 114 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
  • the electronic device 11 may also include any other appropriate components according to the specific application.
  • embodiments of the present disclosure may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification
  • the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present disclosure may also be computer-readable storage media having computer program instructions stored thereon that, when executed by a processor, cause the processor to perform the above-described "Example Method" section of this specification Steps in a method for training a multi-task model according to various embodiments of the present disclosure described in .
  • the computer-readable storage medium may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • each component or each step may be decomposed and/or recombined. These disaggregations and/or recombinations should be considered equivalents of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Orthopedic Medicine & Surgery (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种数据生成方法、装置及计算机可读存储介质,该方法包括:使用游戏引擎导入机器人模型(101);通过所述游戏引擎中的场景捕捉组件模拟三原色深度(RGBD)相机(102);利用所述游戏引擎中的关节控制模块控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作(103);利用所述RGBD相机采集RGBD图像数据(104);根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集(105)。

Description

一种数据生成方法、装置及存储介质
相关申请的交叉引用
本申请基于申请号为202011076496.2、申请日为2020年10月10日的中国专利申请提出,并要求中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种数据生成方法、装置及计算机可读存储介质。
背景技术
当前,机器学习和深度学习被广泛应用到社会的各个方面,在机器人领域应用尤其广泛。质量好的数据集可以让算法发挥最大的性能实现最好的效果。但是,数据集的生成是一个比较繁琐的过程,一般数据集的数量都比较大(以万为单位),而且标注的工作比较繁琐,许多还是要靠人手动标注。另外,对于一些数据的采集并不方便,比如对于一些3D位姿的获取,在实际情况下需要借助额外的传感器等设备。
发明内容
本申请实施例为了解决现有机器人设备与用户进行信息交互时所存在的问题,创造性地提供了一种数据生成方法、装置及计算机可读存储介质。
根据本申请第一方面,创造性地提供了一种数据生成方法,所述方法包括:使用游戏引擎导入机器人模型;通过所述游戏引擎中的场景捕捉组件模拟RGBD相机;利用所述游戏引擎中的关节控制模块控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作;利用所述RGBD相机采集RGBD图像数据;根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生 成带有21个关键点坐标的标注数据集。
根据本申请一实施方式,所述使用游戏引擎导入机器人模型,包括:根据机器人3D模型按照关节堆叠的方式将机器人的各个关节分别导入游戏引擎中。
根据本申请一实施方式,通过所述游戏引擎中的场景捕捉组件模拟RGBD相机,包括:利用场景捕捉组件对场景进行捕捉,得到图像数据;渲染所述图像数据到贴图渲染组件;选择捕捉数据源将所述图像数据中的彩色图像数据和深度图像数据进行重组,得到重组后的图像数据;对所述重组后的图像数据进行彩色图像数据的通道隔离和深度图像数据的单位统一处理,以模拟得到RGBD相机。
根据本申请一实施方式,所述RGBD图像包括2D彩色图像和深度图像;根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集,包括:将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中,以标注每个关键点在所述2D彩色图像中的位置;利用所述深度图像获得各个关键点的深度信息。
根据本申请一实施方式,在将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中之前,所述方法还包括:将21个关键点的3D位姿的坐标转换为在所述RGBD相机坐标系下的坐标,获得21个关键点的相对坐标;将所述RGBD图像数据和21个关键点的相对坐标进行对应。
根据本申请第二方面,还提供了一种数据生成装置,所述装置包括:模型导入模块,配置为使用游戏引擎导入机器人模型;相机模拟模块,配置为通过所述游戏引擎中的场景捕捉组件模拟RGBD相机;关节控制模块,配置为控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作;图像采集控制模块,配置为利用所述RGBD相机采集RGBD图像数据;数据生成模块,配置为根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集。
根据本申请一实施方式,所述模型导入模块,配置为根据机器人3D模型按照关节堆叠的方式将机器人的各个关节分别导入游戏引擎中。
根据本申请一实施方式,所述相机模拟模块,配置为利用场景捕捉组件对场景进行捕捉,得到图像数据;渲染所述图像数据到贴图渲染组件;选择捕捉数据源将所述图像数据中的彩色图像数据和深度图像数据进行重组,得到重组后的图像数据;对所述重组后的图像数据进行彩色图像数据的通道隔离和深度图像数据的单位统一处理,以模拟得到RGBD相机。
根据本申请一实施方式,所述RGBD图像包括2D彩色图像和深度图像;所述数据生成模块,配置为将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中,以标注每个关键点在所述2D彩色图像中的位置;利用所述深度图像获得各个关键点的深度信息。
根据本申请一实施方式,所述数据生成模块,还配置为在将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中之前,将21个关键点的3D位姿的坐标转换为在所述RGBD相机坐标系下的坐标,获得21个关键点的相对坐标;将所述RGBD图像数据和21个关键点的相对坐标进行对应。
根据本申请第三方面,又提供了一种数据生成装置,包括:一个或多个处理器;存储器,配置为存储一个或多个程序,所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述任一数据生成方法。
根据本申请第四方面,又提供了一种计算机可读存储介质,所述存储介质包括一组计算机可执行指令,当所述指令被执行时用于执行上述任一数据生成方法。
本申请实施例数据生成方法、装置及计算机可读存储介质,首先使用游戏引擎导入机器人模型;再通过所述游戏引擎中的场景捕捉组件模拟RGBD相机;接着利用所述游戏引擎中的关节控制模块控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作,以采集RGBD图像数据;最后根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集。如此,本申请通过游戏引擎生成在实际场景中较难提供的包含机器人手的RGBD图像和手上21个关键点的3D位姿的数据集,可以非常迅速且准确的生成带有21个关键点坐标的数据集,且生成的数据集是已经标注完成 的。这样,本申请将本来需要耗时几天甚至几周才能生成的包括数万张图像的数据集可以在半天的时间内完成,极大地提高了效率。另外,生成的仿真数据集可以用来验证学习算法的性能,而且游戏引擎的高还原度建模也使得仿真中生成的数据集在实际场景中也有应用价值。
需要理解的是,本申请的教导并不需要实现上面所述的全部有益效果,而是特定的技术方案可以实现特定的技术效果,并且本申请的其他实施方式还能够实现上面未提到的有益效果。
附图说明
通过参考附图阅读下文的详细描述,本申请示例性实施方式的上述以及其他目的、特征和优点将变得易于理解。在附图中,以示例性而非限制性的方式示出了本申请的若干实施方式,其中:
在附图中,相同或对应的标号表示相同或对应的部分。
图1示出了本申请实施例数据生成方法的实现流程示意图;
图2示出了本申请一应用实例21个关键点位置的显示效果图;
图3示出了本申请一应用实例生成的标注后的数据的场景效果图;
图4示出了本申请实施例数据生成装置的组成结构示意图;
图5示出了本申请实施例提供的电子设备的组成结构示意图。
具体实施方式
为使本申请的目的、特征、优点能够更加的明显和易懂,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而非全部实施例。基于本申请中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体 特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或隐含地包括至少一个该特征。在本申请的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。
图1示出了本申请实施例数据生成方法的实现流程示意图;图2示出了本申请一应用实例21个关键点位置的显示效果图;图3示出了本申请一应用实例生成的标注后的数据的场景效果图。
参考图1,本申请实施例提供了一种数据生成方法,该方法包括如下步骤:
步骤101,使用游戏引擎导入机器人模型。
具体地,电子设备使用游戏引擎(Unreal Engine 4,UE4)导入机器人模型。UE4游戏引擎可以保证导入的机器人模型和真实机器人的高还原度。
这里,电子设备可以是任意形式的安装有游戏引擎的智能设备。
步骤102,通过所述游戏引擎中的场景捕捉组件模拟RGBD相机。
具体地,电子设备利用场景捕捉组件对场景进行捕捉,得到图像数据;渲染所述图像数据到贴图渲染组件;选择捕捉数据源将所述图像数据中的彩色图像数据和深度图像数据进行重组,得到重组后的图像数据;对所述重组后的图像数据进行彩色图像数据的通道隔离和深度图像数据的单位统一处理,以模拟得到RGBD相机。
在一应用示例中,电子设备利用UE4游戏引擎中的场景捕捉组件(SceneCaptureComponent2D)场景捕捉组件开发了自定义的相机模块。SceneCaptureComponent2D可以将场景捕捉并渲染到贴图渲染组件 (TextureRenderTarget2D),选择合适的捕捉数据源(CaptureSource)并将彩色数据和深度数据进行重新组织,使得利用同一个场景捕捉组件可以同时获取彩色图像和深度图像数据。之后,从渲染的目标处读取到图像数据后再进行彩色图像的通道隔离和深度图像的单位统一,就能得到标准的RGBD数据。该相机模拟模块应用特别简单,可作为一个内部组件直接绑定到某个动作节点(actor)上,就能和实际相机一样实时传出RGBD图像,同时该相机模拟模块支持修改相机的内参,可以保证生成的图像和真实相机一致。
这样,电子设备通过游戏引擎中的场景捕捉组件模拟出RGBD相机。在RGBD相机模型过程中,使用了真实相机的内参矩阵,使得仿真中的数据和真实相机的图像数据可以保持一致。
步骤103,利用所述游戏引擎中的关节控制模块控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作。
具体地,电子设备可以利用游戏引擎中的关节控制模块控制所导入的机器人模型的人手,如左手或右手在RGBD相机的视野范围内做随机动作,以供采集大量的可用数据图像。
步骤104,利用所述RGBD相机采集RGBD图像数据。
其中,所述RGBD图像包括2D彩色图像和深度图像。
步骤105,根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集。
具体地,电子设备将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中,以标注每个关键点在所述2D彩色图像中的位置;利用所述深度图像获得各个关键点的深度信息。
当然,在进行步骤105之前,电子设备会通过游戏引擎获得21个关键点的3D位姿的坐标信息;将21个关键点的3D位姿的坐标转换为在所述RGBD相机坐标系下的坐标,获得21个关键点的相对坐标;将所述RGBD图像数据和21个关键点的相对坐标进行对应。
在一应用示例中,参考图2所示的机器人模型左手的21个关键点的位置, 各绑定一个空角色,游戏引擎能实时获得各个空角色的坐标信息。接着,在UE4中编写蓝图,将21个关键点的3D位姿的坐标转换为在RGBD相机坐标系下的坐标,并按照一定顺序存入文件中。将采集的RGBD图像数据和获得的21个关键点的相对坐标对应起来,并将21个关键点的3D坐标利用相机内参矩阵转换到2D彩色图像中,标出每个关键点在2D图像中的位置,从而确定手在图像中的范围,以达到标注的目的。完成标注的图像如下图3所示,用特定颜色的标注框完全包围住手的范围,再利用深度图像获得各个关键点的深度信息。
本申请实施例数据生成方法,首先使用游戏引擎导入机器人模型;再通过所述游戏引擎中的场景捕捉组件模拟RGBD相机;接着利用所述游戏引擎中的关节控制模块控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作,以采集RGBD图像数据;最后根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集。如此,本申请通过游戏引擎生成在实际场景中较难提供的包含机器人手的RGBD图像和手上21个关键点的3D位姿的数据集,可以非常迅速且准确的生成带有21个关键点坐标的数据集,且生成的数据集是已经标注完成的。这样,本申请将本来需要耗时几天甚至几周才能生成的包括数万张图像的数据集可以在半天的时间内完成,极大地提高了效率。另外,生成的仿真数据集可以用来验证学习算法的性能,而且游戏引擎的高还原度建模也使得仿真中生成的数据集在实际场景中也有应用价值。
图4示出了本申请实施例数据生成装置的组成结构示意图。
参考图4,本申请实施例数据生成装置40,包括:模型导入模块401,配置为使用游戏引擎导入机器人模型;相机模拟模块402,配置为通过所述游戏引擎中的场景捕捉组件模拟RGBD相机;关节控制模块403,配置为控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作;图像采集控制模块404,配置为利用所述RGBD相机采集RGBD图像数据;数据生成模块405,配置为根据所述RGBD图像数据和21个关键点的3D位姿的坐 标信息,生成带有21个关键点坐标的标注数据集。
在一可实施方式中,模型导入模块401,配置为根据机器人3D模型按照关节堆叠的方式将机器人的各个关节分别导入游戏引擎中。
在一可实施方式中,相机模拟模块402,配置为利用场景捕捉组件对场景进行捕捉,得到图像数据;渲染所述图像数据到贴图渲染组件;选择捕捉数据源将所述图像数据中的彩色图像数据和深度图像数据进行重组,得到重组后的图像数据;对所述重组后的图像数据进行彩色图像数据的通道隔离和深度图像数据的单位统一处理,以模拟得到RGBD相机。
在一可实施方式中,所述RGBD图像包括2D彩色图像和深度图像;数据生成模块405,配置为将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中,以标注每个关键点在所述2D彩色图像中的位置;利用所述深度图像获得各个关键点的深度信息。
在一可实施方式中,数据生成模块405,还配置为在将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中之前,将21个关键点的3D位姿的坐标转换为在所述RGBD相机坐标系下的坐标,获得21个关键点的相对坐标;将所述RGBD图像数据和21个关键点的相对坐标进行对应。
图5示出了本申请实施例提供的电子设备的组成结构示意图。
下面,参考图5来描述根据本公开实施例的电子设备。该电子设备可以是数据生成装置40或与它独立的单机设备,该单机设备可以与数据生成装置40进行通信,以从它们接收所采集到的输入信号。
图5图示了根据本公开实施例的电子设备的框图。
如图5所示,电子设备11包括一个或多个处理器111和存储器112。
处理器111可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备11中的其他组件以执行期望的功能。
存储器112可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性 存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器111可以运行所述程序指令,以实现上文所述的本公开的各个实施例的支持动态意图的控制方法以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。
在一个示例中,电子设备11还可以包括:输入装置113和输出装置114,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。
例如,在该电子设备是支持动态意图的控制装置60时,该输入装置113可以是上述的麦克风或麦克风阵列,配置为捕捉声源的输入信号。在该电子设备是单机设备时,该输入装置113可以是通信网络连接器,配置为从数据生成装置40接收所采集的输入信号。
此外,该输入装置13还可以包括例如键盘、鼠标等等。
该输出装置114可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出装置114可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等。
当然,为了简化,图5中仅示出了该电子设备11中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备11还可以包括任何其他适当的组件。
除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的多任务模型的训练方法中的步骤。
所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸 如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。
此外,本公开的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的多任务模型的训练方法中的步骤。
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
以上结合具体实施例描述了本公开的基本原理,但是,需要指出的是,在本公开中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本公开的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本公开为必须采用上述具体的细节来实现。
本公开中涉及的器件、装置、设备、系统的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、系统。诸如“包括”、“包含”、“具有”等等的词语是开放性词汇,指“包括但不限于”,且可与其互换使用。这里所使用的词汇“或”和“和”指词汇“和/或”,且可与其互换使用,除非上下文明确指示不是如此。这里 所使用的词汇“诸如”指词组“如但不限于”,且可与其互换使用。
还需要指出的是,在本公开的装置、设备和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本公开的等效方案。
提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本公开。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的,并且在此定义的一般原理可以应用于其他方面而不脱离本公开的范围。因此,本公开不意图被限制到在此示出的方面,而是按照与在此公开的原理和新颖的特征一致的最宽范围。
为了例示和描述的目的已经给出了以上描述。此外,此描述不意图将本公开的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例,但是本领域技术人员将认识到其某些变型、修改、改变、添加和子组合。

Claims (10)

  1. 一种数据生成方法,所述方法包括:
    使用游戏引擎导入机器人模型;
    通过所述游戏引擎中的场景捕捉组件模拟RGBD相机;
    利用所述游戏引擎中的关节控制模块控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作;
    利用所述RGBD相机采集RGBD图像数据;
    根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集。
  2. 根据权利要求1所述的方法,其中,所述使用游戏引擎导入机器人模型,包括:
    根据机器人3D模型按照关节堆叠的方式将机器人的各个关节分别导入游戏引擎中。
  3. 根据权利要求1所述的方法,其中,通过所述游戏引擎中的场景捕捉组件模拟RGBD相机,包括:
    利用场景捕捉组件对场景进行捕捉,得到图像数据;
    渲染所述图像数据到贴图渲染组件;
    选择捕捉数据源将所述图像数据中的彩色图像数据和深度图像数据进行重组,得到重组后的图像数据;
    对所述重组后的图像数据进行彩色图像数据的通道隔离和深度图像数据的单位统一处理,以模拟得到RGBD相机。
  4. 根据权利要求1至3任一项所述的方法,其中,所述RGBD图像包括2D彩色图像和深度图像;根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集,包括:
    将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中,以标注每个关键点在所述2D彩色图像中的位置;
    利用所述深度图像获得各个关键点的深度信息。
  5. 根据权利要求4所述的方法,其中,在将所述21个关键点的3D位姿的坐标转换到所述2D彩色图像中之前,所述方法还包括:
    将21个关键点的3D位姿的坐标转换为在所述RGBD相机坐标系下的坐标,获得21个关键点的相对坐标;
    将所述RGBD图像数据和21个关键点的相对坐标进行对应。
  6. 一种数据生成装置,所述装置包括:
    模型导入模块,配置为使用游戏引擎导入机器人模型;
    相机模拟模块,配置为通过所述游戏引擎中的场景捕捉组件模拟RGBD相机;
    关节控制模块,配置为控制所导入的机器人模型的人手在所述RGBD相机的视野范围内动作;
    图像采集控制模块,配置为利用所述RGBD相机采集RGBD图像数据;
    数据生成模块,配置为根据所述RGBD图像数据和21个关键点的3D位姿的坐标信息,生成带有21个关键点坐标的标注数据集。
  7. 根据权利要求6所述的装置,其中,
    所述模型导入模块,配置为根据机器人3D模型按照关节堆叠的方式将机器人的各个关节分别导入游戏引擎中。
  8. 根据权利要求7所述的装置,其中,
    所述相机模拟模块,配置为利用场景捕捉组件对场景进行捕捉,得到图像数据;渲染所述图像数据到贴图渲染组件;选择捕捉数据源将所述图像数据中的彩色图像数据和深度图像数据进行重组,得到重组后的图像数据;对所述重组后的图像数据进行彩色图像数据的通道隔离和深度图像数据的单位统一处理,以模拟得到RGBD相机。
  9. 一种数据生成装置,包括:一个或多个处理器;存储器,配置为存储一个或多个程序,所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1至5任一项所述的数据生成方法。
  10. 一种计算机可读存储介质,所述存储介质包括一组计算机可执行指令,当所述指令被执行时用于执行权利要求1至5任一项所述的数据生成方法。
PCT/CN2021/119393 2020-10-10 2021-09-18 一种数据生成方法、装置及存储介质 WO2022073415A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/563,692 US20220126447A1 (en) 2020-10-10 2021-12-28 Data generation method and apparatus, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011076496.2A CN112308910B (zh) 2020-10-10 2020-10-10 一种数据生成方法、装置及存储介质
CN202011076496.2 2020-10-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/563,692 Continuation US20220126447A1 (en) 2020-10-10 2021-12-28 Data generation method and apparatus, and storage medium

Publications (1)

Publication Number Publication Date
WO2022073415A1 true WO2022073415A1 (zh) 2022-04-14

Family

ID=74489531

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119393 WO2022073415A1 (zh) 2020-10-10 2021-09-18 一种数据生成方法、装置及存储介质

Country Status (3)

Country Link
US (1) US20220126447A1 (zh)
CN (1) CN112308910B (zh)
WO (1) WO2022073415A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308910B (zh) * 2020-10-10 2024-04-05 达闼机器人股份有限公司 一种数据生成方法、装置及存储介质
CN115167534B (zh) * 2022-07-11 2023-06-20 深圳市乐唯科技开发有限公司 一种游艺游戏设备的多方位转向控制系统及方法
CN115578236A (zh) * 2022-08-29 2023-01-06 上海智能制造功能平台有限公司 基于物理引擎和碰撞实体的位姿估计虚拟数据集生成方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399634A (zh) * 2018-01-16 2018-08-14 达闼科技(北京)有限公司 基于云端计算的rgb-d数据生成方法及装置
CN111368667A (zh) * 2020-02-25 2020-07-03 达闼科技(北京)有限公司 一种数据采集方法、电子设备和存储介质
CN111414409A (zh) * 2020-03-17 2020-07-14 网易(杭州)网络有限公司 游戏引擎之间数据交换方法及装置、存储介质及电子设备
US10796489B1 (en) * 2017-09-13 2020-10-06 Lucasfilm Entertainment Company Ltd. Game engine responsive to motion-capture data for mixed-reality environments
CN112308910A (zh) * 2020-10-10 2021-02-02 达闼机器人有限公司 一种数据生成方法、装置及存储介质

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530619B (zh) * 2013-10-29 2016-08-31 北京交通大学 基于rgb-d数据构成的少量训练样本的手势识别方法
WO2018140656A1 (en) * 2017-01-26 2018-08-02 Matterport, Inc. Capturing and aligning panoramic image and depth data
CN108564642A (zh) * 2018-03-16 2018-09-21 中国科学院自动化研究所 基于ue引擎的无标记表演捕捉系统
CN108776773B (zh) * 2018-05-04 2022-03-29 华南理工大学 一种基于深度图像的三维手势识别方法及交互系统
WO2020061432A1 (en) * 2018-09-21 2020-03-26 Cubic Corporation Markerless human movement tracking in virtual simulation
CN110751716B (zh) * 2019-05-08 2024-02-02 叠境数字科技(上海)有限公司 基于单视角rgbd传感器的虚拟试鞋方法
CN110956065B (zh) * 2019-05-11 2022-06-10 魔门塔(苏州)科技有限公司 一种用于模型训练的人脸图像处理方法及装置
CN111161387B (zh) * 2019-12-31 2023-05-30 华东理工大学 堆叠场景下合成图像的方法及系统、存储介质、终端设备
CN111274927A (zh) * 2020-01-17 2020-06-12 北京三快在线科技有限公司 一种训练数据的生成方法、装置、电子设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10796489B1 (en) * 2017-09-13 2020-10-06 Lucasfilm Entertainment Company Ltd. Game engine responsive to motion-capture data for mixed-reality environments
CN108399634A (zh) * 2018-01-16 2018-08-14 达闼科技(北京)有限公司 基于云端计算的rgb-d数据生成方法及装置
CN111368667A (zh) * 2020-02-25 2020-07-03 达闼科技(北京)有限公司 一种数据采集方法、电子设备和存储介质
CN111414409A (zh) * 2020-03-17 2020-07-14 网易(杭州)网络有限公司 游戏引擎之间数据交换方法及装置、存储介质及电子设备
CN112308910A (zh) * 2020-10-10 2021-02-02 达闼机器人有限公司 一种数据生成方法、装置及存储介质

Also Published As

Publication number Publication date
US20220126447A1 (en) 2022-04-28
CN112308910A (zh) 2021-02-02
CN112308910B (zh) 2024-04-05

Similar Documents

Publication Publication Date Title
WO2022073415A1 (zh) 一种数据生成方法、装置及存储介质
CN106200983B (zh) 一种结合虚拟现实与bim实现虚拟现实场景建筑设计的系统
Elliott et al. TBAG: A high level framework for interactive, animated 3D graphics applications
CN112836064B (zh) 知识图谱补全方法、装置、存储介质及电子设备
JP7394977B2 (ja) アニメーションを作成する方法、装置、コンピューティング機器及び記憶媒体
CN109710357B (zh) 一种基于Unity3D引擎实现服务器操作的方法及系统
US9472119B2 (en) Computer-implemented operator training system and method of controlling the system
CN107566793A (zh) 用于远程协助的方法、装置、系统及电子设备
CN110348109A (zh) 三维仿真培训数据处理的方法及终端设备
JP7267068B2 (ja) 学習済みモデル生成装置、プログラム及び学習済みモデル生成システム
CN109360274A (zh) 沉浸式虚拟现实构建方法、装置、智能升降桌及存储介质
US11710039B2 (en) Systems and methods for training image detection systems for augmented and mixed reality applications
CN112233208B (zh) 机器人状态处理方法、装置、计算设备和存储介质
WO2021220658A1 (ja) 情報処理装置およびプログラム
KR102568699B1 (ko) 360도 파노라마 실내 영상으로부터 생성된 포인트 클라우드의 바닥면을 고려한 후처리 방법
WO2024000480A1 (zh) 3d虚拟对象的动画生成方法、装置、终端设备及介质
Šiđanin et al. Immersive virtual reality course at the digital production studies
CN115994981B (zh) 一种用于应急演练方案的三维自动推演方法
Sisyukov et al. Web Based GPU Acceleration in Embodied Agent Training Workflow
D’Ambrogio et al. Supporting Engineering Areas
Dewberry et al. Problems and Solutions of Point Cloud Mapping for VR and CAVE Environments for Data Visualization and Physics Simulation
Lin Optimal Design of Vocal Music Teaching Platform Based on Virtual Reality Technology
Pynkyawati et al. Virtual Reality as a Tool in Architectural Design Process
Abramova et al. Real-time motion tracking for dance visualization using Kalman filters
Singh et al. Enabling Real-time Gesture Recognition Data Delivery over ROS and OpenISS

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21876933

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21876933

Country of ref document: EP

Kind code of ref document: A1