CN115205324A

CN115205324A - Target object orientation determining method and device

Info

Publication number: CN115205324A
Application number: CN202110378944.2A
Authority: CN
Inventors: 朱静; 王兵; 卿泉; 王刚; 刘挺
Original assignee: Taobao China Software Co Ltd
Current assignee: Taobao China Software Co Ltd
Priority date: 2021-04-08
Filing date: 2021-04-08
Publication date: 2022-10-18

Abstract

The embodiment of the specification provides a target object orientation determining method and a target object orientation determining device, wherein the target object orientation determining method comprises the steps of receiving an i +1 th video frame containing a target object, and acquiring the shape, key points and an edge detection frame of the target object in the i +1 th video frame; determining a target attribute value of the target object based on the shape of the target object and the keypoints; determining a first orientation of a target object in the (i + 1) th video frame according to the target attribute value of the target object and an edge detection frame; determining a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i +1 th video frame, and determining the target orientation of the target object in the i +1 th video frame based on the first orientation and the second orientation.

Description

Target object orientation determination method and device

技术领域technical field

本说明书实施例涉及计算机技术领域，特别涉及一种目标对象朝向确定方法。本说明书一个或者多个实施例同时涉及一种目标对象朝向确定装置，一种计算设备，以及一种计算机可读存储介质。The embodiments of the present specification relate to the field of computer technologies, and in particular, to a method for determining the orientation of a target object. One or more embodiments of the present specification simultaneously relate to an apparatus for determining the orientation of a target object, a computing device, and a computer-readable storage medium.

背景技术Background technique

在车辆自动驾驶逐渐成熟，寻求落地，走向量产的大背景下，车辆对障碍物的感知和结构化输出也有了更高的稳定性、全面性、精度的要求。在自动驾驶领域，车辆的朝向计算是一个感知系统的基础任务，属于车辆的位姿估计的一部分，后续对车辆的轨迹预测、规划控制也都依赖于朝向计算，现有的自动驾驶方案中对车辆的朝向计算以激光雷达为主，但是对于无激光雷达或处于激光雷达盲区的场景，则不能很好的识别车辆的朝向。Under the background of the gradual maturity of vehicle autonomous driving, the pursuit of landing, and the mass production, the vehicle's perception of obstacles and structured output also have higher requirements for stability, comprehensiveness, and accuracy. In the field of automatic driving, the calculation of the orientation of the vehicle is a basic task of the perception system, which is a part of the pose estimation of the vehicle. The subsequent trajectory prediction and planning control of the vehicle also depend on the orientation calculation. The orientation calculation of the vehicle is mainly based on lidar, but for scenes without lidar or in the blind spot of lidar, the direction of the vehicle cannot be well identified.

因此急需提供一种可以提高车辆朝向识别精度和稳定性的目标对象朝向确定方法。Therefore, there is an urgent need to provide a method for determining the orientation of a target object that can improve the accuracy and stability of vehicle orientation recognition.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本说明书施例提供了一种目标对象朝向确定方法。本说明书一个或者多个实施例同时涉及一种目标对象朝向确定装置，一种计算设备，以及一种计算机可读存储介质，以解决现有技术中存在的技术缺陷。In view of this, the embodiments of the present specification provide a method for determining the orientation of a target object. One or more embodiments of the present specification simultaneously relate to an apparatus for determining the orientation of a target object, a computing device, and a computer-readable storage medium, so as to solve the technical defects existing in the prior art.

根据本说明书实施例的第一方面，提供了一种目标对象朝向确定方法，包括：According to a first aspect of the embodiments of the present specification, a method for determining the orientation of a target object is provided, including:

接收包含目标对象的第i+1个视频帧，并获取所述第i+1个视频帧中目标对象的形状、关键点以及边缘检测框；Receive the i+1 th video frame including the target object, and obtain the shape, key points and edge detection frame of the target object in the i+1 th video frame;

基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值；Determine the target attribute value of the target object based on the shape and key points of the target object;

根据所述目标对象的目标属性值以及边缘检测框确定所述第i+1个视频帧中目标对象的第一朝向；Determine the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and the edge detection frame;

基于第i个视频帧中目标对象的目标朝向确定第i+1个视频帧中目标对象的第二朝向，基于所述第一朝向和所述第二朝向确定所述第i+1个视频帧中目标对象的目标朝向。The second orientation of the target object in the i+1 th video frame is determined based on the target orientation of the target object in the ith video frame, and the i+1 th video frame is determined based on the first orientation and the second orientation The target orientation of the target object in .

根据本说明书实施例的第二方面，提供了一种目标对象朝向确定装置，包括：According to a second aspect of the embodiments of the present specification, there is provided an apparatus for determining the orientation of a target object, including:

第一视频接收模块，被配置为接收包含目标对象的第i+1个视频帧，并获取所述第i+1个视频帧中目标对象的形状、关键点以及边缘检测框；The first video receiving module is configured to receive the i+1 th video frame including the target object, and obtain the shape, key points and edge detection frame of the target object in the i+1 th video frame;

第一确定模块，被配置为基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值；a first determining module, configured to determine a target attribute value of the target object based on the shape and key points of the target object;

第二确定模块，被配置为根据所述目标对象的目标属性值以及边缘检测框确定所述第i+1个视频帧中目标对象的第一朝向；a second determining module, configured to determine the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and the edge detection frame;

第一目标朝向确定模块，被配置为基于第i个视频帧中目标对象的目标朝向确定第i+1个视频帧中目标对象的第二朝向，基于所述第一朝向和所述第二朝向确定所述第i+1个视频帧中目标对象的目标朝向。a first target orientation determination module configured to determine a second orientation of the target object in the i+1 th video frame based on the target orientation of the target object in the i th video frame, based on the first orientation and the second orientation Determine the target orientation of the target object in the i+1 th video frame.

根据本说明书实施例的第三方面，提供了一种计算设备，包括：According to a third aspect of the embodiments of the present specification, a computing device is provided, including:

存储器和处理器；memory and processor;

所述存储器用于存储计算机可执行指令，所述处理器用于执行所述计算机可执行指令，该计算机可执行指令被处理器执行时实现所述目标对象朝向确定方法的步骤。The memory is used for storing computer-executable instructions, the processor is used for executing the computer-executable instructions, and when the computer-executable instructions are executed by the processor, the steps of the method for determining the orientation of the target object are implemented.

根据本说明书实施例的第四方面，提供了一种计算机可读存储介质，其存储有计算机可执行指令，该计算机可执行指令被处理器执行时实现所述目标对象朝向确定方法的步骤。According to a fourth aspect of the embodiments of the present specification, a computer-readable storage medium is provided, which stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, implements the steps of the method for determining the orientation of a target object.

本说明书一个实施例实现了目标对象朝向确定方法及装置，其中，所述目标对象朝向确定方法包括接收包含接收包含目标对象的第i+1个视频帧，并获取所述第i+1个视频帧中目标对象的形状、关键点以及边缘检测框；基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值；根据所述目标对象的目标属性值以及边缘检测框确定所述第i+1个视频帧中目标对象的第一朝向；基于第i个视频帧中目标对象的目标朝向确定第i+1个视频帧中目标对象的第二朝向，基于所述第一朝向和所述第二朝向确定所述第i+1个视频帧中目标对象的目标朝向；具体的，所述目标对象朝向确定方法通过将目标朝向的计算分解为对目标对象的形状估计、目标属性值计算以及朝向计算等多个步骤实现，使得各个实现步骤之间可以很好的解耦和融合，最终可以获得精确、稳定的目标对象的目标朝向。An embodiment of the present specification implements a method and apparatus for determining the orientation of a target object, wherein the method for determining the orientation of a target object includes receiving the i+1 th video frame including the target object, and acquiring the i+1 th video frame the shape, key points and edge detection frame of the target object in the frame; determine the target attribute value of the target object based on the shape and key points of the target object; determine the target attribute value of the target object and the edge detection frame The first orientation of the target object in the i+1 th video frame; the second orientation of the target object in the i+1 th video frame is determined based on the target orientation of the target object in the i th video frame, based on the first orientation and The second orientation determines the target orientation of the target object in the i+1th video frame; specifically, the target object orientation determination method decomposes the calculation of the target orientation into shape estimation and target attribute values of the target object. Multiple steps such as calculation and orientation calculation are implemented, so that each implementation step can be well decoupled and integrated, and finally an accurate and stable target orientation of the target object can be obtained.

附图说明Description of drawings

图1是本说明书一个实施例提供的一种目标对象朝向确定方法的具体应用场景的示例图；FIG. 1 is an example diagram of a specific application scenario of a method for determining the orientation of a target object provided by an embodiment of the present specification;

图2是本说明书一个实施例提供的一种目标对象朝向确定方法的流程图；2 is a flowchart of a method for determining the orientation of a target object provided by an embodiment of the present specification;

图3是本说明书一个实施例提供的一种目标对象朝向确定方法中包含目标对象的视频帧的示意图；3 is a schematic diagram of a video frame including a target object in a method for determining the orientation of a target object provided by an embodiment of the present specification;

图4是本说明书一个实施例提供的一种目标对象朝向确定方法中目标对象在世界坐标系的投影关系示意图；4 is a schematic diagram of the projection relationship of the target object in the world coordinate system in a method for determining the orientation of a target object provided by an embodiment of the present specification;

图5是本说明书一个实施例提供的一种目标对象朝向确定方法在车辆自动驾驶的应用的流程图；FIG. 5 is a flowchart of an application of a method for determining the orientation of a target object provided in an embodiment of the present specification to automatic driving of a vehicle;

图6是本说明书一个实施例提供的一种目标对象朝向确定装置的结构示意图；6 is a schematic structural diagram of an apparatus for determining the orientation of a target object provided by an embodiment of the present specification;

图7是本说明书一个实施例提供的一种计算设备的结构框图。FIG. 7 is a structural block diagram of a computing device provided by an embodiment of the present specification.

具体实施方式Detailed ways

在下面的描述中阐述了很多具体细节以便于充分理解本说明书。但是本说明书能够以很多不同于在此描述的其它方式来实施，本领域技术人员可以在不违背本说明书内涵的情况下做类似推广，因此本说明书不受下面公开的具体实施的限制。In the following description, numerous specific details are set forth in order to provide a thorough understanding of this specification. However, this specification can be implemented in many other ways different from those described herein, and those skilled in the art can make similar promotions without departing from the connotation of this specification. Therefore, this specification is not limited by the specific implementation disclosed below.

在本说明书一个或多个实施例中使用的术语是仅仅出于描述特定实施例的目的，而非旨在限制本说明书一个或多个实施例。在本说明书一个或多个实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式，除非上下文清楚地表示其他含义。还应当理解，本说明书一个或多个实施例中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in one or more embodiments of this specification is for the purpose of describing a particular embodiment only and is not intended to limit the one or more embodiments of this specification. As used in the specification or embodiments and the appended claims, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used in this specification in one or more embodiments refers to and includes any and all possible combinations of one or more of the associated listed items.

应当理解，尽管在本说明书一个或多个实施例中可能采用术语第一、第二等来描述各种信息，但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如，在不脱离本说明书一个或多个实施例范围的情况下，第一也可以被称为第二，类似地，第二也可以被称为第一。取决于语境，如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It will be understood that although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, such information should not be limited by these terms. These terms are only used to distinguish the same type of information from each other. For example, a first could be termed a second, and similarly, a second could be termed a first, without departing from the scope of one or more embodiments of this specification. Depending on the context, the word "if" as used herein can be interpreted as "at the time of" or "when" or "in response to determining."

首先，对本说明书一个或多个实施例涉及的名词术语进行解释。First, the terminology involved in one or more embodiments of the present specification is explained.

射影几何：区别于欧几里得几何，通常应用在真实世界和图像平面的转换和计算中。Projective geometry: Different from Euclidean geometry, usually used in real-world and image plane transformations and calculations.

长尾问题：受限于样本的丰富性和模型的性能，通常自动驾驶领域都会面临很多使得现有系统犯错的边角案例(CornerCase)。Long tail problem: Limited by the richness of samples and the performance of models, the field of autonomous driving usually faces many corner cases (CornerCases) that make existing systems make mistakes.

可解释性：端到端的模型，通常是一个黑盒，面临可解释性能弱的问题。Interpretability: An end-to-end model, usually a black box, suffers from weak interpretability.

边缘计算：要求在端上完成实时计算，通常强调高的实时性，区别于云计算。Edge computing: It is required to complete real-time computing on the end, usually emphasizing high real-time performance, which is different from cloud computing.

相机外参：相机外参数(Camera Extrinsics)，相机外参数是在世界坐标系中的参数，例如相机的位置、旋转方向等。Camera extrinsics: Camera extrinsics, camera extrinsics are parameters in the world coordinate system, such as camera position, rotation direction, etc.

Kalman滤波：卡尔曼滤波，是一种利用线性系统状态方程，通过系统输入输出观测数据，对系统状态进行优选估计的算法。Kalman filter: Kalman filter is an algorithm that uses the linear system state equation to optimally estimate the system state through the system input and output observation data.

单车模型：即运动学自动车模型，当获得当前时刻的车辆朝向的情况下，通过该单车模型可以对下一时刻的车辆朝向进行预测。Bicycle model: that is, a kinematic automatic car model. When the current vehicle orientation is obtained, the bicycle model can be used to predict the vehicle orientation at the next moment.

在车辆自动驾驶领域，车辆的朝向计算是一个感知系统的基础任务，属于车辆的位姿估计的一部分，后续流程中的车辆轨迹预测、规划控制也都依赖于朝向。本说明书实施例提供了一种基于视觉的目标对象朝向确定方法，以实现对车辆的朝向进行更加精确的计算，对于安装有激光雷达配置的车辆应用，再配合本说明书的高精度视觉朝向计算方法，能够极大的提高朝向计算的精度和稳定性；对于无激光雷达或处于雷达盲区的车辆应用场景，采用本说明书提供的视觉方案可以独立承担朝向计算和输出，使得本说明书提供的目标对象朝向方法可以适用于各种应用场景中，提升用户体验。In the field of vehicle autonomous driving, the calculation of the orientation of the vehicle is a basic task of a perception system, which is a part of the pose estimation of the vehicle. The vehicle trajectory prediction and planning control in the subsequent process also depend on the orientation. The embodiment of this specification provides a method for determining the orientation of a target object based on vision, so as to realize more accurate calculation of the orientation of the vehicle. For the application of the vehicle equipped with the lidar configuration, the high-precision visual orientation calculation method of this specification is combined with the application. , which can greatly improve the accuracy and stability of orientation calculation; for vehicle application scenarios without lidar or in the radar blind area, the vision solution provided in this manual can independently undertake orientation calculation and output, so that the target object provided in this manual is oriented towards The method can be applied to various application scenarios to improve user experience.

基于此，在本说明书中，提供了一种目标对象朝向确定方法。本说明书一个或者多个实施例同时涉及一种目标对象朝向确定装置，一种计算设备，以及一种计算机可读存储介质，在下面的实施例中逐一进行详细说明。Based on this, in this specification, a method for determining the orientation of a target object is provided. One or more embodiments of this specification simultaneously relate to an apparatus for determining the orientation of a target object, a computing device, and a computer-readable storage medium, which will be described in detail one by one in the following embodiments.

参见图1，图1示出了本说明书一个实施例提供的一种目标对象朝向确定方法的具体应用场景的示例图。Referring to FIG. 1 , FIG. 1 shows an example diagram of a specific application scenario of a method for determining the orientation of a target object provided by an embodiment of the present specification.

图1的应用场景中包括图像采集终端102、图像接收终端104和服务器106，具体的，图像接收终端104接收图像采集终端102实时采集包含车辆的图像a；图像接收终端104在接收到图像a后，将图像a发送至服务器106，服务器106接收到图像a后，将图像a输入车型检测模型，获得该图像a中车辆的车型；将图像a输入关键点检测模型，获得该图像a中车辆的关键点；以及将图像a输入全车检测模型，获得该图像a中车辆的全车检测框；其中，车型检测模型、关键点检测模型以及全车检测模型可以理解为采用卷积神经网络训练的深度学习模型。The application scenario of FIG. 1 includes an image capture terminal 102 , an image receiving terminal 104 and a server 106 . Specifically, the image receiving terminal 104 receives the image a including the vehicle captured by the image capture terminal 102 in real time; after the image receiving terminal 104 receives the image a , send the image a to the server 106, after the server 106 receives the image a, input the image a into the vehicle type detection model to obtain the vehicle type of the vehicle in the image a; input the image a into the key point detection model to obtain the vehicle type in the image a key points; and input the image a into the whole-vehicle detection model to obtain the whole-vehicle detection frame of the vehicle in the image a; among them, the vehicle-type detection model, the key point detection model and the whole-vehicle detection model can be understood as using convolutional neural network training. Deep learning models.

具体计算时，首先利用该图像a中车辆的车型，确定车辆长度和宽度的范围，即车辆长宽1；例如车辆的车型为SUV，一般紧凑型SUV的长度在4.4m～4.65m左右，宽度为1.8m左右，大的SUV的长度在4.7m～5m左右，宽度为1.9m左右，那么基于车辆的车型确定的车辆的长度范围为4.4m～5m，车辆的宽度范围为1.8m～1.9m。In the specific calculation, first use the model of the vehicle in the image a to determine the range of the length and width of the vehicle, that is, the length and width of the vehicle is 1; for example, the model of the vehicle is an SUV. It is about 1.8m, the length of a large SUV is about 4.7m~5m, and the width is about 1.9m, then the length range of the vehicle determined based on the model of the vehicle is 4.4m~5m, and the width range of the vehicle is 1.8m~1.9m .

然后利用该图像a中车辆的关键点以及获取该图像a的相机外参计算出该图像a中车辆的真实长度和宽度，即车辆长宽2；其中，关键点为图1的该图像a中的E、F、D、G点等。若该图像a为包含该车辆的第一帧图像，那么在车辆长宽2在车辆长宽1的范围内的情况下，则将车辆长宽2作为该图像a中车辆的目标长宽，即图1中的车辆长宽3；若该图像a为包含该车辆的第一帧图像之后的其他第二帧、第三帧或者第四帧图像的情况下，则利用前一帧图像的车辆的真实长宽预测出当前图像a中车辆的预测长宽，然后将通过车辆的关键点以及相机外参获得的该图像a中车辆的真实长宽与通过前一帧图像中车辆的长宽预测出的该图像a中车辆的预测长宽输入到卡尔曼滤波器中进行融合矫正，以获得该图像a中车辆的目标长宽。实际应用中，在车辆行驶的过程中，每帧图像对应的相机外参也会发生变化，因此在计算图像中车辆的目标长宽时，需要基于相机外参对车辆的长宽进行迭代计算，以获得每帧图像中车辆的准确长宽，保证后续对于车辆朝向计算的准确性。Then use the key points of the vehicle in the image a and the external parameters of the camera that obtained the image a to calculate the real length and width of the vehicle in the image a, that is, the vehicle length and width 2; where the key points are in the image a in FIG. 1 . E, F, D, G points, etc. If the image a is the first frame image including the vehicle, then if the vehicle length and width 2 is within the range of the vehicle length and width 1, the vehicle length and width 2 is taken as the target length and width of the vehicle in the image a, that is, The length and width of the vehicle in Fig. 1 are 3; if the image a contains other second, third or fourth frame images after the first frame image of the vehicle, the vehicle's The real length and width predict the predicted length and width of the vehicle in the current image a, and then predict the actual length and width of the vehicle in the image a obtained through the key points of the vehicle and the camera external parameters and the length and width of the vehicle in the previous frame image. The predicted length and width of the vehicle in the image a is input into the Kalman filter for fusion correction to obtain the target length and width of the vehicle in the image a. In practical applications, the camera extrinsic parameters corresponding to each frame of image will also change during the driving process of the vehicle. Therefore, when calculating the target length and width of the vehicle in the image, it is necessary to iteratively calculate the length and width of the vehicle based on the camera extrinsic parameters. In order to obtain the accurate length and width of the vehicle in each frame of image, to ensure the accuracy of the subsequent calculation of the vehicle orientation.

在获得该图像a中车辆的目标长宽，即车辆长宽3后，通过射影几何根据该图像a中车辆的全车检测框在预设坐标系中进行投影，获得该图像a中车辆的全车检测框在预设坐标系中的映射关键点，其中，预设坐标系为相机的世界坐标系，在确定了该图像a中车辆的目标长宽以及映射关键点之后，则可以基于该图像a中车辆的目标长宽以及映射关键点在世界坐标系中的坐标值计算获得该图像a中车辆的车辆朝向，实际应用中，该图像a为包含该车辆的第一帧图像，那么该车辆朝向即为该图像a中车辆的目标车辆朝向；而该图像a存在前一帧包含该车辆的图像的情况下，则可以通过单车模型获得该图像a中车辆的预测车辆朝向，然后将该图像a中车辆的预测车辆朝向以及上述通过车辆的长宽以及映射关键点计算出来的车辆朝向输入卡尔曼滤波器中进行处理，获得该图像a中车辆的目标车辆朝向。After obtaining the target length and width of the vehicle in the image a, that is, the vehicle length and width 3, the entire vehicle detection frame of the vehicle in the image a is projected in the preset coordinate system through projective geometry, and the entire vehicle in the image a is obtained. The mapping key points of the vehicle detection frame in the preset coordinate system, where the preset coordinate system is the world coordinate system of the camera. After the target length and width of the vehicle in the image a and the mapping key points are determined, the image The target length and width of the vehicle in a and the coordinate values of the mapping key points in the world coordinate system are calculated to obtain the vehicle orientation of the vehicle in the image a. In practical applications, the image a is the first frame image containing the vehicle, then the vehicle The orientation is the target vehicle orientation of the vehicle in the image a; and if the image a has an image containing the vehicle in the previous frame, the predicted vehicle orientation of the vehicle in the image a can be obtained through the single-vehicle model, and then the image The predicted vehicle orientation of the vehicle in a and the vehicle orientation calculated by the length and width of the vehicle and the mapping key points are input into the Kalman filter for processing, and the target vehicle orientation of the vehicle in the image a is obtained.

本说明书实施例提供的所述目标对象朝向确定方法应用在对车辆朝向计算中，将车辆朝向计算拆解为车型估计、长宽计算以及Kalman滤波等多个部分，能够通过不同实现方式的解耦和融合，最终获得车辆较为精确以及稳定的朝向，并且车辆整体的朝向计算模型上只依赖关键点，本身计算量较小，因此采用本说明书实施例提供的所述目标对象朝向确定方法对车辆的朝向进行计算，在精度、稳定性、适用范围以及实时性上都存在较大的优势。The method for determining the orientation of the target object provided in the embodiments of this specification is applied to the calculation of the orientation of the vehicle, and the calculation of the orientation of the vehicle is decomposed into multiple parts such as model estimation, length and width calculation, and Kalman filtering, which can be decoupled through different implementations. and fusion, and finally obtain a more accurate and stable orientation of the vehicle, and the overall orientation calculation model of the vehicle only relies on key points, and the amount of calculation itself is small. It has great advantages in accuracy, stability, scope of application and real-time performance.

参见图2，图2示出了根据本说明书一个实施例提供的一种目标对象朝向确定方法的流程图，具体包括以下步骤。Referring to FIG. 2, FIG. 2 shows a flowchart of a method for determining the orientation of a target object provided according to an embodiment of the present specification, which specifically includes the following steps.

步骤202：接收包含目标对象的第i+1个视频帧，并获取所述第i+1个视频帧中目标对象的形状、关键点以及边缘检测框。Step 202: Receive the i+1 th video frame containing the target object, and acquire the shape, key points and edge detection frame of the target object in the i+1 th video frame.

其中，目标对象包括但不限于两轮车、三轮车、四轮车或者是其他的多轮车等，又或者是物流车辆、公共服务车辆、医疗服务车辆、终端服务车辆等；此外，i为正整数，例如i为1，那么i+1则为2，实际应用中，本说明书实施例提供的所述目标对象朝向确定方法可以对静止的车辆的朝向进行预测，也可以对行驶中的车辆进行预测，为了便于理解，本说明书以行驶中的四轮汽车为例对本方案进行详细介绍。Among them, the target objects include but are not limited to two-wheeled vehicles, three-wheeled vehicles, four-wheeled vehicles or other multi-wheeled vehicles, etc., or logistics vehicles, public service vehicles, medical service vehicles, terminal service vehicles, etc.; in addition, i is a positive Integer, for example, if i is 1, then i+1 is 2. In practical applications, the method for determining the orientation of the target object provided in the embodiments of this specification can predict the orientation of a stationary vehicle, or can also perform a calculation on a moving vehicle. Prediction, in order to facilitate understanding, this specification takes a four-wheeled vehicle in motion as an example to introduce the solution in detail.

具体的，接收包含目标对象的第i+1个视频帧，则可以理解为接收通过摄像机获取的包含行驶中车辆的第i+1个视频帧，并获取所述第i+1个视频帧中目标对象的形状、关键点以及边缘检测框。Specifically, receiving the i+1 th video frame containing the target object can be understood as receiving the i+1 th video frame obtained by the camera and containing the moving vehicle, and obtaining the i+1 th video frame in the video frame. The shape, keypoints, and edge detection boxes of the target object.

以i为1为例，那么第i+1个视频帧则为包含行驶中车辆的第2个视频帧；在获得第2个视频帧之后，获取第2个视频帧中该车辆的形状、关键点以及边缘检测框；其中，车辆的形状可以理解为车辆的车型，车辆的边缘检测框可以理解为车辆的全车检测框。Taking i as 1 as an example, then the i+1th video frame is the second video frame containing the moving vehicle; after obtaining the second video frame, obtain the shape and key of the vehicle in the second video frame. Point and edge detection frame; among them, the shape of the vehicle can be understood as the model of the vehicle, and the edge detection frame of the vehicle can be understood as the whole vehicle detection frame of the vehicle.

具体实施时，所述获取第i+1个视频帧中所述目标对象的形状、关键点以及边缘检测框，包括：During specific implementation, the acquiring the shape, key points and edge detection frame of the target object in the i+1 th video frame includes:

将所述第i+1个视频帧分别输入第一识别模型、第二识别模型以及第三识别模型，获取所述第i+1个视频帧中目标对象的形状、关键点以及边缘检测框。The i+1 th video frame is input into the first recognition model, the second recognition model and the third recognition model respectively, and the shape, key points and edge detection frame of the target object in the i+1 th video frame are acquired.

其中，第一识别模型、第二识别模型以及第三识别模型可以为理解为采用卷积神经网络训练的深度学习模型。The first recognition model, the second recognition model and the third recognition model may be understood as deep learning models trained by using a convolutional neural network.

沿用上例，将第2个视频帧输入第一识别模型，可以获得第2个视频帧中车辆的车型；将第2个视频帧输入第二识别模型，可以获得第2个视频帧中车辆的关键点；将第2个视频帧输入第三识别模型，可以获得第2个视频帧中车辆的全车检测框；车辆的全车检测框可以展示出车辆三维检测的范围以及车辆的边缘信息；车辆的关键点可以展示出车轮点、车灯点，具有明确的物理意义和纹理，同时车辆的关键点信息还包含了车辆侧面框的信息。Following the above example, input the second video frame into the first recognition model, the vehicle type in the second video frame can be obtained; input the second video frame into the second recognition model, the vehicle type in the second video frame can be obtained. Key points; input the second video frame into the third recognition model, the whole-vehicle detection frame of the vehicle in the second video frame can be obtained; the whole-vehicle detection frame of the vehicle can show the range of the 3D detection of the vehicle and the edge information of the vehicle; The key points of the vehicle can show wheel points and light points, with clear physical meaning and texture, and the key point information of the vehicle also includes the information of the side frame of the vehicle.

本说明书实施例中，通过不同的识别模型，分别获取目标对象的形状、关键点以及边缘检测框，针对目标对象的每一个特征，采用有针对性的模型进行准确获取，后续再将其融合计算，以获得准确的目标对象的朝向，相比原有的大模型直接获得目标对象的朝向更加快速和准确，例如采用关键点模型获取目标对象的关键点，该关键点模型具有很强的图像纹理分析能力，相对于原有的大模型在模型的稳定性和长尾问题的处理上存在很大的优势。In the embodiment of this specification, the shape, key points and edge detection frame of the target object are obtained through different recognition models, and a targeted model is used to accurately obtain each feature of the target object, and then the fusion calculation is performed later. , to obtain the accurate orientation of the target object. Compared with the original large model, it is faster and more accurate to directly obtain the orientation of the target object. For example, the key point model is used to obtain the key points of the target object. The key point model has strong image texture. Compared with the original large model, the analysis ability has great advantages in the stability of the model and the handling of the long tail problem.

参见图3，图3示出了根据本说明书一个实施例提供的一种目标对象朝向确定方法中包含目标对象的视频帧的示意图。Referring to FIG. 3, FIG. 3 shows a schematic diagram of a video frame including a target object in a method for determining the orientation of a target object provided according to an embodiment of the present specification.

由图3中可以看出，该视频帧中的目标对象为车辆，将图3中包含车辆的视频帧分别输入到三个识别模型中，获得图3中车辆的全车检测框，即包围在车辆周围的、由ABCD组成的长方形边缘检测框，获得图3中车辆的关键点：E、F、G、D，获得图3中的车辆的车型。As can be seen from Figure 3, the target object in the video frame is a vehicle, and the video frames containing the vehicle in Figure 3 are input into the three recognition models respectively to obtain the whole vehicle detection frame of the vehicle in The rectangular edge detection frame composed of ABCD around the vehicle is used to obtain the key points of the vehicle in Figure 3: E, F, G, and D, and the model of the vehicle in Figure 3 is obtained.

具体的，所述接收包含目标对象的第i+1个视频帧之前，还包括：Specifically, before the receiving the i+1 th video frame including the target object, the method further includes:

接收包含目标对象的第i个视频帧，并获取所述第i个视频帧中目标对象的形状、关键点以及边缘检测框；Receive the ith video frame containing the target object, and obtain the shape, key points and edge detection frame of the target object in the ith video frame;

根据所述目标对象的目标属性值以及边缘检测框确定所述第i个视频帧中目标对象的目标朝向。The target orientation of the target object in the ith video frame is determined according to the target attribute value of the target object and the edge detection frame.

其中，第i个视频帧中的目标对象与第i+1个视频帧中的目标对象为同一个目标对象。Wherein, the target object in the i-th video frame and the target object in the i+1-th video frame are the same target object.

实际应用中，所述目标对象朝向确定方法应用于车辆行驶的场景中，而对于行驶中的车辆，在对车辆进行朝向计算时，为了获得车辆的精确、稳定的朝向，会基于车辆的当前帧的上一时刻的车辆朝向对当前帧的车辆的朝向进行预测，然后基于预测的车辆朝向以及通过当前帧的车辆的长宽以及关键点等计算出的车辆朝向获得当前帧的车辆的目标朝向。In practical applications, the method for determining the orientation of the target object is applied to the scene where the vehicle is driving. For a moving vehicle, when calculating the orientation of the vehicle, in order to obtain an accurate and stable orientation of the vehicle, the current frame of the vehicle will be used. The orientation of the vehicle at the previous moment is predicted to the orientation of the vehicle in the current frame, and then the target orientation of the vehicle in the current frame is obtained based on the predicted vehicle orientation and the vehicle orientation calculated through the length and width of the vehicle in the current frame and key points.

因此，在对第i+1个视频帧中的车辆的目标朝向进行确定时，需要获取第i个视频帧中车辆的目标朝向。Therefore, when determining the target orientation of the vehicle in the i+1 th video frame, it is necessary to obtain the target orientation of the vehicle in the i th video frame.

沿用上例，若i仍为1，首先接收包含车辆的第1个视频帧，并获取第1个视频帧中车辆的形状、关键点以及边缘检测框；然后基于车辆的形状以及关键点确定车辆的目标属性值；最后根据车辆的目标属性值以及边缘检测框确定第1个视频帧中目标对象的目标朝向。Following the above example, if i is still 1, first receive the first video frame containing the vehicle, and obtain the shape, key points and edge detection frame of the vehicle in the first video frame; then determine the vehicle based on the shape and key points of the vehicle. Finally, the target orientation of the target object in the first video frame is determined according to the target attribute value of the vehicle and the edge detection frame.

其中，获取所述第i个视频帧中目标对象的形状、关键点以及边缘检测框的方式可以参见上述实施例中对于第i+1个视频帧中目标对象的形状、关键点以及边缘检测框的具体介绍，在此不再赘述。Wherein, for the method of obtaining the shape, key points and edge detection frame of the target object in the i-th video frame, reference may be made to the shape, key points and edge detection frame of the target object in the i+1-th video frame in the above-mentioned embodiment. The specific introduction will not be repeated here.

具体实施时，所述基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值，包括：During specific implementation, the determining the target attribute value of the target object based on the shape and key points of the target object includes:

基于所述第i个视频帧中目标对象的形状确定所述第i个视频帧中目标对象的第一初始属性值；Determine the first initial attribute value of the target object in the ith video frame based on the shape of the target object in the ith video frame;

基于所述第i个视频帧中目标对象的关键点以及获取所述第i个视频帧的相机外参，确定所述第i个视频帧中目标对象的第二初始属性值；Determine the second initial attribute value of the target object in the i-th video frame based on the key points of the target object in the i-th video frame and the camera extrinsic parameters for obtaining the i-th video frame;

在所述第二初始属性值小于等于所述第一初始属性值的情况下，将所述第二初始属性值作为所述第i个视频帧中目标对象的目标属性值。In the case that the second initial attribute value is less than or equal to the first initial attribute value, the second initial attribute value is used as the target attribute value of the target object in the ith video frame.

其中，属性值可以理解为长宽，那么在目标对象为车辆的情况下，属性值则可以理解为车辆的长宽。Among them, the attribute value can be understood as the length and width, then when the target object is a vehicle, the attribute value can be understood as the length and width of the vehicle.

具体的，在第i个视频帧为包含目标对象的第一个视频帧的情况下，第一个视频帧之前没有其他视频帧，则以基于第i个视频帧中目标对象的关键点以及获取第i个视频帧的相机外参，计算获得的第i个视频帧中目标对象的第二初始属性值作为目标属性值，而该目标属性值也必然在第一初始属性值内。Specifically, in the case where the i-th video frame is the first video frame containing the target object, and there are no other video frames before the first video frame, the key points of the target object in the i-th video frame and the For the camera extrinsic parameters of the i-th video frame, the second initial attribute value of the target object in the i-th video frame obtained by calculation is used as the target attribute value, and the target attribute value must also be within the first initial attribute value.

沿用上例，在获得第1个视频帧中车辆的车型、关键点以及全车检测框后，首先根据该车辆的车型确定该车辆的第一初始长宽，即车辆的长宽范围；然后根据第1个视频帧中车辆的关键点以及获取该视频帧的相机的外参计算获得第1个视频帧中车辆的第二初始长宽；最后在第二初始长宽小于等于第一初始长宽的情况下，将第二初始长宽作为第1个视频帧中车辆的目标长宽。Following the above example, after obtaining the vehicle type, key points and the whole vehicle detection frame in the first video frame, first determine the first initial length and width of the vehicle according to the vehicle type, that is, the length and width range of the vehicle; The key points of the vehicle in the first video frame and the extrinsic parameters of the camera that acquired the video frame are calculated to obtain the second initial length and width of the vehicle in the first video frame; finally, the second initial length and width are less than or equal to the first initial length and width In the case of , take the second initial length and width as the target length and width of the vehicle in the first video frame.

实际应用中，根据车辆的车型可以获得该车型的最大长宽和最小长宽，然后将该车型的最大长宽和最小长宽作为车辆的长度和宽度的范围，而在根据车辆的关键点以及相机外参计算获得车辆的真实长宽后，若车辆的真实长宽在该长度和宽度范围内，则说明根据车辆的关键点以及相机外参计算获得车辆的真实长宽为准确的，可以作为第1个视频帧中车辆的目标长宽；若车辆的真实长宽不在该长度和宽度范围内，则说明根据车辆的关键点以及相机外参计算获得车辆的真实长宽为错误的。In practical applications, the maximum length and width and the minimum length and width of the model can be obtained according to the model of the vehicle, and then the maximum length and width and the minimum length and width of the model are used as the range of the length and width of the vehicle. After the camera extrinsic parameters are calculated to obtain the real length and width of the vehicle, if the real length and width of the vehicle are within the range of the length and width, it means that the calculation of the real length and width of the vehicle according to the key points of the vehicle and the camera extrinsic parameters is accurate, and can be used as The target length and width of the vehicle in the first video frame; if the real length and width of the vehicle is not within the range of length and width, it means that the calculation of the real length and width of the vehicle based on the key points of the vehicle and the external parameters of the camera is wrong.

本说明书实施例中，在第i个视频帧为包含目标对象的第一个视频帧的情况下，以目标对象的形状确定的第一初始属性值作为约束条件，以根据目标对象的关键点以及获取第i个视频帧的相机外参计算出的第i个视频帧中车辆的第二初始属性值作为目标属性值，后续可以快速的获得第一个视频帧中目标对象的朝向。In the embodiment of this specification, in the case where the i-th video frame is the first video frame containing the target object, the first initial attribute value determined by the shape of the target object is used as the constraint condition, so that the key points of the target object and The second initial attribute value of the vehicle in the ith video frame calculated by the camera extrinsic parameters of the ith video frame is obtained as the target attribute value, and the orientation of the target object in the first video frame can be quickly obtained subsequently.

步骤204：基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值。Step 204: Determine a target attribute value of the target object based on the shape and key points of the target object.

具体的，所述基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值，包括：Specifically, determining the target attribute value of the target object based on the shape and key points of the target object includes:

基于所述第i+1个视频帧中目标对象的形状，确定所述第i+1个视频帧中目标对象的第一初始属性值；Based on the shape of the target object in the i+1 th video frame, determine the first initial attribute value of the target object in the i+1 th video frame;

基于所述第i+1个视频帧中目标对象的关键点以及获取所述第i+1个视频帧的相机外参，确定所述第i+1个视频帧中目标对象的第二初始属性值；Determine the second initial attribute of the target object in the i+1 th video frame based on the key points of the target object in the i+1 th video frame and the camera extrinsic parameters used to obtain the i+1 th video frame value;

在所述第二初始属性值小于等于所述第一初始属性值的情况下，基于所述第i个视频帧中目标对象的目标属性值，确定所述第i+1个视频帧中目标对象的第三初始属性值；In the case that the second initial attribute value is less than or equal to the first initial attribute value, determine the target object in the i+1 th video frame based on the target attribute value of the target object in the i th video frame The third initial attribute value of ;

基于所述第二初始属性值和所述第三初始属性值确定所述第i+1个视频帧中目标对象的目标属性值。The target attribute value of the target object in the i+1 th video frame is determined based on the second initial attribute value and the third initial attribute value.

具体的，第i+1个视频帧中目标对象的目标属性值的具体计算方式与上述实施例中对第i个视频帧中目标对象的目标属性值的计算方式不同。Specifically, the specific calculation method of the target attribute value of the target object in the i+1 th video frame is different from the calculation method of the target attribute value of the target object in the ith video frame in the above embodiment.

首先基于第i+1个视频帧中目标对象形状确定第i+1个视频帧中目标对象的第一初始属性值，再根据第i+1个视频帧中目标对象的关键点以及获取第i+1个视频帧中目标对象的相机外参，获得第i+1个视频帧中目标对象的第二初始属性值；对于第i+1个视频帧中目标对象的第一初始属性值和第二初始属性值的计算与上述对第i个视频帧中目标对象的第一初始属性值和第二初始属性值的计算方式相同，在此不做赘述。First, determine the first initial attribute value of the target object in the i+1th video frame based on the shape of the target object in the i+1th video frame, and then obtain the i+1th video frame according to the key points of the target object in the i+1th video frame and obtain the i +1 camera external parameters of the target object in the video frame, obtain the second initial attribute value of the target object in the i+1th video frame; for the first initial attribute value of the target object in the i+1th video frame and the The calculation of the second initial attribute value is the same as the above-mentioned calculation method for the first initial attribute value and the second initial attribute value of the target object in the ith video frame, and details are not described herein.

由于第i+1个视频帧中目标对象的第一初始属性值是第二初始属性值的约束条件，那么在获得第i+1个视频帧中目标对象的第一初始属性值和第二初始属性值之后，则要判断第i+1个视频帧中目标对象的第二初始属性值是否在第一初始属性值内，若是，则可以基于第i个视频帧中目标对象的目标属性值，确定第i+1个视频帧中目标对象的第三初始属性值；实际应用中，即将第i个视频帧中目标对象的目标属性值输入卡尔曼滤波器预测获得第i+1个视频帧中目标对象的第三初始属性值。Since the first initial attribute value of the target object in the i+1 th video frame is a constraint condition of the second initial attribute value, then the first initial attribute value and the second initial attribute value of the target object in the i+1 th video frame are obtained. After the attribute value, it is necessary to judge whether the second initial attribute value of the target object in the i+1th video frame is within the first initial attribute value, if so, it can be based on the target attribute value of the target object in the ith video frame, Determine the third initial attribute value of the target object in the i+1 th video frame; in practical applications, the target attribute value of the target object in the ith video frame is input into the Kalman filter to predict and obtain in the i+1 th video frame. The third initial property value of the target object.

最后将第i+1个视频帧中目标对象的第二初始属性值和第三初始属性值进行融合矫正，以获得第i+1个视频帧中目标对象的目标属性值。Finally, the second initial attribute value and the third initial attribute value of the target object in the i+1 th video frame are fused and corrected to obtain the target attribute value of the target object in the i+1 th video frame.

本说明书实施例中的所述目标对象朝向确定方法应用在车辆行驶场景中，摄像机会实时获取包含车辆的图像发送至服务器，服务器接收到包含车辆的图像后会基于当前视频帧的上一时刻的视频帧中车辆的长宽对当前视频帧中车辆的长宽进行预测，然后将预测的车辆的长宽以及通过当前帧的车辆的关键点和获取当前视频帧的相机外参获得的车辆的长宽输入到卡尔曼滤波器中进行融合矫正，以获得当前视频帧中车辆的精确的、稳定的车辆长宽。The method for determining the orientation of the target object in the embodiment of this specification is applied in a vehicle driving scene. The camera will acquire an image containing the vehicle in real time and send it to the server. After the server receives the image containing the vehicle, it will be based on the current video frame. The length and width of the vehicle in the video frame predict the length and width of the vehicle in the current video frame, and then the predicted length and width of the vehicle and the length and width of the vehicle obtained through the key points of the vehicle in the current frame and the camera extrinsic parameters of the current video frame are obtained. The width is fed into the Kalman filter for fusion correction to obtain an accurate and stable vehicle length and width of the vehicle in the current video frame.

步骤206：根据所述目标对象的目标属性值以及边缘检测框确定所述第i+1个视频帧中目标对象的第一朝向。Step 206: Determine the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and the edge detection frame.

具体的，所述根据所述目标对象的目标属性值以及边缘检测框确定所述第i+1个视频帧中目标对象的第一朝向，包括：Specifically, the determining the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and the edge detection frame includes:

将所述第i+1个视频帧中目标对象的边缘检测框在预设坐标系中映射，获得所述第i+1个视频帧中目标对象的边缘检测框的映射关键点；Mapping the edge detection frame of the target object in the i+1 th video frame in a preset coordinate system to obtain the mapping key point of the edge detection frame of the target object in the i+1 th video frame;

基于所述映射关键点在所述预设坐标系中的坐标值以及所述目标对象的目标属性值确定所述第i+1个视频帧中目标对象的第一朝向。The first orientation of the target object in the i+1 th video frame is determined based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object.

其中，预设坐标系可以理解为相机的世界坐标系，实际应用中，在对第i+1个视频帧中目标对象的第一朝向进行计算时，首先将第i+1个视频帧中目标对象的边缘检测框在世界坐标系中映射，获得第i+1个视频帧中目标对象的边缘检测框的映射关键点；然后基于该映射关键点在世界坐标系中的坐标值以及第i+1个视频帧中目标对象的目标属性值计算获得第i+1个视频帧中目标对象的第一朝向。Among them, the preset coordinate system can be understood as the world coordinate system of the camera. In practical applications, when calculating the first orientation of the target object in the i+1th video frame, the target object in the i+1th video frame is firstly calculated. The edge detection frame of the object is mapped in the world coordinate system, and the mapping key point of the edge detection frame of the target object in the i+1 video frame is obtained; then based on the coordinate value of the mapping key point in the world coordinate system and the i+ The target attribute value of the target object in one video frame is calculated to obtain the first orientation of the target object in the i+1th video frame.

参见图4，图4示出了根据本说明书一个实施例提供的一种目标对象朝向确定方法中目标对象在世界坐标系的投影关系示意图。Referring to FIG. 4 , FIG. 4 shows a schematic diagram of the projection relationship of the target object in the world coordinate system in the method for determining the orientation of the target object provided according to an embodiment of the present specification.

结合图3，图3中的ABCD形成的长方形即为车辆的边缘检测框，图4中的ABC为车辆的三个可见的点，通过相机成像，车辆的ABC在图4中最下面的一条横线(即成像平面)上对应的三个点为A’，B’，C’，而根据图4中的XOY坐标系可以获得A’，B’，C’在成像平面上的具体像素值，即坐标值，根据A’，B’，C’在成像平面上的坐标值以及AC(车辆的宽度)、BC(车辆的长度)可以获得BC在XOY坐标系中的方向，该方向即为车辆在视频帧中的朝向。Combined with Figure 3, the rectangle formed by ABCD in Figure 3 is the edge detection frame of the vehicle, and ABC in Figure 4 is the three visible points of the vehicle. Through the camera imaging, the ABC of the vehicle is in the lowermost horizontal line in Figure 4. The corresponding three points on the line (ie the imaging plane) are A', B', C', and according to the XOY coordinate system in Figure 4, the specific pixel values of A', B', C' on the imaging plane can be obtained, That is, the coordinate value, according to the coordinate values of A', B', C' on the imaging plane and AC (the width of the vehicle), BC (the length of the vehicle), the direction of BC in the XOY coordinate system can be obtained, which is the vehicle. The orientation in the video frame.

本说明书实施例中，采用几何建模的方式对目标对象的朝向进行计算，原理设计上对视觉成像不敏感，同时适用于针孔和鱼眼摄像头，因此对于不同参数的模组都不敏感，有很好的量产特性；并且利用目标对象的长宽和朝向之间的约束关系，通过目标对象的边缘检测框的映射关键点在世界坐标系中的坐标值以及目标对象的长宽可以计算获得较为准确的视频帧中目标对象的朝向。In the embodiment of this specification, the geometric modeling method is used to calculate the orientation of the target object. The principle design is not sensitive to visual imaging, and it is suitable for pinhole and fisheye cameras at the same time, so it is not sensitive to modules with different parameters. It has good mass production characteristics; and using the constraint relationship between the length, width and orientation of the target object, the coordinates of the key points in the world coordinate system and the length and width of the target object can be calculated through the mapping of the edge detection frame of the target object. Get a more accurate orientation of the target object in the video frame.

具体的，所述基于所述映射关键点在所述预设坐标系中的坐标值以及所述目标对象的目标属性值确定所述第i+1个视频帧中目标对象的第一朝向，包括：Specifically, the determining the first orientation of the target object in the i+1 th video frame based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object, including :

基于所述边缘检测框中确定所述目标对象的第一目标边缘和第二目标边缘；Determine the first target edge and the second target edge of the target object based on the edge detection frame;

根据所述目标对象的目标属性值确定所述第一目标边缘的边缘值和所述第二目标边缘的边缘值；Determine the edge value of the first target edge and the edge value of the second target edge according to the target attribute value of the target object;

基于所述映射关键点在所述预设坐标系中的坐标值、所述第一目标边缘的边缘值和所述第二目标边缘的边缘值，计算获得所述第i+1个视频帧中目标对象的第一朝向。Based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge, calculate and obtain the i+1 th video frame. The first orientation of the target object.

其中，第一目标边缘为目标对象的长边，第二目标边缘为目标对象的宽边，实际应用中，基于目标对象的边缘检测框可以确定哪个边为目标对象的长，哪个边为目标对象的宽，而在确定了目标对象的长边和宽边之后，根据目标对象的目标属性值为长边赋予长度信息，为宽边赋予宽度信息，然后将映射关键点在世界坐标系中的坐标值、目标对象的长边的长度和宽边的宽度计算获得目标对象的朝向。Among them, the first target edge is the long side of the target object, and the second target edge is the wide side of the target object. In practical applications, based on the edge detection frame of the target object, it can be determined which side is the length of the target object and which side is the target object. After the long side and wide side of the target object are determined, according to the target attribute value of the target object, length information is given to the long side, width information is given to the wide side, and then the coordinates of the key points in the world coordinate system are mapped. The value, the length of the long side of the target object and the width of the wide side are calculated to obtain the orientation of the target object.

本说明书实施例中，利用目标对象的长宽和朝向之间的约束关系，通过目标对象的边缘检测框的映射关键点在世界坐标系中的坐标值以及目标对象的长宽可以计算获得较为准确的视频帧中目标对象的朝向。In the embodiment of this specification, using the constraint relationship between the length, width and orientation of the target object, the coordinates of the key points of the edge detection frame of the target object in the world coordinate system and the length and width of the target object can be calculated to obtain relatively accurate The orientation of the target object in the video frame of .

步骤208：基于第i个视频帧中目标对象的目标朝向确定第i+1个视频帧中目标对象的第二朝向，基于所述第一朝向和所述第二朝向确定所述第i+1个视频帧中目标对象的目标朝向。Step 208: Determine the second orientation of the target object in the i+1 th video frame based on the target orientation of the target object in the ith video frame, and determine the i+1 th orientation based on the first orientation and the second orientation The target orientation of the target object in each video frame.

具体的，所述根据所述目标对象的目标属性值以及边缘检测框确定所述第i个视频帧中目标对象的目标朝向，包括：Specifically, determining the target orientation of the target object in the ith video frame according to the target attribute value of the target object and the edge detection frame includes:

将所述第i个视频帧中目标对象的边缘检测框在预设坐标系中映射，获得所述第i个视频帧中目标对象的边缘检测框的映射关键点；The edge detection frame of the target object in the i-th video frame is mapped in a preset coordinate system, and the mapping key point of the edge detection frame of the target object in the i-th video frame is obtained;

基于所述映射关键点在所述预设坐标系中的坐标值以及所述目标对象的目标属性值确定所述第i个视频帧中目标对象的第一朝向；Determine the first orientation of the target object in the i-th video frame based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object;

在i为1的情况下，将所述第i个视频帧中目标对象的第一朝向作为所述第i个视频帧中目标对象的目标朝向。In the case where i is 1, the first orientation of the target object in the i-th video frame is used as the target orientation of the target object in the i-th video frame.

具体的，在第i个视频帧为包含目标对象的第一个视频帧的情况下，在计算第i个视频帧中目标对象的目标朝向的情况下，直接通过映射关键点在世界坐标系中的坐标值与目标对象的长宽的计算，即可获得第i个视频帧的目标朝向；而在第i个视频帧不是第一个视频帧的情况下，则可以参见对第i+1个视频帧中目标对象的目标朝向的获取方式实现目标朝向的获取。Specifically, in the case where the i-th video frame is the first video frame containing the target object, in the case of calculating the target orientation of the target object in the i-th video frame, directly map key points in the world coordinate system The target orientation of the i-th video frame can be obtained by calculating the coordinate value of , and the length and width of the target object; and if the i-th video frame is not the first video frame, you can refer to the i+1-th video frame. The acquisition method of the target orientation of the target object in the video frame realizes the acquisition of the target orientation.

本说明书实施例中，在第i个视频帧为包含目标对象的第一个视频帧或者是静止视频帧的情况下，则不存在与速度、距离等的约束关系，那么此时直接通过映射关键点在世界坐标系中的坐标值与目标对象的长宽的计算，即可快速获得第i个视频帧的目标朝向。In the embodiment of this specification, if the i-th video frame is the first video frame containing the target object or a still video frame, there is no constraint relationship with speed, distance, etc., then directly through the mapping key The target orientation of the ith video frame can be quickly obtained by calculating the coordinate value of the point in the world coordinate system and the length and width of the target object.

其中，所述基于所述映射关键点在所述预设坐标系中的坐标值以及所述目标对象的目标属性值确定所述第i个视频帧中目标对象的第一朝向，包括：Wherein, determining the first orientation of the target object in the ith video frame based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object includes:

基于所述映射关键点在所述预设坐标系中的坐标值、所述第一目标边缘的边缘值和所述第二目标边缘的边缘值，计算获得所述第i个视频帧中目标对象的第一朝向。Based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge, calculate and obtain the target object in the i-th video frame the first orientation.

本说明书实施例中，在第i个视频帧为第一个视频帧或者是静止视频帧的情况下，可以根据第i个视频帧中目标对象的边缘检测框的映射关键点、目标对象的长度和宽度计算获得第i个视频帧中目标对象的第一朝向，利用目标对象的长宽和朝向之间的约束关系，计算获得更为准确的目标对象的朝向。In the embodiment of this specification, in the case where the i-th video frame is the first video frame or a still video frame, the mapping key points of the edge detection frame of the target object in the i-th video frame and the length of the target object may be used. The first orientation of the target object in the ith video frame is obtained by calculating the width and width, and a more accurate orientation of the target object is obtained by using the constraint relationship between the length, width and orientation of the target object.

而在视频帧不是第一个视频帧，且视频帧也非静止视频帧的情况下，为了获得稳定、精确的目标对象的目标朝向，则需要通过单车模型以及滤波器对目标对象的朝向进行更好的计算和矫正，具体实现方式如下所述：However, when the video frame is not the first video frame and the video frame is not a still video frame, in order to obtain a stable and accurate target orientation of the target object, the orientation of the target object needs to be updated through the bicycle model and filter. Good calculation and correction, the specific implementation is as follows:

所述基于第i个视频帧中目标对象的目标朝向确定第i+1个视频帧中目标对象的第二朝向，包括：The determining of the second orientation of the target object in the i+1 th video frame based on the target orientation of the target object in the ith video frame includes:

将所述第i个视频帧中目标对象的目标朝向输入单车模型，获得所述第i+1个视频帧中目标对象的第二朝向。Input the target orientation of the target object in the i-th video frame into the bicycle model, and obtain the second orientation of the target object in the i+1-th video frame.

实际应用中，预先建立车辆的单车模型，将上述计算的朝向作为卡尔曼滤波器的观测，在单车模型中，车辆的朝向和速度方向一致，同时时序上测距的差分提供了速度的信息，因此对于运动速度大的车辆可以更加相信速度方向作为朝向，对于运动缓慢和静止的车辆，可以更加相信几何约束计算的朝向。而对于行驶中的车辆可以通过对车辆的时序上测距或者激光雷达等获得该车辆的运动速度，将第i个视频帧中目标对象的目标朝向输入到单车模型中，通过目标朝向与车辆的运动速度的约束关系可以预测出第i+1个视频帧中目标对象的第二朝向。In practical applications, a bicycle model of the vehicle is pre-established, and the above calculated orientation is used as the observation of the Kalman filter. In the bicycle model, the direction of the vehicle is consistent with the direction of the speed, and the difference of the distance measurement in the time series provides the speed information. Therefore, for a vehicle with a large moving speed, the direction of the speed can be more trusted as the orientation, and for a vehicle with a slow and stationary motion, the orientation calculated by the geometric constraints can be more trusted. For a moving vehicle, the motion speed of the vehicle can be obtained through the time-series ranging or lidar of the vehicle, and the target orientation of the target object in the i-th video frame is input into the bicycle model. The constraint relationship of the motion speed can predict the second orientation of the target object in the i+1th video frame.

再将第i+1个视频帧中目标对象的第一朝向和第二朝向输入卡尔曼滤波器，卡尔曼滤波器对第一朝向和第二朝向进行加权融合，获得第i+1个视频帧中目标对象的目标朝向；此外，卡尔曼滤波器的观测量中还可以加入第i+1个视频帧中目标对象的关键点的信息，其中，在朝向计算中，关键点对于朝向计算有正向作用。Then input the first and second orientations of the target object in the i+1th video frame into the Kalman filter, and the Kalman filter performs weighted fusion of the first and second orientations to obtain the i+1th video frame The target direction of the target object in to the effect.

本说明书实施例中，所述目标对象朝向确定方法提出了新的几何建模的方案，适用近处和远处的车辆，也适用静止和运动的车辆，没有原理的局限性，且通过几何建模代替了现有技术中的深度学习模型，对计算资源和耗时方面存在优势，能够适用于实时性很强的自动驾驶应用；且本方案将端到端的深度学习模型拆解为多个子模型和新的几何建模，增加了冗余度，对于稳定性和减少长尾问题存在优势，提高了自动驾驶应用的安全性；而对应几何建模，原理设计上对视觉成像不敏感，可适用于针孔和鱼眼摄像头，因此对于不同参数(例如相机外参)的模组都不敏感，有很好的量产特性；并且采用物理建模方案，有很强的可解释性。In the embodiment of this specification, the method for determining the orientation of the target object proposes a new geometric modeling solution, which is suitable for near and far vehicles, as well as stationary and moving vehicles, without the limitation of principle, and through geometric modeling The model replaces the deep learning model in the existing technology, which has advantages in terms of computing resources and time-consuming, and can be suitable for automatic driving applications with strong real-time performance; and this solution disassembles the end-to-end deep learning model into multiple sub-models and new geometric modeling, which increases redundancy, has advantages in stability and reduces long-tail problems, and improves the safety of autonomous driving applications; while the corresponding geometric modeling, the principle design is not sensitive to visual imaging, and can be applied For pinhole and fisheye cameras, it is not sensitive to modules with different parameters (such as camera external parameters), and has good mass production characteristics; and the physical modeling scheme is used, which has strong interpretability.

下述结合附图5，以本说明书提供的目标对象朝向确定方法在车辆自动驾驶的应用为例，对所述目标对象朝向确定方法进行进一步说明，具体包括以下步骤。The following describes the method for determining the orientation of a target object by taking the application of the method for determining the orientation of a target object provided in this specification in the automatic driving of a vehicle as an example, which specifically includes the following steps.

步骤502：接收包含车辆的视频帧，获取该视频帧中车辆的全车检测框、关键点以及车型。Step 502 : Receive a video frame including a vehicle, and obtain a full-vehicle detection frame, key points, and vehicle type of the vehicle in the video frame.

具体的，根据三个深度学习模型分别获得该视频帧中车辆的全车检测框、关键点以及车型。Specifically, the full-vehicle detection frame, key points, and model of the vehicle in the video frame are obtained respectively according to the three deep learning models.

步骤504：根据车辆的车型获得该视频帧中车辆的初始长宽。Step 504: Obtain the initial length and width of the vehicle in the video frame according to the vehicle type.

步骤506：在车辆的初始长宽的约束下，根据车辆的关键点以及获取的车辆视频帧的相机外参确定车辆的真实长宽。Step 506: Under the constraint of the initial length and width of the vehicle, determine the real length and width of the vehicle according to the key points of the vehicle and the acquired camera extrinsic parameters of the video frame of the vehicle.

步骤508：根据车辆的长宽和车辆的全车检测框在世界坐标系中的映射关键点的约束关系，计算出该视频帧中车辆的初始朝向。Step 508: Calculate the initial orientation of the vehicle in the video frame according to the length and width of the vehicle and the mapping key points of the entire vehicle detection frame of the vehicle in the world coordinate system.

步骤510：预先为车辆建立自行车的运动模型，获得该视频帧中车辆的预测朝向，利用卡尔曼滤波器对初始朝向和预测朝向做估计。Step 510: Establish a motion model of the bicycle for the vehicle in advance, obtain the predicted orientation of the vehicle in the video frame, and use the Kalman filter to estimate the initial orientation and the predicted orientation.

步骤512：利用利用卡尔曼滤波器对初始朝向和预测朝向加权融合后，获得车辆的稳定的朝向。Step 512: After weighted fusion of the initial orientation and the predicted orientation using the Kalman filter, the stable orientation of the vehicle is obtained.

本说明书实施例提供的所述目标对象朝向确定方法应用在车辆自动驾驶，通过利用车辆的长宽和朝向之间的约束关系建模的方式，能够获得车辆的稳定的精确的朝向，且对于静止和运行的车辆都能适用，对近处远处的车辆也适用，没有原理上的局限，整体性能较优，然后由于采用的模型是关键点模型，具有很强的图像纹理，相对于原有的大模型在模型稳定性和长尾问题上存在优势，其次通过将朝向计算拆解为车型估计、长宽计算、卡尔曼滤波等多个部分，能够分而治之的进行解耦和融合，最终达到鲁棒的效果，最后车辆整体的朝向计算模型上只依赖关键点，本身计算量也比较小，因此本说明书实施例提供的所述目标对象朝向确定方法可以在精度、稳定性、适用范围、实时性上都有较优的效果。The method for determining the orientation of the target object provided in the embodiments of this specification is applied to automatic driving of vehicles. By modeling the constraint relationship between the length, width and orientation of the vehicle, a stable and accurate orientation of the vehicle can be obtained, and for stationary vehicles It can be applied to running vehicles, and it is also applicable to vehicles near and far. There is no limitation in principle, and the overall performance is better. Then, because the model used is a key point model, it has a strong image texture, which is better than the original one. The large model has advantages in model stability and long tail problems. Secondly, by disassembling the orientation calculation into multiple parts such as model estimation, length and width calculation, Kalman filtering, etc., it can be divided and conquered for decoupling and fusion, and finally achieve a robust In the end, the overall orientation calculation model of the vehicle only depends on the key points, and the amount of calculation itself is relatively small. Therefore, the method for determining the orientation of the target object provided by the embodiments of this specification can be used in accuracy, stability, application range, real-time performance. All have better effects.

与上述方法实施例相对应，本说明书还提供了目标对象朝向确定装置实施例，图6示出了本说明书一个实施例提供的一种目标对象朝向确定装置的结构示意图。如图6所示，该装置包括：Corresponding to the above method embodiments, the present specification also provides an embodiment of an apparatus for determining the orientation of a target object. FIG. 6 shows a schematic structural diagram of an apparatus for determining the orientation of a target object provided by an embodiment of the present specification. As shown in Figure 6, the device includes:

第一视频接收模块602，被配置为接收包含目标对象的第i+1个视频帧，并获取所述第i+1个视频帧中目标对象的形状、关键点以及边缘检测框；The first video receiving module 602 is configured to receive the i+1 th video frame including the target object, and obtain the shape, key points and edge detection frame of the target object in the i+1 th video frame;

第一确定模块604，被配置为基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值；a first determining module 604, configured to determine a target attribute value of the target object based on the shape and key points of the target object;

第二确定模块606，被配置为根据所述目标对象的目标属性值以及边缘检测框确定所述第i+1个视频帧中目标对象的第一朝向；The second determination module 606 is configured to determine the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and the edge detection frame;

第一目标朝向确定模块608，被配置为基于第i个视频帧中目标对象的目标朝向确定第i+1个视频帧中目标对象的第二朝向，基于所述第一朝向和所述第二朝向确定所述第i+1个视频帧中目标对象的目标朝向。The first target orientation determination module 608 is configured to determine a second orientation of the target object in the i+1 th video frame based on the target orientation of the target object in the ith video frame, based on the first orientation and the second orientation. The orientation determines the orientation of the target object in the i+1 th video frame.

可选的，所述装置，还包括：Optionally, the device further includes:

第二视频接收模块，被配置为接收包含目标对象的第i个视频帧，并获取所述第i个视频帧中目标对象的形状、关键点以及边缘检测框；The second video receiving module is configured to receive the ith video frame containing the target object, and obtain the shape, key points and edge detection frame of the target object in the ith video frame;

第三确定模块，被配置为基于所述目标对象的形状以及关键点确定所述目标对象的目标属性值；a third determining module, configured to determine a target attribute value of the target object based on the shape and key points of the target object;

第二目标朝向确定模块，被配置为根据所述目标对象的目标属性值以及边缘检测框确定所述第i个视频帧中目标对象的目标朝向。The second target orientation determination module is configured to determine the target orientation of the target object in the ith video frame according to the target attribute value of the target object and the edge detection frame.

可选的，所述第一视频接收模块602，进一步被配置为：Optionally, the first video receiving module 602 is further configured as:

可选的，所述第三确定模块，进一步被配置为：Optionally, the third determining module is further configured to:

可选的，所述第一确定模块604，进一步被配置为：Optionally, the first determining module 604 is further configured to:

可选的，所述第二确定模块606，进一步被配置为：Optionally, the second determining module 606 is further configured to:

可选的，所述第二目标朝向确定模块，进一步被配置为：Optionally, the second target orientation determination module is further configured as:

可选的，所述第一目标朝向确定模块608，进一步被配置为：Optionally, the first target orientation determination module 608 is further configured to:

本说明书一个实施例实现了目标对象朝向确定装置，通过将目标朝向的计算分解为对目标对象的形状估计、目标属性值计算以及朝向计算等多个步骤实现，使得各个实现步骤之间可以很好的解耦和融合，最终可以获得精确、稳定的目标对象的目标朝向。An embodiment of this specification implements a device for determining the orientation of a target object, which is realized by decomposing the calculation of the target orientation into multiple steps such as shape estimation, target attribute value calculation, and orientation calculation of the target object, so that each implementation step can be well adjusted. The decoupling and fusion of the target object can finally obtain the accurate and stable target orientation of the target object.

上述为本实施例的一种目标对象朝向确定装置的示意性方案。需要说明的是，该目标对象朝向确定装置的技术方案与上述的目标对象朝向确定方法的技术方案属于同一构思，目标对象朝向确定装置的技术方案未详细描述的细节内容，均可以参见上述目标对象朝向确定方法的技术方案的描述。The above is a schematic solution of an apparatus for determining the orientation of a target object according to this embodiment. It should be noted that the technical solution of the device for determining the orientation of the target object and the technical solution of the above-mentioned method for determining the orientation of the target object belong to the same concept. A description of the technical solution towards the determination method.

本说明书实施例提供的目标对象朝向确定装置可以应用于自动驾驶场景中，通过上述说明书实施例中的各个功能模块(例如第一确定模块、第二确定模块以及第一目标朝向确定模块等)对自动驾驶车辆的朝向进行计算。The device for determining the orientation of a target object provided in the embodiments of this specification can be applied to an automatic driving scenario, and the functional modules (eg, the first determination module, the second determination module, and the first target orientation determination module, etc.) The orientation of the autonomous vehicle is calculated.

当然，根据自动驾驶车辆类型的不同，这些算法模块(如上述功能模块)也会有所不同。例如，对于物流车辆、公共服务车辆、医疗服务车辆、终端服务车辆会涉及不同的算法模块。下面分别针对这四种自动驾驶车辆对算法模块进行举例说明：Of course, these algorithmic modules (such as the above-mentioned functional modules) will vary depending on the type of autonomous vehicle. For example, different algorithm modules will be involved for logistics vehicles, public service vehicles, medical service vehicles, and terminal service vehicles. The following are examples of the algorithm modules for these four types of autonomous vehicles:

其中，物流车辆是指物流场景中使用的车辆，例如可以是带自动分拣功能的物流车辆、带冷藏保温功能的物流车辆、带测量功能的物流车辆。这些物流车辆会涉及不同算法模块。Among them, the logistics vehicle refers to the vehicle used in the logistics scenario, for example, it may be a logistics vehicle with an automatic sorting function, a logistics vehicle with a refrigeration and heat preservation function, and a logistics vehicle with a measurement function. These logistics vehicles will involve different algorithm modules.

例如，对于物流车辆，可以带有自动化的分拣装置，该分拣装置可以在物流车辆到达目的地后自动把货物取出并搬送、分拣、存放。这就涉及用于货物分拣的算法模块，该算法模块主要实现货物取出、搬运、分拣以及存放等逻辑控制。For example, a logistics vehicle can be equipped with an automated sorting device, which can automatically take out and transport, sort, and store the goods after the logistics vehicle arrives at the destination. This involves an algorithm module for cargo sorting, which mainly implements logical control of cargo take-out, handling, sorting, and storage.

又例如，针对冷链物流场景，物流车辆还可以带有冷藏保温装置，该冷藏保温装置可以实现运输的水果、蔬菜、水产品、冷冻食品以及其它易腐烂的食品进行冷藏或保温，使之处于合适的温度环境，解决易腐烂食品的长途运输问题。这就涉及用于冷藏保温控制的算法模块，该算法模块主要用于根据食品(或物品)性质、易腐性、运输时间、当前季节、气候等信息动态、自适应计算冷餐或保温的合适温度，根据该合适温度对冷藏保温装置进行自动调节，这样在车辆运输不同食品或物品时运输人员无需手动调整温度，将运输人员从繁琐的温度调控中解放出来，提高冷藏保温运输的效率。For another example, for the cold chain logistics scenario, the logistics vehicle can also be equipped with a refrigeration and heat preservation device, which can realize the refrigeration or heat preservation of the transported fruits, vegetables, aquatic products, frozen food and other perishable food, so that they are in Appropriate temperature environment to solve the problem of long-distance transportation of perishable food. This involves an algorithm module for refrigeration and heat preservation control. The algorithm module is mainly used to dynamically and adaptively calculate the appropriateness of cold meals or heat preservation according to information such as food (or item) properties, perishability, transportation time, current season, and climate. According to the appropriate temperature, the refrigeration and heat preservation device is automatically adjusted, so that the transport personnel do not need to manually adjust the temperature when the vehicle transports different foods or items, which frees the transport personnel from the tedious temperature control and improves the efficiency of refrigeration and heat preservation transportation.

又例如，在大多数物流场景中，是根据包裹体积和/或重量进行收费的，而物流包裹的数量非常庞大，单纯依靠快递员对包裹体积和/或重量进行测量，效率非常低，人工成本较高。因此，在一些物流车辆中，增设了测量装置，可自动测量物流包裹的体积和/或重量，并计算物流包裹的费用。这就涉及用于物流包裹测量的算法模块，该算法模块主要用于识别物流包裹的类型，确定物流包裹的测量方式，如进行体积测量还是重量测量或者是同时进行体积和重量的组合测量，并可根据确定的测量方式完成体积和/或重量的测量，以及根据测量结果完成费用计算。For another example, in most logistics scenarios, charges are based on the volume and/or weight of the package, and the number of logistics packages is very large, and it is very inefficient to rely solely on the courier to measure the volume and/or weight of the package, resulting in labor costs. higher. Therefore, in some logistics vehicles, a measuring device is added, which can automatically measure the volume and/or weight of the logistics package and calculate the cost of the logistics package. This involves an algorithm module for logistics package measurement, which is mainly used to identify the type of logistics package, determine the measurement method of the logistics package, such as volume measurement or weight measurement, or a combination of volume and weight measurement at the same time, and The measurement of volume and/or weight can be done according to the determined measurement method, and the cost calculation can be done according to the measurement result.

其中，公共服务车辆是指提供某种公共服务的车辆，例如可以是消防车、除冰车、洒水车、铲雪车、垃圾处理车辆、交通指挥车辆等。这些公共服务车辆会涉及不同算法模块。Among them, the public service vehicle refers to a vehicle that provides some kind of public service, such as a fire truck, a deicer, a sprinkler, a snow plow, a garbage disposal vehicle, a traffic command vehicle, and the like. These public service vehicles will involve different algorithm modules.

例如，对于自动驾驶的消防车，其主要任务是针对火灾现场进行合理的灭火任务，这就涉及用于灭火任务的算法模块，该算法模块至少需要实现火灾状况的识别、灭火方案的规划以及对灭火装置的自动控制等逻辑。For example, for an autonomous fire truck, its main task is to carry out a reasonable fire-fighting task at the fire scene, which involves an algorithm module for fire-fighting tasks. Logic such as automatic control of fire extinguishing device.

又例如，对于除冰车，其主要任务是清除路面上结的冰雪，这就涉及除冰的算法模块，该算法模块至少需要实现路面上冰雪状况的识别、根据冰雪状况制定除冰方案，如哪些路段需要采取除冰，哪些路段无需除冰，是否采用撒盐方式、撒盐克数等，以及在确定除冰方案的情况下对除冰装置的自动控制等逻辑。For another example, for a de-icing vehicle, its main task is to remove the ice and snow on the road, which involves an algorithm module for de-icing, which at least needs to identify the ice and snow conditions on the road, and formulate a de-icing plan according to the ice and snow conditions, such as Which road sections need to be de-iced, which sections do not need to be de-iced, whether to use the salt spray method, the number of grams of salt, and the logic of automatic control of the de-icing device when the de-icing plan is determined.

其中，医疗服务车辆是指能够提供一种或多种医疗服务的自动驾驶车辆，该种车辆可提供消毒、测温、配药、隔离等医疗服务，这就涉及提供各种自助医疗服务的算法模块，这些算法模块主要实现消毒需求的识别以及对消毒装置的控制，以使消毒装置为病人进行消毒，或者对病人位置的识别，控制测温装置自动贴近病人额头等位置为病人进行测温，或者，用于实现对病症的判断，根据判断结果给出药方并需要实现对药品/药品容器的识别，以及对取药机械手的控制，使之按药方为病人抓取药品，等等。Among them, medical service vehicles refer to self-driving vehicles that can provide one or more medical services. Such vehicles can provide medical services such as disinfection, temperature measurement, dispensing, and isolation. This involves algorithm modules that provide various self-service medical services. , these algorithm modules mainly realize the identification of disinfection needs and the control of the disinfection device, so that the disinfection device can disinfect the patient, or identify the patient's position, and control the temperature measurement device to automatically measure the temperature of the patient close to the patient's forehead, or , used to realize the judgment of the disease, give the prescription according to the judgment result and need to realize the identification of the medicine/drug container, and control the medicine taking manipulator, so that it can grab the medicine for the patient according to the prescription, and so on.

其中，终端服务车辆是指可代替一些终端设备面向用户提供某种便利服务的自助型的自动驾驶车辆，例如这些车辆可以为用户提供打印、考勤、扫描、开锁、支付、零售等服务。Among them, terminal service vehicles refer to self-service autonomous vehicles that can replace some terminal devices to provide users with certain convenient services. For example, these vehicles can provide users with services such as printing, attendance, scanning, unlocking, payment, and retail.

例如，在一些应用场景中，用户经常需要到特定位置去打印或扫描文档，费时费力。于是，出现一种可以为用户提供打印/扫描服务的终端服务车辆，这些服务车辆可以与用户终端设备互联，用户通过终端设备发出打印指令，服务车辆响应打印指令，自动打印用户所需的文档并可自动将打印出的文档送至用户位置，用户无需去打印机处排队，可极大地提高打印效率。或者，可以响应用户通过终端设备发出的扫描指令，移动至用户位置，用户将待扫描的文档放置的服务车辆的扫描工具上完成扫描，无需到打印/扫描机处排队，省时省力。这就涉及提供打印/扫描服务的算法模块，该算法模块至少需要识别与用户终端设备的互联、打印/扫描指令的响应、用户位置的定位以及行进控制等。For example, in some application scenarios, users often need to go to a specific location to print or scan documents, which is time-consuming and labor-intensive. As a result, there is a terminal service vehicle that can provide users with printing/scanning services. These service vehicles can be interconnected with user terminal equipment. The user sends a print command through the terminal device, and the service vehicle responds to the print command, automatically prints the documents required by the user and automatically prints the documents required by the user. The printed documents can be automatically sent to the user's location, and the user does not need to queue at the printer, which can greatly improve the printing efficiency. Alternatively, the user can move to the user's location in response to the scanning instruction issued by the user through the terminal device, and the user can place the document to be scanned on the scanning tool of the service vehicle to complete the scanning, without queuing at the printing/scanning machine, saving time and effort. This involves an algorithm module for providing printing/scanning services. The algorithm module needs at least to identify the interconnection with the user terminal equipment, the response to the printing/scanning command, the positioning of the user's position, and the travel control.

又例如，随着新零售业务的开展，越来越多的电商借助于自助售货机将商品销售送到了各大办公楼、公共区，但这些自助售货机被放置在固定位置，不可移动，用户需要到该自助售货机跟前才能购买所需商品，便利性还是较差。于是出现了可提供零售服务的自助驾驶车辆，这些服务车辆可以承载商品自动移动，并可提供对应的自助购物类APP或购物入口，用户借助于手机等终端通过APP或购物入口可以向提供零售服务的自动驾驶车辆进行下单，该订单中包括待购买的商品名称、数量以及用户位置，该车辆收到下单请求之后，可以确定当前剩余商品是否具有用户购买的商品以及数量是否足够，在确定具有用户购买的商品且数量足够的情况下，可携带这些商品自动移动至用户位置，将这些商品提供给用户，进一步提高用户购物的便利性，节约用户时间，让用户将时间用于更为重要的事情上。这就涉及提供零售服务的算法模块，这些算法模块主要实现响应用户下单请求、订单处理、商品信息维护、用户位置定位、支付管理等逻辑。For another example, with the development of new retail business, more and more e-commerce merchants use self-service vending machines to sell goods to major office buildings and public areas, but these self-service vending machines are placed in fixed positions and cannot be moved. The user needs to go to the self-service vending machine to buy the desired product, and the convenience is still poor. As a result, there are self-driving vehicles that can provide retail services. These service vehicles can carry goods and move automatically, and can provide corresponding self-service shopping APPs or shopping portals. Users can provide retail services through APPs or shopping portals with the help of terminals such as mobile phones. After the vehicle receives the order request, it can determine whether the current remaining goods have the goods purchased by the user and whether the quantity is sufficient. After determining When there are goods purchased by the user and the quantity is sufficient, these goods can be automatically moved to the user's location, and these goods are provided to the user, which further improves the convenience of the user's shopping, saves the user's time, and allows the user to use the time for more important things. This involves algorithm modules that provide retail services. These algorithm modules mainly implement logic such as responding to user order requests, order processing, product information maintenance, user location positioning, and payment management.

参见图7，图7示出了根据本说明书一个实施例提供的一种计算设备700的结构框图。该计算设备700的部件包括但不限于存储器710和处理器720。处理器720与存储器710通过总线730相连接，数据库750用于保存数据。Referring to FIG. 7, FIG. 7 shows a structural block diagram of a computing device 700 provided according to an embodiment of the present specification. Components of the computing device 700 include, but are not limited to, memory 710 and processor 720 . The processor 720 is connected with the memory 710 through the bus 730, and the database 750 is used for storing data.

计算设备700还包括接入设备740，接入设备740使得计算设备700能够经由一个或多个网络760通信。这些网络的示例包括公用交换电话网(PSTN)、局域网(LAN)、广域网(WAN)、个域网(PAN)或诸如因特网的通信网络的组合。接入设备740可以包括有线或无线的任何类型的网络接口(例如，网络接口卡(NIC))中的一个或多个，诸如IEEE802.11无线局域网(WLAN)无线接口、全球微波互联接入(Wi-MAX)接口、以太网接口、通用串行总线(USB)接口、蜂窝网络接口、蓝牙接口、近场通信(NFC)接口，等等。Computing device 700 also includes access device 740 that enables computing device 700 to communicate via one or more networks 760 . Examples of such networks include a public switched telephone network (PSTN), a local area network (LAN), a wide area network (WAN), a personal area network (PAN), or a combination of communication networks such as the Internet. Access device 740 may include one or more of any type of network interface (eg, a network interface card (NIC)), wired or wireless, such as an IEEE 802.11 wireless local area network (WLAN) wireless interface, World Interoperability for Microwave Access ( Wi-MAX) interface, Ethernet interface, Universal Serial Bus (USB) interface, cellular network interface, Bluetooth interface, Near Field Communication (NFC) interface, and the like.

在本说明书的一个实施例中，计算设备700的上述部件以及图7中未示出的其他部件也可以彼此相连接，例如通过总线。应当理解，图7所示的计算设备结构框图仅仅是出于示例的目的，而不是对本说明书范围的限制。本领域技术人员可以根据需要，增添或替换其他部件。In one embodiment of the present specification, the above-described components of computing device 700 and other components not shown in FIG. 7 may also be connected to each other, such as through a bus. It should be understood that the structural block diagram of the computing device shown in FIG. 7 is only for the purpose of example, rather than limiting the scope of the present specification. Those skilled in the art can add or replace other components as required.

计算设备700可以是任何类型的静止或移动计算设备，包括移动计算机或移动计算设备(例如，平板计算机、个人数字助理、膝上型计算机、笔记本计算机、上网本等)、移动电话(例如，智能手机)、可佩戴的计算设备(例如，智能手表、智能眼镜等)或其他类型的移动设备，或者诸如台式计算机或PC的静止计算设备。计算设备700还可以是移动式或静止式的服务器。Computing device 700 may be any type of stationary or mobile computing device, including mobile computers or mobile computing devices (eg, tablet computers, personal digital assistants, laptop computers, notebook computers, netbooks, etc.), mobile phones (eg, smart phones) ), wearable computing devices (eg, smart watches, smart glasses, etc.) or other types of mobile devices, or stationary computing devices such as desktop computers or PCs. Computing device 700 may also be a mobile or stationary server.

其中，处理器720用于执行如下计算机可执行指令，所述处理器用于执行所述计算机可执行指令，该计算机可执行指令被处理器执行时实现所述目标对象朝向确定方法的步骤。The processor 720 is configured to execute the following computer-executable instructions, the processor is configured to execute the computer-executable instructions, and when the computer-executable instructions are executed by the processor, implement the steps of the method for determining the orientation of the target object.

上述为本实施例的一种计算设备的示意性方案。需要说明的是，该计算设备的技术方案与上述的目标对象朝向确定方法的技术方案属于同一构思，计算设备的技术方案未详细描述的细节内容，均可以参见上述目标对象朝向确定方法的技术方案的描述。The above is a schematic solution of a computing device according to this embodiment. It should be noted that the technical solution of the computing device and the technical solution of the above-mentioned target object orientation determination method belong to the same concept, and the details that are not described in detail in the technical solution of the computing device can be referred to the above-mentioned technical solution of the target object orientation determination method. description of.

本说明书一实施例还提供一种计算机可读存储介质，其存储有计算机可执行指令，该计算机可执行指令被处理器执行时实现所述目标对象朝向确定方法的步骤。An embodiment of the present specification further provides a computer-readable storage medium, which stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, implements the steps of the method for determining the orientation of the target object.

上述为本实施例的一种计算机可读存储介质的示意性方案。需要说明的是，该存储介质的技术方案与上述的目标对象朝向确定方法的技术方案属于同一构思，存储介质的技术方案未详细描述的细节内容，均可以参见上述目标对象朝向确定方法的技术方案的描述。The above is a schematic solution of a computer-readable storage medium of this embodiment. It should be noted that the technical solution of the storage medium and the above-mentioned technical solution of the target object orientation determination method belong to the same concept, and the details that are not described in detail in the technical solution of the storage medium can be referred to the above-mentioned technical solution of the target object orientation determination method. description of.

上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下，在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外，在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中，多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of the present specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. Additionally, the processes depicted in the figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

所述计算机指令包括计算机程序代码，所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括：能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，RandomAccess Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是，所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减，例如在某些司法管辖区，根据立法和专利实践，计算机可读介质不包括电载波信号和电信信号。The computer instructions include computer program code, which may be in source code form, object code form, an executable file, some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, RandomAccess Memory), electric carrier signal, telecommunication signal and software distribution medium, etc. It should be noted that the content contained in the computer-readable media may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, the computer-readable media Electric carrier signals and telecommunication signals are not included.

需要说明的是，对于前述的各方法实施例，为了简便描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本说明书实施例并不受所描述的动作顺序的限制，因为依据本说明书实施例，某些步骤可以采用其它顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定都是本说明书实施例所必须的。It should be noted that, for the convenience of description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the embodiments of this specification are not limited by the described action sequences. Limitation, because certain steps may be performed in other orders or simultaneously according to embodiments of the present specification. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily all necessary for the embodiments of the specification.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.

以上公开的本说明书优选实施例只是用于帮助阐述本说明书。可选实施例并没有详尽叙述所有的细节，也不限制该发明仅为所述的具体实施方式。显然，根据本说明书实施例的内容，可作很多的修改和变化。本说明书选取并具体描述这些实施例，是为了更好地解释本说明书实施例的原理和实际应用，从而使所属技术领域技术人员能很好地理解和利用本说明书。本说明书仅受权利要求书及其全部范围和等效物的限制。The preferred embodiments of the present specification disclosed above are provided only to aid in the elaboration of the present specification. Alternative embodiments are not intended to exhaust all details, nor do they limit the invention to only the described embodiments. Obviously, many modifications and changes can be made in accordance with the contents of the embodiments of the present specification. These embodiments are selected and described in this specification to better explain the principles and practical applications of the embodiments of this specification, so that those skilled in the art can well understand and utilize this specification. This specification is limited only by the claims and their full scope and equivalents.

Claims

1. A method for determining the orientation of a target object, comprising:

Receive the i+1 th video frame including the target object, and obtain the shape, key points and edge detection frame of the target object in the i+1 th video frame;

Determine the target attribute value of the target object based on the shape and key points of the target object;

Determine the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and the edge detection frame;

The second orientation of the target object in the i+1 th video frame is determined based on the target orientation of the target object in the ith video frame, and the i+1 th video frame is determined based on the first orientation and the second orientation The target orientation of the target object in .

2. The method for determining the orientation of a target object according to claim 1, before receiving the i+1 th video frame including the target object, further comprising:

Receive the ith video frame containing the target object, and obtain the shape, key points and edge detection frame of the target object in the ith video frame;

The target orientation of the target object in the ith video frame is determined according to the target attribute value of the target object and the edge detection frame.

3. The method for determining the orientation of a target object according to claim 1 or 2, wherein the acquisition of the shape, key points and edge detection frame of the target object in the i+1 th video frame comprises:

The i+1 th video frame is input into the first recognition model, the second recognition model and the third recognition model respectively, and the shape, key points and edge detection frame of the target object in the i+1 th video frame are acquired.

4. The method for determining the orientation of a target object according to claim 2, wherein determining the target attribute value of the target object based on the shape and key points of the target object, comprising:

Determine the first initial attribute value of the target object in the ith video frame based on the shape of the target object in the ith video frame;

Determine the second initial attribute value of the target object in the i-th video frame based on the key points of the target object in the i-th video frame and the camera extrinsic parameters for obtaining the i-th video frame;

In the case that the second initial attribute value is less than or equal to the first initial attribute value, the second initial attribute value is used as the target attribute value of the target object in the ith video frame.

5. The method for determining the orientation of a target object according to claim 4, wherein determining the target attribute value of the target object based on the shape and key points of the target object, comprising:

Based on the shape of the target object in the i+1 th video frame, determine the first initial attribute value of the target object in the i+1 th video frame;

Determine the second initial attribute of the target object in the i+1 th video frame based on the key points of the target object in the i+1 th video frame and the camera extrinsic parameters used to obtain the i+1 th video frame value;

In the case that the second initial attribute value is less than or equal to the first initial attribute value, determine the target object in the i+1 th video frame based on the target attribute value of the target object in the i th video frame The third initial attribute value of ;

The target attribute value of the target object in the i+1 th video frame is determined based on the second initial attribute value and the third initial attribute value.

6. The method for determining the orientation of a target object according to claim 1, wherein determining the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and an edge detection frame, comprising:

Mapping the edge detection frame of the target object in the i+1 th video frame in a preset coordinate system to obtain the mapping key point of the edge detection frame of the target object in the i+1 th video frame;

The first orientation of the target object in the i+1 th video frame is determined based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object.

7 . The method for determining the orientation of a target object according to claim 6 , wherein the i+1th is determined based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object. 8 . The first orientation of the target object in video frames, including:

Determine the first target edge and the second target edge of the target object based on the edge detection frame;

Determine the edge value of the first target edge and the edge value of the second target edge according to the target attribute value of the target object;

Based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge, calculate and obtain the i+1 th video frame. The first orientation of the target object.

8. The method for determining the direction of a target object according to claim 2, wherein the target direction of the target object in the i-th video frame is determined according to the target attribute value of the target object and an edge detection frame, comprising:

The edge detection frame of the target object in the i-th video frame is mapped in a preset coordinate system, and the mapping key point of the edge detection frame of the target object in the i-th video frame is obtained;

Determine the first orientation of the target object in the i-th video frame based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object;

In the case where i is 1, the first orientation of the target object in the i-th video frame is used as the target orientation of the target object in the i-th video frame.

9. The method for determining the orientation of a target object according to claim 8, wherein the i-th video is determined based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object The first orientation of the target object in the frame, including:

Based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge, calculate and obtain the target object in the i-th video frame the first orientation.

10. A device for determining the orientation of a target object, comprising:

The first video receiving module is configured to receive the i+1 th video frame including the target object, and obtain the shape, key points and edge detection frame of the target object in the i+1 th video frame;

a first determining module, configured to determine a target attribute value of the target object based on the shape and key points of the target object;

a second determining module, configured to determine the first orientation of the target object in the i+1 th video frame according to the target attribute value of the target object and the edge detection frame;

a first target orientation determination module configured to determine a second orientation of the target object in the i+1 th video frame based on the target orientation of the target object in the i th video frame, based on the first orientation and the second orientation Determine the target orientation of the target object in the i+1 th video frame.

11. A computing device comprising:

memory and processor;

The memory is used for storing computer-executable instructions, the processor is used for executing the computer-executable instructions, and when the computer-executable instructions are executed by the processor, the method for determining the orientation of a target object according to any one of claims 1-9 is implemented A step of.

12. A computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the steps of the method for determining the orientation of a target object according to any one of claims 1-9.