TWI739601B

TWI739601B - Image processing method, electronic equipment and storage medium

Info

Publication number: TWI739601B
Application number: TW109131909A
Authority: TW
Inventors: 夏鵬程; 侯軍; 伊帥
Original assignee: 大陸商上海商湯智能科技有限公司
Priority date: 2020-05-28
Filing date: 2020-09-16
Publication date: 2021-09-11
Also published as: CN111724442B; WO2021237960A1; CN111724442A; TW202145147A

Abstract

This application discloses an image processing method, electronic equipment and computer readable storage medium. The method includes: obtaining a first position of a first character point in an image to be processed and a second position of a first character frame in the image to be processed; the first position and the second position are both used to characterize the position of the first person in the image to be processed; according to the first position and the second position, the third position of the first person in the image to be processed is obtained.

Description

Image processing method, electronic equipment and storage medium

本發明關於電腦視覺技術領域，關於但不限於一種圖像處理方法、電子設備、電腦可讀儲存介質。The present invention relates to the field of computer vision technology, but is not limited to an image processing method, electronic equipment, and computer-readable storage medium.

在公共場所（例如廣場、超市、地鐵站、碼頭等地方）中，有時會存在人流量過多的情況，進而導致人群過於密集的情況發生。這時易發生一些公共事故，例如踩踏事件。因此，如何確定圖像中人物的位置具有非常重要的意義。In public places (such as squares, supermarkets, subway stations, docks, etc.), sometimes there will be too many people, which will lead to too crowded. At this time, some public accidents are prone to occur, such as stampede incidents. Therefore, how to determine the position of a person in an image is of great significance.

目前，基於電腦視覺技術可通過對圖像進行人頭檢測處理，可得到圖像中人物的位置，但位置的準確度低。At present, based on computer vision technology, the human head detection processing can be performed on the image to obtain the position of the person in the image, but the accuracy of the position is low.

本發明實施例提供了一種圖像處理方法、電子設備和電腦可讀儲存介質。The embodiment of the present invention provides an image processing method, an electronic device, and a computer-readable storage medium.

本發明實施例提供了一種圖像處理方法，所述方法包括：獲取第一人物點在待處理圖像中的第一位置和第一人物框在所述待處理圖像中的第二位置；所述第一位置和所述第二位置均用於表徵第一人物在待處理圖像中的位置；依據所述第一位置和所述第二位置，得到所述第一人物在所述待處理圖像中的第三位置。 The embodiment of the present invention provides an image processing method, and the method includes: Acquire the first position of the first character point in the image to be processed and the second position of the first character frame in the image to be processed; both the first position and the second position are used to characterize the first The position of the character in the image to be processed; According to the first position and the second position, the third position of the first person in the image to be processed is obtained.

本發明的一些實施例中，所述方法還包括：獲取所述第一位置的第一置信度和所述第二位置的第二置信度；所述第一置信度與所述第一位置的尺度呈負相關；所述第二置信度與所述第二位置的尺度呈正相關；所述依據所述第一位置和所述第二位置，得到所述第一人物在所述待處理圖像中的第三位置，包括：將所述第一位置和所述第二位置中置信度最高的位置作為第四位置；依據所述第四位置，得到所述第三位置。 In some embodiments of the present invention, the method further includes: Acquire the first confidence level of the first location and the second confidence level of the second location; the first confidence level is negatively correlated with the scale of the first location; the second confidence level is related to the The scale of the second position is positively correlated; The obtaining the third position of the first person in the image to be processed according to the first position and the second position includes: Taking the position with the highest confidence level among the first position and the second position as the fourth position; According to the fourth position, the third position is obtained.

本發明的一些實施例中，所述獲取第一人物點在待處理圖像中的第一位置和第一人物框在所述待處理圖像中的第二位置，包括：對所述待處理圖像進行人物定位處理，得到所述第一位置和所述至少一個人物框的位置；依據所述第一位置和所述至少一個人物框的位置，確定所述至少一個人物框與所述第一人物點之間的距離，得到至少一個第一距離；將與第二距離對應的人物框作為所述第一人物框，確定所述第一人物框在所述待處理圖像中的第二位置；所述第二距離為所述至少一個第一距離中未超過距離閾值的距離。 In some embodiments of the present invention, the acquiring the first position of the first character point in the image to be processed and the second position of the first character frame in the image to be processed includes: Performing character positioning processing on the image to be processed to obtain the first position and the position of the at least one character frame; Determine the distance between the at least one character frame and the first character point according to the first position and the position of the at least one character frame to obtain at least one first distance; Use the person frame corresponding to the second distance as the first person frame, and determine the second position of the first person frame in the image to be processed; the second distance is the at least one first distance The distance in which the distance threshold is not exceeded.

本發明的一些實施例中，所述對所述待處理圖像進行人物定位處理，得到所述第一位置，包括：對所述待處理圖像進行人物定位處理，得到至少一個人物點的位置；將所述至少一個人物點的位置中置信度最高的位置作為所述第一位置。 In some embodiments of the present invention, the performing character positioning processing on the to-be-processed image to obtain the first position includes: Performing character positioning processing on the image to be processed to obtain the position of at least one character point; The position with the highest confidence among the positions of the at least one character point is taken as the first position.

本發明的一些實施例中，所述至少一個人物框包括第二人物框；所述至少一個第一距離包括所述第一人物點與所述第二人物框之間的第三距離；所述依據所述第一位置和所述至少一個人物框的位置，確定所述至少一個人物框與所述第一人物點之間的距離，得到至少一個第一距離，包括：依據所述第一位置和所述第二人物框的位置，得到所述第一人物點與所述第二人物框之間的第四距離；確定第一尺度與第二尺度之間的差異，得到第一差異；所述第一尺度為所述第一人物點在所述待處理圖像中的尺度；所述第二尺度為所述第二人物框在所述待處理圖像中的尺度；依據所述第四距離和所述第一差異，得到所述第三距離；所述第三距離與所述第一差異呈正相關。 In some embodiments of the present invention, the at least one character frame includes a second character frame; the at least one first distance includes a third distance between the first character point and the second character frame; The determining the distance between the at least one character frame and the first character point according to the first position and the position of the at least one character frame to obtain at least one first distance includes: Obtaining the fourth distance between the first character point and the second character frame according to the first position and the position of the second character frame; Determine the difference between the first scale and the second scale to obtain the first difference; the first scale is the scale of the first person point in the image to be processed; the second scale is the first 2. The scale of the character frame in the image to be processed; According to the fourth distance and the first difference, the third distance is obtained; the third distance is positively correlated with the first difference.

本發明的一些實施例中，所述方法還包括：依據所述第二人物框的位置，確定第二人物點；確定所述第一人物點與所述第二人物點的中點，得到第三人物點；獲取第一尺度指標；所述第一尺度指標表徵第一尺寸與第二尺寸之間的映射；所述第一尺寸為位於第一尺度位置的第一參考物體的尺寸；所述第二尺寸為所述第一參考物體在真實世界下的尺寸；所述第一尺度位置為所述第三人物點在所述待處理圖像中的位置；所述確定第一尺度與第二尺度之間的差異，得到第一差異，包括：依據所述第一尺度指標，得到所述第一差異。 In some embodiments of the present invention, the method further includes: Determine the second character point according to the position of the second character frame; Determining a midpoint between the first character point and the second character point to obtain a third character point; Obtain a first scale index; the first scale index represents the mapping between the first size and the second size; the first size is the size of the first reference object located at the first scale position; the second size is The size of the first reference object in the real world; the first scale position is the position of the third person point in the image to be processed; The determining the difference between the first scale and the second scale to obtain the first difference includes: According to the first scale index, the first difference is obtained.

本發明的一些實施例中，所述獲取第一尺度指標，包括：對所述待處理圖像進行物體檢測處理，得到第一物體框和第二物體框；依據所述第一物體框在y軸方向上的長度得到第一長度，依據所述第二物體框在y軸方向上的長度得到第二長度；所述y軸為所述待處理圖像的像素座標系的縱軸；依據所述第一長度和第一物體在真實世界下的第三長度得到第二尺度指標，依據所述第二長度和第二物體在真實世界下的第四長度得到第三尺度指標；所述第一物體為所述第一物體框所包含的檢測對象；所述第二物體為所述第二物體框所包含的檢測對象；所述第二尺度指標表徵第三尺寸與第四尺寸之間的映射；所述第三尺寸為位於第二尺度位置的第二參考物體的尺寸；所述第四尺寸為所述第二參考物體在真實世界下的尺寸；所述第二尺度位置為依據所述第一物體框的位置在所述待處理圖像中確定的位置；所述第三尺度指標表徵第五尺寸與第六尺寸之間的映射；所述第五尺寸為位於第三尺度位置的第三參考物體的尺寸；所述第六尺寸為所述第三參考物體在真實世界下的尺寸；所述第三尺度位置為依據所述第二物體框的位置在所述待處理圖像中確定的位置；對所述第二尺度指標和所述第三尺度指標進行曲線擬合處理，得到所述待處理圖像的尺度指標圖；所述尺度指標圖中的第一像素值表徵第七尺寸與第八尺寸之間的映射；所述第七尺寸為位於第四尺度位置的第四參考物體的尺寸；所述第八尺寸為所述第四參考物體在真實世界下的尺寸；所述第一像素值為第一像素點的像素值；所述第四尺度位置為第二像素點在所述待處理圖像中的位置；所述第一像素點在所述尺度指標圖中的位置與所述第二像素點在所述待處理圖像中的位置相同；依據所述尺度指標圖和所述第三人物點的位置，得到所述第一尺度指標。 In some embodiments of the present invention, the obtaining the first scale index includes: Performing object detection processing on the image to be processed to obtain a first object frame and a second object frame; The first length is obtained according to the length of the first object frame in the y-axis direction, and the second length is obtained according to the length of the second object frame in the y-axis direction; the y-axis is the length of the image to be processed The vertical axis of the pixel coordinate system; Obtaining a second scale index according to the first length and the third length of the first object in the real world, and obtaining a third scale index according to the second length and the fourth length of the second object in the real world; The first object is the detection object contained in the first object frame; the second object is the detection object contained in the second object frame; the second scale index represents the difference between the third size and the fourth size The third size is the size of the second reference object located at the second scale position; the fourth size is the size of the second reference object in the real world; the second scale position is based on The position of the first object frame is determined in the image to be processed; the third scale index represents the mapping between the fifth size and the sixth size; the fifth size is the position at the third scale The size of the third reference object; the sixth size is the size of the third reference object in the real world; the third scale position is based on the position of the second object frame in the image to be processed Determined location Perform curve fitting processing on the second scale index and the third scale index to obtain the scale index diagram of the image to be processed; the first pixel value in the scale index diagram represents the seventh size and the eighth size Mapping between sizes; the seventh size is the size of the fourth reference object at the fourth scale position; the eighth size is the size of the fourth reference object in the real world; the first pixel value Is the pixel value of the first pixel; the fourth scale position is the position of the second pixel in the image to be processed; the position of the first pixel in the scale index map is the same as that of the first pixel The positions of the two pixels in the image to be processed are the same; According to the scale index map and the position of the third person point, the first scale index is obtained.

本發明的一些實施例中，所述第二尺度位置為第一物體點在所述待處理圖像中的位置；所述第三尺度位置為第二物體點在所述待處理圖像中的位置；所述第一物體點為以下中的一個：所述第一物體框的幾何中心、所述第一物體框的頂點；所述第二物體點為以下中的一個：所述第二物體框的幾何中心、所述第二物體框的頂點。 In some embodiments of the present invention, the second scale position is the position of the first object point in the image to be processed; the third scale position is the position of the second object point in the image to be processed Location; The first object point is one of the following: the geometric center of the first object frame, the vertex of the first object frame; the second object point is one of the following: The geometric center and the vertex of the second object frame.

本發明的一些實施例中，所述第二人物點為以下中的一個：所述第二人物框的幾何中心、所述第二人物框的頂點。In some embodiments of the present invention, the second character point is one of the following: a geometric center of the second character frame, and a vertex of the second character frame.

本發明的一些實施例中，所述第一人物點所覆蓋的像素點區域和所述第一人物框所包含的像素點區域均為人頭區域。In some embodiments of the present invention, the pixel area covered by the first character dot and the pixel area included in the first character frame are both human head areas.

本發明的一些實施例中，所述第一人物框的形狀為矩形、菱形、圓形、橢圓形或多邊形。In some embodiments of the present invention, the shape of the first character frame is a rectangle, a diamond, a circle, an ellipse, or a polygon.

本發明實施例還提供了一種圖像處理裝置，所述裝置包括：獲取單元，配置為獲取第一人物點在待處理圖像中的第一位置和第一人物框在所述待處理圖像中的第二位置；所述第一位置和所述第二位置均用於表徵第一人物在待處理圖像中的位置；第一處理單元，配置為依據所述第一位置和所述第二位置，得到所述第一人物在所述待處理圖像中的第三位置。 An embodiment of the present invention also provides an image processing device, which includes: An acquiring unit configured to acquire the first position of the first character point in the image to be processed and the second position of the first character frame in the image to be processed; both the first position and the second position Used to characterize the position of the first person in the image to be processed; The first processing unit is configured to obtain a third position of the first person in the image to be processed according to the first position and the second position.

本發明的一些實施例中，所述獲取單元，還配置為獲取所述第一位置的第一置信度和所述第二位置的第二置信度；所述第一置信度與所述第一位置的尺度呈負相關；所述第二置信度與所述第二位置的尺度呈正相關；所述第一處理單元配置為：將所述第一位置和所述第二位置中置信度最高的位置作為第四位置；依據所述第四位置，得到所述第三位置。 In some embodiments of the present invention, the acquiring unit is further configured to acquire a first confidence level of the first position and a second confidence level of the second position; the first confidence level is compared with the first confidence level The scale of the position is negatively correlated; the second confidence level is positively correlated with the scale of the second position; The first processing unit is configured as: Taking the position with the highest confidence level among the first position and the second position as the fourth position; According to the fourth position, the third position is obtained.

本發明的一些實施例中，所述獲取單元，還配置為對所述待處理圖像進行人物定位處理，得到所述第一位置和所述至少一個人物框的位置；依據所述第一位置和所述至少一個人物框的位置，確定所述至少一個人物框與所述第一人物點之間的距離，得到至少一個第一距離；將與第二距離對應的人物框作為所述第一人物框；所述第二距離為所述至少一個第一距離中未超過距離閾值的距離。 In some embodiments of the present invention, the acquiring unit is further configured to perform character positioning processing on the image to be processed to obtain the first position and the position of the at least one character frame; Determine the distance between the at least one character frame and the first character point according to the first position and the position of the at least one character frame to obtain at least one first distance; A person frame corresponding to the second distance is used as the first person frame; the second distance is a distance that does not exceed a distance threshold in the at least one first distance.

本發明的一些實施例中，所述獲取單元，配置為：對所述待處理圖像進行人物定位處理，得到至少一個人物點的位置；將所述至少一個人物點的位置中置信度最高的位置作為所述第一位置。 In some embodiments of the present invention, the acquiring unit is configured to: Performing character positioning processing on the image to be processed to obtain the position of at least one character point; The position with the highest confidence among the positions of the at least one character point is taken as the first position.

本發明的一些實施例中，所述至少一個人物框包括第二人物框；所述至少一個第一距離包括所述第一人物點與所述第二人物框之間的第三距離；所述獲取單元，配置為：依據所述第一位置和所述第二人物框的位置，得到所述第一人物點與所述第二人物框之間的第四距離；確定第一尺度與第二尺度之間的差異，得到第一差異；所述第一尺度為所述第一人物點在所述待處理圖像中的尺度；所述第二尺度為所述第二人物框在所述待處理圖像中的尺度；依據所述第四距離和所述第一差異，得到所述第三距離；所述第三距離與所述第一差異呈正相關。 In some embodiments of the present invention, the at least one character frame includes a second character frame; the at least one first distance includes a third distance between the first character point and the second character frame; The acquiring unit is configured to: Obtaining the fourth distance between the first character point and the second character frame according to the first position and the position of the second character frame; Determine the difference between the first scale and the second scale to obtain the first difference; the first scale is the scale of the first person point in the image to be processed; the second scale is the first 2. The scale of the character frame in the image to be processed; According to the fourth distance and the first difference, the third distance is obtained; the third distance is positively correlated with the first difference.

本發明的一些實施例中，所述裝置還包括第二處理單元，所述第二處理單元配置為：依據所述第二人物框的位置，確定第二人物點；確定所述第一人物點與所述第二人物點的中點，得到第三人物點；所述獲取單元，還配置為獲取第一尺度指標；所述第一尺度指標表徵第一尺寸與第二尺寸之間的映射；所述第一尺寸為位於第一尺度位置的第一參考物體的尺寸；所述第二尺寸為所述第一參考物體在真實世界下的尺寸；所述第一尺度位置為所述第三人物點在所述待處理圖像中的位置；依據所述第一尺度指標，得到所述第一差異。 In some embodiments of the present invention, the device further includes a second processing unit, and the second processing unit is configured to: Determine the second character point according to the position of the second character frame; Determining a midpoint between the first character point and the second character point to obtain a third character point; The acquiring unit is further configured to acquire a first scale index; the first scale index represents a mapping between a first size and a second size; the first size is the value of the first reference object located at the first scale position Size; the second size is the size of the first reference object in the real world; the first scale position is the position of the third person point in the image to be processed; According to the first scale index, the first difference is obtained.

本發明的一些實施例中，所述獲取單元，配置為：對所述待處理圖像進行物體檢測處理，得到第一物體框和第二物體框；依據所述第一物體框在y軸方向上的長度得到第一長度，依據所述第二物體框在y軸方向上的長度得到第二長度；所述y軸為所述待處理圖像的像素座標系的縱軸；依據所述第一長度和第一物體在真實世界下的第三長度得到第二尺度指標，依據所述第二長度和第二物體在真實世界下的第四長度得到第三尺度指標；所述第一物體為所述第一物體框所包含的檢測對象；所述第二物體為所述第二物體框所包含的檢測對象；所述第二尺度指標表徵第三尺寸與第四尺寸之間的映射；所述第三尺寸為位於第二尺度位置的第二參考物體的尺寸；所述第四尺寸為所述第二參考物體在真實世界下的尺寸；所述第二尺度位置為依據所述第一物體框的位置在所述待處理圖像中確定的位置；所述第三尺度指標表徵第五尺寸與第六尺寸之間的映射；所述第五尺寸為位於第三尺度位置的第三參考物體的尺寸；所述第六尺寸為所述第三參考物體在真實世界下的尺寸；所述第三尺度位置為依據所述第二物體框的位置在所述待處理圖像中確定的位置；對所述第二尺度指標和所述第三尺度指標進行曲線擬合處理，得到所述待處理圖像的尺度指標圖；所述尺度指標圖中的第一像素值表徵第七尺寸與第八尺寸之間的映射；所述第七尺寸為位於第四尺度位置的第四參考物體的尺寸；所述第八尺寸為所述第四參考物體在真實世界下的尺寸；所述第一像素值為第一像素點的像素值；所述第四尺度位置為第二像素點在所述待處理圖像中的位置；所述第一像素點在所述尺度指標圖中的位置與所述第二像素點在所述待處理圖像中的位置相同；依據所述尺度指標圖和所述第三人物點的位置，得到所述第一尺度指標。 In some embodiments of the present invention, the acquiring unit is configured to: Performing object detection processing on the image to be processed to obtain a first object frame and a second object frame; The first length is obtained according to the length of the first object frame in the y-axis direction, and the second length is obtained according to the length of the second object frame in the y-axis direction; the y-axis is the length of the image to be processed The vertical axis of the pixel coordinate system; Obtaining a second scale index according to the first length and the third length of the first object in the real world, and obtaining a third scale index according to the second length and the fourth length of the second object in the real world; The first object is the detection object contained in the first object frame; the second object is the detection object contained in the second object frame; the second scale index represents the difference between the third size and the fourth size The third size is the size of the second reference object located at the second scale position; the fourth size is the size of the second reference object in the real world; the second scale position is based on The position of the first object frame is determined in the image to be processed; the third scale index represents the mapping between the fifth size and the sixth size; the fifth size is the position at the third scale The size of the third reference object; the sixth size is the size of the third reference object in the real world; the third scale position is based on the position of the second object frame in the image to be processed Determined location Perform curve fitting processing on the second scale index and the third scale index to obtain the scale index diagram of the image to be processed; the first pixel value in the scale index diagram represents the seventh size and the eighth size Mapping between sizes; the seventh size is the size of the fourth reference object at the fourth scale position; the eighth size is the size of the fourth reference object in the real world; the first pixel value Is the pixel value of the first pixel; the fourth scale position is the position of the second pixel in the image to be processed; the position of the first pixel in the scale index map is the same as that of the first pixel The positions of the two pixels in the image to be processed are the same; According to the scale index map and the position of the third person point, the first scale index is obtained.

本發明實施例還提供了一種處理器，所述處理器配置為執行如上述任意一種圖像處理方法。The embodiment of the present invention also provides a processor configured to execute any one of the above-mentioned image processing methods.

本發明實施例還提供了一種電子設備，包括：處理器和記憶體，所述記憶體用於儲存電腦程式代碼，所述電腦程式代碼包括電腦指令，在所述處理器執行所述電腦指令的情況下，所述電子設備執行上述任意一種圖像處理方法。An embodiment of the present invention also provides an electronic device, including: a processor and a memory, the memory is used to store computer program code, the computer program code includes computer instructions, the processor executes the computer instructions In this case, the electronic device executes any one of the above-mentioned image processing methods.

本發明實施例還提供了一種電腦可讀儲存介質，所述電腦可讀儲存介質中儲存有電腦程式，所述電腦程式包括程式指令，在所述程式指令被處理器執行的情況下，使所述處理器執行上述任意一種圖像處理方法。The embodiment of the present invention also provides a computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and the computer program includes program instructions. When the program instructions are executed by a processor, The processor executes any one of the above-mentioned image processing methods.

本發明實施例還提供了一種電腦程式，所述電腦程式包括電腦程式或指令，在所述電腦程式或指令在電腦上運行的情況下，使得所述電腦執行上述任意一種圖像處理方法。An embodiment of the present invention also provides a computer program, the computer program includes a computer program or instruction, and when the computer program or instruction runs on a computer, the computer is caused to execute any one of the above-mentioned image processing methods.

在本發明實施例中，在人物處於待處理圖像中的遠處的情況下，依據人物點的位置確定人物在待處理圖像中的位置；在人物處於待處理圖像中的近處的情況下，依據人物框的位置確定人物在待處理圖像中的位置。以此提高人物在待處理圖像中的位置的準確度。In the embodiment of the present invention, when the person is far away in the image to be processed, the position of the person in the image to be processed is determined according to the position of the person point; when the person is close to the image to be processed In this case, the position of the character in the image to be processed is determined according to the position of the character frame. In this way, the accuracy of the position of the character in the image to be processed is improved.

應當理解的是，以上的一般描述和後文的細節描述僅是示例性和解釋性的，而非限制本發明。It should be understood that the above general description and the following detailed description are only exemplary and explanatory, rather than limiting the present invention.

為了使本技術領域的人員更好地理解本發明方案，下面將結合本發明實施例中的附圖，對本發明實施例中的技術方案進行清楚、完整地描述，顯然，所描述的實施例僅僅是本發明一部分實施例，而不是全部的實施例。基於本發明中的實施例，本領域普通技術人員在沒有做出創造性勞動前提下所獲得的所有其他實施例，都屬於本發明保護的範圍。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only These are a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

本發明的說明書和申請專利範圍及上述附圖中的術語“第一”、“第二”等是用於區別不同對象，而不是用於描述特定順序。此外，術語“包括”和“具有”以及它們任何變形，意圖在於覆蓋不排他的包含。例如包含了一系列步驟或單元的過程、方法、系統、產品或設備沒有限定於已列出的步驟或單元，而是可選地還包括沒有列出的步驟或單元，或可選地還包括對於這些過程、方法、產品或設備固有的其他步驟或單元。The terms "first" and "second" in the specification and patent application scope of the present invention and the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusions. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.

在本文中提及“實施例”意味著，結合實施例描述的特定特徵、結構或特性可以包含在本發明的至少一個實施例中。在說明書中的各個位置出現該短語並不一定均是指相同的實施例，也不是與其它實施例互斥的獨立的或備選的實施例。本領域技術人員顯式地和隱式地理解的是，本文所描述的實施例可以與其它實施例相結合。Reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present invention. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.

首先對下文將要出現的一些概念進行定義。本發明實施例中，[a，b]表示大於或等於a且小於或等於b的取值區間。First, define some concepts that will appear below. In the embodiment of the present invention, [a, b] represents a value interval greater than or equal to a and less than or equal to b.

本發明實施例中，圖像中近處的人物對應的圖像尺度大，圖像中遠處的人物對應的圖像尺度小。本發明實施例中的“遠”指與圖像中人物對應的真實人物與採集上述圖像的成像設備之間的距離遠，“近”指與圖像中人物對應的真實人物與採集上述圖像的成像設備之間的距離近。In the embodiment of the present invention, the image scale corresponding to a person near the image is large, and the image scale corresponding to a person far away in the image is small. In the embodiment of the present invention, "far" refers to the distance between the real person corresponding to the person in the image and the imaging device that captures the image, and "near" refers to the real person corresponding to the person in the image and the image capture device. The distance between the imaging devices of the image is close.

在圖像中，近處的人物覆蓋的像素點區域的面積比遠處的人物覆蓋的像素點區域的面積大。例如，圖1中人物A相較於人物B為近處的人物，且人物A覆蓋的像素點區域的面積比人物B覆蓋的像素點區域的面積大。而近處的人物覆蓋的像素點區域的尺度大，遠處的人物覆蓋的像素點區域的尺度小。因此，人物覆蓋的像素點區域的面積與人物覆蓋的像素點區域的尺度呈正相關。In the image, the area of the pixel area covered by a person in the vicinity is larger than the area of the pixel area covered by a person in the distance. For example, the person A in FIG. 1 is a close person compared to the person B, and the area of the pixel area covered by the person A is larger than the area of the pixel area covered by the person B. However, the scale of the pixel area covered by the people in the vicinity is large, and the scale of the pixel area covered by the people in the distance is small. Therefore, the area of the pixel area covered by the person is positively correlated with the scale of the pixel area covered by the person.

本發明實施例中，圖像中的位置均指圖像的像素座標下的位置。本發明實施例中的像素座標系的橫座標用於表示像素點所在的列數，像素座標系下的縱座標用於表示像素點所在的行數。例如，在圖2所示的圖像中，以圖像的左上角為座標原點 O、平行於圖像的行的方向為 X軸的方向、平行於圖像的列的方向為 Y軸的方向，構建像素座標系為 XOY。橫座標和縱座標的單位均為像素點。例如，圖2中的像素點A ₁₁的座標為（1，1），像素點A ₂₃的座標為（3，2），像素點A ₄₂的座標為（2，4），像素點A ₃₄的座標為（4，3）。 In the embodiment of the present invention, the positions in the image all refer to the positions under the pixel coordinates of the image. In the embodiment of the present invention, the abscissa of the pixel coordinate system is used to indicate the number of columns where the pixel is located, and the ordinate under the pixel coordinate system is used to indicate the number of rows where the pixel is located. For example, in the image shown in Figure 2, the upper left corner of the image is the coordinate origin O , the direction parallel to the image row is the direction of the X axis, and the direction parallel to the image column is the direction of the Y axis. Direction, the construction pixel coordinate system is XOY . The units of the abscissa and ordinate are pixels. For example, _{the coordinates of pixel A 11} in Figure 2 are (1, 1), the coordinates of pixel A ₂₃ are (3, 2), _{the coordinates of pixel A 42} are (2, 4), and the coordinates of pixel A ₃₄ are The coordinates are (4, 3).

在公共場所（例如廣場、超市、地鐵站、碼頭等地方）中，有時會存在人流量過多的情況，進而導致人群過於密集的情況發生。這時易發生一些公共事故，例如踩踏事件。因此，如何確定圖像中的人數、圖像中的人群密度、圖像中人群的分佈具有非常重要的意義。In public places (such as squares, supermarkets, subway stations, docks, etc.), sometimes there will be too many people, which will lead to too crowded. At this time, some public accidents are prone to occur, such as stampede incidents. Therefore, how to determine the number of people in the image, the density of the crowd in the image, and the distribution of the crowd in the image are of very important significance.

本發明的一些實施例中，可通過確定圖像中每個人的位置，確定圖像中的人數、圖像中的人群密度、圖像中人群的分佈。隨著電腦視覺技術的發展，基於電腦視覺的方法可確定圖像中每個人的位置（為表述方便，下文將確定圖像中每個人的位置稱為人群定位）。In some embodiments of the present invention, the position of each person in the image can be determined to determine the number of people in the image, the density of the crowd in the image, and the distribution of the crowd in the image. With the development of computer vision technology, computer vision-based methods can determine the location of each person in the image (for ease of presentation, determining the location of each person in the image is referred to as crowd positioning below).

相關技術中，基於電腦視覺技術可通過對圖像進行人頭檢測處理，得到圖像中人頭框（即包含人頭的框）。依據人頭框的位置可確定人物在圖像中的位置。由於在圖像中近處的人頭比遠處的人頭的大，且在人群密度較大的情況下遠處的人頭的密度大，遠處的人頭框的位置的準確度低。進而導致人群定位的準確度低。基於此，本發明實施例提供了一種提高人群定位的準確度的方法。In related technologies, a human head frame (that is, a frame containing a human head) in the image can be obtained by performing head detection processing on the image based on computer vision technology. The position of the character in the image can be determined according to the position of the head frame. Since the heads in the near area are larger than the heads in the distance in the image, and the density of the heads in the distance is higher when the crowd density is high, the accuracy of the position of the head frame in the distance is low. In turn, the accuracy of crowd positioning is low. Based on this, the embodiment of the present invention provides a method for improving the accuracy of crowd positioning.

本發明實施例的圖像處理方法可以由圖像處理裝置實施。在一些實施例中，圖像處理裝置可以是以下中的一種：手機、電腦、伺服器、平板電腦。本發明實施例的圖像處理方法還可以通過處理器執行電腦代碼的方式實現。下面結合本發明實施例中的附圖對本發明實施例進行描述。The image processing method of the embodiment of the present invention may be implemented by an image processing device. In some embodiments, the image processing device may be one of the following: a mobile phone, a computer, a server, and a tablet computer. The image processing method of the embodiment of the present invention can also be implemented by a processor executing computer code. The embodiments of the present invention will be described below in conjunction with the drawings in the embodiments of the present invention.

請參閱圖3，圖3是本發明實施例提供的一種圖像處理方法的流程示意圖。Please refer to FIG. 3, which is a schematic flowchart of an image processing method according to an embodiment of the present invention.

步驟301：獲取第一人物點在待處理圖像中的第一位置和第一人物框在上述待處理圖像中的第二位置，其中，上述第一位置和上述第二位置均用於表徵第一人物在待處理圖像中的位置。Step 301: Obtain the first position of the first character point in the image to be processed and the second position of the first character frame in the image to be processed, wherein both the first position and the second position are used for characterization The position of the first person in the image to be processed.

本發明實施例中，第一人物點和第一人物框均可通過對待處理圖像進行人物檢測處理得到。在一些實施例中，對待處理圖像的人物檢測處理可以通過人物檢測網路對待處理圖像實現。該人物檢測網路通過以訓練圖像對卷積神經網路進行訓練得到，其中，訓練圖像的標注資訊包括以下至少一種：人物點的位置、人物框的位置。在訓練圖像的標注資訊包括人物點的位置的情況下，使用人物檢測網路對待處理圖像進行處理，可得到包括上述第一位置在內的至少一個人物點的位置；在訓練圖像的標注資訊包括人物框的位置的情況下，使用人物檢測網路對待處理圖像進行處理，可得到包括上述第二位置在內的至少一個人物框的位置；在訓練圖像的標注資訊包括人物點的位置和人物框的位置的情況下，使用人物檢測網路對待處理圖像進行處理，可得到包括上述第一位置在內的至少一個人物點的位置，以及包括上述第二位置在內的至少一個人物框的位置。In the embodiment of the present invention, both the first person point and the first person frame can be obtained by performing person detection processing on the image to be processed. In some embodiments, the person detection processing of the image to be processed can be implemented through the person detection network in the image to be processed. The person detection network is obtained by training a convolutional neural network with a training image, where the label information of the training image includes at least one of the following: the position of the character point and the position of the character frame. In the case that the annotation information of the training image includes the position of the person point, the person detection network is used to process the image to be processed, and the position of at least one person point including the above-mentioned first position can be obtained; In the case where the annotation information includes the position of the person frame, the person detection network is used to process the image to be processed, and the position of at least one person frame including the above-mentioned second position can be obtained; the annotation information of the training image includes the person point In the case of the position of the person frame and the position of the person frame, using the person detection network to process the image to be processed, the position of at least one person point including the above-mentioned first position and at least the position of at least one person including the above-mentioned second position can be obtained The position of a character frame.

在待處理圖像中，人物點所覆蓋的像素點區域可視為人物區域，其中人物區域為人體所覆蓋的像素點區域。例如，第一人物點所覆蓋的區域屬於人頭所覆蓋的像素點區域。又例如，第一人物點所覆蓋的區域屬於手臂所覆蓋的像素點區域。再例如，第一人物點所覆蓋的區域屬於軀幹所覆蓋的像素點區域。在待處理圖像中，人物框所包含的像素點區域可視為人物區域。例如，第一人物框所包含的區域為人頭所覆蓋的像素點區域。又例如，第一人物框所包含的區域為人臉所覆蓋的像素點區域。再例如，第一人物框所包含的區域為軀幹所覆蓋的像素點區域。In the image to be processed, the pixel area covered by the character point can be regarded as the character area, and the character area is the pixel point area covered by the human body. For example, the area covered by the first character point belongs to the pixel point area covered by the human head. For another example, the area covered by the first character point belongs to the pixel point area covered by the arm. For another example, the area covered by the first character point belongs to the pixel point area covered by the torso. In the image to be processed, the pixel area contained in the character frame can be regarded as the character area. For example, the area contained in the first character frame is the pixel area covered by the human head. For another example, the area contained in the first person frame is the pixel area covered by the face. For another example, the area included in the first character frame is the pixel area covered by the torso.

本發明實施例中，第一人物點可以是任意形狀，本發明對第一人物點的形狀不做限定。在一些實施例中，第一人物點的形狀包括以下至少一種：圓形、菱形、矩形、橢圓形、多邊形。In the embodiment of the present invention, the first character point may have any shape, and the present invention does not limit the shape of the first character point. In some embodiments, the shape of the first character point includes at least one of the following: a circle, a diamond, a rectangle, an ellipse, and a polygon.

本發明實施例中，第一位置為第一人物點在待處理圖像的像素座標系下的位置。例如，在第一人物點的形狀為圓形的情況下，第一位置可以是第一人物點的圓心在像素座標下的位置。又例如，在第一人物點的形狀為矩形的情況下，第一位置可以是第一人物點的幾何心在像素座標下的位置。In the embodiment of the present invention, the first position is the position of the first character point in the pixel coordinate system of the image to be processed. For example, in the case where the shape of the first character point is a circle, the first position may be the position of the center of the first character point under the pixel coordinates. For another example, when the shape of the first character point is a rectangle, the first position may be the position of the geometric center of the first character point under the pixel coordinates.

在本發明的一些實施例中，第一人物點為待處理圖像中的像素點，第一位置為該像素點在像素座標系下的位置。例如，在圖4所示的待處理圖像中，第一人物點為像素點A ₁₃，則第一位置為像素點A ₁₃在像素座標系下的位置。 In some embodiments of the present invention, the first character point is a pixel point in the image to be processed, and the first position is the position of the pixel point in the pixel coordinate system. For example, in the to-be-processed image shown in FIG. 4, the first character point is the pixel point A ₁₃ , and the first position is the position of the pixel point A ₁₃ in the pixel coordinate system.

在本發明的一些實施例中，第一位置還攜帶第一人物點的尺寸資訊。例如，在第一人物點的形狀為圓形的情況下，第一位置還攜帶第一人物點的半徑。又例如，在第一人物點為矩形的情況下，第一位置還攜帶第一人物點的長和寬。依據第一位置，可確定第一人物點在待處理圖像中所覆蓋的像素點區域。可以理解的是，若第一人物點為待處理圖像中的像素點，即使第一位置未攜帶第一人物點的尺寸資訊，也可依據第一位置確定第一人物點在待處理圖像中所覆蓋的像素點區域。In some embodiments of the present invention, the first position also carries size information of the first character point. For example, in the case where the shape of the first character point is a circle, the first position also carries the radius of the first character point. For another example, when the first character point is a rectangle, the first position also carries the length and width of the first character point. According to the first position, the pixel area covered by the first character point in the image to be processed can be determined. It is understandable that if the first character point is a pixel in the image to be processed, even if the first position does not carry the size information of the first character point, it can be determined that the first character point is in the image to be processed according to the first position. The area of pixels covered by.

本發明實施例中，第一人物框可以是任意形狀，本發明對第一人物框的形狀不做限定。在本發明的一些實施例中，第一人物框的形狀包括以下至少一種：矩形、菱形、圓形、橢圓形、多邊形。In the embodiment of the present invention, the first character frame may have any shape, and the present invention does not limit the shape of the first character frame. In some embodiments of the present invention, the shape of the first character frame includes at least one of the following: a rectangle, a diamond, a circle, an ellipse, and a polygon.

本發明實施例中，第二位置為第一人物框在待處理圖像中的位置。例如，在第一人物框的形狀為矩形的情況下，第二位置可以包括矩形中任意一對對角的座標，其中，一對對角指過矩形的對角線上的兩個頂點。又例如，在第一人物框的形狀為矩形的情況下，第二位置可以包括：矩形的幾何中心的位置、矩形的長和矩形的寬。再例如，在第一人物框的形狀為圓形的情況下，第二位置可以包括：第一人物框的圓心、第一人物框的半徑。依據第二位置，可確定第一人物框在待處理圖像中所包含的像素點區域。In the embodiment of the present invention, the second position is the position of the first character frame in the image to be processed. For example, when the shape of the first character frame is a rectangle, the second position may include coordinates of any pair of diagonal corners in the rectangle, wherein the pair of diagonal corners refers to two vertices on the diagonal of the rectangle. For another example, when the shape of the first character frame is a rectangle, the second position may include: the position of the geometric center of the rectangle, the length of the rectangle, and the width of the rectangle. For another example, when the shape of the first character frame is circular, the second position may include: the center of the first character frame and the radius of the first character frame. According to the second position, the pixel area included in the image to be processed by the first character frame can be determined.

本發明實施例中，第一位置可用於表示人物在待處理圖像中的位置。在一種可能實現的方式中，第一位置表徵在待處理圖像中的第一位置處有人物。例如，假設第一位置為（7，8），則人物在待處理圖像中的位置為（7，8）。In the embodiment of the present invention, the first position may be used to indicate the position of the person in the image to be processed. In a possible implementation manner, the first position indicates that there is a person at the first position in the image to be processed. For example, if the first position is (7, 8), the position of the person in the image to be processed is (7, 8).

在本發明的一些實施例中，第一位置表徵在待處理圖像中，基於第一位置構建的像素點鄰域為人物區域。例如，假設第一位置為（7，8），以第一位置為圓心、2個像素點為半徑，構建像素點鄰域n1，則n1為人物區域。此時，n1內的任意一個像素點所處的位置均可作為人物在待處理圖像中的位置。In some embodiments of the present invention, the first position is represented in the image to be processed, and the pixel neighborhood constructed based on the first position is a character area. For example, suppose the first position is (7, 8), the first position is the center of the circle, and the 2 pixels are the radius to construct the pixel neighborhood n1, and then n1 is the character area. At this time, the position of any pixel in n1 can be used as the position of the person in the image to be processed.

本發明實施例中，第二位置也可用於表示人物在待處理圖像中的位置。在本發明的一些實施例中，第一人物框內的任意一個像素點所處的位置均有人物，而依據第二位置可確定第一人物框內任意一個像素點的位置，因此，依據第二位置可確定人物在待處理圖像中的位置。例如，假設第一人物框的形狀為矩形，第二位置包括第一人物框的一對對角的座標：（7，8）、（10，10）。此時，座標（x，y）可作為人物在待處理圖像中的位置，其中，x的取值範圍為[7，10]，y的取值範圍為[8，10]。In the embodiment of the present invention, the second position may also be used to indicate the position of the person in the image to be processed. In some embodiments of the present invention, the position of any pixel in the first character frame has a character, and the position of any pixel in the first character frame can be determined according to the second position. Therefore, according to the first character frame The second position can determine the position of the person in the image to be processed. For example, suppose that the shape of the first character frame is rectangular, and the second position includes the coordinates of a pair of diagonal corners of the first character frame: (7, 8), (10, 10). At this time, the coordinates (x, y) can be used as the position of the person in the image to be processed, where the value range of x is [7, 10], and the value range of y is [8, 10].

在本發明的一些實施例中，依據第二位置確定在第一人物框的幾何中心處有人物存在。例如，假設第一人物框的形狀為矩形，第二位置包括第一人物框的一對對角的座標：（8，8）、（12，10）。此時，第一人物框的幾何中心的座標為（10，9），即人物在待處理圖像中的位置為（7，8）。In some embodiments of the present invention, it is determined that there is a character at the geometric center of the first character frame according to the second position. For example, suppose that the shape of the first character frame is a rectangle, and the second position includes the coordinates of a pair of diagonal corners of the first character frame: (8, 8), (12, 10). At this time, the coordinates of the geometric center of the first character frame are (10, 9), that is, the position of the character in the image to be processed is (7, 8).

本發明實施例中，第一位置和第二位置均用於表示同一個人物（即上述第一人物）在待處理圖像的位置。在一種獲取第一位置的實現方式中，圖像處理裝置接收使用者通過輸入元件輸入的第一位置。上述輸入元件包括：鍵盤、滑鼠、觸控屏、觸控板和音頻輸入器等。在另一種獲取第一位置的實現方式中，圖像處理裝置接收第一終端發送的第一位置。在本發明的一些實施例中，第一終端可以是以下任意一種：手機、電腦、平板電腦、伺服器、可穿戴設備。在又一種獲取第一位置的實現方式中，圖像處理裝置對待處理圖像進行人物檢測處理，得到第一位置。In the embodiment of the present invention, the first position and the second position are both used to indicate the position of the same person (that is, the above-mentioned first person) in the image to be processed. In an implementation manner of acquiring the first position, the image processing device receives the first position input by the user through the input element. The above-mentioned input elements include: keyboard, mouse, touch screen, touch pad, audio input device, etc. In another implementation manner of acquiring the first position, the image processing apparatus receives the first position sent by the first terminal. In some embodiments of the present invention, the first terminal may be any of the following: a mobile phone, a computer, a tablet computer, a server, and a wearable device. In yet another implementation manner for obtaining the first position, the image processing device performs person detection processing on the image to be processed to obtain the first position.

在一種獲取第二位置的實現方式中，圖像處理裝置接收使用者通過輸入元件輸入的第二位置。上述輸入元件包括：鍵盤、滑鼠、觸控屏、觸控板和音頻輸入器等。在另一種獲取第二位置的實現方式中，圖像處理裝置接收第二終端發送的第二位置。在本發明的一些實施例中，第二終端可以是以下任意一種：手機、電腦、平板電腦、伺服器、可穿戴設備。第二終端與第一終端可以相同，也可以不同，本發明對此不做限定。在又一種獲取第二位置的實現方式中，圖像處理裝置對待處理圖像進行人物檢測處理，得到第二位置。In an implementation manner of acquiring the second position, the image processing device receives the second position input by the user through the input element. The above-mentioned input elements include: keyboard, mouse, touch screen, touch pad, audio input device, etc. In another implementation manner of acquiring the second location, the image processing apparatus receives the second location sent by the second terminal. In some embodiments of the present invention, the second terminal may be any one of the following: a mobile phone, a computer, a tablet computer, a server, and a wearable device. The second terminal and the first terminal may be the same or different, which is not limited in the present invention. In yet another implementation manner for acquiring the second position, the image processing device performs person detection processing on the image to be processed to obtain the second position.

步驟302：依據上述第一位置和上述第二位置，得到上述第一人物在上述待處理圖像中的第三位置。Step 302: Obtain the third position of the first person in the image to be processed according to the first position and the second position.

由於在待處理圖像中，近處的人物比遠處的人物大，且在人群密度較大的情況下遠處的人群密度大（如人與人相互挨著，甚至在待處理圖像中，兩個不同的人物區域之間存在重疊區域），遠處的第一人物框的位置的準確度低，進而導致依據第一人物框確定的人物的位置的準確度低。Because in the image to be processed, people in the vicinity are larger than those in the distance, and when the crowd density is high, the crowd density in the distance is high (such as people next to each other, or even in the image to be processed). , There is an overlapping area between two different character areas), the accuracy of the position of the first character frame in the distance is low, which in turn leads to the low accuracy of the position of the character determined according to the first character frame.

由於在待處理圖像中，近處的人物區域的面積較大，與近處的人物區域對應的人物點的數量可能會超過1個，此時，與該人物區域對應的人物的位置超過1，進而導致依據第一人物框確定的人物的位置的準確度低。也就是說，依據人物框的位置，確定的近處的人物的位置的準確度高；依據人物框的位置，確定的遠處的人物的位置的準確度低。依據人物點的位置，確定的近處的人物的位置的準確度低；依據人物點的位置，確定的遠處的人物的位置的準確度高。Since the area of the nearby person in the image to be processed is large, the number of person points corresponding to the nearby person area may exceed one. At this time, the position of the person corresponding to the person area exceeds 1. , Which in turn leads to low accuracy of the position of the character determined according to the first character frame. That is, according to the position of the character frame, the accuracy of determining the position of a nearby person is high; according to the position of the character frame, the accuracy of determining the position of a distant person is low. According to the position of the character point, the accuracy of determining the position of the nearby character is low; according to the position of the character point, the accuracy of determining the position of the distant character is high.

本發明實施例中，對於待處理圖像中的同一個人物而言，依據該人物的人物點的位置和該人物的人物框的位置，確定該人物的位置，可在該人物在待處理圖像的近處的情況下，依據該人物的人物框確定該人物的位置；以及在該人物在待處理圖像的遠處的情況下，依據該人物的人物點確定該人物的位置。這樣，可提高該人物在待處理圖像中的位置的準確度。In the embodiment of the present invention, for the same character in the image to be processed, the position of the character is determined according to the position of the character point of the character and the position of the character frame of the character. When the image is close to the image, the position of the character is determined according to the character frame of the character; and when the character is far away in the image to be processed, the position of the character is determined according to the character point of the character. In this way, the accuracy of the position of the person in the image to be processed can be improved.

在本發明的一些實施例中，依據第一位置得到第一人物點的中心的縱座標（下文將稱為第一縱座標），依據第二位置得到第一人物框的中心的縱座標（下文將稱為第二縱座標）。由於在待處理圖像中縱座標的大小可用於表徵遠和近，可依據縱座標的大小確定第一人物位於待處理圖像的近處還是位於待處理圖像的遠處。在第一人物位於待處理圖像的近處的情況下，依據第一位置確定第一人物在待處理圖像中位置；在第一人物位於待處理圖像的遠處的情況下，依據第二位置確定第一人物在待處理圖像中位置。In some embodiments of the present invention, the ordinate of the center of the first character point (hereinafter referred to as the first ordinate) is obtained according to the first position, and the ordinate of the center of the first character frame (hereinafter referred to as the first ordinate) is obtained according to the second position. Will be called the second ordinate). Since the size of the ordinate in the image to be processed can be used to represent the distance and nearness, it can be determined according to the size of the ordinate to whether the first person is located close to the image to be processed or far away from the image to be processed. In the case where the first person is located close to the image to be processed, the position of the first person in the image to be processed is determined according to the first position; in the case where the first person is located far away from the image to be processed, the position of the first person in the image to be processed is determined according to the The second position determines the position of the first person in the image to be processed.

例如，在第一縱座標處於[第一值，第二值]的情況下，依據第一位置得到第一人物在待處理圖像中的位置，即第三位置；在第一縱座標處於（第二值，第三值]的情況下，依據第二位置得到第一人物在待處理圖像中的位置，即第三位置。第一值為待處理圖像中的最大縱座標，第三值為待處理圖像中的最小縱座標，第二值為第一值與第三值的均值。For example, when the first ordinate is at [first value, second value], the position of the first person in the image to be processed is obtained according to the first position, that is, the third position; when the first ordinate is at ( In the case of the second value, the third value], the position of the first person in the image to be processed is obtained according to the second position, that is, the third position. The first value is the largest ordinate in the image to be processed, and the third The value is the smallest ordinate in the image to be processed, and the second value is the average of the first value and the third value.

在本發明的一些實施例中，真實人物與採集上述圖像的成像設備之間的距離越遠，該人物在圖像中的大小就越小。因此，圖像處理裝置可依據第一人物的身體的某個部位在待處理圖像中的長度與該部位在真實世界下長度的比值（下文將稱為參考比值），確定第一人物處於遠處還是處於近處。In some embodiments of the present invention, the farther the distance between the real person and the imaging device that collects the above-mentioned image, the smaller the size of the person in the image. Therefore, the image processing device can determine that the first person is far away based on the ratio of the length of a certain part of the body of the first person in the image to be processed to the length of the part in the real world (hereinafter referred to as the reference ratio). The place is still close.

例如，假設第一人物框為人體框，即第一框包含第一人物的整個身體。此時，圖像處理裝置可依據第一人物在待處理圖像中的高度與第一人物在真實世界下的高度的比值，作為參考比值。圖像處理裝置進而可依據參考比值，確定第一人物處於遠處還是處於近處。在本發明的一些實施例中，可將第四值（如人類的平均身高）作為第一人物在真實世界下的高度。又例如，假設第一人物框為人臉框，即第一框包含第一人物的臉。此時，圖像處理裝置可依據第一人物的臉在待處理圖像中的長度與第一人物的臉在真實世界下的長度的比值，作為參考比值，圖像處理裝置進而可依據圖像處理裝置，確定第一人物處於遠處還是處於近處。在本發明的一些實施例中，可將第五值（人臉的平均長度）作為第一人物的臉在真實世界下的長度。再例如，假設第一人物框為人頭框，即第一框包含第一人物的頭。此時，圖像處理裝置可依據第一人物的頭在待處理圖像中的長度與第一人物的頭在真實世界下的長度的比值，作為參考比值。圖像處理裝置進而可依據參考比值，確定第一人物處於遠處還是處於近處。在本發明的一些實施例中，可將第六值（人頭的平均長度）作為第一人物的頭在真實世界下的長度。For example, suppose that the first character frame is a human body frame, that is, the first frame contains the entire body of the first character. At this time, the image processing device may use the ratio of the height of the first person in the image to be processed to the height of the first person in the real world as a reference ratio. The image processing device can then determine whether the first person is far away or near based on the reference ratio. In some embodiments of the present invention, the fourth value (such as the average height of a human) may be used as the height of the first person in the real world. For another example, suppose that the first person frame is a face frame, that is, the first frame contains the face of the first person. At this time, the image processing device can use the ratio of the length of the first person’s face in the image to be processed to the length of the first person’s face in the real world as a reference ratio, and the image processing device can then rely on the image The processing device determines whether the first person is far away or close. In some embodiments of the present invention, the fifth value (average length of the face) may be used as the length of the face of the first person in the real world. For another example, suppose that the first character frame is a human head frame, that is, the first frame contains the first person's head. At this time, the image processing device may use the ratio of the length of the head of the first person in the image to be processed to the length of the head of the first person in the real world as a reference ratio. The image processing device can then determine whether the first person is far away or near based on the reference ratio. In some embodiments of the present invention, the sixth value (the average length of the human head) may be used as the length of the first character's head in the real world.

在依據參考比值確定第一人物處於近處還是處於遠處的情況下，圖像處理裝置進而可確定依據第一位置確定第一人物的位置，還是依據第二位置確定第一人物的位置。In the case of determining whether the first person is close or far away according to the reference ratio, the image processing device may further determine whether to determine the position of the first person according to the first position or to determine the position of the first person according to the second position.

例如，參考比值未超過第七值，表徵第一人物處於遠處，此時依據第一位置得到第一人物在待處理圖像中的位置，即第三位置；參考比值超過第七值，表徵第一人物處於近處，此時依據第二位置得到第一人物在待處理圖像中的位置，即第三位置。在本發明的一些實施例中，第七值為1。For example, if the reference ratio does not exceed the seventh value, it indicates that the first person is far away. At this time, the position of the first person in the image to be processed is obtained according to the first position, that is, the third position; the reference ratio exceeds the seventh value, indicating that The first person is close, and at this time, the position of the first person in the image to be processed is obtained according to the second position, that is, the third position. In some embodiments of the present invention, the seventh value is 1.

本發明實施例中，在人物處於待處理圖像中的遠處的情況下，依據人物點的位置確定人物在待處理圖像中的位置；在人物處於待處理圖像中的近處的情況下，依據人物框的位置確定人物在待處理圖像中的位置。以此提高人物在待處理圖像中的位置的準確度。In the embodiment of the present invention, when the person is far away in the image to be processed, the position of the person in the image to be processed is determined according to the position of the person point; in the case where the person is close to the image to be processed Next, determine the position of the character in the image to be processed according to the position of the character frame. In this way, the accuracy of the position of the character in the image to be processed is improved.

在本發明的一些實施例中，在執行步驟302之前，圖像處理裝置還執行以下步驟。In some embodiments of the present invention, before performing step 302, the image processing apparatus further performs the following steps.

步驟1：獲取上述第一位置的第一置信度和上述第二位置的第二置信度。Step 1: Obtain the first confidence level of the above-mentioned first position and the second confidence level of the above-mentioned second position.

本發明實施例中，第一置信度可在對待處理圖像進行人物檢測處理得到第一位置的過程中得到。第二置信度可在對待處理圖像進行人物檢測處理得到第二位置的過程中得到。In the embodiment of the present invention, the first confidence level may be obtained in the process of performing person detection processing on the image to be processed to obtain the first position. The second degree of confidence can be obtained in the process of performing person detection processing on the image to be processed to obtain the second position.

第一置信度與第一位置的尺度呈負相關，即第一位置與像素座標系的x軸之間的距離越小，第一置信度越高；第一位置與像素座標系的x軸之間的距離越大，第一置信度越低。第二置信度與第二位置的尺度呈正相關，即第二位置與像素座標系的x軸之間的距離越小，第二置信度越低；第二位置與像素座標系的x軸之間的距離越大，第一置信度越高。The first degree of confidence is negatively related to the scale of the first position, that is, the smaller the distance between the first position and the x-axis of the pixel coordinate system, the higher the first degree of confidence; the difference between the first position and the x-axis of the pixel coordinate system The greater the distance between, the lower the first confidence. The second degree of confidence is positively correlated with the scale of the second position, that is, the smaller the distance between the second position and the x-axis of the pixel coordinate system, the lower the second degree of confidence; between the second position and the x-axis of the pixel coordinate system The greater the distance, the higher the first confidence.

在一種獲取第一置信度的實現方式中，圖像處理裝置接收使用者通過輸入元件輸入的第一置信度。上述輸入元件包括：鍵盤、滑鼠、觸控屏、觸控板和音頻輸入器等。在另一種獲取第一置信度的實現方式中，圖像處理裝置接收第三終端發送的第一置信度。可選的，第三終端可以是以下任意一種：手機、電腦、平板電腦、伺服器、可穿戴設備。第三終端與第一終端可以相同，也可以不同，本發明對此不做限定。在又一種獲取第一置信度的實現方式中，第一位置攜帶第一位置的置信度資訊，圖像處理裝置通過獲取第一位置獲取第一置信度。In an implementation manner for obtaining the first degree of confidence, the image processing device receives the first degree of confidence input by the user through the input element. The above-mentioned input elements include: keyboard, mouse, touch screen, touch pad, audio input device, etc. In another implementation manner for obtaining the first confidence level, the image processing apparatus receives the first confidence level sent by the third terminal. Optionally, the third terminal may be any one of the following: a mobile phone, a computer, a tablet computer, a server, and a wearable device. The third terminal and the first terminal may be the same or different, which is not limited in the present invention. In yet another implementation manner for obtaining the first confidence level, the first location carries the confidence information of the first location, and the image processing device obtains the first confidence level by obtaining the first location.

在一種獲取第二置信度的實現方式中，圖像處理裝置接收使用者通過輸入元件輸入的第二置信度。上述輸入元件包括：鍵盤、滑鼠、觸控屏、觸控板和音頻輸入器等。在另一種獲取第二置信度的實現方式中，圖像處理裝置接收第四終端發送的第二置信度。可選的，第三終端可以是以下任意二種：手機、電腦、平板電腦、伺服器、可穿戴設備。第四終端與第三終端可以相同，也可以不同，本發明對此不做限定。在又一種獲取第二置信度的實現方式中，第二位置攜帶第二位置的置信度資訊，圖像處理裝置通過獲取第二位置獲取第二置信度。In an implementation manner for obtaining the second degree of confidence, the image processing device receives the second degree of confidence input by the user through the input element. The above-mentioned input elements include: keyboard, mouse, touch screen, touch pad, audio input device, etc. In another implementation manner for obtaining the second confidence level, the image processing apparatus receives the second confidence level sent by the fourth terminal. Optionally, the third terminal can be any two of the following: mobile phones, computers, tablet computers, servers, and wearable devices. The fourth terminal and the third terminal may be the same or different, which is not limited in the present invention. In yet another implementation manner for obtaining the second confidence level, the second location carries the confidence level information of the second location, and the image processing apparatus obtains the second confidence level by obtaining the second location.

在執行步驟1之後，圖像處理裝置在執行步驟302的過程中具體執行以下步驟。After performing step 1, the image processing apparatus specifically performs the following steps in the process of performing step 302.

步驟2：將上述第一位置和上述第二位置中置信度最高的位置作為第四位置。Step 2: Use the position with the highest confidence level among the first position and the second position as the fourth position.

例如，第一置信度為0.8，第二置信度為0.9，第一位置和第二位置中置信度最高的位置為第二位置。此時圖像處理裝置將第二位置作為第四位置。For example, the first confidence is 0.8, the second confidence is 0.9, and the position with the highest confidence among the first position and the second position is the second position. At this time, the image processing device regards the second position as the fourth position.

步驟3：依據上述第四位置，得到上述第三位置。Step 3: According to the fourth position, the third position is obtained.

假設第三位置為

，第四位置為

。在一種可能實現的方式中，

、

滿足下式：

（1）其中， k為正數。例如， k=1。 Suppose the third position is

, The fourth position is

. In one possible way,

,

Meet the following formula:

(1) Among them, k is a positive number. For example, k =1.

在另一種可能實現的方式中，

、

滿足下式：

（2）其中， k為正數、 c為實數。例如， k=1， c=0。 In another possible way,

,

Meet the following formula:

(2) Among them, k is a positive number and c is a real number. For example, k =1 and c =0.

在又一種可能實現的方式中，

、

滿足下式：

（3）其中， k為正數、 c為實數。例如， k=1， c=0。 In another possible way,

,

Meet the following formula:

(3) Among them, k is a positive number and c is a real number. For example, k =1 and c =0.

本發明實施例中，對同一個人物而言，依據人物點的位置和人物框的位置中置信度最高的位置確定該人物的位置，可提高該人物的位置的準確度。In the embodiment of the present invention, for the same character, the position of the character is determined according to the position of the character point and the position of the character frame with the highest confidence, which can improve the accuracy of the position of the character.

作為一種可選的實施方式，在執行步驟301之前，圖像處理裝置還執行以下步驟。As an optional implementation manner, before step 301 is executed, the image processing apparatus further executes the following steps.

步驟4：獲取上述待處理圖像。Step 4: Obtain the above-mentioned image to be processed.

本發明實施例中，待處理圖像可以是任意圖像。例如，待處理圖像可以包含人物。待處理圖像可以只包括人頭，並無軀幹、四肢（下文將軀幹和四肢稱為人體）。待處理圖像也可以只包括人體，不包括人頭。待處理圖像還可以只包括下肢或上肢。本發明對待處理圖像具體包含的人體區域不做限定。又例如，待處理圖像可以包含動物。再例如，待處理圖像可以包含植物。本發明對待處理圖像中包含的內容不做限定。In the embodiment of the present invention, the image to be processed may be any image. For example, the image to be processed may contain people. The image to be processed may only include the human head, without the torso and limbs (the torso and limbs are referred to as the human body below). The image to be processed can also include only the human body and not the human head. The image to be processed can also include only lower limbs or upper limbs. The present invention does not limit the area of the human body specifically included in the image to be processed. For another example, the image to be processed may contain animals. For another example, the image to be processed may include plants. The present invention does not limit the content contained in the image to be processed.

在一種獲取待處理圖像的實現方式中，圖像處理裝置接收使用者通過輸入元件輸入的待處理圖像。上述輸入元件包括：鍵盤、滑鼠、觸控屏、觸控板和音頻輸入器等。在另一種獲取待處理圖像的實現方式中，圖像處理裝置接收第五終端發送的待處理圖像。在一些實施例中，第五終端可以是以下任意一種：手機、電腦、平板電腦、伺服器、可穿戴設備。第五終端與第一終端可以相同，也可以不同，本發明對此不做限定。在又一種獲取待處理圖像的實現方式中，圖像處理裝置可以通過成像元件採集得到待處理圖像。在一些實施例中，上述成像元件可以是攝影頭。In an implementation manner of acquiring the image to be processed, the image processing device receives the image to be processed input by the user through the input element. The above-mentioned input elements include: keyboard, mouse, touch screen, touch pad, audio input device, etc. In another implementation manner of acquiring the image to be processed, the image processing apparatus receives the image to be processed sent by the fifth terminal. In some embodiments, the fifth terminal may be any one of the following: a mobile phone, a computer, a tablet computer, a server, and a wearable device. The fifth terminal and the first terminal may be the same or different, which is not limited in the present invention. In yet another implementation manner for acquiring the image to be processed, the image processing device may acquire the image to be processed through the imaging element. In some embodiments, the aforementioned imaging element may be a camera.

在執行步驟4之後，圖像處理裝置在執行步驟301的過程中具體執行以下步驟。After performing step 4, the image processing apparatus specifically performs the following steps during the process of performing step 301.

步驟5：對上述待處理圖像進行人物定位處理，得到上述第一位置和上述至少一個人物框的位置。Step 5: Perform character positioning processing on the image to be processed to obtain the first position and the position of the at least one character frame.

本發明實施例中，通過對待處理圖像進行人物定位處理，可確定待處理圖像中是否包含人物。在待處理圖像中包含人物的情況下，還可得到人物在待處理圖像中的位置。上述位置包括人物點的位置和人物框的位置。In the embodiment of the present invention, by performing character positioning processing on the image to be processed, it can be determined whether the image to be processed contains a person. In the case that the image to be processed contains a person, the position of the person in the image to be processed can also be obtained. The above position includes the position of the character point and the position of the character frame.

在本發明的一些實施例中，對待處理圖像進行人物定位處理可通過卷積神經網路實現。通過將帶有標注資訊的圖像作為訓練資料，對卷積神經網路進行訓練，使訓練後的卷積神經網路可完成對圖像的人物定位處理。訓練資料中的圖像的標注資訊為人物點的位置以及人物框的位置。在使用訓練資料對卷積神經網路進行訓練的過程中，卷積神經網路從圖像中提取出圖像的特徵資料，並依據特徵資料確定圖像中是否有人物，在圖像中有人物的情況下，依據圖像的特徵資料得到人物點的位置和人物框的位置。以標注資訊為監督資訊監督卷積神經網路在訓練過程中得到的結果，並更新卷積神經網路的參數，完成對卷積神經網路的訓練。這樣，可使用訓練後的卷積神經網路對待處理圖像進行處理，以得到待處理圖像中的人物的位置（包括人物點的位置和人物框的位置）。In some embodiments of the present invention, the character positioning processing on the image to be processed may be implemented by a convolutional neural network. By using the image with annotated information as training data, the convolutional neural network is trained, so that the trained convolutional neural network can complete the character positioning processing of the image. The label information of the image in the training data is the position of the character point and the position of the character frame. In the process of using training data to train the convolutional neural network, the convolutional neural network extracts the characteristic data of the image from the image, and determines whether there are people in the image based on the characteristic data. In the case of a person, the position of the person point and the position of the person frame are obtained according to the feature data of the image. Use the labeled information as the supervision information to supervise the results obtained in the training process of the convolutional neural network, and update the parameters of the convolutional neural network to complete the training of the convolutional neural network. In this way, the trained convolutional neural network can be used to process the image to be processed to obtain the position of the character in the image to be processed (including the position of the character point and the position of the character frame).

在本發明的一些實施例中，人物定位處理可通過人物檢測演算法實現，其中，人物檢測演算法可以是以下中的一種：只需一眼演算法（you only look once，YOLO）、目標檢測演算法（deformable part model，DMP）、單張圖像多目標檢測演算法（single shot multiBox detector，SSD）、快速區域卷積神經網路（Faster Region-based Convolutional Neural Networks，Faster-RCNN）演算法等等，本發明對實現人物定位處理的人物檢測演算法不做限定。In some embodiments of the present invention, the person location processing can be implemented by a person detection algorithm, where the person detection algorithm can be one of the following: you only look once (YOLO), target detection algorithm Method (deformable part model, DMP), single shot multiBox detector (SSD), Faster Region-based Convolutional Neural Networks (Faster-RCNN) algorithm, etc. Etc., the present invention does not limit the person detection algorithm that realizes the person positioning process.

由於待處理圖像中的人物的數量可能不止1個，通過對待處理圖像進行人物定位處理，得到的人物點的數量和人物框的數量均至少為1。因此，人物點的位置的數量至少為1，人物框的位置的數量也至少為1。Since the number of characters in the image to be processed may be more than one, by performing character positioning processing on the image to be processed, the number of character points and the number of character frames obtained are both at least one. Therefore, the number of character point positions is at least one, and the number of character frame positions is also at least one.

在本發明的一些實施例中，在圖像處理裝置對待處理圖像進行人物檢測處理得到至少一個人物點的位置和上述至少一個人物框的位置的同時，還將得到每一個人物點的位置的置信度和每一個人物框的位置的置信度。在人物點的數量超過1的情況下，將置信度最高的人物點的位置作為第一位置。例如，對待處理圖像進行人物定位處理，得到位置1、位置2、位置3，其中，位置1的置信度為0.7，位置2的置信度為0.9，位置3的置信度為0.8。由於位置2的置信度最高，將位置2作為第一位置。In some embodiments of the present invention, when the image processing device performs person detection processing on the image to be processed to obtain the position of at least one character point and the position of the at least one character frame, it will also obtain the position of each character point. The confidence level and the confidence level of the position of each character frame. When the number of character points exceeds 1, the position of the character point with the highest confidence is taken as the first position. For example, perform person positioning processing on the image to be processed to obtain position 1, position 2, and position 3, where the confidence of position 1 is 0.7, the confidence of position 2 is 0.9, and the confidence of position 3 is 0.8. Since position 2 has the highest confidence, position 2 is taken as the first position.

步驟6：依據上述第一位置和上述至少一個人物框的位置，確定上述至少一個人物框與上述第一人物點之間的距離，得到至少一個第一距離。Step 6: Determine the distance between the at least one character frame and the first character point according to the first position and the position of the at least one character frame to obtain at least one first distance.

依據第一位置和一個人物框的位置，可確定一個人物框與第一人物點之間的距離，得到一個第一距離。例如，第一位置為（7，8），至少一個人物框包括人物框1，人物框1的形狀為矩形，人物框1的位置包括人物框1的一對對角的座標：（6，8）、（4，12）。第一人物點與人物框1之間的距離可以是第一人物點與人物框1的中心之間的距離，該距離為：

。 According to the first position and the position of a character frame, the distance between a character frame and the first character point can be determined to obtain a first distance. For example, the first position is (7, 8), at least one character frame includes character frame 1, the shape of character frame 1 is rectangular, and the position of character frame 1 includes the coordinates of a pair of diagonal corners of character frame 1: (6, 8 ), (4, 12). The distance between the first character point and the character frame 1 may be the distance between the first character point and the center of the character frame 1, and the distance is:

.

又例如，第一位置為（7，8），至少一個人物框包括人物框2，人物框2的形狀為矩形，人物框2的位置包括人物框1的一對對角的座標：（6，8）、（4，12）。第一人物點與人物框1之間的距離可以是第一人物點與人物框2中離像素座標系的原人物點最近的頂人物點之間的距離，該距離為：

。 For another example, the first position is (7, 8), at least one character frame includes character frame 2, the shape of character frame 2 is rectangular, and the position of character frame 2 includes a pair of diagonal coordinates of character frame 1: (6, 8), (4, 12). The distance between the first character point and the character frame 1 may be the distance between the first character point and the top character point in the character frame 2 that is closest to the original character point of the pixel coordinate system, and the distance is:

.

分別確定第一人物點與每一個人物框之間的距離，可得到至少一個第一距離。The distance between the first character point and each character frame is determined respectively, and at least one first distance can be obtained.

步驟7：將與第二距離對應的人物框作為上述第一人物框，確定第一人物框在所述待處理圖像中的第二位置，其中，上述第二距離為上述至少一個第一距離中未超過距離閾值的距離。Step 7: Use the person frame corresponding to the second distance as the first person frame, and determine the second position of the first person frame in the image to be processed, wherein the second distance is the at least one first distance. The distance in which the distance threshold is not exceeded.

本發明實施例中，人物點與人物框之間的距離未超過距離閾值，表徵人物點與人物框屬於同一個人物，即人物點的位置與人物框的位置用於表徵同一個人物在待處理圖像中的位置。In the embodiment of the present invention, the distance between the character point and the character frame does not exceed the distance threshold, indicating that the character point and the character frame belong to the same character, that is, the position of the character point and the position of the character frame are used to indicate that the same character is in the process to be processed. The position in the image.

因此，確定至少一個第一距離中未超過距離閾值的距離，得到第二距離。將與第二距離對應的人物框作為第一人物框。例如，至少一個人物框包括人物框1、人物框2。第一點與人物框1之間的距離為20，第一點與人物框2之間的距離為30。假設距離閾值為25，則第二距離為20。此時，與第二距離對應的人物框為人物框1。Therefore, it is determined that at least one of the first distances does not exceed the distance threshold, and the second distance is obtained. The character frame corresponding to the second distance is taken as the first character frame. For example, at least one character frame includes character frame 1 and character frame 2. The distance between the first point and the character frame 1 is 20, and the distance between the first point and the character frame 2 is 30. Assuming that the distance threshold is 25, the second distance is 20. At this time, the character frame corresponding to the second distance is the character frame 1.

在本發明的一些實施例中，在與第二距離對應的人物框的數量超過1的情況下，將置信度最高的人物框的位置作為第二位置，並將與第二位置對應的人物框作為第一人物框。本發明實施例中，以第一點與人物框之間的距離，確定與第一點屬於同一個人物的人物框，進而確定第一人物框。In some embodiments of the present invention, when the number of character frames corresponding to the second distance exceeds 1, the position of the character frame with the highest confidence is taken as the second position, and the character frame corresponding to the second position is taken as the second position. As the first character frame. In the embodiment of the present invention, the distance between the first point and the character frame is used to determine the character frame belonging to the same character as the first point, and then the first character frame is determined.

在本發明的一些實施例中，在通過執行步驟5得到至少一個人物點的位置和至少一個人物框的位置後，圖像處理裝置可將至少一個人物框的位置中置信度最高的位置作為第二位置，並將與第二位置對應的人物框作為第一人物框。圖像處理裝置依據第二位置和至少一個人物點的位置，確定至少一個人物點與第一人物框之間的距離，得到至少一個第一中間距離。圖像處理裝置將與第二中間距離對應的人物點作為第一人物點，其中，第二中間距離為至少一個第一中間距離中未超過距離閾值的距離。In some embodiments of the present invention, after obtaining the position of at least one character point and the position of at least one character frame by performing step 5, the image processing device may use the position with the highest confidence among the positions of the at least one character frame as the first Two positions, and the character frame corresponding to the second position is used as the first character frame. The image processing device determines the distance between the at least one character point and the first character frame according to the second position and the position of the at least one character point to obtain at least one first intermediate distance. The image processing device uses the person point corresponding to the second intermediate distance as the first person point, where the second intermediate distance is at least one of the first intermediate distances that does not exceed the distance threshold.

請參閱圖5，圖5是本發明實施例提供的步驟6的一種可能實現的方法的流程示意圖。Please refer to FIG. 5, which is a schematic flowchart of a possible implementation method of step 6 according to an embodiment of the present invention.

步驟501：依據上述第一位置和第二人物框的位置，得到第一人物點與上述第二人物框之間的第四距離。Step 501: Obtain a fourth distance between the first character point and the second character frame according to the first position and the position of the second character frame.

本發明實施中，至少一個人物框包括第二人物框。依據第一位置和第二人物框的位置，確定第一點與第二人物框之間的距離，即第四距離。依據第一位置和第二人物框的位置得到第四距離的實現方式，可參見步驟5中依據第一位置和一個人物框的位置得到一個第一距離的實現方式。In the implementation of the present invention, at least one character frame includes a second character frame. According to the position of the first position and the second character frame, the distance between the first point and the second character frame, that is, the fourth distance, is determined. For the implementation manner of obtaining the fourth distance according to the position of the first position and the second character frame, refer to the implementation manner of obtaining a first distance according to the first position and the position of a character frame in step 5.

應理解，在本步驟中，依據第一位置和第二人物框的位置得到的距離不是第一距離而是第四距離。It should be understood that in this step, the distance obtained according to the position of the first position and the position of the second character frame is not the first distance but the fourth distance.

步驟502：確定第一尺度與第二尺度之間的差異，得到第一差異，其中，上述第一尺度為上述第一人物點在上述待處理圖像中的尺度，上述第二尺度為上述第二人物框在上述待處理圖像中的尺度。Step 502: Determine the difference between the first scale and the second scale to obtain the first difference, where the first scale is the scale of the first person point in the image to be processed, and the second scale is the first The scale of the second person frame in the image to be processed.

在本發明的一些實施例中，計算第一人物點的縱座標與第二人物框的中心的縱座標之間的差，可得到第一差異。在本發明的一些實施例中，使用尺度神經網路對待處理圖像進行處理，可得到第一尺度和第二尺度。尺度神經網路以圖像中的人物的尺度為監督資訊訓練得到。In some embodiments of the present invention, the first difference can be obtained by calculating the difference between the ordinate of the first character point and the ordinate of the center of the second character frame. In some embodiments of the present invention, the scale neural network is used to process the image to be processed, and the first scale and the second scale can be obtained. The scale neural network is trained with the scale of the person in the image as the supervisory information.

步驟503：依據上述第四距離和上述第一差異，得到上述第三距離，其中，上述第三距離與上述第一差異呈正相關。Step 503: Obtain the third distance according to the fourth distance and the first difference, where the third distance is positively correlated with the first difference.

本發明實施例中，第三距離為第一人物點與第二人物框之間的距離，即至少一個第一距離包括第三距離。In the embodiment of the present invention, the third distance is the distance between the first character point and the second character frame, that is, at least one first distance includes the third distance.

由於在待處理圖像中，尺度小的地方的單位長度對應真實世界的長度比尺度大的地方的單位長度對應真實世界的長度長，人物點和人物框之間的距離應與尺度呈正相關，即第三距離與第一差異呈正相關。例如，假設單位長度為10個像素點。在待處理圖像中，尺度大的地方的10個像素點表示真實世界下的長度為0.5米，尺度小的地方的10個像素點表示真實世界下的長度為1米。Since in the image to be processed, the unit length of the place with a small scale corresponds to the length of the real world than the unit length of the place with a large scale corresponds to the length of the real world, the distance between the character point and the character frame should be positively correlated with the scale, That is, the third distance is positively correlated with the first difference. For example, suppose the unit length is 10 pixels. In the image to be processed, 10 pixels in a large-scale area indicate that the length in the real world is 0.5 m, and 10 pixels in a small-scale area indicate that the length in the real world is 1 m.

假設第一差異為

，第三距離為

，第四距離為

。 Suppose the first difference is

, The third distance is

, The fourth distance is

.

在本發明的一些實施例中，

、

滿足下式：

（4）其中， t為正數。例如， t=1。 In some embodiments of the invention,

,

Meet the following formula:

(4) Among them, t is a positive number. For example, t =1.

在本發明的一些實施例中，

、

滿足下式：

（5）其中， t為正數、 b為實數。例如， t=1， b=0。 In some embodiments of the invention,

,

Meet the following formula:

(5) Among them, t is a positive number and b is a real number. For example, t =1, b =0.

在本發明的一些實施例中，

、

滿足下式：

（6）其中，， t為正數、 b為實數。例如， t=1， b=0。 In some embodiments of the invention,

,

Meet the following formula:

(6) Among them, t is a positive number and b is a real number. For example, t =1, b =0.

本發明實施例中，依據第一差異和第四距離確定第三距離，可提高第三距離的準確度。In the embodiment of the present invention, the third distance is determined according to the first difference and the fourth distance, which can improve the accuracy of the third distance.

在本發明的一些實施例中，在執行步驟502之前，圖像處理裝置還執行以下步驟。In some embodiments of the present invention, before performing step 502, the image processing apparatus further performs the following steps.

步驟8：依據上述第二人物框的位置，確定第二人物點。Step 8: Determine the second character point according to the position of the second character frame.

本發明實施例中，依據一個人物框的位置可確定一個人物點。例如，人物框1的形狀為矩形。圖像處理裝置依據人物框1的位置可確定人物框1的任意一個頂點的位置，進而可將人物框1的任意一個頂點作為人物點。又例如，人物框1的形狀為矩形abcd。矩形abcd的中心為點e。圖像處理裝置依據人物框1的位置可確定點e的座標，進而將點e作為人物點。再例如，人物框1的形狀為圓形。圖像處理裝置依據人物框1的位置可確定圓形上任意一個點的位置，進而可將圓形上的任意一個點作為人物點。本發明實施例中，依據第二人物框的位置得到的人物點為第二人物點。在一些實施例中，第二人物點為以下中的一個：第二人物框的幾何中心、第二人物框的頂點。In the embodiment of the present invention, a character point can be determined according to the position of a character frame. For example, the shape of the character frame 1 is a rectangle. The image processing device can determine the position of any vertex of the character frame 1 according to the position of the character frame 1, and then can use any vertex of the character frame 1 as a character point. For another example, the shape of the character frame 1 is rectangular abcd. The center of the rectangle abcd is point e. The image processing device can determine the coordinates of the point e according to the position of the character frame 1, and then use the point e as the character point. For another example, the shape of the character frame 1 is a circle. The image processing device can determine the position of any point on the circle according to the position of the character frame 1, and then can use any point on the circle as a character point. In the embodiment of the present invention, the character point obtained according to the position of the second character frame is the second character point. In some embodiments, the second character point is one of the following: the geometric center of the second character frame, and the vertex of the second character frame.

步驟9：確定上述第一人物點與上述第二人物點的中點，得到第三人物點。Step 9: Determine the midpoint between the first character point and the second character point to obtain a third character point.

步驟10：獲取第一尺度指標。Step 10: Obtain the first scale index.

本發明實施例中，在圖像中，某處的尺度指標（包括上述第一尺度指標，以及下文將要出現的第二尺度指標、第三尺度指標）表徵位於該處的物體的尺寸與該物體在真實世界下尺寸之間的映射關係。In the embodiment of the present invention, in the image, the scale index (including the above-mentioned first scale index, and the second scale index and the third scale index which will appear below) in an image represents the size of the object located there and the object The mapping relationship between dimensions in the real world.

在本發明的一些實施例中，某處的尺度指標表徵在該處表示真實世界下的1米所需像素點的數量。例如，假設在圖4所示的圖像中，像素點A ₃₁所在位置的尺度指標為50，像素點A ₁₃所在位置的尺度指標為20。那麼在像素點A ₃₁所在位置表示真實世界下的1米所需像素點數量為50。 In some embodiments of the present invention, the scale index in a certain place represents the number of pixels required to represent 1 meter in the real world at that place. For example, suppose that in the image shown in FIG. 4, _{the scale index of the position of the pixel point A 31} is 50, and the scale index of the position of the pixel point A ₁₃ is 20. Then the _{position of pixel A 31} indicates that the number of pixels required for 1 meter in the real world is 50.

在本發明的一些實施例中，某處的尺度指標表徵位於該處的物體的尺寸與該物體在真實世界下尺寸之間的比值。例如，假設在圖4所示的圖像中，物體1位於像素點A ₁₃所在的位置，物體2位於像素點A ₃₁所在的位置。像素點A ₃₁所在位置的尺度指標為50，像素點A ₁₃所在位置的尺度指標為20。那麼物體1在圖像中的尺寸與物體1在真實世界下的尺寸之間的比值為20，物體2在圖像中的尺寸與物體2在真實世界下的尺寸之間的比值為50。 In some embodiments of the present invention, the scale index of a certain place represents the ratio between the size of the object located there and the size of the object in the real world. For example, suppose that in the image shown in FIG. 4, the object 1 is located at _{the position where the pixel point A 13} is located, and the object 2 is located at the position where the _{pixel point A 31 is located.} The scale index of the location of the pixel point A ₃₁ is 50, and the scale index of the location of the pixel point A ₁₃ is 20. Then the ratio between the size of object 1 in the image and the size of object 1 in the real world is 20, and the ratio between the size of object 2 in the image and the size of object 2 in the real world is 50.

在本發明的一些實施例中，某處的尺度指標表徵位於該處的物體的尺寸與該物體在真實世界下尺寸之間的比值的倒數。例如，假設在圖4所示的圖像中，物體1位於像素點A ₁₃所在的位置，物體2位於像素點A ₃₁所在的位置。像素點A ₃₁所在位置的尺度指標為50，像素點A ₁₃所在位置的尺度指標為20。那麼物體1在真實世界下的尺寸與物體1在圖像中的尺寸之間的比值為20，物體2在真實世界下的尺寸與物體2在圖像中的尺寸之間的比值為50。 In some embodiments of the present invention, the scale index of a certain place represents the reciprocal of the ratio between the size of the object located there and the size of the object in the real world. For example, suppose that in the image shown in FIG. 4, the object 1 is located at _{the position where the pixel point A 13} is located, and the object 2 is located at the position where the _{pixel point A 31 is located.} The scale index of the location of the pixel point A ₃₁ is 50, and the scale index of the location of the pixel point A ₁₃ is 20. Then the ratio between the size of object 1 in the real world and the size of object 1 in the image is 20, and the ratio between the size of object 2 in the real world and the size of object 2 in the image is 50.

在本發明的一些實施例中，尺度相同的位置的尺度指標相同。例如，在圖4所示的圖像中，像素點A ₁₁的尺度、像素點A ₁₂的尺度、像素點A ₁₃的尺度均相同，像素點A ₂₁的尺度、像素點A ₂₂的尺度、像素點A ₂₃的尺度均相同，像素點A ₃₁的尺度、像素點A ₃₂的尺度、像素點A ₃₃的尺度均相同。像素點A ₁₁的尺度指標、像素點A ₁₂的尺度指標、像素點A ₁₃的尺度指標均相同，像素點A ₂₁的尺度指標、像素點A ₂₂的尺度指標、像素點A ₂₃的尺度指標均相同，像素點A ₃₁的尺度指標、像素點A ₃₂的尺度指標、像素點A ₃₃的尺度指標均相同 In some embodiments of the present invention, positions with the same scale have the same scale index. For example, in the image shown in Figure 4, _{the scale of pixel point A 11} , the scale of pixel point A ₁₂ , and the scale of pixel point A ₁₃ are all the same, the scale of pixel point A ₂₁ , the scale of pixel point A ₂₂ , and the pixel The scale of the point A ₂₃ is the same, the scale of the pixel point A ₃₁ , the scale of the pixel point A ₃₂ , and the scale of the pixel point A ₃₃ are all the same. The scale index of pixel point A ₁₁ , the scale index of pixel point A ₁₂ , and the scale index of pixel point A ₁₃ are all the same. The scale index of pixel point A ₂₁ , the scale index of pixel point A ₂₂ , and the scale index of pixel point A ₂₃ are all the same. The same, _{the scale index of pixel point A 31} , the scale index of pixel point A ₃₂ , and the scale index of pixel point A ₃₃ are all the same

本發明實施例中，第一尺度指標為第一尺度位置的尺度指標，其中，第一尺度位置為第三人物點在待處理圖像中的位置。假設第一參考物體位於第一尺度位置，則第一尺度指標表徵在第一尺寸與第二尺寸之間的映射，其中，第一尺寸為第一參考物體在待標注圖像中的尺寸，第二尺寸為第一參考物體在真實世界下的尺寸。在一種獲取第一尺度指標的實現方式中，圖像處理裝置接收使用者通過輸入元件輸入的第一尺度指標。上述輸入元件包括：鍵盤、滑鼠、觸控屏、觸控板和音頻輸入器等。In the embodiment of the present invention, the first scale index is the scale index of the first scale position, where the first scale position is the position of the third person point in the image to be processed. Assuming that the first reference object is located at the first scale position, the first scale indicator represents the mapping between the first size and the second size, where the first size is the size of the first reference object in the image to be labeled, and the first The second size is the size of the first reference object in the real world. In an implementation manner for obtaining the first scale index, the image processing device receives the first scale index input by the user through the input element. The above-mentioned input elements include: keyboard, mouse, touch screen, touch pad, audio input device, etc.

在另一種獲取第一尺度指標的實現方式中，圖像處理裝置接收不同於圖像處理裝置的終端發送的第一尺度指標。可選的，不同於圖像處理裝置的終端可以是以下任意一種：手機、電腦、平板電腦、伺服器、可穿戴設備。不同於圖像處理裝置的終端與第一終端可以相同，也可以不同。In another implementation manner for acquiring the first metric index, the image processing apparatus receives the first metric index sent by a terminal different from the image processing apparatus. Optionally, the terminal different from the image processing device may be any one of the following: a mobile phone, a computer, a tablet computer, a server, and a wearable device. The terminal different from the image processing apparatus and the first terminal may be the same or different.

在獲取到第一尺度指標之後，圖像處理裝置在執行步驟502的過程中具體執行以下步驟。After acquiring the first scale index, the image processing apparatus specifically executes the following steps in the process of executing step 502.

步驟11：依據上述第一尺度指標，得到上述第一差異。Step 11: Obtain the above-mentioned first difference according to the above-mentioned first scale index.

本發明實施例中，第一尺度指標與第一差異呈正相關。假設第一差異為

，第一尺度指標為

。 In the embodiment of the present invention, the first scale index is positively correlated with the first difference. Suppose the first difference is

, The first scale index is

.

在本發明的一些實施例中，

、

滿足下式：

（7）其中， r為正數。例如， r=1/2。 In some embodiments of the invention,

,

Meet the following formula:

(7) Among them, r is a positive number. For example, r = 1/2.

在本發明的一些實施例中，

、

滿足下式：

（8）其中， r為正數、 a為實數。可選的， r=1/2， a=0。 In some embodiments of the invention,

,

Meet the following formula:

(8) Among them, r is a positive number and a is a real number. Optional, r = 1/2, a =0.

在本發明的一些實施例中，

、

滿足下式：

（9）其中， r為正數、 a為實數。可選的， r=1/2， a=0。 In some embodiments of the invention,

,

Meet the following formula:

(9) Among them, r is a positive number and a is a real number. Optional, r = 1/2, a =0.

由於第一尺度指標可較為準確的反應第三人物點所在位置的尺度，依據第一尺度指標確定第一差異，可提高第一差異的準確度。Since the first scale index can more accurately reflect the scale of the position of the third person's point, the first difference can be determined according to the first scale index, which can improve the accuracy of the first difference.

請參閱圖6，圖6是本發明實施例提供的步驟10的一種可能實現的方法的流程示意圖。Please refer to FIG. 6, which is a schematic flowchart of a possible implementation method of step 10 according to an embodiment of the present invention.

步驟601：對上述待處理圖像進行物體檢測處理，得到第一物體框和第二物體框。Step 601: Perform object detection processing on the image to be processed to obtain a first object frame and a second object frame.

本發明實施例中，物體檢測處理的檢測對象在真實世界下的長度處於確定值附近。例如，人臉的平均長度為20釐米，物體檢測處理的檢測對象可以為人臉。又例如，人的平均身高為1.65米，物體檢測處理的檢測對象可以為人體。再例如，在候機室內，如圖7A所示的指示牌的高度均為確定的（如2.5米），物體檢測處理的檢測對象可以為指示牌。可選的，物體檢測處理為人臉檢測處理。In the embodiment of the present invention, the length of the detection object of the object detection process in the real world is near a certain value. For example, the average length of a human face is 20 cm, and the detection object of the object detection process may be a human face. For another example, the average height of a person is 1.65 meters, and the detection object of the object detection process may be a human body. For another example, in the waiting room, the height of the sign as shown in FIG. 7A is determined (for example, 2.5 meters), and the detection object of the object detection process may be the sign. Optionally, the object detection processing is face detection processing.

在本發明的一些實施例中，對待處理圖像進行物體檢測處理可通過卷積神經網路實現。通過將帶有標注資訊的圖像作為訓練資料，對卷積神經網路進行訓練，使訓練後的卷積神經網路可完成對圖像的物體檢測處理。訓練資料中的圖像的標注資訊為物體框的位置資訊，該物體框包含物體檢測處理的檢測對象。In some embodiments of the present invention, object detection processing on the image to be processed may be implemented by a convolutional neural network. By using the image with annotated information as training data, the convolutional neural network is trained, so that the trained convolutional neural network can complete the object detection processing on the image. The label information of the image in the training data is the position information of the object frame, and the object frame contains the detection object of the object detection process.

在本發明的一些實施例中，物體檢測處理可通過人物檢測演算法實現，其中，人物檢測演算法可以是以下中的一種： YOLO、DMP、SSD、Faster-RCNN演算法等等，本發明對實現物體檢測處理的人物檢測演算法不做限定。In some embodiments of the present invention, the object detection processing can be implemented by a person detection algorithm, where the person detection algorithm can be one of the following: YOLO, DMP, SSD, Faster-RCNN algorithm, etc. The present invention is The person detection algorithm that realizes the object detection processing is not limited.

本發明實施例中，第一物體框所包含的檢測對象與第二物體框所包含的檢測對象不同。例如，第一物體框所包含的檢測對象為張三的人臉，第二物體框所包含的檢測對象為李四的人臉。又例如，第一物體框所包含的檢測對象為張三的人臉，第二物體框所包含的檢測對象為指示牌。In the embodiment of the present invention, the detection object contained in the first object frame is different from the detection object contained in the second object frame. For example, the detection object contained in the first object frame is Zhang San's face, and the detection object contained in the second object frame is Li Si's face. For another example, the detection object contained in the first object frame is Zhang San's face, and the detection object contained in the second object frame is a sign.

步驟602：依據上述第一物體框在y軸方向上的長度得到第一長度，依據上述第二物體框在y軸方向上的長度得到第二長度。Step 602: Obtain a first length according to the length of the first object frame in the y-axis direction, and obtain a second length according to the length of the second object frame in the y-axis direction.

本發明實施例中，y軸為待處理圖像的像素座標系的縱軸。圖像處理裝置通過執行步驟601，可得到物體框的位置。依據物體框的位置，可得到物體框在y軸方向上的長度。In the embodiment of the present invention, the y-axis is the vertical axis of the pixel coordinate system of the image to be processed. The image processing device can obtain the position of the object frame by executing step 601. According to the position of the object frame, the length of the object frame in the y-axis direction can be obtained.

例如，矩形框abcd為物體框1，其中，a的座標為（4，8）、b的座標為（6，8）、c的座標為（6，12）、d的座標為（4，12）。此時，物體框1在y軸方向上的長度為12-8=4。For example, the rectangular frame abcd is the object frame 1, where the coordinates of a are (4, 8), the coordinates of b are (6, 8), the coordinates of c are (6, 12), and the coordinates of d are (4, 12). ). At this time, the length of the object frame 1 in the y-axis direction is 12-8=4.

圖像處理裝置可依據第一物體框的位置，得到第一物體框在y軸方向上的長度，即第一長度。圖像處理可依據第二物體框的位置，得到第二物體框在y軸方向上的長度，即第二長度。The image processing device can obtain the length of the first object frame in the y-axis direction according to the position of the first object frame, that is, the first length. The image processing can obtain the length of the second object frame in the y-axis direction according to the position of the second object frame, that is, the second length.

步驟603：依據上述第一長度和第一物體在真實世界下的第三長度得到第二尺度指標，依據上述第二長度和第二物體在真實世界下的第四長度得到第三尺度指標。Step 603: Obtain a second scale index according to the first length and the third length of the first object in the real world, and obtain a third scale index according to the second length and the fourth length of the second object in the real world.

本發明實施例中，第二尺度指標為第二尺度位置的尺度指標，其中，第二尺度位置為依據第一物體框的位置在待標注圖像中確定的位置。假設第二參考物體位於第二尺度位置，則第二尺度指標表徵在第三尺寸與第四尺寸之間的映射，其中，第三尺寸為第二參考物體在待標注圖像中的尺寸，第四尺寸為第二參考物體在真實世界下的尺寸。第三尺度指標為第三尺度位置的尺度指標，其中，第三尺度位置為依據第二物體框的位置在待標注圖像中確定的位置。假設第三參考物體位於第三尺度位置，則第三尺度指標表徵在第五尺寸與第六尺寸之間的映射，其中，第五尺寸為第三參考物體在待標注圖像中的尺寸，第六尺寸為第三參考物體在真實世界下的尺寸。In the embodiment of the present invention, the second scale index is a scale index of a second scale position, where the second scale position is a position determined in the image to be annotated according to the position of the first object frame. Assuming that the second reference object is located at the second scale position, the second scale index represents the mapping between the third size and the fourth size, where the third size is the size of the second reference object in the image to be labeled, and The fourth size is the size of the second reference object in the real world. The third scale index is a scale index of a third scale position, where the third scale position is a position determined in the image to be marked according to the position of the second object frame. Assuming that the third reference object is located at the third scale position, the third scale index represents the mapping between the fifth size and the sixth size, where the fifth size is the size of the third reference object in the image to be labeled, and the The sixth size is the size of the third reference object in the real world.

本發明實施例中，圖像處理裝置可依據一個物體框確定一個物體點。本步驟的具體實現方式可參見步驟8中依據一個人物框可確定一個人物點的實現方式，此處不再贅述。In the embodiment of the present invention, the image processing device can determine an object point according to an object frame. For the specific implementation of this step, please refer to the implementation of determining a character point according to a character frame in step 8, which will not be repeated here.

圖像處理裝置依據第一物體框的位置，確定第一物體點。圖像處理裝置依據第二物體框的位置，確定第二物體點。The image processing device determines the first object point according to the position of the first object frame. The image processing device determines the second object point according to the position of the second object frame.

在本發明的一些實施例中，第一物體點為以下中的一個：第一物體框的幾何中心、第一物體框的頂點。第二物體點為以下中的一個：第二物體框的幾何中心、第二物體框的頂點。In some embodiments of the present invention, the first object point is one of the following: the geometric center of the first object frame, and the vertex of the first object frame. The second object point is one of the following: the geometric center of the second object frame, and the vertex of the second object frame.

在確定第一物體點的位置和第二物體點的位置後，圖像處理裝置可將第一物體點的位置作為第二尺度位置、將第二物體點的位置作為第三尺度位置。After determining the position of the first object point and the position of the second object point, the image processing device may use the position of the first object point as the second-scale position and the position of the second object point as the third-scale position.

本發明實施例中，第一物體和第二物體均為物體檢測處理的檢測對象。第一物體為第一物體框所包含的檢測對象，第二物體為第二物體框所包含的檢測對象。第一物體在真實世界下的長度為第三長度，第二物體在真實世界下的長度為第四長度。例如，第一物體和第二物體均為人臉，第三長度和第四長度均可以是20釐米。又例如，第一物體為人臉，第二物體為人體，第三長度可以是20釐米，第四長度可以是170釐米。In the embodiment of the present invention, both the first object and the second object are detection objects of object detection processing. The first object is a detection object included in the first object frame, and the second object is a detection object included in the second object frame. The length of the first object in the real world is the third length, and the length of the second object in the real world is the fourth length. For example, the first object and the second object are both human faces, and both the third length and the fourth length may be 20 cm. For another example, the first object is a human face, the second object is a human body, the third length may be 20 cm, and the fourth length may be 170 cm.

假設第一長度為

，第二長度為

，第三長度為

，第四長度為

，第二尺度指標為

，第三尺度指標為

。 Suppose the first length is

, The second length is

, The third length is

, The fourth length is

, The second scale index is

, The third scale index is

.

在本發明的一些實施例中，

、

滿足下式：

（10）其中， q為正數。例如， q=1。 In some embodiments of the invention,

,

Meet the following formula:

(10) Among them, q is a positive number. For example, q =1.

在本發明的一些實施例中，

、

滿足下式：

（11）其中， q為正數、 m為實數。例如， q=1， m=0。 In some embodiments of the invention,

,

Meet the following formula:

(11) Among them, q is a positive number and m is a real number. For example, q =1, m =0.

在本發明的一些實施例中，

、

滿足下式：

（12）其中， q為正數、 m為實數。例如， q=1， m=0。 In some embodiments of the invention,

,

Meet the following formula:

(12) Among them, q is a positive number and m is a real number. For example, q =1, m =0.

步驟604：對上述第二尺度指標和上述第三尺度指標進行曲線擬合處理，得到上述待處理圖像的尺度指標圖。Step 604: Perform curve fitting processing on the second scale index and the third scale index to obtain the scale index map of the image to be processed.

由於在待處理圖像中，尺度與縱座標之間的關係可視為線性相關，而尺度指標用於表徵尺度，圖像處理裝置通過對第二尺度指標和第三尺度指標進行曲線擬合處理，可得到待處理圖像的尺度指標圖。該尺度指標圖包括待處理圖像中任意一個像素點所在位置的尺度指標。Since in the image to be processed, the relationship between the scale and the ordinate can be regarded as a linear correlation, and the scale index is used to characterize the scale, the image processing device performs curve fitting processing on the second scale index and the third scale index, The scale index map of the image to be processed can be obtained. The scale index map includes the scale index of the position of any pixel in the image to be processed.

以尺度指標圖中的第一像素點為例。假設第一像素點的像素值（即第一像素值）為40，第一像素點在尺度指標圖中的位置與第二像素點在待處理圖像中的位置相同。則第二像素點在待處理圖像的位置（即第四尺度位置）的尺度指標為第一像素值。假設第四參考物體位於第四尺度位置，則第一像素值表徵第七尺寸與第八尺寸之間的映射，其中，第七尺寸為位於第四尺度位置的第四參考物體的尺寸，第八尺寸為所述第四參考物體在真實世界下的尺寸。Take the first pixel in the scale index map as an example. Assuming that the pixel value of the first pixel (ie, the first pixel value) is 40, the position of the first pixel in the scale index map is the same as the position of the second pixel in the image to be processed. Then the scale index of the position of the second pixel in the image to be processed (that is, the fourth scale position) is the first pixel value. Assuming that the fourth reference object is located at the fourth scale position, the first pixel value represents the mapping between the seventh size and the eighth size, where the seventh size is the size of the fourth reference object located at the fourth scale position, and the eighth The size is the size of the fourth reference object in the real world.

步驟605：依據上述尺度指標圖和上述第三人物點的位置得到上述第一尺度指標。Step 605: Obtain the first scale index according to the scale index map and the position of the third person point.

如步驟604所述，尺度指標圖包括待處理圖像中任意一個像素點所在位置的尺度指標。因此，依據尺度指標圖和第三人物點在待處理圖像中的位置，可確定第三人物點的尺度指標，即第一尺度指標。As described in step 604, the scale index map includes the scale index of the location of any pixel in the image to be processed. Therefore, according to the scale index map and the position of the third person point in the image to be processed, the scale index of the third person point, that is, the first scale index, can be determined.

本發明實施例中，依據第一長度和第三長度得到第二尺度指標，依據第二長度和第四長度得到第三尺度指標。通過對第二尺度指標和第三尺度指標進行曲線擬合處理，得到尺度指標圖，進而可依據尺度指標圖確定待處理圖像中任意一個像素點所在位置的尺度指標。In the embodiment of the present invention, the second scale index is obtained according to the first length and the third length, and the third scale index is obtained according to the second length and the fourth length. By performing curve fitting processing on the second scale index and the third scale index, a scale index map is obtained, and then the scale index of the location of any pixel in the image to be processed can be determined according to the scale index map.

在本發明的一些實施例中，本發明實施例中的人物點（包括：第一人物點、第二人物點、第三人物點）可以是人頭點，人物框（包括：第一人物框、第二人物框）可以是人頭框。人頭點所覆蓋的像素點區域和人頭框所包含的像素點區域均為人頭區域。In some embodiments of the present invention, the character points (including: the first character point, the second character point, and the third character point) in the embodiment of the present invention may be the head point, and the character frame (including: the first character frame) , The second character frame) can be a human head frame. The pixel area covered by the human head dot and the pixel area included in the human head frame are both the human head area.

基於本發明提供的技術方案，本發明實施例還提供了一些可能的應用場景。Based on the technical solutions provided by the present invention, the embodiments of the present invention also provide some possible application scenarios.

如上所述，在公共場所常因人流量過多導致人群過於密集的情況的發生，進而發生一些公共事故，如何對公共場所進行人群定位就具有非常重要的意義。As mentioned above, in public places, crowds are often too crowded due to excessive flow of people, and then some public accidents occur. How to locate the crowd in public places is of very important significance.

相關技術中，為了增強工作、生活或者社會環境中的安全性，會在各個公共場所內安裝監控攝影設備，以便根據視頻流資訊進行安全防護。利用本發明實施例提供的技術方案對監控攝影設備採集到的視頻流進行處理，可確定圖像中每個人的位置，進而可有效預防公共事故的發生。In related technologies, in order to enhance safety in work, life, or social environments, surveillance camera equipment will be installed in various public places to perform security protection based on video stream information. By using the technical solution provided by the embodiment of the present invention to process the video stream collected by the surveillance camera equipment, the location of each person in the image can be determined, thereby effectively preventing the occurrence of public accidents.

舉例來說，監控攝影設備的視頻流處理中心的伺服器可執行本發明實施例提供的技術方案，該伺服器可與至少一個監控攝影頭相連。伺服器在獲取到監控攝影頭發送的視頻流後，可對視頻流中的每一幀圖像進行人物檢測處理，得到每一幀圖像中的人物點的位置和每一幀圖像中的人物框的位置。圖7B為本發明實施例提供的一個應用場景的示意圖，如圖7B所示，待處理圖像71為上述視頻幀中的每一幀圖像，將待處理圖像輸入至圖像處理裝置1後，可以採用本發明實施例提供的技術方案對每一幀圖像中的人物點的位置和每一幀圖像中的人物框的位置進行處理，以確定每一幀圖像中的人物的位置。使用者進而可通過伺服器查看圖像中每個人物的位置，以便進一步確定圖像的人物在真實世界中的位置。For example, a server of a video stream processing center of a surveillance camera device can execute the technical solution provided in the embodiment of the present invention, and the server can be connected to at least one surveillance camera. After the server obtains the video stream sent by the surveillance camera, it can perform person detection processing on each frame of image in the video stream, and obtain the position of the person point in each frame of image and the position of each frame of image. The position of the character frame. FIG. 7B is a schematic diagram of an application scenario provided by an embodiment of the present invention. As shown in FIG. 7B, the image to be processed 71 is each of the above-mentioned video frames, and the image to be processed is input to the image processing device 1 Later, the technical solution provided by the embodiment of the present invention can be used to process the position of the character point in each frame of image and the position of the character frame in each frame of image to determine the position of the character in each frame of image. Location. The user can then view the position of each character in the image through the server, so as to further determine the position of the character in the image in the real world.

在一些實施例中，伺服器在得到視頻流中的每一幀圖像中的人的位置後，可依據每一幀圖像中人的位置，確定每一幀圖像中的人數。在圖像中的人數大於或等於人數閾值的情況下，伺服器可向相關設備發送指令，以進行提示或報警。In some embodiments, after obtaining the position of the person in each frame of the image in the video stream, the server can determine the number of persons in each frame of the image according to the position of the person in each frame of the image. In the case that the number of people in the image is greater than or equal to the number threshold, the server can send instructions to related devices to prompt or alarm.

例如，伺服器可向採集該圖像的攝影頭發送指令，該指令用於指示採集該圖像的攝影頭進行報警。又例如，伺服器可向採集該圖像的攝影頭所在的區域的管控人員的終端發送指令，該指令用於提示該終端輸出人數超過人數閾值的提示資訊。For example, the server may send an instruction to the camera that collects the image, and the instruction is used to instruct the camera that collects the image to give an alarm. For another example, the server may send an instruction to the terminal of the management personnel in the area where the camera collecting the image is located, and the instruction is used to prompt the terminal to output prompt information that the number of people exceeds the threshold of the number of people.

本領域技術人員可以理解，在具體實施方式的上述方法中，各步驟的撰寫順序並不意味著嚴格的執行順序而對實施過程構成任何限定，各步驟的具體執行順序應當以其功能和可能的內在邏輯確定。Those skilled in the art can understand that in the above-mentioned methods of the specific implementation, the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possibility. The inner logic is determined.

上述詳細闡述了本發明實施例的方法，下面提供了本發明實施例的裝置。The above describes the method of the embodiment of the present invention in detail, and the device of the embodiment of the present invention is provided below.

請參閱圖8，圖8為本發明實施例提供的一種圖像處理裝置的結構示意圖，所述圖像處理裝置1包括：獲取單元11、第一處理單元12、第二處理單元13。其中：獲取單元11，配置為獲取第一人物點在待處理圖像中的第一位置和第一人物框在所述待處理圖像中的第二位置；所述第一位置和所述第二位置均用於表徵第一人物在待處理圖像中的位置；第一處理單元12，配置為依據所述第一位置和所述第二位置，得到所述第一人物在所述待處理圖像中的第三位置。 Please refer to FIG. 8. FIG. 8 is a schematic structural diagram of an image processing device according to an embodiment of the present invention. The image processing device 1 includes: an acquisition unit 11, a first processing unit 12, and a second processing unit 13. in: The acquiring unit 11 is configured to acquire the first position of the first character point in the image to be processed and the second position of the first character frame in the image to be processed; the first position and the second position Both are used to characterize the position of the first person in the image to be processed; The first processing unit 12 is configured to obtain a third position of the first person in the image to be processed according to the first position and the second position.

本發明的一些實施例中，所述獲取單元11，還配置為獲取所述第一位置的第一置信度和所述第二位置的第二置信度；所述第一置信度與所述第一位置的尺度呈負相關；所述第二置信度與所述第二位置的尺度呈正相關；所述第一處理單元12配置為：將所述第一位置和所述第二位置中置信度最高的位置作為第四位置；依據所述第四位置，得到所述第三位置。 In some embodiments of the present invention, the acquiring unit 11 is further configured to acquire a first confidence level of the first position and a second confidence level of the second position; the first confidence level and the first confidence level are The scale of a position is negatively correlated; the second confidence level is positively correlated with the scale of the second position; The first processing unit 12 is configured to: Taking the position with the highest confidence level among the first position and the second position as the fourth position; According to the fourth position, the third position is obtained.

本發明的一些實施例中，所述獲取單元11，還配置為對所述待處理圖像進行人物定位處理，得到所述第一位置和所述至少一個人物框的位置；依據所述第一位置和所述至少一個人物框的位置，確定所述至少一個人物框與所述第一人物點之間的距離，得到至少一個第一距離；將與第二距離對應的人物框作為所述第一人物框；所述第二距離為所述至少一個第一距離中未超過距離閾值的距離。 In some embodiments of the present invention, the acquiring unit 11 is further configured to perform character positioning processing on the image to be processed to obtain the first position and the position of the at least one character frame; Determine the distance between the at least one character frame and the first character point according to the first position and the position of the at least one character frame to obtain at least one first distance; A person frame corresponding to the second distance is used as the first person frame; the second distance is a distance that does not exceed a distance threshold in the at least one first distance.

本發明的一些實施例中，所述獲取單元11，配置為：對所述待處理圖像進行人物定位處理，得到至少一個人物點的位置；將所述至少一個人物點的位置中置信度最高的位置作為所述第一位置。 In some embodiments of the present invention, the acquiring unit 11 is configured to: Performing character positioning processing on the image to be processed to obtain the position of at least one character point; The position with the highest confidence among the positions of the at least one character point is taken as the first position.

本發明的一些實施例中，所述至少一個人物框包括第二人物框；所述至少一個第一距離包括所述第一人物點與所述第二人物框之間的第三距離；所述獲取單元11，配置為：依據所述第一位置和所述第二人物框的位置，得到所述第一人物點與所述第二人物框之間的第四距離；確定第一尺度與第二尺度之間的差異，得到第一差異；所述第一尺度為所述第一人物點在所述待處理圖像中的尺度；所述第二尺度為所述第二人物框在所述待處理圖像中的尺度；依據所述第四距離和所述第一差異，得到所述第三距離；所述第三距離與所述第一差異呈正相關。 In some embodiments of the present invention, the at least one character frame includes a second character frame; the at least one first distance includes a third distance between the first character point and the second character frame; The obtaining unit 11 is configured to: Obtaining the fourth distance between the first character point and the second character frame according to the first position and the position of the second character frame; Determine the difference between the first scale and the second scale to obtain the first difference; the first scale is the scale of the first person point in the image to be processed; the second scale is the first 2. The scale of the character frame in the image to be processed; According to the fourth distance and the first difference, the third distance is obtained; the third distance is positively correlated with the first difference.

本發明的一些實施例中，所述裝置還包括第二處理單元13，所述第二處理單元13配置為：依據所述第二人物框的位置，確定第二人物點；確定所述第一人物點與所述第二人物點的中點，得到第三人物點；所述獲取單元11，還配置為獲取第一尺度指標；所述第一尺度指標表徵第一尺寸與第二尺寸之間的映射；所述第一尺寸為位於第一尺度位置的第一參考物體的尺寸；所述第二尺寸為所述第一參考物體在真實世界下的尺寸；所述第一尺度位置為所述第三人物點在所述待處理圖像中的位置；依據所述第一尺度指標，得到所述第一差異。 In some embodiments of the present invention, the device further includes a second processing unit 13, and the second processing unit 13 is configured to: Determine the second character point according to the position of the second character frame; Determining a midpoint between the first character point and the second character point to obtain a third character point; The acquiring unit 11 is further configured to acquire a first scale index; the first scale index represents a mapping between a first size and a second size; the first size is a first reference object located at a first scale position The size of; the second size is the size of the first reference object in the real world; the first scale position is the position of the third person point in the image to be processed; According to the first scale index, the first difference is obtained.

本發明的一些實施例中，所述獲取單元11，配置為：對所述待處理圖像進行物體檢測處理，得到第一物體框和第二物體框；依據所述第一物體框在y軸方向上的長度得到第一長度，依據所述第二物體框在y軸方向上的長度得到第二長度；所述y軸為所述待處理圖像的像素座標系的縱軸；依據所述第一長度和第一物體在真實世界下的第三長度得到第二尺度指標，依據所述第二長度和第二物體在真實世界下的第四長度得到第三尺度指標；所述第一物體為所述第一物體框所包含的檢測對象；所述第二物體為所述第二物體框所包含的檢測對象；所述第二尺度指標表徵第三尺寸與第四尺寸之間的映射；所述第三尺寸為位於第二尺度位置的第二參考物體的尺寸；所述第四尺寸為所述第二參考物體在真實世界下的尺寸；所述第二尺度位置為依據所述第一物體框的位置在所述待處理圖像中確定的位置；所述第三尺度指標表徵第五尺寸與第六尺寸之間的映射；所述第五尺寸為位於第三尺度位置的第三參考物體的尺寸；所述第六尺寸為所述第三參考物體在真實世界下的尺寸；所述第三尺度位置為依據所述第二物體框的位置在所述待處理圖像中確定的位置；對所述第二尺度指標和所述第三尺度指標進行曲線擬合處理，得到所述待處理圖像的尺度指標圖；所述尺度指標圖中的第一像素值表徵第七尺寸與第八尺寸之間的映射；所述第七尺寸為位於第四尺度位置的第四參考物體的尺寸；所述第八尺寸為所述第四參考物體在真實世界下的尺寸；所述第一像素值為第一像素點的像素值；所述第四尺度位置為第二像素點在所述待處理圖像中的位置；所述第一像素點在所述尺度指標圖中的位置與所述第二像素點在所述待處理圖像中的位置相同；依據所述尺度指標圖和所述第三人物點的位置，得到所述第一尺度指標。 In some embodiments of the present invention, the acquiring unit 11 is configured to: Performing object detection processing on the image to be processed to obtain a first object frame and a second object frame; The first length is obtained according to the length of the first object frame in the y-axis direction, and the second length is obtained according to the length of the second object frame in the y-axis direction; the y-axis is the length of the image to be processed The vertical axis of the pixel coordinate system; Obtaining a second scale index according to the first length and the third length of the first object in the real world, and obtaining a third scale index according to the second length and the fourth length of the second object in the real world; The first object is the detection object contained in the first object frame; the second object is the detection object contained in the second object frame; the second scale index represents the difference between the third size and the fourth size The third size is the size of the second reference object located at the second scale position; the fourth size is the size of the second reference object in the real world; the second scale position is based on The position of the first object frame is determined in the image to be processed; the third scale index represents the mapping between the fifth size and the sixth size; the fifth size is the position at the third scale The size of the third reference object; the sixth size is the size of the third reference object in the real world; the third scale position is based on the position of the second object frame in the image to be processed Determined location Perform curve fitting processing on the second scale index and the third scale index to obtain the scale index diagram of the image to be processed; the first pixel value in the scale index diagram represents the seventh size and the eighth size Mapping between sizes; the seventh size is the size of the fourth reference object at the fourth scale position; the eighth size is the size of the fourth reference object in the real world; the first pixel value Is the pixel value of the first pixel; the fourth scale position is the position of the second pixel in the image to be processed; the position of the first pixel in the scale index map is the same as that of the first pixel The positions of the two pixels in the image to be processed are the same; According to the scale index map and the position of the third person point, the first scale index is obtained.

在一些實施例中，本發明實施例提供的裝置具有的功能或包含的模組可以用於執行上文方法實施例描述的方法，其具體實現可以參照上文方法實施例的描述，為了簡潔，這裡不再贅述。In some embodiments, the functions or modules contained in the device provided by the embodiments of the present invention can be used to execute the methods described in the above method embodiments. For specific implementation, refer to the description of the above method embodiments. For brevity, I won't repeat it here.

本發明實施例還提出一種電腦可讀儲存介質，其上儲存有電腦程式指令，所述電腦程式指令被處理器執行時實現上述任意一種圖像處理方法。電腦可讀儲存介質可以是非易失性電腦可讀儲存介質。An embodiment of the present invention also provides a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, any one of the above-mentioned image processing methods is implemented. The computer-readable storage medium may be a non-volatile computer-readable storage medium.

本發明實施例還提供了一種電腦程式，包括電腦可讀代碼，當所述電腦可讀代碼在電子設備中運行時，所述電子設備中的處理器執行用於實現上述任意一種圖像處理方法。The embodiment of the present invention also provides a computer program, including computer-readable code, when the computer-readable code is run in an electronic device, the processor in the electronic device is executed to implement any of the above-mentioned image processing methods .

本發明實施例還提出一種電子設備，圖9為本發明實施例提供的一種電子設備的硬體結構示意圖。該電子設備2可以包括處理器21和記憶體22，所述記憶體22用於儲存電腦程式代碼，所述電腦程式代碼包括電腦指令，在所述處理器21執行所述電腦指令的情況下，所述電子設備2執行上述任意一種圖像處理方法。An embodiment of the present invention also provides an electronic device. FIG. 9 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present invention. The electronic device 2 may include a processor 21 and a memory 22. The memory 22 is used to store computer program codes. The computer program codes include computer instructions. When the processor 21 executes the computer instructions, The electronic device 2 executes any one of the above-mentioned image processing methods.

處理器21可以是一個或多個圖形處理器（graphics processing unit， GPU），在處理器21是一個GPU的情況下，該GPU可以是單核GPU，也可以是多核GPU。可選的，處理器21可以是多個GPU構成的處理器組，多個處理器之間通過一個或多個匯流排彼此耦合。在一些實施例中，該處理器還可以為其他類型的處理器等等，本發明實施例不作限定。記憶體22可用於儲存電腦程式指令，以及用於執行本發明方案的程式碼在內的各類電腦程式代碼。可選地，記憶體包括但不限於是隨機儲存記憶體（random access memory，RAM）、唯讀記憶體（read-only memory，ROM）、可擦除可程式設計唯讀記憶體（erasable programmable read only memory，EPROM）、或可擕式唯讀記憶體（compact disc read-only memory，CD-ROM），該記憶體用於相關指令及資料。The processor 21 may be one or more graphics processing units (graphics processing unit, GPU). In the case where the processor 21 is a GPU, the GPU may be a single-core GPU or a multi-core GPU. Optionally, the processor 21 may be a processor group composed of multiple GPUs, and the multiple processors are coupled to each other through one or more buses. In some embodiments, the processor may also be other types of processors, etc., which is not limited in the embodiment of the present invention. The memory 22 can be used to store computer program instructions and various computer program codes including program codes used to execute the solution of the present invention. Optionally, the memory includes but is not limited to random access memory (RAM), read-only memory (read-only memory, ROM), erasable programmable read-only memory (erasable programmable read-only memory). only memory, EPROM), or portable read-only memory (compact disc read-only memory, CD-ROM), which is used for related instructions and data.

在一些可選的實施方式中，上述電子設備2還可以包括輸入裝置23和/或輸出裝置24。該處理器21、記憶體22、輸入裝置23和輸出裝置24通過連接器相耦合，該連接器包括各類介面、傳輸線或匯流排等等，本發明實施例對此不作限定。應當理解，本發明的各個實施例中，耦合是指通過特定方式的相互聯繫，包括直接相連或者通過其他設備間接相連，例如可以通過各類介面、傳輸線、匯流排等相連。輸入裝置23用於輸入資料和/或信號，以及輸出裝置24用於輸出資料和/或信號。輸入裝置23和輸出裝置24可以是獨立的器件，也可以是一個整體的器件。In some optional implementation manners, the aforementioned electronic device 2 may further include an input device 23 and/or an output device 24. The processor 21, the memory 22, the input device 23, and the output device 24 are coupled through a connector, and the connector includes various interfaces, transmission lines or buses, etc., which are not limited in the embodiment of the present invention. It should be understood that in the various embodiments of the present invention, coupling refers to mutual connection in a specific manner, including direct connection or indirect connection through other devices, such as connection through various interfaces, transmission lines, bus bars, and the like. The input device 23 is used for inputting data and/or signals, and the output device 24 is used for outputting data and/or signals. The input device 23 and the output device 24 may be independent devices or a whole device.

可理解，本發明實施例中，記憶體22不僅可用於儲存相關指令，還可用於儲存相關資料，如該記憶體22可用於儲存通過輸入裝置23獲取的第一位置和第二位置，又或者該記憶體22還可用於儲存通過處理器21得到的第三位置等等，本發明實施例對於該記憶體中具體所儲存的資料不作限定。It can be understood that, in the embodiment of the present invention, the memory 22 can be used not only to store related commands, but also to store related data. For example, the memory 22 can be used to store the first position and the second position obtained through the input device 23, or The memory 22 can also be used to store the third position obtained by the processor 21, etc. The embodiment of the present invention does not limit the specific data stored in the memory.

可以理解的是，圖9僅僅示出了一種電子設備的簡化設計。在實際應用中，電子設備還可以分別包含必要的其他元件，包含但不限於任意數量的輸入/輸出裝置、處理器、記憶體等，而所有可以實現本發明實施例的電子設備都在本發明的保護範圍之內。It can be understood that FIG. 9 only shows a simplified design of an electronic device. In practical applications, electronic equipment may also contain other necessary components, including but not limited to any number of input/output devices, processors, memory, etc., and all electronic equipment that can implement the embodiments of the present invention are included in the present invention. Within the scope of protection.

本領域普通技術人員可以意識到，結合本文中所公開的實施例描述的各示例的單元及演算法步驟，能夠以電子硬體、或者電腦軟體和電子硬體的結合來實現。這些功能究竟以硬體還是軟體方式來執行，取決於技術方案的特定應用和設計約束條件。專業技術人員可以對每個特定的應用來使用不同方法來實現所描述的功能，但是這種實現不應認為超出本發明的範圍。A person of ordinary skill in the art may be aware that the units and algorithm steps described in the examples in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.

所屬領域的技術人員可以清楚地瞭解到，為描述的方便和簡潔，上述描述的系統、裝置和單元的具體工作過程，可以參考前述方法實施例中的對應過程，在此不再贅述。所屬領域的技術人員還可以清楚地瞭解到，本發明各個實施例描述各有側重，為描述的方便和簡潔，相同或類似的部分在不同實施例中可能沒有贅述，因此，在某一實施例未描述或未詳細描述的部分可以參見其他實施例的記載。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the system, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here. Those skilled in the art can also clearly understand that the description of each embodiment of the present invention has its own focus. For the convenience and brevity of the description, the same or similar parts may not be repeated in different embodiments. Therefore, in a certain embodiment For parts that are not described or described in detail, reference may be made to the records of other embodiments.

在本發明所提供的幾個實施例中，應該理解到，所揭露的系統、裝置和方法，可以通過其它的方式實現。例如，以上所描述的裝置實施例僅僅是示意性的，例如，所述單元的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式，例如多個單元或元件可以結合或者可以集成到另一個系統，或一些特徵可以忽略，或不執行。另一點，所顯示或討論的相互之間的耦合或直接耦合或通信連接可以是通過一些介面，裝置或單元的間接耦合或通信連接，可以是電性，機械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or elements may be combined or may be Integrate into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

所述作為分離部件說明的單元可以是或者也可以不是物理上分開的，作為單元顯示的部件可以是或者也可以不是物理單元，即可以位於一個地方，或者也可以分佈到多個網路單元上。可以根據實際的需要選擇其中的部分或者全部單元來實現本實施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. . Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

另外，在本發明各個實施例中的各功能單元可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。In addition, the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.

在上述實施例中，可以全部或部分地通過軟體、硬體、固件或者其任意組合來實現。當使用軟體實現時，可以全部或部分地以電腦程式產品的形式實現。所述電腦程式產品包括一個或多個電腦指令。在電腦上載入和執行所述電腦程式指令時，全部或部分地產生按照本發明實施例所述的流程或功能。所述電腦可以是通用電腦、專用電腦、電腦網路、或者其他可程式設計裝置。所述電腦指令可以儲存在電腦可讀儲存介質中，或者通過所述電腦可讀儲存介質進行傳輸。所述電腦指令可以從一個網站網站、電腦、伺服器或資料中心通過有線（例如同軸電纜、光纖、數位用戶線路（digital subscriber line，DSL））或無線（例如紅外、無線、微波等）方式向另一個網站網站、電腦、伺服器或資料中心進行傳輸。所述電腦可讀儲存介質可以是電腦能夠存取的任何可用介質或者是包含一個或多個可用介質集成的伺服器、資料中心等資料存放裝置。所述可用介質可以是磁性介質，(例如，軟碟、硬碟、磁帶)、光介質(例如，數位通用光碟（digital versatile disc，DVD）)、或者半導體介質（例如固態硬碟）等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present invention are generated in whole or in part. The computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices. The computer instructions can be stored in a computer-readable storage medium or transmitted through the computer-readable storage medium. The computer instructions can be sent from a website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) Another website, computer, server or data center for transmission. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a digital versatile disc (DVD)), or a semiconductor medium (for example, a solid state drive).

本領域普通技術人員可以理解實現上述實施例方法中的全部或部分流程，該流程可以由電腦程式來指令相關的硬體完成，該程式可儲存於電腦可讀取儲存介質中，該程式在執行時，可包括如上述各方法實施例的流程。而前述的儲存介質包括：ROM或RAM、磁碟或者光碟等各種可儲存程式碼的介質。A person of ordinary skill in the art can understand that all or part of the process in the above-mentioned embodiment method can be realized. The process can be completed by a computer program instructing related hardware. The program can be stored in a computer readable storage medium, and the program is executing At this time, it may include the process of each method embodiment described above. The aforementioned storage media include: ROM or RAM, magnetic disks or optical disks and other media that can store program codes.

301,302:步驟 501~503:步驟 601~605:步驟 1:圖像處理裝置 11:獲取單元 12:第一處理單元 13:第二處理單元 2:電子設備 21:處理器 22:記憶體 23:輸入裝置 24:輸出裝置 71:待處理圖像 301, 302: steps 501~503: steps 601~605: steps 1: Image processing device 11: Get unit 12: The first processing unit 13: The second processing unit 2: electronic equipment 21: processor 22: Memory 23: Input device 24: output device 71: Image to be processed

為了更清楚地說明本發明實施例或背景技術中的技術方案，下面將對本發明實施例或背景技術中所需要使用的附圖進行說明。此處的附圖被併入說明書中並構成本說明書的一部分，這些附圖示出了符合本發明的實施例，並與說明書一起用於說明本發明的技術方案。圖1為本發明實施例提供的一種人群圖像示意圖；圖2為本發明實施例提供的一種像素座標系示意圖；圖3為本發明實施例提供的一種圖像處理方法的流程示意圖；圖4為本發明實施例提供的一種圖像示意圖；圖5為本發明實施例提供的另一種圖像處理方法的流程示意圖；圖6為本發明實施例提供的又一種圖像處理方法的流程示意圖；圖7A為本發明實施例提供的一種指示牌示意圖；圖7B為本發明實施例提供的一個應用場景的示意圖；圖8為本發明實施例提供的一種圖像處理裝置的結構示意圖；圖9為本發明實施例提供的一種電子設備的硬體結構示意圖。 In order to more clearly describe the technical solutions in the embodiments of the present invention or the background art, the following will describe the drawings that need to be used in the embodiments of the present invention or the background art. The drawings here are incorporated into the specification and constitute a part of the specification. These drawings show embodiments in accordance with the present invention and are used together with the specification to illustrate the technical solution of the present invention. FIG. 1 is a schematic diagram of a crowd image provided by an embodiment of the present invention; 2 is a schematic diagram of a pixel coordinate system provided by an embodiment of the present invention; FIG. 3 is a schematic flowchart of an image processing method provided by an embodiment of the present invention; FIG. 4 is a schematic diagram of an image provided by an embodiment of the present invention; FIG. 5 is a schematic flowchart of another image processing method provided by an embodiment of the present invention; 6 is a schematic flowchart of another image processing method provided by an embodiment of the present invention; FIG. 7A is a schematic diagram of a sign provided by an embodiment of the present invention; FIG. 7B is a schematic diagram of an application scenario provided by an embodiment of the present invention; FIG. 8 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present invention; FIG. 9 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the present invention.

301,302:步驟 301, 302: steps

Claims

An image processing method, the method comprising: obtaining a first position of a first character point in an image to be processed and a second position of a first character frame in the image to be processed; the first position and The second position is used to characterize the position of the first person in the image to be processed; according to the first position and the second position, the first person’s position in the image to be processed is obtained. Three positions; including: when the first person is in the vicinity of the image to be processed, a third position is obtained according to the second position; and when the first person is in the image to be processed In the case of a remote location, the third position is determined according to the first position.

The method according to claim 1, further comprising: obtaining a first confidence degree of the first position and a second confidence degree of the second position; the first confidence degree is proportional to the scale of the first position Negative correlation; the second confidence level is positively correlated with the scale of the second position; the first position and the second position are used to obtain the position of the first person in the image to be processed The third position includes: taking the position with the highest confidence level among the first position and the second position as the fourth position; and obtaining the third position according to the fourth position.

The method according to claim 1 or 2, wherein the acquiring the first position of the first character point in the image to be processed and the second position of the first character frame in the image to be processed includes: Perform character positioning processing on the image to be processed to obtain the first position Position and the position of the at least one character frame; according to the first position and the position of the at least one character frame, determine the distance between the at least one character frame and the first character point to obtain at least one A distance; the person frame corresponding to the second distance is used as the first person frame, and the second position of the first person frame in the image to be processed is determined; the second distance is the at least one The first distance does not exceed the distance threshold.

The method according to claim 3, wherein the performing character positioning processing on the to-be-processed image to obtain the first position includes: performing character positioning processing on the to-be-processed image to obtain at least one person The position of the point; the position with the highest confidence among the positions of the at least one character point is taken as the first position.

The method according to claim 3, wherein the at least one character frame includes a second character frame; the at least one first distance includes a third distance between the first character point and the second character frame The determining the distance between the at least one character frame and the first character point according to the first position and the position of the at least one character frame to obtain at least one first distance includes: according to the The first position and the position of the second character frame are obtained to obtain the fourth distance between the first character point and the second character frame; the difference between the first scale and the second scale is determined to obtain the first difference; The first scale is the scale of the first person point in the image to be processed; the second scale is the scale of the second person frame in the image to be processed; according to the first Four distances and the first difference to obtain the third distance; the third distance is positively correlated with the first difference.

The method according to claim 5, further comprising: determining a second character point according to the position of the second character frame; determining a midpoint between the first character point and the second character point to obtain a third character Point; obtain a first scale index; the first scale index represents the mapping between the first size and the second size; the first size is the size of the first reference object located at the first scale position; the second The size is the size of the first reference object in the real world; the first scale position is the position of the third person point in the image to be processed; the determining one of the first scale and the second scale Obtaining the first difference includes: obtaining the first difference according to the first scale index.

The method according to claim 6, wherein the obtaining the first scale index includes: performing object detection processing on the image to be processed to obtain a first object frame and a second object frame; The length of the frame in the y-axis direction obtains the first length, and the second length is obtained according to the length of the second object frame in the y-axis direction; the y-axis is the vertical axis of the pixel coordinate system of the image to be processed ； Obtaining a second scale index according to the first length and the third length of the first object in the real world, and obtaining a third scale index according to the second length and the fourth length of the second object in the real world; The first object is the detection object contained in the first object frame; the second object is the detection object contained in the second object frame; the second scale index represents the difference between the third size and the fourth size The third size is the size of the second reference object located at the second scale position; the fourth size is the size of the second reference object in the real world; the second scale position is based on The position of the first object frame is determined in the image to be processed; the third scale index represents the mapping between the fifth size and the sixth size; the fifth size is the position at the third scale The size of the third reference object; the sixth size is the size of the third reference object in the real world; the third scale position is based on the position of the second object frame in the image to be processed The determined position; curve fitting processing is performed on the second scale index and the third scale index to obtain the scale index map of the image to be processed; the first pixel value in the scale index map represents the seventh The mapping between the size and the eighth size; the seventh size is the size of the fourth reference object located at the fourth scale position; the eighth size is the size of the fourth reference object in the real world; The first pixel value is the pixel value of the first pixel; the fourth scale position is the position of the second pixel in the image to be processed; the position of the first pixel in the scale index map The position of the second pixel point in the image to be processed is the same; according to the scale index map and the position of the third person point, all the points are obtained The first scale index.

The method according to claim 7, wherein the second scale position is the position of the first object point in the image to be processed; the third scale position is the position of the second object point in the image to be processed The position in the image; the first object point is one of the following: the geometric center of the first object frame, the vertex of the first object frame; the second object point is one of the following: the The geometric center of the second object frame and the vertices of the second object frame.

The method according to claim 6, wherein the second character point is one of the following: a geometric center of the second character frame, and a vertex of the second character frame.

The method according to claim 1 or 2, wherein the pixel area covered by the first character dot and the pixel area included in the first character frame are both human head areas.

The method according to claim 1 or 2, wherein the shape of the first character frame is a rectangle, a diamond, a circle, an ellipse, or a polygon.

An electronic device, comprising: a processor and a memory, the memory is used to store computer program code, the computer program code includes computer instructions, when the processor executes the computer instructions, the electronic device Perform the method described in any one of Claims 1 to 11.

A computer-readable storage medium in which a computer program is stored, and the computer program includes program instructions. When the program instructions are executed by a processor, the processor executes a request item The method described in any one of 1 to 11.