WO2021017882A1 - 图像坐标系的转换方法、装置、设备及存储介质 - Google Patents

图像坐标系的转换方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021017882A1
WO2021017882A1 PCT/CN2020/102493 CN2020102493W WO2021017882A1 WO 2021017882 A1 WO2021017882 A1 WO 2021017882A1 CN 2020102493 W CN2020102493 W CN 2020102493W WO 2021017882 A1 WO2021017882 A1 WO 2021017882A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
target object
detection
key points
adjacent cameras
Prior art date
Application number
PCT/CN2020/102493
Other languages
English (en)
French (fr)
Inventor
黄湘琦
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to JP2021545821A priority Critical patent/JP7266106B2/ja
Publication of WO2021017882A1 publication Critical patent/WO2021017882A1/zh
Priority to US17/373,768 priority patent/US11928800B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the embodiments of the present application relate to the field of computer vision technology, and in particular to a method, device, device, and storage medium for converting an image coordinate system.
  • Multiple cameras are usually deployed in a large-scale video surveillance scene. By analyzing and processing the images captured by the cameras, the conversion relationship between the image coordinate systems of different cameras can be obtained.
  • related technologies usually adopt Zhang Zhengyou's calibration method for the conversion between the image coordinate systems of different cameras.
  • the checkerboard used for calibration is first placed on a fixed plane, and then multiple groups of checkerboard feature points are detected and a transformation model is calculated to obtain the conversion relationship between the checkerboard coordinate system and the camera coordinate system , And finally convert the image coordinate system of different cameras to the same checkerboard coordinate system.
  • the embodiments of the present application provide a method, device, device, and storage medium for converting an image coordinate system, which help to improve the processing efficiency of images captured by a camera, and is suitable for large-scale video surveillance scenarios.
  • the technical solution is as follows:
  • an embodiment of the present application provides a method for converting an image coordinate system, which is applied to a computer device, and the method includes:
  • the adjacent camera including a first camera and a second camera having a shooting overlapping area on the ground plane;
  • each group of key points includes the first key point extracted from the video image of the first camera Point, and the second key point extracted from the video image of the second camera, and the first key point and the second key point are the same target object that appears in the adjacent cameras at the same time
  • the N is an integer greater than or equal to 3;
  • the conversion relationship between the image coordinate systems of the adjacent cameras is calculated.
  • an embodiment of the present application provides an image coordinate system conversion device, the device includes:
  • a video acquisition module configured to acquire video images collected by adjacent cameras, the adjacent cameras including a first camera and a second camera with a shooting overlap area on the ground plane;
  • the detection and recognition module is used to identify N groups of key points where the target object is located on the ground plane from the video images collected by the adjacent cameras; wherein, each group of key points includes the video image from the first camera The first key point extracted from the second camera, and the second key point extracted from the video image of the second camera, and the first key point and the second key point are in the adjacent camera at the same time
  • the N is an integer greater than or equal to 3;
  • the relationship calculation module is configured to calculate the conversion relationship between the image coordinate systems of the adjacent cameras according to the N groups of key points.
  • an embodiment of the present application provides a computer device, the computer device includes a processor and a memory, the memory stores at least one instruction, at least one program, code set, or instruction set, and the at least one instruction , The at least one program, the code set or the instruction set is loaded and executed by the processor to implement the above-mentioned image coordinate system conversion method.
  • an embodiment of the present application provides a computer-readable storage medium that stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program ,
  • the code set or instruction set is loaded and executed by the processor to realize the above-mentioned image coordinate system conversion method.
  • the embodiments of the present application provide a computer program product, which is used to implement the above-mentioned image coordinate system conversion method when the computer program product is executed by a processor.
  • Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of a method for converting an image coordinate system provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of an image coordinate system conversion method provided by an embodiment of the present application.
  • FIG. 4 is a block diagram of an image coordinate system conversion device provided by an embodiment of the present application.
  • FIG. 5 is a block diagram of an image coordinate system conversion device provided by another embodiment of the present application.
  • Fig. 6 is a structural block diagram of a computer device provided by an embodiment of the present application.
  • AI Artificial Intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology.
  • Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • Computer Vision is a science that studies how to make machines "see”. Furthermore, it refers to the use of cameras and computers instead of human eyes to identify, track, and measure targets. And further graphics processing, so that the computer processing becomes more suitable for human eyes to observe or send to the instrument to detect images.
  • Computer vision studies related theories and technologies trying to establish an artificial intelligence system that can obtain information from images or multi-dimensional data.
  • Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping Construction and other technologies also include common facial recognition, fingerprint recognition and other biometric recognition technologies.
  • FIG. 1 shows a schematic diagram of an implementation environment involved in an embodiment of the present application.
  • the implementation environment may include: a camera 10 and a computer device 20.
  • the camera 10 is used to capture images within its field of view to generate a video stream.
  • multiple cameras 10 are arranged at different positions of a certain real scene 30, and each camera 10 is used to monitor a part of the real scene 30 to obtain a corresponding video stream.
  • the computer device 20 refers to a device capable of processing and storing data, such as a PC (Personal Computer, personal computer), a server, or other electronic devices with computing capabilities, which are not limited in the embodiments of the present application.
  • the computer device 20 can receive the video streams of multiple cameras 10, and can decode the video streams to form an image, and then perform subsequent processing, such as calculating the conversion relationship between the image coordinate systems of the two cameras.
  • the camera 10 and the computer device 20 can communicate in a wired or wireless manner.
  • the data transmission between the camera 10 and the computer device 20 can be carried out in a device-to-device (Ad-Hoc) manner, or can be carried out under the coordination of a base station or a wireless access point (Access Point, AP). This is not limited.
  • Ad-Hoc device-to-device
  • AP wireless access point
  • FIG. 2 shows a flowchart of a method for converting an image coordinate system provided by an embodiment of the present application.
  • This method can be applied to the computer equipment in the implementation environment shown in FIG. 1.
  • the method can include the following steps (201-203):
  • Step 201 Obtain video images collected by adjacent cameras.
  • the adjacent cameras include a first camera and a second camera with an overlapping area for shooting on the ground plane. If two cameras are arranged adjacently, and the shooting areas of the two cameras overlap on the ground plane, the two cameras are adjacent cameras.
  • the computer device may decode the video stream collected by the first camera and the video stream collected by the second camera to obtain multiple frames of video images collected by the first camera and multiple frames of video images collected by the second camera.
  • the frame rate of the video stream collected by the camera should not be too low.
  • the frame rate should be greater than or equal to 25 frames per second, which is not limited in the embodiment of the application.
  • the computer device may also align the time of the first camera and the second camera, that is, keep the time of the first camera and the second camera synchronized.
  • the computer device may align the time of the first camera and the time of the second camera with the standard time respectively. In this way, when the subsequent key point detection is performed, the accuracy of the extracted key points in the time domain can be ensured.
  • Step 202 Identify N groups of key points where the target object is located on the ground plane from the video images collected by the adjacent cameras.
  • Computer equipment first uses target detection technology to detect target objects in video images.
  • the computer device may use SSD (Single Shot MultiBox Detector), YOLO (You Only Look Once) series and other methods to detect the target object, which is not limited in this embodiment of the application.
  • the computer device uses target tracking technology to track the detected target object.
  • the computer device may use related filtering algorithms such as KCF (Kernelized Correlation Filters) and tracking algorithms based on deep neural networks (such as Siames Network) to track the target object, which is not limited in the embodiment of the application.
  • KCF Kernelized Correlation Filters
  • deep neural networks such as Siames Network
  • a target object refers to an object detected by a computer device in a video image collected by an adjacent camera.
  • the target object may include one object or multiple objects.
  • the objects may be movable objects such as pedestrians, animals, vehicles (such as vehicles), that is, dynamic objects, or immovable objects such as rocks, trees, and buildings, that is, static objects.
  • the dynamic physical object may be a physical object that moves autonomously, such as a pedestrian, a mobile robot, etc., or a physical object that moves involuntarily, such as a remote control racing car or a vehicle.
  • the computer equipment After the computer equipment detects the target object, it uses the key point detection technology to detect the key point of the target object.
  • the computer device may use deep neural network related algorithms such as MARK RCNN to detect key points of the target object, which is not limited in the embodiment of the present application.
  • the computer equipment uses key point detection technology to identify N groups of key points where the target object is located on the ground plane, where N is an integer greater than or equal to 3.
  • Each group of key points includes the first key point extracted from the video image of the first camera, and the second key point extracted from the video image of the second camera, and the first key point and the second key point are at the same time The same feature point of the same target object that appears in the aforementioned adjacent cameras.
  • the N groups of key points can come from the video images of the same target object at N different moments, or from the video images of N different target objects at the same moment, or some of them are from the same one.
  • the video images of the target objects at different moments are partly from the video images of different target objects at the same moment.
  • the N groups of key points may all come from the above-mentioned dynamic objects, or all of them may come from the above-mentioned static objects, or part of them may come from the above-mentioned dynamic objects, and some of them may come from the above-mentioned static objects.
  • the embodiment of the application does not limit the specific acquisition method of the N groups of key points.
  • Step 203 Calculate the conversion relationship between the image coordinate systems of adjacent cameras according to the above N groups of key points.
  • the image coordinate system of the camera refers to the coordinate system of the image captured by the camera.
  • the conversion relationship between the image coordinate systems of adjacent cameras refers to the conversion relationship between the position coordinates of the above-mentioned object and the image coordinate systems of adjacent cameras.
  • the ground plane in the physical world can be used to project the image into the camera image to meet the affine transformation property, and the ground plane can be deduced
  • the mapping between the ground plane parts in the images of adjacent cameras with overlapping areas also satisfies the affine transformation.
  • the computer device can model the conversion relationship between the image coordinate systems corresponding to adjacent cameras according to the above N groups of key points, and obtain a mathematical model for characterizing the conversion relationship.
  • the mathematical model may be an affine transformation model, and the affine transformation model is used to transform the position coordinates of the above-mentioned object between the image coordinate systems of adjacent cameras.
  • the RANSAC Random Sample Consensus
  • the interference key points refer to the M groups of key points that do not meet the mathematical model with the smallest error. Is a natural number.
  • the computer equipment obtains 100 sets of key points from adjacent cameras, and randomly selects 3 sets from 100 sets to calculate the mathematical model. The remaining 97 sets of key points are used to calculate the error of the mathematical model, and the final computer equipment selects the error
  • the mathematical model with the smallest mean value or error variance performs operations such as parameter estimation of the mathematical model.
  • the RANSAC algorithm can be used.
  • the error threshold is a value set according to actual application requirements. For example, in a situation with high accuracy requirements, the error threshold has a small value, which is not limited in the embodiment of the present application.
  • the foregoing step 203 may specifically be to calculate an affine transformation matrix between the image coordinate systems of the adjacent cameras according to the foregoing N sets of key points, and the affine transformation matrix is used to represent the adjacent The conversion relationship between the image coordinate systems of the camera.
  • step 203 it may further include: for any object detected and tracked from the video image of the first camera, according to the position coordinates of the object in the image coordinate system corresponding to the first camera and the conversion Relationship, calculate the position coordinates of the object in the image coordinate system corresponding to the second camera.
  • the computer device can continue to calculate the position coordinates of the pedestrian in the image coordinate system of the camera 3 according to the position coordinates of the pedestrian in the image coordinate system of the camera 2... According to the image coordinates of the pedestrian in the camera N-1 Calculate the position coordinates of the pedestrian in the image coordinate system of the camera N, and finally complete the conversion between the cameras that shoot the above-mentioned object on the ground plane without overlapping areas.
  • the technical solutions provided by the embodiments of the present application model the conversion relationship between the image coordinate systems corresponding to adjacent cameras by extracting N sets of key points from the video images taken by adjacent cameras, and solve Relevant technology requires manual placement of the calibration checkerboard, which is time-consuming and labor-intensive.
  • This application obtains key point recognition results through target object tracking and key point recognition. Based on the key point recognition results, the image coordinate system corresponding to different cameras can be obtained
  • the entire process of the conversion relationship between the computer equipment can be completed autonomously, without manual participation, which helps to improve the processing efficiency of images taken by the camera, and is suitable for large-scale video surveillance scenarios.
  • the computer device recognizes N groups of key points where the target object is located on the ground plane from the video images collected by the adjacent cameras, including the following steps:
  • the detection and tracking result corresponding to the first camera refers to the detection and tracking result of the target object in the first camera, which can include information such as the position, appearance, and time stamp of the target object;
  • the detection and tracking result corresponding to the second camera refers to the second camera
  • the detection and tracking result of the target object in can include information such as the location, appearance characteristics, and time stamp of the target object.
  • the computer device can detect and track the target object in each frame of the video image in the video stream, or it can detect and track the target object at intervals of several frames of video images, for example, every Five frames of video images perform a detection and tracking of the target object, that is, the detection and tracking of the target object in the video image such as the first frame, the sixth frame, the 11th frame, and the 16th frame.
  • the computer device can also detect and track the target object in each frame of the video image in the video stream, or detect and track the target object every interval of several frames of video images. .
  • the computer device detects and tracks the target object at intervals of several frames of video images, the interval selected when the computer device processes the video stream of the first camera is the same as that of the second camera.
  • the interval selected during video streaming is the same.
  • the standard target objects are screened out.
  • the standard target object refers to the same target object that appears in adjacent cameras at the same time. Taking pedestrian A as an example, at the same moment, pedestrian A appears in both the first camera and the second camera, and pedestrian A can be used as a standard target object.
  • the appearance characteristics reflect the color, shape, texture and other characteristics of the target object. For example, by performing feature extraction on the image region corresponding to the target object in the video image, the appearance feature of the target object is obtained. Taking the target object as a pedestrian as an example, the appearance feature of the target object can be obtained by using person re-identification technology and/or face recognition technology.
  • the embodiment of the present application does not limit the specific acquisition method of the appearance feature.
  • the first video image and the second video image are video images collected by the adjacent camera at the same time.
  • steps (1) and (2) can be executed at the same time or sequentially. For example, step (1) is executed first and then step (2) is executed, or step (2) is executed first and then step (1) is executed. This application is implemented The example does not limit this.
  • the similarity is used to characterize the degree of similarity between the appearance feature of the first target object and the appearance feature of the second target object.
  • the similarity between the appearance feature of the first target object and the appearance feature of the second target object is calculated using the following steps:
  • (3-1) Calculate the distance between the k-dimensional appearance feature included in the detection and tracking result of the first target object and the k-dimensional appearance feature included in the detection and tracking result of the second target object, where k is a positive integer ;
  • the similarity between the appearance feature of the first target object and the appearance feature of the second target object is determined according to the distance value between the k-dimensional appearance features, and the distance value may be represented by cosine distance or Euclidean distance.
  • the distance value is represented by a non-normalized Euclidean distance. Using this method to represent the distance value can make the numerical value of the similarity more intuitive.
  • the computer device may directly determine the aforementioned distance value as the similarity degree, or may convert the distance value into the similarity degree based on a preset conversion rule, which is not limited in the embodiment of the present application.
  • the similarity is less than the similarity threshold, the first target object and the second target object are eliminated.
  • Performing key point detection on the standard target object refers to detecting the position of each key point of the standard target object.
  • the key points of the standard target object on the ground plane are mainly detected.
  • the key points can include key points of the feet, the center point of the connection between the two feet or other key points.
  • its key point can be the center point of the plane intersecting with the ground plane.
  • the N groups of key points are not collinear, so as to ensure that the N groups of key points can form a plane.
  • the N groups of key points further includes: for each group of key points, obtaining the confidence level corresponding to the key point, if the key point corresponds to If the confidence of is less than the confidence threshold, the key point is eliminated.
  • the confidence level corresponding to the key point is used to indicate the degree of credibility of the key point.
  • the confidence level corresponding to the key point can be given at the same time or after the key point detection is performed on the standard target object. Not limited.
  • the standard is selected based on the detection and tracking results corresponding to the first camera and the detection and tracking results corresponding to the second camera.
  • the target object further includes: filtering out the first video image and the second video image that meet the conditions according to the detection and tracking result corresponding to the first camera and the detection and tracking result corresponding to the second camera.
  • the condition includes that the number of target objects detected and tracked from the first video image is 1, and the number of target objects detected and tracked from the second video image is also 1, which means that the first video image is excluded. There are pictures of multiple people in the first video image and the second video image, which can further avoid data mismatch.
  • the embodiment of the present application comprehensively considers the appearance characteristics of the target object and the confidence level of the key points, so that the obtained N sets of key points are more reliable and improve The accuracy of the conversion relationship calculated through the N groups of key points is calculated.
  • FIG. 4 shows a block diagram of an image coordinate system conversion device provided by an embodiment of the present application.
  • the device 400 has the function of realizing the above method embodiments, and the function can be realized by hardware, or by hardware executing corresponding software.
  • the apparatus 400 may be the computer equipment introduced above, or may be set in the computer equipment.
  • the device 400 may include: a video acquisition module 410, a detection and recognition module 420, and a relationship calculation module 430.
  • the video acquisition module 410 is configured to acquire video images collected by adjacent cameras, and the adjacent cameras include a first camera and a second camera with an overlapping area for shooting on the ground plane.
  • the detection and recognition module 420 is configured to identify N groups of key points where the target object is located on the ground plane from the video images collected by the adjacent cameras; wherein, each group of key points includes the video from the first camera The first key point extracted from the image, and the second key point extracted from the video image of the second camera, and the first key point and the second key point are at the same time in the adjacent camera
  • the N is an integer greater than or equal to 3.
  • the relationship calculation module 430 is configured to calculate the conversion relationship between the image coordinate systems of the adjacent cameras according to the N groups of key points.
  • the detection and identification module 420 includes: a detection and tracking sub-module 421, a standard screening sub-module 422, and a key point detection sub-module 423.
  • the detection and tracking sub-module 421 is configured to perform target detection and tracking on the video images collected by the adjacent cameras to obtain detection and tracking results corresponding to the first camera and detection and tracking results corresponding to the second camera.
  • the standard screening sub-module 422 is configured to screen out standard target objects according to the detection and tracking results corresponding to the first camera and the detection and tracking results corresponding to the second camera. The same target object that appears in the adjacent camera.
  • the key point detection sub-module 423 is configured to perform key point detection on the standard target object to obtain the N groups of key points.
  • the standard screening sub-module 422 is configured to:
  • the appearance feature of the second target object obtained by detection and tracking from the second video image collected by the second camera is acquired; wherein, the first video image and the first video image
  • the second video image is a video image collected by the adjacent cameras at the same time;
  • the similarity is greater than the similarity threshold, it is determined that the first target object and the second target object are the standard target objects.
  • the detection and identification module 420 further includes:
  • the image screening submodule 424 is configured to screen out the first video image and the second video image that meet the conditions according to the detection and tracking result corresponding to the first camera and the detection and tracking result corresponding to the second camera; Wherein, the condition includes that the number of target objects detected and tracked from the first video image is 1, and the number of target objects detected and tracked from the second video image is also 1.
  • the key point detection sub-module 423 is configured to extract the center point of the biped connection of the standard target object when the standard target object is a pedestrian to obtain the result Describe N groups of key points.
  • the detection and identification module 420 further includes:
  • the key point screening sub-module 425 is configured to obtain the confidence level corresponding to the key point for each group of key points; if the confidence level corresponding to the key point is less than the confidence level threshold, the key point is eliminated.
  • the N sets of key points come from video images of the same target object at N different moments.
  • the relationship calculation module 430 is configured to calculate the affine transformation matrix between the image coordinate systems of the adjacent cameras according to the N groups of key points.
  • the device 400 further includes:
  • the coordinate calculation module 440 is configured to, for any object detected and tracked from the video image of the first camera, according to the position coordinates of the object in the image coordinate system corresponding to the first camera and the conversion relationship Calculate the position coordinates of the object in the image coordinate system corresponding to the second camera.
  • the technical solutions provided by the embodiments of the present application model the conversion relationship between the image coordinate systems corresponding to adjacent cameras by extracting N sets of key points from the video images taken by adjacent cameras, and solve Relevant technology requires manual placement of the calibration checkerboard, which is time-consuming and labor-intensive.
  • This application obtains key point recognition results through target object tracking and key point recognition. Based on the key point recognition results, the image coordinate system corresponding to different cameras can be obtained
  • the entire process of the conversion relationship between the computer equipment can be completed autonomously, without manual participation, which helps to improve the processing efficiency of images taken by the camera, and is suitable for large-scale video surveillance scenarios.
  • the device provided in the embodiment of the present application when implementing its functions, only uses the division of the above-mentioned functional modules for illustration.
  • the above-mentioned functions can be allocated by different functional modules as needed. That is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the apparatus and method embodiments provided by the above-mentioned embodiments belong to the same concept, and the specific implementation process is detailed in the method embodiments, which will not be repeated here.
  • FIG. 6 shows a structural block diagram of a computer device provided by an embodiment of the present application.
  • the computer device can be used to implement the image coordinate system conversion method provided in the foregoing embodiment.
  • the computer device may be the computer device 20 in the implementation environment shown in FIG. 1. Specifically:
  • the computer device 600 includes a processing unit (such as a central processing unit CPU, a graphics processor GPU, and a field programmable logic gate array FPGA, etc.) 601, a system memory including a random access memory (RAM) 602 and a read only memory (ROM) 603 604, and a system bus 605 connecting the system memory 604 and the central processing unit 601.
  • the computer device 600 also includes a basic input/output system (I/O system) 606 that helps to transfer information between the various devices in the computing computer device, and a large-scale system for storing the operating system 613, application programs 614, and other program modules 612. Capacity storage device 607.
  • I/O system basic input/output system
  • the basic input/output system 606 includes a display 608 for displaying information and an input device 609 such as a mouse and a keyboard for the user to input information.
  • the display 608 and the input device 609 are both connected to the central processing unit 601 through the input and output controller 610 connected to the system bus 605.
  • the basic input/output system 606 may also include an input and output controller 610 for receiving and processing input from multiple other devices such as a keyboard, a mouse, or an electronic stylus.
  • the input and output controller 610 also provides output to a display screen, a printer, or other types of output devices.
  • the mass storage device 607 is connected to the central processing unit 601 through a mass storage controller (not shown) connected to the system bus 605.
  • the mass storage device 607 and its associated computer readable medium provide non-volatile storage for the computer device 600. That is, the mass storage device 607 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROM drive.
  • the computer-readable medium may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storage technologies, CD-ROM, DVD or other optical storage, tape cartridges, magnetic tape, disk storage or other magnetic storage devices.
  • RAM random access memory
  • ROM read-only memory
  • EPROM Erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the computer device 600 may also be connected to a remote computer on the network through a network such as the Internet to run. That is, the computer device 600 can be connected to the network 612 through the network interface unit 611 connected to the system bus 605, or in other words, the network interface unit 611 can also be used to connect to other types of networks or remote computer systems (not shown) .
  • the memory also includes at least one instruction, at least one program, code set or instruction set, the at least one instruction, at least one program, code set or instruction set is stored in the memory and configured to be executed by one or more processors, In order to achieve the above-mentioned image coordinate system conversion method.
  • a computer-readable storage medium stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code
  • the set or the instruction set is executed by the processor to realize the above-mentioned image coordinate system conversion method.
  • a computer program product is also provided.
  • the computer program product is executed by a processor, it is used to implement the above-mentioned image coordinate system conversion method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

本申请实施例公开了一种图像坐标系的转换方法、装置、设备及存储介质,属于计算机视觉技术领域。所述方法包括:获取相邻摄像头采集的视频图像;从相邻摄像头采集的视频图像中,识别出目标对象位于地平面上的N组关键点;根据上述N组关键点,计算相邻摄像头的图像坐标系之间的转换关系。本申请实施例提供的技术方案,通过从相邻摄像头拍摄的视频图像中提取的N组关键点,对相邻摄像头对应的图像坐标系之间的转换关系进行建模,解决了相关技术需要人工放置标定棋盘格,比较耗时耗力的问题,有助于提高针对摄像头拍摄图像的处理效率,且适用于大规模的视频监控场景。

Description

图像坐标系的转换方法、装置、设备及存储介质
本申请要求于2019年07月31日提交的申请号为201910704514.8、发明名称为“图像坐标系的转换方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机视觉技术领域,特别涉及一种图像坐标系的转换方法、装置、设备及存储介质。
背景技术
大范围的视频监控场景中通常会布设多个摄像头,通过对摄像头拍摄图像的分析处理,可以得到不同摄像头的图像坐标系之间的转换关系。
目前,相关技术对于不同摄像头的图像坐标系之间的转换,通常采用张正友标定法。在相关技术中,先是在固定平面上放置用于标定的棋盘格,然后检测多组棋盘格特征点并计算得出一个变换模型,以此得到棋盘格坐标系与摄像头坐标系之间的转换关系,最终将不同摄像头的图像坐标系转换为同一个棋盘格坐标系。
相关技术中对于不同摄像头的图像坐标系之间的转换,需要人工放置用于标定的棋盘格,当涉及多个摄像头拍摄图像的处理时,比较耗时耗力,不适用于大规模的视频监控场景。
发明内容
本申请实施例提供了一种图像坐标系的转换方法、装置、设备及存储介质,有助于提高针对摄像头拍摄图像的处理效率,且适用于大规模的视频监控场景。所述技术方案如下:
一方面,本申请实施例提供了一种图像坐标系的转换方法,应用于计算机设备,所述方法包括:
获取相邻摄像头采集的视频图像,所述相邻摄像头包括在地平面上有拍摄重叠区域的第一摄像头和第二摄像头;
从所述相邻摄像头采集的视频图像中,识别出目标对象位于所述地平面上的N组关键点;其中,每组关键点包括从所述第一摄像头的视频图像中提取的第一关键点,以及从所述第二摄像头的视频图像中提取的第二关键点,且所述第一关键点和所述第二关键点是同一时刻在所述相邻摄像头中出现的同一目标对象的同一特征点,所述N为大于等于3的整数;
根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的转换关系。
另一方面,本申请实施例提供了一种图像坐标系的转换装置,所述装置包括:
视频获取模块,用于获取相邻摄像头采集的视频图像,所述相邻摄像头包括在地平面上有拍摄重叠区域的第一摄像头和第二摄像头;
检测识别模块,用于从所述相邻摄像头采集的视频图像中,识别出目标对象位于所述地平面上的N组关键点;其中,每组关键点包括从所述第一摄像头的视频图像中提取的第一关键点,以及从所述第二摄像头的视频图像中提取的第二关键点,且所述第一关键点和所述第二关键点是同一时刻在所述相邻摄像头中出现的同一目标对象的同一特征点,所述N为大于等于3的整数;
关系计算模块,用于根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的转换关系。
再一方面,本申请实施例提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述图像坐标系的转换方法。
又一方面,本申请实施例提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述图像坐标系的转换方法。
还一方面,本申请实施例提供了一种计算机程序产品,所述计算机程序产品被处理器执行时,用于实现上述图像坐标系的转换方法。
本申请实施例提供的技术方案可以带来如下有益效果:
通过从相邻摄像头拍摄的视频图像中提取的N组关键点,对相邻摄像头对应的图像坐标系之间的转换关系进行建模,解决了相关技术需要人工放置标定 棋盘格,比较耗时耗力的问题,本申请通过目标对象跟踪和关键点识别得到关键点识别结果,基于该关键点识别结果即可得到不同摄像头对应的图像坐标系之间的转换关系,整个过程由计算机设备自主完成即可,无需人工参与,从而有助于提高针对摄像头拍摄图像的处理效率,且适用于大规模的视频监控场景。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一个实施例提供的实施环境的示意图;
图2是本申请一个实施例提供的图像坐标系的转换方法的流程图;
图3是本申请一个实施例提供的图像坐标系的转换方法的示意图;
图4是本申请一个实施例提供的图像坐标系的转换装置的框图;
图5是本申请另一个实施例提供的图像坐标系的转换装置的框图;
图6是本申请一个实施例提供的计算机设备的结构框图。
具体实施方式
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。
计算机视觉技术(Computer Vision,CV)计算机视觉是一门研究如何使机器“看”的科学,更进一步的说,就是指用摄影机和电脑代替人眼对目标进行识 别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别、图像语义理解、图像检索、OCR、视频处理、视频语义理解、视频内容/行为识别、三维物体重建、3D技术、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。
本申请实施例提供的方案涉及人工智能的计算机视觉技术,具体通过如下实施例进行说明。
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
请参考图1,其示出了本申请实施例涉及的一种实施环境的示意图。该实施环境可以包括:摄像头10和计算机设备20。
摄像头10用于拍摄其视野范围内的图像,生成视频流。在本申请实施例中,摄像头10的数量有多个。例如,如图1所示,在某一个现实场景30的不同位置,布设多个摄像头10,每个摄像头10用于对该现实场景30的一部分区域进行监控,得到相应的视频流。
计算机设备20是指具备对数据进行处理和存储功能的设备,如PC(Personal Computer,个人计算机)、服务器或者其它具有计算能力的电子设备,本申请实施例对此不作限定。计算机设备20可以接收多个摄像头10的视频流,并且可以将该视频流解码形成图像,然后进行后续的处理,如进行两个摄像头的图像坐标系之间转换关系的计算。
摄像头10与计算机设备20之间可以通过有线或者无线的方式进行通信。例如,摄像头10与计算机设备20之间的数据传送可以采用设备到设备(Ad-Hoc)的方式,也可以在基站或无线访问点(Access Point,AP)的协调下进行,本申请实施例对此不作限定。
请参考图2,其示出了本申请一个实施例提供的图像坐标系的转换方法的流程图。该方法可应用于图1所示实施环境的计算机设备中。该方法可以包括如下几个步骤(201-203):
步骤201,获取相邻摄像头采集的视频图像。
在本申请实施例中,相邻摄像头包括在地平面上有拍摄重叠区域的第一摄像头和第二摄像头。如果两个摄像头邻近设置,且这两个摄像头的拍摄区域在地平面上存在重叠区域,则这两个摄像头为相邻摄像头。
另外,计算机设备可以对第一摄像头采集的视频流和第二摄像头采集的视频流分别进行解码,得到第一摄像头采集的多帧视频图像以及第二摄像头采集的多帧视频图像。
另外,为了尽可能地捕捉到摄像头下经过的人或物,摄像头采集的视频流的帧率不能太低,例如该帧率应当大于或等于25帧/秒,本申请实施例对此不作限定。
可选地,计算机设备还可以将第一摄像头和第二摄像头的时间对齐,也即保持第一摄像头和第二摄像头的时间相同步。例如,计算机设备可以将第一摄像头的时间和第二摄像头的时间,分别与标准时间相对齐。这样,在后续进行关键点检测时,能够确保提取到的各组关键点在时域上的准确性。
步骤202,从相邻摄像头采集的视频图像中,识别出目标对象位于地平面上的N组关键点。
计算机设备首先运用目标检测技术对视频图像中的目标对象进行检测。可选地,计算机设备可以采用SSD(Single Shot MultiBox Detector)、YOLO(You Only Look Once)系列等方法对目标对象进行检测,本申请实施例对此不作限定。可选地,计算机设备在检测到目标对象之后,运用目标跟踪技术对检测到的目标对象进行跟踪。可选地,计算机设备可以采用KCF(Kernelized Correlation Filters)等相关滤波算法以及基于深度神经网络的跟踪算法(如SiamesNetwork)等对目标对象进行跟踪,本申请实施例对此不作限定。
目标对象是指计算机设备在相邻摄像头采集的视频图像中检测的对象,该目标对象可以包括一个对象,也可以包括多个对象。在本申请实施例中,对象可以是行人、动物、交通工具(如车辆)等可移动的实物,即动态实物,也可以是石头、树木、建筑等不可移动的实物,即静态实物。可选地,该动态实物既可以是自主移动的实物,如行人、移动机器人等,也可以是遥控赛车、交通工具等非自主移动的实物。
计算机设备检测到目标对象后,再使用关键点检测技术对目标对象进行关键点检测。可选地,计算机设备可以采用MARK RCNN等深度神经网络相关的 算法对目标对象进行关键点检测,本申请实施例对此不作限定。
计算机设备使用关键点检测技术识别出目标对象位于地平面上的N组关键点,其中,N为大于等于3的整数。每组关键点包括从第一摄像头的视频图像中提取的第一关键点,以及从第二摄像头的视频图像中提取的第二关键点,且该第一关键点和第二关键点是同一时刻在上述相邻摄像头中出现的同一目标对象的同一特征点。
可选地,该N组关键点既可以来自于同一个目标对象在N个不同时刻的视频图像,也可以来自于N个不同的目标对象在同一时刻的视频图像,还可以一部分来自于同一个目标对象在不同时刻的视频图像,一部分来自于不同的目标对象在同一时刻的视频图像。并且,该N组关键点可以全部来自于上述动态实物,也可以全部来自于上述静态实物,还可以一部分来自于上述动态实物,一部分来自于上述静态实物。本申请实施例对该N组关键点的具体获取方式不作限定。
步骤203,根据上述N组关键点,计算相邻摄像头的图像坐标系之间的转换关系。
摄像头的图像坐标系是指摄像头所拍摄的图像的坐标系。相邻摄像头的图像坐标系之间的转换关系,是指上述对象的位置坐标在相邻摄像头的图像坐标系之间的转换关系。
假设摄像头的成像过程符合小孔成像模型,且摄像头拍摄的视频图像不存在畸变,则可以利用物理世界中的地平面投影到摄像头画面中的成像满足仿射变换这一性质,推论得出地平面拍摄有重叠区域的相邻摄像头画面中地平面部分之间的映射也满足仿射变换。计算机设备可以根据上述N组关键点,对相邻摄像头对应的图像坐标系之间的转换关系进行建模,得到一个用于表征该转换关系的数学模型。
可选地,该数学模型可以是仿射变换模型,该仿射变换模型用于在相邻摄像头的图像坐标系之间转换上述对象的位置坐标。
可选地,为了剔除可能的干扰关键点,可以使用RANSAC(Random Sample Consensus)算法对该数学模型的参数进行估计,该干扰关键点是指不符合误差最小的数学模型的M组关键点,M为自然数。例如,计算机设备从相邻摄像头中获取了100组关键点,从100组中随机选取3组计算该数学模型,剩余的97组关键点用于计算得到的数学模型的误差,最终计算机设备选取误差平均值或 者误差方差最小的那个数学模型进行数学模型的参数估计等操作,在计算机设备估计该数学模型的参数时,可以使用RANSAC算法,这样可以将与该数学模型误差大于误差阈值的M组关键点剔除掉,使得估计出的数学模型的参数更为精确。其中,误差阈值是根据实际的应用需求设定的数值,例如,在精度要求较高的场合,误差阈值的数值较小,本申请实施例对此不作限定。
在一种可能的实施方式中,上述步骤203具体可以是,根据上述N组关键点,计算上述相邻摄像头的图像坐标系之间的仿射变换矩阵,该仿射变换矩阵用于表征相邻摄像头的图像坐标系之间的转换关系。
可选地,上述步骤203之后还可以包括:对于从上述第一摄像头的视频图像中检测跟踪得到的任一对象,根据该对象在该第一摄像头对应的图像坐标系中的位置坐标及上述转换关系,计算该对象在上述第二摄像头对应的图像坐标系中的位置坐标。例如,请参考图3,在该应用场景中有多个摄像头10,记为摄像头1、摄像头2……摄像头N,该多个摄像头两两之间存在地平面拍摄重叠区域,即摄像头1与摄像头2之间存在地平面拍摄重叠区域,摄像头2与摄像头3之间存在地平面拍摄重叠区域……在摄像头1与摄像头2之间的地平面拍摄重叠区域中有一个行人,则可以根据该行人在摄像头1的图像坐标系下的位置坐标及上述转换关系,计算出该行人在摄像头2的图像坐标系下的位置坐标。可选地,计算机设备可以继续根据该行人在摄像头2的图像坐标系下的位置坐标,计算该行人在摄像头3的图像坐标系下的位置坐标……根据该行人在摄像头N-1的图像坐标系下的位置坐标,计算该行人在摄像头N的图像坐标系下的位置坐标,最终可以完成上述对象在地平面拍摄没有重叠区域的摄像头之间的转换。
综上所述,本申请实施例提供的技术方案,通过从相邻摄像头拍摄的视频图像中提取的N组关键点,对相邻摄像头对应的图像坐标系之间的转换关系进行建模,解决了相关技术需要人工放置标定棋盘格,比较耗时耗力的问题,本申请通过目标对象跟踪和关键点识别得到关键点识别结果,基于该关键点识别结果即可得到不同摄像头对应的图像坐标系之间的转换关系,整个过程由计算机设备自主完成即可,无需人工参与,从而有助于提高针对摄像头拍摄图像的处理效率,且适用于大规模的视频监控场景。
在示例性实施例中,计算机设备从相邻摄像头采集的视频图像中,识别出 目标对象位于地平面上的N组关键点,包括如下几个步骤:
1、对相邻摄像头采集的视频图像分别进行目标检测跟踪,得到第一摄像头对应的检测跟踪结果以及第二摄像头对应的检测跟踪结果。
第一摄像头对应的检测跟踪结果是指第一摄像头中的目标对象的检测跟踪结果,可以包括目标对象的位置、外观特征、时间戳等信息;第二摄像头对应的检测跟踪结果是指第二摄像头中的目标对象的检测跟踪结果,可以包括目标对象的位置、外观特征、时间戳等信息。
对于第一摄像头采集的视频流,计算机设备可以对该视频流中的每一帧视频图像中的目标对象进行检测跟踪,也可以每间隔若干帧视频图像对目标对象进行一次检测跟踪,例如每隔5帧视频图像对目标对象进行一次检测跟踪,即对第1帧、第6帧、第11帧、第16帧等视频图像中的目标对象进行检测跟踪。
同样地,对于第二摄像头采集的视频流,计算机设备也可以对该视频流中的每一帧视频图像中的目标对象进行检测跟踪,也可以每间隔若干帧视频图像对目标对象进行一次检测跟踪。
如果对于第一摄像头和第二摄像头采集的视频流,计算机设备每间隔若干帧视频图像对目标对象进行一次检测跟踪,则计算机设备处理第一摄像头的视频流时选取的间隔与处理第二摄像头的视频流时选取的间隔是一样的。
2、根据第一摄像头对应的检测跟踪结果以及第二摄像头对应的检测跟踪结果,筛选出标准目标对象。
标准目标对象是指同一时刻在相邻摄像头中出现的同一目标对象。以目标对象是行人甲为例,在同一时刻,行人甲既出现在第一摄像头中,也出现在第二摄像头中,则行人甲可以作为标准目标对象。
在一种可能的实施方式中,计算机设备如果如下方式筛选出标准目标对象:
(1)根据第一摄像头对应的检测跟踪结果,获取从该第一摄像头采集的第一视频图像中检测跟踪得到的第一目标对象的外观特征;
(2)根据第二摄像头对应的检测跟踪结果,获取从该第二摄像头采集的第二视频图像中检测跟踪得到的第二目标对象的外观特征;
外观特征反映了目标对象的颜色、形状、纹理等特征。例如,通过对该目标对象在视频图像中对应的图像区域进行特征提取,得到该目标对象的外观特征。以目标对象为行人为例,可以采用行人重识别(person re-identification)技术和/或人脸识别技术等得到该目标对象的外观特征,本申请实施例对于外观特 征的具体获取手段不作限定。另外,该第一视频图像和该第二视频图像是该相邻摄像头在同一时刻采集的视频图像。
上述步骤(1)和步骤(2)可以同时执行,也可以先后执行,如先执行步骤(1)后执行步骤(2),或先执行步骤(2)后执行步骤(1),本申请实施例对此不作限定。
(3)计算第一目标对象的外观特征和第二目标对象的外观特征之间的相似度;
该相似度用于表征该第一目标对象的外观特征和该第二目标对象的外观特征之间的近似程度。
可选地,该第一目标对象的外观特征和该第二目标对象的外观特征之间的相似度,采用如下步骤计算得到:
(3-1)计算该第一目标对象的检测跟踪结果所包括的k维外观特征,与该第二目标对象的检测跟踪结果所包括的k维外观特征之间的距离值,k为正整数;
(3-2)根据该距离值确定该第一目标对象的外观特征和该第二目标对象的外观特征之间的相似度。
该第一目标对象的外观特征和该第二目标对象的外观特征之间的相似度,根据k维外观特征之间的距离值确定,该距离值可以采用余弦距离或者欧氏距离等进行表示。可选地,该距离值采用非归一化的欧氏距离来表示,采用这种方式来表示距离值,能够使得相似度在数值上体现地更加直观。另外,计算机设备可以直接将上述距离值确定为相似度,也可以基于预设的换算规则将距离值换算为相似度,本申请实施例对此不作限定。
(4)若该相似度大于相似度阈值,则确定该第一目标对象和该第二目标对象为该标准目标对象。
另外,若该相似度小于相似度阈值,则剔除该第一目标对象和该第二目标对象。
3、对该标准目标对象进行关键点检测,得到上述N组关键点。
对标准目标对象进行关键点检测是指检测标准目标对象的各个关键点的位置。本申请实施例中,因为计算的是相邻摄像头的图像坐标系之间的转换关系,为了提高计算的精确性,主要检测标准目标对象位于地平面上的关键点。以标准目标对象是行人为例,其关键点可以包括足部关键点、两足连线的中心点或者其它部位的关键点。以标准目标对象是石头为例,其关键点可以是与地平面 相交面的中心点。本申请实施例中,该N组关键点不共线,以此确保该N组关键点可以构成一个平面。
可选地,为了使选取出的N组关键点更为可靠,在得到该N组关键点之后,还包括:对于每一组关键点,获取该关键点对应的置信度,若该关键点对应的置信度小于置信度阈值,则剔除该关键点。该关键点对应的置信度用于指示该关键点可信程度的大小,该关键点对应的置信度可以在对该标准目标对象进行关键点检测的同时或之后给出,本申请实施例对此不作限定。
可选地,为了避免数据的误匹配,提高相邻摄像头之间的转换关系计算的精确性,在根据该第一摄像头对应的检测跟踪结果以及该第二摄像头对应的检测跟踪结果,筛选出标准目标对象之前还包括:根据该第一摄像头对应的检测跟踪结果以及该第二摄像头对应的检测跟踪结果,筛选出符合条件的第一视频图像和第二视频图像。可选地,该条件包括从该第一视频图像中检测跟踪得到的目标对象的数量为1,且从该第二视频图像中检测跟踪得到的目标对象的数量也为1,也就是说排除第一视频图像和第二视频图像中有多人的画面,这样可以进一步避免数据的误匹配。
综上所述,本申请实施例在提取相邻摄像头的N组关键点时,综合考虑了目标对象的外观特征与关键点置信度的大小,使得获取的N组关键点更为可靠,并且提高了通过该N组关键点计算出的转换关系的准确性。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图4,其示出了本申请一个实施例提供的图像坐标系的转换装置的框图。该装置400具有实现上述方法实施例的功能,该功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置400可以是上文介绍的计算机设备,也可以设置在计算机设备中。该装置400可以包括:视频获取模块410、检测识别模块420和关系计算模块430。
视频获取模块410,用于获取相邻摄像头采集的视频图像,所述相邻摄像头包括在地平面上有拍摄重叠区域的第一摄像头和第二摄像头。
检测识别模块420,用于从所述相邻摄像头采集的视频图像中,识别出目标对象位于所述地平面上的N组关键点;其中,每组关键点包括从所述第一摄像头的视频图像中提取的第一关键点,以及从所述第二摄像头的视频图像中提取 的第二关键点,且所述第一关键点和所述第二关键点是同一时刻在所述相邻摄像头中出现的同一目标对象的同一特征点,所述N为大于等于3的整数。
关系计算模块430,用于根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的转换关系。
在示例性实施例中,请参考图5,所述检测识别模块420,包括:检测跟踪子模块421、标准筛选子模块422和关键点检测子模块423。
检测跟踪子模块421,用于对所述相邻摄像头采集的视频图像分别进行目标检测跟踪,得到所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果。
标准筛选子模块422,用于根据所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果,筛选出标准目标对象,所述标准目标对象是指同一时刻在所述相邻摄像头中出现的同一目标对象。
关键点检测子模块423,用于对所述标准目标对象进行关键点检测,得到所述N组关键点。
在示例性实施例中,请参考图5,所述标准筛选子模块422,用于:
根据所述第一摄像头对应的检测跟踪结果,获取从所述第一摄像头采集的第一视频图像中检测跟踪得到的第一目标对象的外观特征;
根据所述第二摄像头对应的检测跟踪结果,获取从所述第二摄像头采集的第二视频图像中检测跟踪得到的第二目标对象的外观特征;其中,所述第一视频图像和所述第二视频图像是所述相邻摄像头在同一时刻采集的视频图像;
计算所述第一目标对象的外观特征和所述第二目标对象的外观特征之间的相似度;
若所述相似度大于相似度阈值,则确定所述第一目标对象和所述第二目标对象为所述标准目标对象。
在示例性实施例中,请参考图5,所述检测识别模块420,还包括:
图像筛选子模块424,用于根据所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果,筛选出符合条件的所述第一视频图像和所述第二视频图像;其中,所述条件包括从所述第一视频图像中检测跟踪得到的目标对象的数量为1,且从所述第二视频图像中检测跟踪得到的目标对象的数量也为1。
在示例性实施例中,请参考图5,所述关键点检测子模块423,用于当所述 标准目标对象为行人时,提取所述标准目标对象的两足连线的中心点,得到所述N组关键点。
在示例性实施例中,请参考图5,所述检测识别模块420,还包括:
关键点筛选子模块425,用于对于每一组关键点,获取所述关键点对应的置信度;若所述关键点对应的置信度小于置信度阈值,则剔除所述关键点。
在示例性实施例中,所述N组关键点来自于同一个目标对象在N个不同时刻的视频图像。
在示例性实施例中,请参考图5,所述关系计算模块430,用于根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的仿射变换矩阵。
在示例性实施例中,请参考图5,所述装置400还包括:
坐标计算模块440,用于对于从所述第一摄像头的视频图像中检测跟踪得到的任一对象,根据所述对象在所述第一摄像头对应的图像坐标系中的位置坐标及所述转换关系,计算所述对象在所述第二摄像头对应的图像坐标系中的位置坐标。
综上所述,本申请实施例提供的技术方案,通过从相邻摄像头拍摄的视频图像中提取的N组关键点,对相邻摄像头对应的图像坐标系之间的转换关系进行建模,解决了相关技术需要人工放置标定棋盘格,比较耗时耗力的问题,本申请通过目标对象跟踪和关键点识别得到关键点识别结果,基于该关键点识别结果即可得到不同摄像头对应的图像坐标系之间的转换关系,整个过程由计算机设备自主完成即可,无需人工参与,从而有助于提高针对摄像头拍摄图像的处理效率,且适用于大规模的视频监控场景。
需要说明的是,本申请实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
请参考图6,其示出了本申请一个实施例提供的计算机设备的结构框图。该计算机设备可以用于实施上述实施例中提供的图像坐标系的转换方法。例如,该计算机设备可以是图1所示实施环境中的计算机设备20。具体来讲:
该计算机设备600包括处理单元(如中央处理器CPU、图形处理器GPU和现场可编程逻辑门阵列FPGA等)601、包括随机存取存储器(RAM)602和只读存储器(ROM)603的系统存储器604,以及连接系统存储器604和中央处理单元601的系统总线605。该计算机设备600还包括帮助计算计算机设备内的各个器件之间传输信息的基本输入/输出系统(I/O系统)606,和用于存储操作系统613、应用程序614和其他程序模块612的大容量存储设备607。
该基本输入/输出系统606包括有用于显示信息的显示器608和用于用户输入信息的诸如鼠标、键盘之类的输入设备609。其中,该显示器608和输入设备609都通过连接到系统总线605的输入输出控制器610连接到中央处理单元601。该基本输入/输出系统606还可以包括输入输出控制器610以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器610还提供输出到显示屏、打印机或其他类型的输出设备。
该大容量存储设备607通过连接到系统总线605的大容量存储控制器(未示出)连接到中央处理单元601。该大容量存储设备607及其相关联的计算机可读介质为计算机设备600提供非易失性存储。也就是说,该大容量存储设备607可以包括诸如硬盘或者CD-ROM驱动器之类的计算机可读介质(未示出)。
不失一般性,该计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM、EEPROM、闪存或其他固态存储其技术,CD-ROM、DVD或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知该计算机存储介质不局限于上述几种。上述的系统存储器604和大容量存储设备607可以统称为存储器。
根据本申请实施例,该计算机设备600还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即计算机设备600可以通过连接在该系统总线605上的网络接口单元611连接到网络612,或者说,也可以使用网络接口单元611来连接到其他类型的网络或远程计算机系统(未示出)。
该存储器还包括至少一条指令、至少一段程序、代码集或指令集,该至少一条指令、至少一段程序、代码集或指令集存储于存储器中,且经配置以由一个或者一个以上处理器执行,以实现上述图像坐标系的转换方法。
在示例性实施例中,还提供了一种计算机可读存储介质,该存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,该至少一条指令、该至少一段程序、该代码集或该指令集在被处理器执行时以实现上述图像坐标系的转换方法。
在示例性实施例中,还提供了一种计算机程序产品,当该计算机程序产品被处理器执行时,其用于实现上述图像坐标系的转换方法。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (15)

  1. 一种图像坐标系的转换方法,应用于计算机设备,所述方法包括:
    获取相邻摄像头采集的视频图像,所述相邻摄像头包括在地平面上有拍摄重叠区域的第一摄像头和第二摄像头;
    从所述相邻摄像头采集的视频图像中,识别出目标对象位于所述地平面上的N组关键点;其中,每组关键点包括从所述第一摄像头的视频图像中提取的第一关键点,以及从所述第二摄像头的视频图像中提取的第二关键点,且所述第一关键点和所述第二关键点是同一时刻在所述相邻摄像头中出现的同一目标对象的同一特征点,所述N为大于等于3的整数;
    根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的转换关系。
  2. 根据权利要求1所述的方法,其中,所述从所述相邻摄像头采集的视频图像中,识别出目标对象位于所述地平面上的N组关键点,包括:
    对所述相邻摄像头采集的视频图像分别进行目标检测跟踪,得到所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果;
    根据所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果,筛选出标准目标对象,所述标准目标对象是指同一时刻在所述相邻摄像头中出现的同一目标对象;
    对所述标准目标对象进行关键点检测,得到所述N组关键点。
  3. 根据权利要求2所述的方法,其中,所述根据所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果,筛选出标准目标对象,包括:
    根据所述第一摄像头对应的检测跟踪结果,获取从所述第一摄像头采集的第一视频图像中检测跟踪得到的第一目标对象的外观特征;
    根据所述第二摄像头对应的检测跟踪结果,获取从所述第二摄像头采集的第二视频图像中检测跟踪得到的第二目标对象的外观特征;其中,所述第一视频图像和所述第二视频图像是所述相邻摄像头在同一时刻采集的视频图像;
    计算所述第一目标对象的外观特征和所述第二目标对象的外观特征之间的相似度;
    若所述相似度大于相似度阈值,则确定所述第一目标对象和所述第二目标对象为所述标准目标对象。
  4. 根据权利要求3所述的方法,其中,所述根据所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果,筛选出标准目标对象之前,还包括:
    根据所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果,筛选出符合条件的所述第一视频图像和所述第二视频图像;
    其中,所述条件包括从所述第一视频图像中检测跟踪得到的目标对象的数量为1,且从所述第二视频图像中检测跟踪得到的目标对象的数量也为1。
  5. 根据权利要求2所述的方法,其中,所述对所述标准目标对象进行关键点检测,得到所述N组关键点,包括:
    当所述标准目标对象为行人时,提取所述标准目标对象的两足连线的中心点,得到所述N组关键点。
  6. 根据权利要求2所述的方法,其中,所述根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的转换关系之前,还包括:
    对于每一组关键点,获取所述关键点对应的置信度;
    若所述关键点对应的置信度小于置信度阈值,则剔除所述关键点。
  7. 根据权利要求1至6任一项所述的方法,其中,所述N组关键点来自于同一个目标对象在N个不同时刻的视频图像。
  8. 根据权利要求1至6任一项所述的方法,其中,所述根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的转换关系,包括:
    根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的仿射变换矩阵。
  9. 根据权利要求1至6任一项所述的方法,其中,所述根据所述N组关键 点,计算所述相邻摄像头的图像坐标系之间的转换关系之后,还包括:
    对于从所述第一摄像头的视频图像中检测跟踪得到的任一对象,根据所述对象在所述第一摄像头对应的图像坐标系中的位置坐标及所述转换关系,计算所述对象在所述第二摄像头对应的图像坐标系中的位置坐标。
  10. 一种图像坐标系的转换装置,所述装置包括:
    视频获取模块,用于获取相邻摄像头采集的视频图像,所述相邻摄像头包括在地平面上有拍摄重叠区域的第一摄像头和第二摄像头;
    检测识别模块,用于从所述相邻摄像头采集的视频图像中,识别出目标对象位于所述地平面上的N组关键点;其中,每组关键点包括从所述第一摄像头的视频图像中提取的第一关键点,以及从所述第二摄像头的视频图像中提取的第二关键点,且所述第一关键点和所述第二关键点是同一时刻在所述相邻摄像头中出现的同一目标对象的同一特征点,所述N为大于等于3的整数;
    关系计算模块,用于根据所述N组关键点,计算所述相邻摄像头的图像坐标系之间的转换关系。
  11. 根据权利要求10所述的装置,其中,所述检测识别模块,包括:
    检测跟踪子模块,用于对所述相邻摄像头采集的视频图像分别进行目标检测跟踪,得到所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果;
    标准筛选子模块,用于根据所述第一摄像头对应的检测跟踪结果以及所述第二摄像头对应的检测跟踪结果,筛选出标准目标对象,所述标准目标对象是指同一时刻在所述相邻摄像头中出现的同一目标对象;
    关键点检测子模块,用于对所述标准目标对象进行关键点检测,得到所述N组关键点。
  12. 根据权利要求11所述的装置,其中,所述标准筛选子模块,包括:
    根据所述第一摄像头对应的检测跟踪结果,获取从所述第一摄像头采集的第一视频图像中检测跟踪得到的第一目标对象的外观特征;根据所述第二摄像头对应的检测跟踪结果,获取从所述第二摄像头采集的第二视频图像中检测跟 踪得到的第二目标对象的外观特征;其中,所述第一视频图像和所述第二视频图像是所述相邻摄像头在同一时刻采集的视频图像;
    计算所述第一目标对象的外观特征和所述第二目标对象的外观特征之间的相似度;
    若所述相似度大于相似度阈值,则确定所述第一目标对象和所述第二目标对象为所述标准目标对象。
  13. 根据权利要求11所述的装置,其中,所述关键点检测子模块,用于当所述标准目标对象为行人时,提取所述标准目标对象的两足连线的中心点,得到所述N组关键点。
  14. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如权利要求1至9任一项所述的图像坐标系的转换方法。
  15. 一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至9任一项所述的图像坐标系的转换方法。
PCT/CN2020/102493 2019-07-31 2020-07-16 图像坐标系的转换方法、装置、设备及存储介质 WO2021017882A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021545821A JP7266106B2 (ja) 2019-07-31 2020-07-16 画像座標系の変換方法並びにその、装置、機器およびコンピュータプログラム
US17/373,768 US11928800B2 (en) 2019-07-31 2021-07-12 Image coordinate system transformation method and apparatus, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910704514.8A CN110458895B (zh) 2019-07-31 2019-07-31 图像坐标系的转换方法、装置、设备及存储介质
CN201910704514.8 2019-07-31

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/373,768 Continuation US11928800B2 (en) 2019-07-31 2021-07-12 Image coordinate system transformation method and apparatus, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021017882A1 true WO2021017882A1 (zh) 2021-02-04

Family

ID=68484414

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/102493 WO2021017882A1 (zh) 2019-07-31 2020-07-16 图像坐标系的转换方法、装置、设备及存储介质

Country Status (4)

Country Link
US (1) US11928800B2 (zh)
JP (1) JP7266106B2 (zh)
CN (1) CN110458895B (zh)
WO (1) WO2021017882A1 (zh)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458895B (zh) * 2019-07-31 2020-12-25 腾讯科技(深圳)有限公司 图像坐标系的转换方法、装置、设备及存储介质
CN112907454B (zh) * 2019-11-19 2023-08-08 杭州海康威视数字技术股份有限公司 获取图像的方法、装置、计算机设备和存储介质
KR20210073281A (ko) * 2019-12-10 2021-06-18 삼성전자주식회사 운동 정보 추정 방법 및 장치
CN113011445A (zh) * 2019-12-19 2021-06-22 斑马智行网络(香港)有限公司 标定方法、识别方法、装置及设备
CN111126257B (zh) * 2019-12-23 2023-08-11 上海商汤智能科技有限公司 行为检测方法及装置
CN113362392B (zh) * 2020-03-05 2024-04-23 杭州海康威视数字技术股份有限公司 可视域生成方法、装置、计算设备及存储介质
CN111783724B (zh) * 2020-07-14 2024-03-26 上海依图网络科技有限公司 一种目标对象识别方法和装置
CN112184787A (zh) * 2020-10-27 2021-01-05 北京市商汤科技开发有限公司 图像配准方法及装置、电子设备和存储介质
CN112528957A (zh) * 2020-12-28 2021-03-19 北京万觉科技有限公司 人体运动基础信息检测方法、系统及电子设备
CN113516036B (zh) * 2021-05-08 2024-05-24 上海依图网络科技有限公司 一种监控区域内目标对象的数量检测方法及装置
CN113286086B (zh) * 2021-05-26 2022-02-18 南京领行科技股份有限公司 一种摄像头的使用控制方法、装置、电子设备及存储介质
CN115294622B (zh) * 2022-06-15 2023-04-18 北京邮电大学 语音驱动说话人头动视频合成增强方法、系统和存储介质
CN117974417A (zh) * 2024-03-28 2024-05-03 腾讯科技(深圳)有限公司 Ai芯片、电子设备及图像处理方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6473536B1 (en) * 1998-09-18 2002-10-29 Sanyo Electric Co., Ltd. Image synthesis method, image synthesizer, and recording medium on which image synthesis program is recorded
CN101616310A (zh) * 2009-07-17 2009-12-30 清华大学 可变视角及分辨率的双目视觉系统目标图像稳定化方法
CN101639747A (zh) * 2009-08-31 2010-02-03 广东威创视讯科技股份有限公司 一种空间三维定位方法
CN109740413A (zh) * 2018-11-14 2019-05-10 平安科技(深圳)有限公司 行人重识别方法、装置、计算机设备及计算机存储介质
CN110458895A (zh) * 2019-07-31 2019-11-15 腾讯科技(深圳)有限公司 图像坐标系的转换方法、装置、设备及存储介质
CN111091025A (zh) * 2018-10-23 2020-05-01 阿里巴巴集团控股有限公司 图像处理方法、装置和设备

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5687249A (en) * 1993-09-06 1997-11-11 Nippon Telephone And Telegraph Method and apparatus for extracting features of moving objects
JP2006145419A (ja) 2004-11-22 2006-06-08 Univ Nihon 画像処理方法
JP2008046761A (ja) * 2006-08-11 2008-02-28 Sumitomo Electric Ind Ltd 移動体画像処理システム、装置及び方法
US7821958B2 (en) * 2007-12-21 2010-10-26 Belair Networks Inc. Method for estimating and monitoring timing errors in packet data networks
CN101710932B (zh) 2009-12-21 2011-06-22 华为终端有限公司 图像拼接方法及装置
CN101950426B (zh) * 2010-09-29 2014-01-01 北京航空航天大学 一种多摄像机场景下车辆接力跟踪方法
JP5588812B2 (ja) * 2010-09-30 2014-09-10 日立オートモティブシステムズ株式会社 画像処理装置及びそれを用いた撮像装置
US8831290B2 (en) * 2012-08-01 2014-09-09 Mitsubishi Electric Research Laboratories, Inc. Method and system for determining poses of vehicle-mounted cameras for in-road obstacle detection
CN104729429B (zh) * 2015-03-05 2017-06-30 深圳大学 一种远心成像的三维形貌测量系统标定方法
US10019806B2 (en) * 2015-04-15 2018-07-10 Sportsmedia Technology Corporation Determining x,y,z,t biomechanics of moving actor with multiple cameras
JP2016218849A (ja) 2015-05-22 2016-12-22 日本電信電話株式会社 平面変換パラメータ推定装置、方法、及びプログラム
CN104994360B (zh) * 2015-08-03 2018-10-26 北京旷视科技有限公司 视频监控方法和视频监控系统
US10681257B2 (en) * 2015-08-26 2020-06-09 Zhejiang Dahua Technology Co., Ltd. Methods and systems for traffic monitoring
US20170132476A1 (en) * 2015-11-08 2017-05-11 Otobrite Electronics Inc. Vehicle Imaging System
US10424070B2 (en) * 2016-04-21 2019-09-24 Texas Instruments Incorporated Methods and apparatus for structure from motion estimation
CN106709436B (zh) * 2016-12-08 2020-04-24 华中师范大学 面向轨道交通全景监控的跨摄像头可疑行人目标跟踪系统
JP2018120283A (ja) 2017-01-23 2018-08-02 キヤノン株式会社 情報処理装置、情報処理方法及びプログラム
US10467454B2 (en) * 2017-04-26 2019-11-05 Mashgin Inc. Synchronization of image data from multiple three-dimensional cameras for image recognition
DE102017207614A1 (de) * 2017-05-05 2018-11-08 Conti Temic Microelectronic Gmbh Vorrichtung und Verfahren zur Kalibrierung eines Kamerasystems eines Fahrzeugs
CN107845066B (zh) * 2017-10-09 2021-03-30 中国电子科技集团公司第二十八研究所 基于分段仿射变换模型的城市遥感图像拼接方法及装置
CN109040700A (zh) * 2018-09-10 2018-12-18 合肥巨清信息科技有限公司 一种基于大场景多gpu模式的视频拼接系统
CN110264572B (zh) * 2019-06-21 2021-07-30 哈尔滨工业大学 一种融合几何特性与力学特性的地形建模方法及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6473536B1 (en) * 1998-09-18 2002-10-29 Sanyo Electric Co., Ltd. Image synthesis method, image synthesizer, and recording medium on which image synthesis program is recorded
CN101616310A (zh) * 2009-07-17 2009-12-30 清华大学 可变视角及分辨率的双目视觉系统目标图像稳定化方法
CN101639747A (zh) * 2009-08-31 2010-02-03 广东威创视讯科技股份有限公司 一种空间三维定位方法
CN111091025A (zh) * 2018-10-23 2020-05-01 阿里巴巴集团控股有限公司 图像处理方法、装置和设备
CN109740413A (zh) * 2018-11-14 2019-05-10 平安科技(深圳)有限公司 行人重识别方法、装置、计算机设备及计算机存储介质
CN110458895A (zh) * 2019-07-31 2019-11-15 腾讯科技(深圳)有限公司 图像坐标系的转换方法、装置、设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LI, ZHIHUA: "Continuous Target Tracking Based on Multiple Cameras", JOURNAL OF ELECTRONIC MEASUREMENT AND INSTRUMENTATION, 28 February 2009 (2009-02-28), pages 1 - 1, XP055777482, [retrieved on 20210218] *

Also Published As

Publication number Publication date
US11928800B2 (en) 2024-03-12
US20210342990A1 (en) 2021-11-04
CN110458895B (zh) 2020-12-25
JP2022542204A (ja) 2022-09-30
JP7266106B2 (ja) 2023-04-27
CN110458895A (zh) 2019-11-15

Similar Documents

Publication Publication Date Title
WO2021017882A1 (zh) 图像坐标系的转换方法、装置、设备及存储介质
CN110428448B (zh) 目标检测跟踪方法、装置、设备及存储介质
CN108764024B (zh) 人脸识别模型的生成装置、方法及计算机可读存储介质
CN108764048B (zh) 人脸关键点检测方法及装置
US10573018B2 (en) Three dimensional scene reconstruction based on contextual analysis
JP2022533309A (ja) 画像ベースの位置特定
WO2016034059A1 (zh) 基于颜色-结构特征的目标对象跟踪方法
CN110428449B (zh) 目标检测跟踪方法、装置、设备及存储介质
CN110705478A (zh) 人脸跟踪方法、装置、设备及存储介质
CN107346414B (zh) 行人属性识别方法和装置
CN102243765A (zh) 基于多相机的多目标定位跟踪方法及系统
JP6590609B2 (ja) 画像解析装置及び画像解析方法
WO2021136386A1 (zh) 数据处理方法、终端和服务器
CN112200056B (zh) 人脸活体检测方法、装置、电子设备及存储介质
WO2022095514A1 (zh) 图像检测方法、装置、电子设备及存储介质
CN109934873B (zh) 标注图像获取方法、装置及设备
JP6290760B2 (ja) 作業類似度算出方法、装置およびプログラム
JP6662382B2 (ja) 情報処理装置および方法、並びにプログラム
WO2023016182A1 (zh) 位姿确定方法、装置、电子设备和可读存储介质
CN113283408A (zh) 基于监控视频的社交距离监测方法、装置、设备和介质
US8164633B2 (en) Calibration apparatus and method for imaging devices and computer program
WO2022247126A1 (zh) 视觉定位方法、装置、设备、介质及程序
JP6456244B2 (ja) カメラキャリブレーション方法および装置
CN111079470A (zh) 人脸活体检测的方法和装置
JP5217917B2 (ja) 物体検知追跡装置,物体検知追跡方法および物体検知追跡プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20847824

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021545821

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20847824

Country of ref document: EP

Kind code of ref document: A1