WO2022160748A1 - Procédé et appareil de traitement vidéo - Google Patents

Procédé et appareil de traitement vidéo Download PDF

Info

Publication number
WO2022160748A1
WO2022160748A1 PCT/CN2021/120411 CN2021120411W WO2022160748A1 WO 2022160748 A1 WO2022160748 A1 WO 2022160748A1 CN 2021120411 W CN2021120411 W CN 2021120411W WO 2022160748 A1 WO2022160748 A1 WO 2022160748A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
field
view
view frame
target
Prior art date
Application number
PCT/CN2021/120411
Other languages
English (en)
Chinese (zh)
Inventor
陈文明
邓高锋
张世明
吕周谨
倪世坤
Original Assignee
深圳壹秘科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹秘科技有限公司 filed Critical 深圳壹秘科技有限公司
Publication of WO2022160748A1 publication Critical patent/WO2022160748A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance

Definitions

  • the invention relates to the technical field of video processing, and in particular, to the technical field of video processing for portrait tracking.
  • the video images of one conference site are acquired through cameras, transmitted from the video image to the other conference site, and displayed on the display device of the other conference site.
  • the camera device of the venue needs to automatically track and focus the participants.
  • the empty space occupies the screen and makes the screen of the participants smaller. In this way, it is not conducive to the exchanges between the participants on both sides.
  • the present application provides a video processing method and device that can automatically track participants in a conference venue.
  • a video processing method comprising: acquiring a sensor frame captured by a video sensor, where the sensor frame is an image frame of an entire frame captured by a video sensor; detecting a target frame in the sensor frame, the The target frame is a human body image frame and/or an image frame containing a human body in the sensor frame; a visual field frame is determined according to the target frame; wherein, the visual field frame is an image frame including all the target frames; The target frame of the boundary of the visual field frame, and determine whether all the target frames that can determine the boundary of the visual field frame are stationary; when it is determined that all the target frames that can determine the boundary of the visual field frame are stationary , output the field of view frame.
  • a video processing device comprising: a video acquisition unit for acquiring a sensor frame captured by a video sensor, where the sensor frame is an image frame of an entire frame captured by the video sensor; a humanoid capture unit for A target frame in the sensor frame is detected, and the target frame is a human body image frame and/or an image frame containing a human body in the sensor frame; a video detection unit is used to determine a field of view frame according to the target frame; determining the target frame of the boundary of the field of view frame, and determining whether all the target frames that can determine the boundary of the field of view frame are still; wherein, the field of view frame is an image frame including all the target frames;
  • the image processing unit outputs the view frame when it is determined that all the target frames that can determine the boundary of the view frame are stationary.
  • the beneficial effect of the present application is that a complete image is acquired through the sensor, and the human body in the sensor frame is detected to determine the image range that needs to be displayed to the user, that is, the field of view frame.
  • the visual field frame is output and displayed. Because real-time monitoring is required for each sensor frame, the position changes of the participants at the venue can be captured in real time.
  • the new field of view will be recalculated and output, thus enabling automatic, real-time tracking of participants in the venue.
  • FIG. 1 is a system architecture diagram of an application of an embodiment of the present application.
  • FIG. 2 is a flowchart of a video processing method according to Embodiment 1 of the present application.
  • FIG. 3 is a flowchart of specific steps of determining a field of view frame according to a target frame in Embodiment 1 of the present application.
  • FIG. 4 is a schematic diagram of extending up and down all target frames in Embodiment 1 of the present application.
  • FIG. 5 is a schematic diagram of a field of view frame in Embodiment 1 of the present application.
  • FIG. 6 is a flowchart of determining a target frame that can determine the boundary of the field of view frame in Embodiment 1 of the present application.
  • FIG. 7 is a schematic diagram of cropping a sensor frame to obtain a field of view frame in Embodiment 1 of the present application.
  • FIG. 8 is a schematic diagram of smoothing a video image in Embodiment 1 of the present application.
  • FIG. 9 is a schematic block diagram of a video processing apparatus according to Embodiment 2 of the present application.
  • FIG. 10 is a schematic structural diagram of a video processing apparatus according to Embodiment 3 of the present application.
  • the embodiments of the present application may be applied to various camera devices or systems, for example, a camera device, a network camera device, and a conference terminal of an audio-video conference, and the specific device or system is not limited by the embodiments of the present application.
  • FIG. 1 shows a system architecture diagram 100 applied by an embodiment of the present application.
  • the system architecture 100 includes: a camera device 110 , a main processing device 120 and a display device 130 .
  • the camera device 110 , the main processing device 120 , and the display device 130 may be communicatively connected through one of electrical connection, network connection, communication connection, and the like.
  • the camera device 110 includes a video sensor for acquiring sensor frames. After the main processing device 120 processes the sensor frames, the field of view frame is sent to the display device 130 for display.
  • the camera device 110, the main processing device 120 and the display device 130 may be three mutually independent hardware entities; alternatively, the camera device 110 and the main processing device 120 may be set in the same hardware entity, for example, in a camera device In addition to including a video sensor, it also includes a device for processing video images; alternatively, the main processing device 120 and the display device 130 may be set in the same hardware entity, for example, in the display device 130, in addition to including The display also includes a device for processing video images.
  • the camera device 110 sends the acquired field of view frame to the display device 130, and the display device 130 processes the field of view frame before displaying it on the display.
  • the camera device 110 may be a camera
  • the display device 130 may be a display, a projector, a computer screen, etc.
  • the main processing device 120 may be a processing device built into the camera device 110 and the display device 130, or an independent processing device
  • the processing device such as a computer or other electronic device, such as a mobile intelligent electronic device, which can communicate with the device 110 and the display device 130, respectively.
  • the camera In the conference scene, the meeting place is fixed. In small and medium-sized conference venues, the camera can use a high-definition wide-angle lens to obtain images of the entire venue. As a result, cameras can capture every participant in real time.
  • the image frame of the entire frame captured by the video sensor is referred to as the sensor frame
  • the human body image frame and/or the image frame containing the human body in the sensor frame is referred to as the target frame, which will include all the above
  • the image frame of the target frame is called the field of view frame.
  • FIG. 2 shows a video processing method provided in Embodiment 1 of the present application.
  • the method can be applied to the camera device 110 with video processing capability, can be applied to the display device 130 with video processing capability, and can also be applied to the independent main processing device 120 .
  • the video processing method includes:
  • S210 Acquire a sensor frame captured by a video sensor, where the sensor frame is an image frame of an entire frame captured by the video sensor; optionally, acquire a sensor frame captured by a high-definition wide-angle camera, for example, the lens part of the camera adopts 4K
  • the lens 5 million pixels or more
  • a wide-angle lens so that when there are more participants in a multi-person conference scene, it can also ensure that all participants are included in the view of the lens. In the range, it can also ensure the clarity of the video; the sensor in the camera mainly converts the optical signal received by the lens into an electrical signal, and then the electrical signal (ie the video signal) is transmitted to the real-time image frame.
  • main processing device 120 main processing device 120;
  • a target frame in the sensor frame detects a target frame in the sensor frame, where the target frame is a human body image frame and/or an image frame including a human body in the sensor frame; optionally, a method for detecting a human body includes but is not limited to face detection , upper body detection, lower body detection, human body pose estimation (SPPE, DensePose) and other methods; it should be noted that the human body referred to in this application may include the entire body of a person, or may refer to a part of the entire body, such as a face or upper body;
  • S240 determine all the target frames that can determine the boundary of the field of view frame, and determine whether all the target frames that can determine the boundary of the field of view frame are static;
  • the field of view frame may be directly displayed on the device running the method, or the field of view frame may be output to other display devices to display the field of view frame by means of wireless or limited transmission.
  • S230 determining a visual field frame according to the target frame, including:
  • Fig. 4 expand the height of a certain proportion to the upper and lower of all target frames, such as e*H, e is the scale coefficient, and H is the height of the corresponding target frame;
  • the range that needs to be displayed to the user is determined.
  • the field of view frame at this time may not conform to the displayed size, or does not conform to the displayed length ratio, etc. Require. You can further adjust the visual field frame. Therefore, in S230, the visual field frame is determined according to the target frame, and may further include the following adjustment mode 1 and/or adjustment mode 2.
  • step S230 further includes:
  • the maximum value of the preset visual field frame is View max and the minimum width and height are W min and H min respectively.
  • View max is generally predefined as the size of the sensor original image
  • W min , H min are set according to the local area of the sensor original image that needs to be enlarged
  • the smaller W min and H min are set, the larger the local area that can be enlarged Small.
  • the coordinates of the visual field frame cannot exceed View max
  • the width/height values cannot be less than W min /H min
  • the coordinates of the minimum frame View O exceeding the boundary or insufficient are corrected.
  • the field of view frame obtained after coordinate correction is marked as View F .
  • the coordinates of the 4 points of the View O box must be within the range of View max coordinates, and the coordinates beyond the maximum boundary are replaced by the maximum boundary coordinates.
  • the width/height values of View O must be greater than or equal to W min /H min . If the width/height of View O is less than W min /H min , the width/height of View O is supplemented to W min /H min .
  • step S234 specifically includes: supplementing one-half of the difference between the minimum height value of the field of view frame and the height value of the field of view frame to the upper and lower boundaries of the field of view frame. If the upper boundary or lower boundary of the visual field frame exceeds the maximum boundary of the visual field frame, the coordinates beyond the maximum boundary are replaced by the maximum boundary coordinates, and the numerical value beyond the maximum boundary is supplemented to the opposite boundary.
  • step S235 specifically includes: supplementing half of the difference between the minimum width value of the field of view frame and the width value of the field of view frame to the left and right boundaries of the field of view frame, if all subsequent additions are added.
  • the left boundary or the right boundary of the visual field frame exceeds the maximum boundary of the visual field frame, then the coordinates beyond the maximum boundary are replaced with the coordinates of the maximum boundary, and the numerical value beyond the maximum boundary is added to the opposite side. boundary.
  • step S230 adjust the aspect ratio of the field of view frame. That is, step S230 further includes:
  • step S236 Adjust the width value and/or the height value of the field of view frame according to the aspect ratio of the current video resolution.
  • the field of view frame obtained after adjustment in step S236 is marked as View.
  • the field of view frame is the field of view frame that is output and displayed to the user.
  • the above-mentioned adjustment method 1 and adjustment method 2 of the field of view frame can be used either, or both adjustment methods can be used, first adjust the size with adjustment method 1, and then adjust with adjustment method 2 Aspect ratio.
  • Rect ti is the set of target frames detected at time ti.
  • determining the target frame that can determine the boundary of the field of view frame specifically includes:
  • a frame rect j is removed from the target frame Rect ti monitored at time ti to obtain a new set by
  • a field of view frame is calculated if It means that the target frame rect j will not affect the calculation result of the field of view frame, otherwise if It means that the target frame rect j will determine the boundary coordinates of the view frame.
  • DecisionRect ti is the set of target boxes that can determine the boundary of the view frame View at time ti.
  • it is determined that all the target frames that can determine the boundary of the field of view frame are static, specifically including:
  • the manner of determining the motion factor Factor 12 of the target frame is as follows:
  • the detection unit After the detection unit receives a sensor frame transmitted by the sensor, it will perform real-time detection on the sensor frame. First, the human body is detected, and the target frame containing the human body is framed, which is called target frame 1 here. Assuming that the upper left corner of the sensor frame is the coordinate origin (0,0), the coordinate of the center point C1 of the target frame 1 is calculated as (x1 ,y1), width W1, height H1, and the result will be saved.
  • the detection unit After the detection unit receives the next sensor frame transmitted by the sensor, it also performs real-time detection on the next sensor frame. Use the same method to frame the target frame 2 containing the human body, and save the coordinates of the center point C2 of the target frame 2 (x2, y2), the length W2, and the height H2.
  • the above is to calculate the motion factor Factor 12 between two sensor frames (which can be the current frame and the previous frame, or the current frame and the next frame).
  • the target frame is determined to be static; when it is determined that the motion factor of T1 within a certain period of time exceeds (such as greater than) the target frame
  • the threshold value of the motion factor can be taken as 0.5, which is an empirical value, which will be different under different conditions.
  • the value of T1 ranges from 0 seconds to 10 seconds. If you need to focus on the person who is currently moving, as long as T1 is small enough.
  • S250 may specifically include:
  • the sensor frame is cropped and scaled by invoking an ISP (Image Signal Processor, image signal processor) chip.
  • ISP Image Signal Processor, image signal processor
  • S250 specifically Also includes:
  • S254 Update the field of view frame frame by frame according to the number of moving steps until the target field of view frame is reached.
  • the field of view frame of each frame of image moves according to a fixed step size to avoid moving too fast.
  • step max the maximum step size of the coordinate value of the view frame
  • View dist (x 0 , y 0 , x 1 , y 1 )
  • MoveNum max ⁇ x 0 , y 0 , x 1 , y 1 ⁇ /step max .
  • View step View dist /MoveNum
  • the cropping and/or scaling processing and the video image smoothing processing in the above S2502 may be used together in practical applications, for example, the cropping and/or scaling processing is performed first, and then the video image smoothing processing is performed.
  • the image of the entire conference site is acquired by the sensor, and the human body in the sensor frame is detected to determine the image range that needs to be displayed to the user. And according to comparing the position change of the same target frame in each sensor frame, it is determined whether the picture frame is in a static state. When it is determined that all the people in the venue who have an influence on the output target frame have been in a still state, the visual field frame of the picture including all the human bodies is output and displayed. Since real-time monitoring is required for each sensor frame, even after the participants are seated, for some reason, the positions of the participants have changed.
  • the described video processing method can capture this change in real time, and after the participants are seated again, a new field of view frame is recalculated, output and displayed for the user to watch. Because the above method does not need to control the rotation of the camera or refocus, it just recalculates the sensor frame captured by the sensor to obtain a new field of view, and outputs and displays it to the user. Perform automatic, real-time tracking. Also, the apparatus using the method can thus also be a plug-and-play device.
  • a video processing apparatus 300 provided in Embodiment 2 of the present application the video processing apparatus includes:
  • the video acquisition unit 310 is configured to acquire the sensor frame captured by the video sensor, where the sensor frame is the image frame of the entire frame captured by the video sensor; optionally, the video acquisition unit 310 acquires the sensor frame captured by the high-definition wide-angle camera ;
  • a humanoid capture unit 320 configured to detect a target frame in the sensor frame, where the target frame is a human body image frame and/or an image frame containing a human body in the sensor frame;
  • the video detection unit 330 is configured to determine a visual field frame according to the target frame; determine all the target frames that can determine the boundary of the visual field frame, and determine whether all the target frames that can determine the boundary of the visual field frame are all Still; wherein, the field of view frame is an image frame including all the target frames;
  • the image processing unit 340 outputs the view frame when it is determined that all the target frames that can determine the boundary of the view frame are stationary.
  • the video detection unit 330 is specifically configured to, when it is determined that the target frames are static, expand the heights of all the target frames up and down by a certain proportion;
  • the smallest frame included in the target frame is the field of view frame.
  • the video detection unit 330 is further configured to replace the four vertex coordinates of the view frame with the maximum boundary coordinates if the coordinates of the four vertices of the view frame exceed the maximum boundary coordinates of the view frame. and/or, if the height value of the field of view frame is less than the minimum height value of the field of view frame, then adjust the height value of the field of view frame to the minimum height value of the field of view frame; and/or, if the field of view If the width value of the frame is smaller than the minimum width value of the view frame, then adjust the width value of the view frame to the minimum width value of the view frame.
  • the video detection unit 330 is specifically used for:
  • the height value of the view frame is smaller than the minimum height value of the view frame, add half of the difference between the minimum width value of the view frame and the width value of the view frame to the view frame If the left or right boundary of the visual field frame after supplementation exceeds the maximum boundary of the visual field frame, the coordinates that exceed the maximum boundary will be replaced by the coordinates of the maximum boundary, and the coordinates that exceed the maximum boundary will be replaced at the same time.
  • the value of the maximum boundary is supplemented to the opposite boundary; and/or,
  • the width value of the view frame is smaller than the minimum width value of the view frame, add half of the difference between the minimum height value of the view frame and the height value of the view frame to the view frame If the upper and lower boundaries of the field of view frame after supplementation exceed the maximum boundary of the field of view frame, the coordinates that exceed the maximum boundary will be replaced by the maximum boundary coordinates, and the maximum boundary will be exceeded at the same time. The value of is added to the opposite boundary.
  • the video detection unit 330 is further configured to adjust the width value and/or the height value of the field of view frame according to the aspect ratio of the current video resolution.
  • the video detection unit 330 configured to determine the target frame that can determine the boundary of the field of view frame, includes:
  • the video detection unit 330 is specifically configured to calculate and obtain a first field of view frame according to all target frames; delete one of the target frames; calculate and obtain a second field of view frame according to the remaining target frames; When it is not equal to the second field of view frame, it is determined that the target frame to be deleted is the target frame that can determine the boundary of the field of view frame.
  • the detection unit 330 determines by calculation whether a certain target frame is a target frame that can determine the boundary of the field of view frame, please refer to the description in Embodiment 1, which will not be repeated here.
  • the video detection unit 330 is configured to determine that all the target frames that can determine the boundary of the field of view frame are static, including:
  • the video detection unit 330 is specifically configured to determine all the target frames that can determine the boundary of the field of view when the motion factors within the preset time interval are all smaller than a preset threshold.
  • the bounding boxes of the target boxes are in a stationary state. Specifically, how the video detection unit 330 determines whether a certain target frame is in a static state through calculation, please refer to the specific description in Embodiment 1, which will not be repeated here.
  • the image processing unit 340 is specifically configured to, when it is determined that all the target frames that can determine the boundary of the field of view frame are stationary, crop and/or crop the sensor frame according to the field of view frame. or zoom, and output the field of view. specific,
  • the image processing unit 340 is specifically configured to, when it is determined that all the target frames that can determine the boundary of the field of view frame are stationary, crop and/or crop the sensor frame according to the field of view frame. or zoom; and calculate the difference coordinates between the target field of view frame and the current field of view frame; according to the preset maximum movement step size of the field of view frame of each frame of image, calculate the movement from the current field of view frame to the target field of view
  • the number of moving steps of the frame; the field of view frame is updated frame by frame according to the number of moving steps until the target field of view frame is reached.
  • how the image processing unit 340 gradually updates the current field of view frame until reaching the target field of view frame please refer to the examples in S252 to S254 in the first embodiment, which will not be repeated here.
  • the video processing device 300 is a camera device with a built-in video processing function, such as the combination of the camera device 110 and the main processing device 120 in FIG. 1 ; it can also be a display device (such as a computer or an intelligent electronic device) with a built-in video processing function. , such as the combination of the main processing device 120 and the display device 130 in FIG. 1 ; it can also be an electronic device independent of hardware. Not limited in this application.
  • FIG. 10 is a schematic structural diagram of a video processing apparatus 400 according to Embodiment 3 of the present application.
  • the video processing apparatus 400 includes: a processor 410 , a memory 420 and a communication interface 430 .
  • the processor 410, the memory 420 and the communication interface 430 are connected to each other through a bus system.
  • the processor 410 may be an independent component, or may be a collective term for multiple processing components. For example, it may be a CPU, an ASIC, or one or more integrated circuits configured to implement the above method, such as at least one microprocessor DSP, or at least one programmable gate FPGA, etc.
  • the memory 420 is a computer-readable storage medium on which programs executable on the processor 410 are stored.
  • the processor 410 invokes the program in the memory 420 to execute a video processing method provided in the first embodiment, and transmits the result obtained by the processor 410 to other devices through the communication interface 430 in a wireless or wired manner.
  • the video processing apparatus 400 may further include a camera 440 .
  • the camera 440 acquires the sensor frame and sends it to the processing 410, and the processor 410 calls the program in the memory 420, executes the video processing method provided in the first embodiment above, and processes the sensor frame , and transmit the result to other devices through the communication interface 430 in a wireless or wired manner.
  • the functions described in the specific embodiments of the present application may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software When implemented in software, it may be implemented by a processor executing software instructions.
  • the software instructions may consist of corresponding software modules.
  • the software modules may be stored in a computer-readable storage medium, which may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more available mediums integrated.
  • the available media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, Digital Video Disc (DVD)), or semiconductor media (eg, Solid State Disk (SSD)) )Wait.
  • the computer-readable storage medium includes but is not limited to random access memory (Random Access Memory, RAM), flash memory, read only memory (Read Only Memory, ROM), Erasable Programmable Read Only Memory (Erasable Programmable ROM, EPROM) ), Electrically Erasable Programmable Read-Only Memory (Electrically EPROM, EEPROM), registers, hard disks, removable hard disks, compact disks (CD-ROMs), or any other form of storage medium known in the art.
  • An exemplary computer-readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the computer-readable storage medium.
  • the computer-readable storage medium can also be an integral part of the processor.
  • the processor and computer-readable storage medium may reside in an ASIC. Additionally, the ASIC may reside in access network equipment, target network equipment or core network equipment.
  • the processor and the computer-readable storage medium may also exist as discrete components in the access network device, the target network device, or the core network device. When implemented in software, it can also be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer program instructions may be stored on or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, from a website site, computer, server, or computer-readable storage medium.
  • the data center transmits to another website site, computer, server or data center by wired (eg coaxial cable, optical fiber, Digital Subscriber Line, DSL) or wireless (eg infrared, wireless, microwave, etc.).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Studio Devices (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et un appareil de traitement vidéo. Le procédé consiste à : acquérir une trame de capteur capturée par un capteur vidéo, la trame de capteur étant une case d'image d'une trame entière capturée par le capteur vidéo ; détecter dans la trame de capteur des cases cibles, c'est-à-dire des cases d'image de corps humain et/ou des cases d'image comprenant un corps humain dans la trame de capteur ; déterminer selon les cases cibles une case de champ d'aperçu, c'est-à-dire une case d'image comprenant toutes les cases cibles ; déterminer toutes les cases cibles pouvant déterminer les limites de la case de champ d'aperçu et déterminer si toutes les cases cibles pouvant déterminer les limites de la case de champ d'aperçu sont fixes ; et lorsqu'il est déterminé que toutes les cases cibles pouvant déterminer les limites de la case de champ d'aperçu sont fixes, transmettre la case de champ d'aperçu. Selon la solution, on peut implémenter un suivi automatique et en temps réel de participants dans une salle de conférence.
PCT/CN2021/120411 2021-01-29 2021-09-24 Procédé et appareil de traitement vidéo WO2022160748A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110129029.XA CN112907617B (zh) 2021-01-29 2021-01-29 一种视频处理方法及其装置
CN202110129029.X 2021-01-29

Publications (1)

Publication Number Publication Date
WO2022160748A1 true WO2022160748A1 (fr) 2022-08-04

Family

ID=76121324

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/120411 WO2022160748A1 (fr) 2021-01-29 2021-09-24 Procédé et appareil de traitement vidéo

Country Status (2)

Country Link
CN (1) CN112907617B (fr)
WO (1) WO2022160748A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907617B (zh) * 2021-01-29 2024-02-20 深圳壹秘科技有限公司 一种视频处理方法及其装置
CN115633255B (zh) * 2021-08-31 2024-03-22 荣耀终端有限公司 视频处理方法和电子设备
CN114222065B (zh) * 2021-12-20 2024-03-08 北京奕斯伟计算技术股份有限公司 图像处理方法、装置、电子设备、存储介质及程序产品

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140002742A1 (en) * 2012-06-29 2014-01-02 Thomson Licensing Method for reframing images of a video sequence, and apparatus for reframing images of a video sequence
CN104125390A (zh) * 2013-04-28 2014-10-29 浙江大华技术股份有限公司 一种用于球型摄像机的定位方法及装置
US20180063482A1 (en) * 2016-08-25 2018-03-01 Dolby Laboratories Licensing Corporation Automatic Video Framing of Conference Participants
CN111756996A (zh) * 2020-06-18 2020-10-09 影石创新科技股份有限公司 视频处理方法、视频处理装置、电子设备及计算机可读存储介质
WO2020220289A1 (fr) * 2019-04-30 2020-11-05 深圳市大疆创新科技有限公司 Procédé, appareil et système permettant de régler un champ de vision d'observation, support de stockage et appareil mobile
CN112073613A (zh) * 2020-09-10 2020-12-11 广州视源电子科技股份有限公司 会议人像的拍摄方法、交互平板、计算机设备及存储介质
CN112907617A (zh) * 2021-01-29 2021-06-04 深圳壹秘科技有限公司 一种视频处理方法及其装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2564668B (en) * 2017-07-18 2022-04-13 Vision Semantics Ltd Target re-identification
CN109766919B (zh) * 2018-12-18 2020-11-10 通号通信信息集团有限公司 级联目标检测系统中的渐变式分类损失计算方法及系统
WO2020133170A1 (fr) * 2018-12-28 2020-07-02 深圳市大疆创新科技有限公司 Procédé et appareil de traitement d'image
CN111401383B (zh) * 2020-03-06 2023-02-10 中国科学院重庆绿色智能技术研究院 基于图像检测的目标框预估方法、系统、设备及介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140002742A1 (en) * 2012-06-29 2014-01-02 Thomson Licensing Method for reframing images of a video sequence, and apparatus for reframing images of a video sequence
CN104125390A (zh) * 2013-04-28 2014-10-29 浙江大华技术股份有限公司 一种用于球型摄像机的定位方法及装置
US20180063482A1 (en) * 2016-08-25 2018-03-01 Dolby Laboratories Licensing Corporation Automatic Video Framing of Conference Participants
WO2020220289A1 (fr) * 2019-04-30 2020-11-05 深圳市大疆创新科技有限公司 Procédé, appareil et système permettant de régler un champ de vision d'observation, support de stockage et appareil mobile
CN111756996A (zh) * 2020-06-18 2020-10-09 影石创新科技股份有限公司 视频处理方法、视频处理装置、电子设备及计算机可读存储介质
CN112073613A (zh) * 2020-09-10 2020-12-11 广州视源电子科技股份有限公司 会议人像的拍摄方法、交互平板、计算机设备及存储介质
CN112907617A (zh) * 2021-01-29 2021-06-04 深圳壹秘科技有限公司 一种视频处理方法及其装置

Also Published As

Publication number Publication date
CN112907617B (zh) 2024-02-20
CN112907617A (zh) 2021-06-04

Similar Documents

Publication Publication Date Title
WO2022160748A1 (fr) Procédé et appareil de traitement vidéo
WO2021208371A1 (fr) Procédé et appareil de commande de zoom multicaméra, et système électronique et support de stockage
US11012614B2 (en) Image processing device, image processing method, and program
JP5592006B2 (ja) 三次元画像処理
US8988529B2 (en) Target tracking apparatus, image tracking apparatus, methods of controlling operation of same, and digital camera
TWI808987B (zh) 將相機與陀螺儀融合在一起的五維視頻穩定化裝置及方法
WO2019114617A1 (fr) Procédé, dispositif et système de capture rapide d'image fixe
US11825183B2 (en) Photographing method and photographing apparatus for adjusting a field of view of a terminal
WO2020007320A1 (fr) Procédé de fusion d'images à plusieurs angles de vision, appareil, dispositif informatique, et support de stockage
US20150103184A1 (en) Method and system for visual tracking of a subject for automatic metering using a mobile device
WO2017045326A1 (fr) Procédé de traitement de photographie pour un véhicule aérien sans équipage
WO2019237745A1 (fr) Procédé et appareil de traitement d'image faciale, dispositif électronique et support de stockage lisible par ordinateur
US20200099854A1 (en) Image capturing apparatus and image recording method
WO2023072030A1 (fr) Procédé et appareil de mise au point automatique pour lentille, et dispositif électronique et support de stockage lisible par ordinateur
WO2021147650A1 (fr) Procédé et appareil de photographie, support de stockage et dispositif électronique
WO2021136035A1 (fr) Procédé et appareil de photographie, support d'enregistrement et dispositif électronique
JP2013172446A (ja) 情報処理装置、端末装置、撮像装置、情報処理方法、及び撮像装置における情報提供方法
JP7424076B2 (ja) 画像処理装置、画像処理システム、撮像装置、画像処理方法およびプログラム
WO2022042669A1 (fr) Procédé de traitement d'image, appareil, dispositif et support de stockage
CN110570441B (zh) 一种超高清低延时视频控制方法及系统
WO2023165535A1 (fr) Procédé et appareil de traitement d'image et dispositif
WO2023225825A1 (fr) Procédé et appareil de génération de graphe de différence de position, dispositif électronique, puce et support
WO2021147648A1 (fr) Procédé et dispositif de suggestion, support de stockage et appareil électronique
CN114647983A (zh) 显示设备及基于人像的距离检测方法
US20230368343A1 (en) Global motion detection-based image parameter control

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21922340

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21922340

Country of ref document: EP

Kind code of ref document: A1