WO2024040535A1 - 视频处理方法、装置、设备和计算机存储介质 - Google Patents
视频处理方法、装置、设备和计算机存储介质 Download PDFInfo
- Publication number
- WO2024040535A1 WO2024040535A1 PCT/CN2022/114915 CN2022114915W WO2024040535A1 WO 2024040535 A1 WO2024040535 A1 WO 2024040535A1 CN 2022114915 W CN2022114915 W CN 2022114915W WO 2024040535 A1 WO2024040535 A1 WO 2024040535A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- interest
- area
- video
- region
- target object
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 230000033001 locomotion Effects 0.000 claims abstract description 205
- 230000005540 biological transmission Effects 0.000 claims abstract description 96
- 238000000034 method Methods 0.000 claims abstract description 67
- 238000012545 processing Methods 0.000 claims abstract description 54
- 206010034719 Personality change Diseases 0.000 claims description 66
- 230000002596 correlated effect Effects 0.000 claims description 64
- 230000008859 change Effects 0.000 claims description 49
- 230000000875 corresponding effect Effects 0.000 claims description 29
- 238000004590 computer program Methods 0.000 claims description 27
- 239000013598 vector Substances 0.000 claims description 22
- 238000006073 displacement reaction Methods 0.000 claims description 20
- 230000001133 acceleration Effects 0.000 claims description 10
- 230000007423 decrease Effects 0.000 claims description 9
- 238000013139 quantization Methods 0.000 claims 4
- 230000007704 transition Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 208000003464 asthenopia Diseases 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 239000004984 smart glass Substances 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 239000000969 carrier Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/142—Detection of scene cut or scene change
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
Definitions
- the present application relates to the field of video processing technology, and in particular, to a video processing method, device, equipment and computer storage medium.
- the captured images/videos usually need to be transmitted in real time or with low latency, which requires a large amount of transmission resources.
- Traditionally all areas in a single image frame are encoded using a unified strategy.
- a unified encoding strategy cannot provide users with a clear view of the Region of Interest (ROI). Therefore, how to save transmission bandwidth while ensuring the user's subjective image quality experience is an issue that needs to be solved urgently.
- ROI Region of Interest
- this application provides a video processing method, device, equipment and computer storage medium, which can save transmission bandwidth.
- embodiments of the present application provide a video processing method, which method includes:
- the video dividing the video into a plurality of regions according to information associated with a global motion state between frames of the video, wherein the plurality of regions includes an area of interest and a non-interest area;
- embodiments of the present application provide a video processing method, which method includes:
- a video processing device which includes:
- Memory used to store computer programs
- a processor configured to call the computer program.
- the video processing device performs the following operations:
- the video dividing the video into a plurality of regions according to information associated with a global motion state between frames of the video, wherein the plurality of regions includes an area of interest and a non-interest area;
- embodiments of the present application provide a video processing device, which is characterized in that it includes:
- Memory used to store computer programs
- a processor configured to call the computer program.
- the video processing device performs the following operations:
- embodiments of the present application provide a device, which includes:
- the photographing device being mounted on the equipment
- embodiments of the present application provide a computer storage medium.
- Computer program instructions are stored in the computer storage medium. When the computer program instructions are executed by a processor, they are used to perform the first aspect or the second aspect.
- the video is divided into multiple areas including ROI areas and non-ROI areas according to the information associated with the global motion state between frames of the video, and the ROI areas and non-ROI areas in the video are Different image processing methods are performed in the area to make the definition of the ROI area and the non-ROI area different. While improving the user's subjective image quality experience, it also reduces the overall occupation of transmission resources and improves the timeliness of transmission.
- Figure 1 is a schematic scene diagram of an unmanned aerial vehicle provided by an embodiment of the present application.
- Figure 2 is a schematic flow chart of a video processing method provided by an embodiment of the present application.
- Figure 3 is a flow chart of dividing multiple areas in a video provided by an embodiment of the present application.
- Figure 4 is a schematic diagram of the position changes of the ROI area in the video provided by the embodiment of the present application.
- Figure 5 is a schematic flow chart of yet another video processing method provided by an embodiment of the present application.
- Figure 6 is a schematic block diagram of a video processing device provided by an embodiment of the present application.
- FIG. 7 is a schematic block diagram of yet another video processing device provided by an embodiment of the present application.
- Unmanned aerial vehicle 10 remote control 20, intelligent terminal 30, shooting device 11, pan/tilt 12;
- Memory 301 processor 302;
- Memory 401 Memory 401, processor 402.
- Embodiments of the present application are applicable to video transmission/communication scenarios of any device with a video capture function, including but not limited to handheld gimbals, action cameras, and movable devices that directly carry or indirectly carry shooting devices through carriers.
- Electronic devices such as mobile phones, tablets, smart wearable devices, and computers with shooting functions.
- the movable device may be a vehicle capable of self-propelled movement.
- the vehicle may have one or more propulsion units that would be capable of allowing the vehicle to move within the environment.
- Mobile devices are capable of traveling on land or underground, on or in water, in the air, in space, or any combination thereof.
- the mobile device may be an aircraft (e.g., rotorcraft, fixed-wing aircraft), land-based vehicle, water-based vehicle, or air-based vehicle.
- Mobile equipment can be manned or unmanned.
- the carrier may include one or more devices configured to accommodate the camera device and/or allow the camera device to be adjusted (eg, rotated) relative to the movable device.
- the carrier may be a gimbal.
- the carrier can be configured to allow the shooting device to rotate around one or more rotation axes, including a yaw axis (Yaw), a pitch axis (Pitch), a roll axis (Roll), etc.
- the carrier can Configured to allow rotation around each axis 360° and above to allow greater control over the camera's viewing angle.
- the unmanned aerial vehicle can be an aerial photography aircraft or a flying vehicle.
- the unmanned aerial vehicle 10 is flying in the air
- the shooting device 11 is mounted on the unmanned aerial vehicle 10 through the gimbal 12, and the shooting device 11 collects video.
- the video is then transmitted to the remote control end 20 through the wireless image transmission system, and the remote control end 20 then transmits the image to an intelligent terminal 30 with a display function for display.
- the video captured by the shooting device 11 can also be directly transmitted to the smart terminal 30 for display through the wireless image transmission system.
- Smart terminals can be wearable devices such as smart glasses, goggles, or head-mounted displays, or user devices such as mobile phones, computers, and tablets.
- Smart terminals can include any type of wearable computer or device that incorporates augmented reality (Augmented Reality, AR) or virtual reality (Virtual Reality, VR) technology.
- Users can watch videos shot by unmanned aerial vehicles through smart terminals. For example, by wearing smart glasses to watch videos returned by mobile devices, they can experience an immersive aerial photography and racing experience. For scenarios where aerial photography or flying drones are flying at high speed, the unmanned aerial vehicle needs to transmit a lot of video data back to the remote controller or smart terminal, and has high requirements for real-time or low-latency image or video transmission. Therefore, how to save transmission resources? It is a problem that needs to be solved urgently.
- Figure 2 is a schematic flowchart of a video processing method provided by an embodiment of the present application.
- the method 100 includes steps S110 to S130:
- S120 Divide the video into multiple areas according to the information associated with the global motion state between frames of the video, where the multiple areas include ROI areas and non-ROI areas;
- S130 Perform different image processing on the ROI area and the non-ROI area so that the sharpness of the ROI area and the non-ROI area is different.
- the information associated with the global motion state between frames of the video refers to the information that can reflect the global motion state between the current frame and the historical frame of the video, and reflects the global change information between the current frame and the historical frame, where The historical frame can be any frame before the current frame. If the global motion change represented by the information associated with the global motion state between frames of the video satisfies the preset change condition, the video is divided into multiple regions. The greater the global motion change between frames, the more drastic the global changes reflected in the picture between frames. The more high-speed motion content in the video that is insensitive to the human eye, the more inclined it is to be processed by division.
- the information associated with the global motion state between frames of the video includes at least one of the following information or a combination thereof: global motion information between frames of the video; information associated with the motion state of the target object, where the target object This includes at least any one of a shooting device and equipment carrying a shooting device.
- the information associated with the global motion state between frames of the video may also include other information that can reflect the global motion state between the current frame and historical frames of the video. This application will No restrictions.
- the global motion information between video frames includes: global motion vector (Global Motion Vector, GMV) between video frames;
- GMV Global Motion Vector
- the information associated with the movement state of the target object includes: the movement speed of the target object and the relative distance between the target object and the subject; or, the movement speed of the target object and the distance between the target object and the subject. Relative height.
- the target object is a shooting device
- the information associated with the motion state of the target object is the information associated with the motion state of the shooting device.
- the information associated with the movement state of the target object may be reflected by the movement state associated with the equipment equipped with the shooting device.
- information related to the motion state of the target can be represented by the pan/tilt equipped with a shooting device.
- the information associated with the motion state of the target can be represented by the motion state of the movable device.
- the shooting device When the shooting device is mounted on the movable device using a carrier, and Information related to the motion state of the target object can be characterized by the motion state of the movable device or carrier.
- the information associated with the motion state of the target may be: the flight speed of the aircraft and the relative height between the aircraft and the object.
- the flight speed and altitude information of the aircraft can be easily obtained through the aircraft's own navigation system, or mapped using the user's stroke amount, or measured using the motion sensor carried by the carrier, shooting device or movable device itself; in There is no restriction on this.
- the video is divided into multiple regions based on the information associated with the global motion state between frames of the video, including: if the global motion change is represented by the information associated with the global motion state between frames of the video If the preset change conditions are met, the video will be divided into multiple areas.
- the global motion change represented by the information associated with the global motion state between frames of the video satisfies the preset change conditions, it includes at least one of the following situations:
- Scenario 1 If the absolute value of GMV between frames of the video is greater than the GMV threshold, the video is divided into multiple regions.
- Scenario 2 When the relative height between the target object and the subject remains constant, if the absolute value of the target's movement speed is greater than the movement speed threshold, the video is divided into multiple regions.
- Scenario 3 When the movement speed of the target object remains constant, if the relative height between the target object and the subject is less than the height threshold, the video is divided into multiple regions.
- the step of dividing the video into multiple areas will be triggered.
- it may also include the use of other information that can reflect the global motion state association between the current frame of the video and the historical frame to make a judgment on the division, which is not limited by this application.
- the moving speed of the target and the relative distance between the target and the subject are used as examples here. In other scenarios, such as the aircraft scene, the moving speed of the target and the relative distance between the target and the subject are used.
- the relative height between objects is used as information related to the motion state of the target object. The same is true in the following paragraphs.
- the relative height between the target object and the photographed object can also be expressed in other forms, for example, the height between the target object and the ground or the height between the target object and the starting point (such as the take-off point).
- the video when transmission resources are sufficient, that is, when the transmission conditions of the transmission equipment corresponding to the shooting device meet the preset transmission conditions, the video may not be divided into ROI areas and non-ROI areas, that is, the entire image The image quality is higher, and there is no need to sacrifice the image quality of the non-ROI area to improve the image quality of the ROI area.
- transmission resources are insufficient, that is, when the transmission adjustment of the transmission equipment corresponding to the shooting device does not meet the preset transmission conditions
- the video is divided into regions based on information associated with the global motion state between frames of the video.
- the transmission condition can be represented by the transmission code rate.
- the transmission code rate is higher than the code rate threshold, which can be a specific situation where the transmission conditions meet the preset transmission conditions.
- the transmission code rate is lower than the code rate threshold, which can be a case where the transmission conditions are not good. A specific situation that meets preset transmission conditions. It should also be understood that in addition to the transmission code rate, determination methods also include using other information that can reflect the transmission conditions of the transmission device, and this application is not limited to this.
- This embodiment of the present application provides a specific method for determining the division of ROI areas and non-ROI areas.
- the specific steps are as follows: first determine whether the transmission code rate of the transmission equipment corresponding to the shooting device is lower than the code rate. rate threshold, if not, then do not divide. If yes, further determine whether the absolute value of GMV between frames of the video is greater than the GMV threshold; if yes, then divide, if not, then further determine the movement speed of the target object and the target and subject. The relative height between them; if the movement speed of the target object is greater than the first movement speed threshold, and whether the relative height between the target object and the subject is less than the first height threshold; if yes, then divide, if not, then do not divide .
- the method further includes: determining the area of the ROI region and/or the area of the non-ROI region according to the information associated with the global motion state between frames of the video. On the basis of ensuring the user's subjective image quality, the reasonable allocation of transmission resources between ROI areas and non-ROI areas is adaptively adjusted based on the information associated with the global motion status between frames of the video.
- the area of the ROI region is negatively correlated with the global motion changes represented by the information associated with the inter-frame global motion state of the video; and/or, the area of the non-ROI area is associated with the inter-frame global motion state of the video.
- the global motion changes represented by the information are positively correlated.
- the greater the change in global motion the more drastic the global change in the picture between frames, the more high-speed motion content in the image that the human eye is insensitive to, and the smaller the area of the ROI area that the human eye is concerned about can be set accordingly.
- the larger the area of the non-ROI area can be set the more transmission resources can be saved overall.
- the regional area adjustment method of the ROI area and/or non-ROI area includes at least one of the following situations:
- Case 1 The area of the ROI area is negatively correlated with the absolute value of the GMV between frames of the video (for example, the larger the absolute value, the smaller the area of the ROI area), and/or, the area of the non-ROI area is related to the area of the video
- the absolute value of GMV between frames is positively correlated (for example, the larger the absolute value, the larger the area of the non-ROI area).
- Scenario 2 When the relative height between the target and the subject remains constant, the area of the ROI area is negatively correlated with the absolute value of the target's movement speed, and/or the area of the non-ROI area is negatively correlated with the target's movement speed. The absolute value of the object's speed is positively correlated.
- Scenario 3 When the speed of movement of the target object remains constant, the area of the ROI area is positively correlated with the relative height between the target object and the subject, and/or, the area of the non-ROI area is positively correlated with the relative height between the target object and the subject. The relative heights between objects are negatively correlated.
- the operation of dynamically adjusting the area of the ROI area and/or non-ROI area in the video will be triggered.
- it may also include the situation of using other information that can reflect the global motion state association between the current frame of the video and the historical frame to adjust the area of the ROI area and/or non-ROI area. This application does not do this. limit.
- the size of the entire image is characterized by the field of view (Field of View, FOV) size of the shooting device, and the ROI area and/or Or the area of the non-ROI area is positively correlated with the size of the FOV.
- FOV Field of View
- the transmission resources when the transmission resources are sufficient, that is, when the transmission conditions of the transmission equipment corresponding to the shooting device meet the preset transmission conditions, the ROI region and the non-ROI region of the video may not be executed.
- Dynamic adjustment of the area when transmission resources are insufficient, that is, when the transmission adjustment of the transmission equipment corresponding to the shooting device does not meet the preset transmission conditions, ROI is performed based on the information associated with the global motion status between frames of the video Dynamic adjustment of the area of regions and non-ROI areas.
- the transmission condition can be represented by the transmission code rate.
- the transmission code rate is higher than the code rate threshold, which can be a specific situation where the transmission conditions meet the preset transmission conditions.
- the transmission code rate is lower than the code rate threshold, which can be a case where the transmission conditions are not good. A specific situation that meets preset transmission conditions. It should also be understood that in addition to the transmission code rate, determination methods also include using other information that can reflect the transmission conditions of the transmission device, and this application is not limited to this.
- the embodiment of the present application provides a specific method for dynamically adjusting the area of the ROI area.
- the transmission code rate is used as an example for explanation. It should also be understood that in addition to using the code rate to determine, It also includes the use of other information that can reflect the transmission conditions of the transmission device for adjustment, and this application does not limit this. Including at least one of the following situations:
- Scenario 1 The transmission code rate is lower than the first code rate threshold
- Scenario 2 The transmission code rate is higher than the first code rate threshold and lower than the second code rate threshold
- the first code rate threshold is less than the second code rate threshold
- the first GMV threshold is greater than the second GMV threshold
- the first motion speed threshold is greater than the second motion speed threshold
- the first height threshold is greater than the second height threshold. It should also be understood here that there is no limit on how many levels the transmission code rate threshold, GMV threshold, movement speed threshold, and relative height threshold are divided into.
- the area of the ROI region is positively correlated with the transmission resources.
- the smaller the transmission code rate the smaller the area of the ROI region.
- a threshold for the proportion of the ROI area in the entire picture such as the minimum ratio and maximum ratio relative to the FOV, to ensure dynamic adjustment. reasonable scope.
- the method 100 further includes: determining the position change of the ROI area according to the information associated with the attitude change of the target object.
- a change in the gimbal's viewing angle can be a change in attitude of at least any axis of Yaw, Pitch or Roll.
- the information related to the attitude change of the target object can be input by the user. For example, in the aircraft scene, it can be mapped by the user's stroke amount. The greater the user's stroke amount, the corresponding offset of the ROI area center point in the screen. It is also larger; in addition, it can also be measured using the attitude sensor carried by the gimbal, shooting device or the mobile device itself.
- an attitude change threshold is preset. If the information associated with the attitude change of the target object satisfies the preset conditions, for example, the attitude change of the target object If the associated information is greater than the attitude change threshold, the position change of the ROI area is determined. When the information associated with the attitude change of the target object is less than the attitude change threshold, it is considered to be caused by unstable control or system errors, such as unexpected jitter, etc. In this case, the positions of the ROI area and/or non-ROI area are not Follow the changes.
- the information associated with the attitude change of the target object includes: at least one of the target object's attitude change speed, attitude change linear velocity, attitude change angular velocity, attitude change acceleration, and attitude change angular acceleration.
- attitude change angular velocity is used to represent the attitude change angle
- the attitude change angle is decomposed into a horizontal component and a vertical component.
- the components should be understood as vectors including direction and amplitude.
- the horizontal displacement of the ROI area is positively correlated with the horizontal component of the angular velocity of the target's attitude change, and/or the vertical displacement of the ROI area is positively correlated with the vertical component of the angular velocity of the target's attitude change.
- the position change of the ROI area can be characterized by the coordinate change of a certain feature identification point in the ROI area.
- the user's line of sight is usually focused on the center of the wearable device, so
- the coordinate changes of the center point of the ROI region can also be used to reflect the position changes of the ROI region.
- the center point of the ROI area changes from center point 1 (x1, y1) to center point 2 (x2, y2).
- x1 is added to the offset component in the horizontal direction to get x2
- y1 is added to the offset component in the vertical direction to get y2, where the horizontal offset component and the vertical offset component are positively correlated with the horizontal and vertical components of the information associated with the attitude change of the target object.
- different image processing is performed on the ROI area and the non-ROI area so that the sharpness of the ROI area and the non-ROI area is different.
- the clarity of the non-ROI area is lower than that of the ROI area.
- the sharpness of the ROI area and/or the non-ROI area is related to the transmission conditions corresponding to the transmission equipment corresponding to the shooting device.
- the definition of ROI areas and/or non-ROI areas decreases as the transmission bit rate decreases.
- the ways of dividing the ROI area and the non-ROI area include but are not limited to: rectangle, square, circle, ellipse, triangle or any other suitable shape, which is not limited in this application.
- multiple non-ROI areas may be included outside the ROI area, and the clarity of the non-ROI areas changes gradually, for example, including a first non-ROI area close to the ROI area and a second non-ROI area far away from the ROI area, where, The first non-ROI area has higher definition than the second non-ROI area.
- the ROI area also includes multiple ROI areas, and the multiple ROI areas gradually change from outside to inside, and the definition of the outer ROI area is lower than that of the inner ROI area.
- perform different image processing on the ROI area and non-ROI area including at least one of the following situations:
- the QP parameters used when encoding pixels in the ROI area are smaller than the QP parameters used when encoding pixels in non-ROI areas.
- the execution subject of the above method 100 may be: a camera, a handheld pan/tilt, an action camera, a movable device, a mobile phone, a tablet computer, a smart wearable device, a computer, and the like.
- the video and the information associated with the global motion status between frames of the video can also be uploaded to the cloud server or a third-party device for data processing by the shooting device, a pan/tilt equipped with the shooting device, or a mobile device.
- the embodiments of this application fully consider the insensitivity of the human eye to high-speed motion content, and determine whether to divide the video into multiple areas based on the information associated with the global motion state between frames of the video.
- the multiple areas include ROI areas and non- ROI area.
- the area of the ROI area and the non-ROI area in the picture is also dynamically and adaptively adjusted based on the information associated with the global motion state between frames of the video, so as to achieve effective and efficient performance on the basis of satisfying the user's subjective image quality. This greatly saves the usage of transmission resources.
- changes in the positions of the ROI area and the non-ROI area are also determined based on information associated with the attitude change of the target object, so as to realize self-correction of the positions of the ROI area and the non-ROI area when the attitude of the target object changes.
- the gradual transition in order to avoid the user's subjective visual flickering experience caused by frequent switching between ROI and non-ROI area division and non-division scenes, when it is determined that the video needs to be switched from non-division to division , the gradual transition can be completed within a period of time T1 (multi-frame images). When it is judged that the division needs to be divided into no division, the gradual transition is completed within a period of time T2. Compared with instantaneous flashing, the gradual transition can better satisfy the visual adaptation of the human eye and avoid human eye fatigue.
- T1 and T2 can be the same or different, and can be set according to actual needs.
- the time domain filtering method can be used to switch from not dividing to dividing ROI areas and non-ROI areas in two adjacent frames, and to gradually switch from not dividing to dividing ROI areas and non-ROI areas in consecutive multi-frame images.
- Non-ROI areas can avoid the problem of poor user experience caused by screen jumps.
- this embodiment of the present application also provides yet another video processing method 200.
- the method 200 includes steps S210 to S230:
- S220 Determine the area of the ROI region and/or the non-ROI region in the video based on the information associated with the global motion state between frames of the video;
- S230 Perform different image processing on the ROI area and the non-ROI area, so that the ROI area and the non-ROI area have different sharpness.
- the method 200 is to dynamically adjust the area of the ROI area and/or the non-ROI area in the video. Specifically, how to perform the dynamic adjustment operation, and the method 200 also includes the operation of determining the position change of the ROI area based on the information associated with the attitude change of the target object, is similar to the principle in the method 100, and will not be described again for the sake of brevity. In the method 200, there are no special restrictions on the division method and division basis of the ROI area and the non-ROI area included in the video.
- the gradual transition in order to avoid the user's subjective visual flickering experience caused by frequent switching between ROI and non-ROI area division and non-division scenes, when it is determined that the video needs to be switched from non-division to division , the gradual transition can be completed within a period of time T1 (multi-frame images). When it is judged that the division needs to be divided into no division, the gradual transition is completed within a period of time T2. Compared with instantaneous flashing, the gradual transition can better satisfy the visual adaptation of the human eye and avoid human eye fatigue.
- T1 and T2 can be the same or different, and can be set according to actual needs.
- the time domain filtering method can be used to switch from not dividing to dividing ROI areas and non-ROI areas in two adjacent frames, and to gradually switch from not dividing to dividing ROI areas and non-ROI areas in consecutive multi-frame images.
- Non-ROI areas can avoid the problem of poor user experience caused by screen jumps.
- the embodiments of this application fully take into account the insensitivity of the human eye to high-speed motion content, and dynamically and adaptively adjust the area of the ROI area and non-ROI area in the picture based on the information associated with the global motion state between frames of the video, so as to achieve On the basis of satisfying the user's subjective picture quality, it effectively saves the occupation of transmission resources.
- changes in the positions of the ROI area and the non-ROI area are also determined based on information associated with the attitude change of the target object, so as to realize self-correction of the positions of the ROI area and the non-ROI area when the attitude of the target object changes.
- an embodiment of the present application also provides a video processing device 300.
- the device 300 includes:
- Memory 301 used to store computer programs
- the processor 302 is used to call a computer program.
- the video processing device performs the following operations:
- Different image processing is performed on the ROI area and the non-ROI area so that the sharpness of the ROI area and the non-ROI area are different.
- an embodiment of the present application also provides yet another video processing device 400.
- the device 400 includes:
- Memory 401 used to store computer programs
- the processor 402 is used to call a computer program.
- the video processing device performs the following operations:
- Different image processing is performed on the ROI area and the non-ROI area so that the sharpness of the ROI area and the non-ROI area are different.
- the processor 302 or the processor 402 may be a micro-controller unit (Micro-controller Unit, MCU), a central processing unit (Central Processing Unit, CPU) or a digital signal processor (Digital Signal Processor, DSP), etc.
- MCU Micro-controller Unit
- CPU Central Processing Unit
- DSP Digital Signal Processor
- the memory 301 or 401 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) disk, an optical disk, a U disk or a mobile hard disk, etc.
- ROM Read-Only Memory
- the memory 301 or 401 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) disk, an optical disk, a U disk or a mobile hard disk, etc.
- the processor is used to run the computer program stored in the memory, and implement the operations of the aforementioned video processing method when executing the computer program.
- An embodiment of the present application also provides a device, which includes:
- the shooting device is mounted on the equipment
- devices include but are not limited to: handheld pan-tilts, action cameras, portable devices, mobile phones, tablets, smart wearable devices, computers, and the like.
- Embodiments of the present application also provide a computer-readable storage medium.
- the computer-readable storage medium stores a computer program.
- the processor implements the steps of the video processing method provided by the above embodiments.
- One or more computer program instructions may be stored on a computer-readable storage medium, and the processor may execute the program instructions stored in the storage device to implement the functions (implemented by the processor) in the embodiments of the present application herein and/or other desired Functions, for example, to perform corresponding steps of the video processing method according to embodiments of the present application, various application programs and various data can also be stored in the computer-readable storage medium, such as various data used and/or generated by the application program wait.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- Studio Devices (AREA)
Abstract
一种视频处理方法、装置、设备及计算机存储介质,该方法包括:获取拍摄装置采集的视频;根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,其中,所述多个区域包括感兴趣区域和非感兴趣区域;对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。本申请能够在保证用户主观画质体验的基础上,有效地节省对传输资源的占用。
Description
本申请涉及视频处理技术领域,尤其涉及一种视频处理方法、装置、设备和计算机存储介质。
在图传应用中,通常需要将拍摄到的图像/视频进行实时或低延时传输,这需要占用较大的传输资源。传统上,单个图像帧中的所有区都使用统一的策略进行编码,在传输资源有限的情况下,统一的编码策略无法为用户提供感兴趣区域(Region Of Interest,ROI区域)的清晰视图。因此,如何在保证用户主观画质体验的基础上,节省传输带宽是目前亟待解决的问题。
发明内容
为了解决上述问题,本申请提供了一种视频处理方法、装置、设备和计算机存储介质,可以节省传输带宽。
第一方面,本申请实施例提供了一种视频处理方法,所述方法包括:
获取拍摄装置采集的视频;
根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,其中,所述多个区域包括感兴趣区域和非感兴趣区域;
对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
第二方面,本申请实施例提供了一种视频处理方法,所述方法包括:
获取拍摄装置采集的视频;
根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中的感兴趣区域和/或非感兴趣区域的区域面积;
对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
第三方面,本申请实施例提供了一种视频处理装置,所述装置包括:
存储器,用于存储计算机程序;
处理器,用于调用所述计算机程序,当所述计算机程序被所述处理器执行时,使得所述视频处理装置执行如下操作:
获取拍摄装置采集的视频;
根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,其中,所述多个区域包括感兴趣区域和非感兴趣区域;
对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
第四方面,本申请实施例提供了一种视频处理装置,其特征在于,包括:
存储器,用于存储计算机程序;
处理器,用于调用所述计算机程序,当所述计算机程序被所述处理器执行时,使得所述视频处理装置执行如下操作:
获取拍摄装置采集的视频;
根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中的感兴趣区域和/或非感兴趣区域的区域面积;
对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
第五方面,本申请实施例提供了一种设备,所述设备包括:
拍摄装置,所述拍摄装置搭载于所述设备;
以及第三方面或第四方面中任一项所述的视频处理装置。
第六方面,本申请实施例提供了一种计算机存储介质,所述计算机存储介质中存储有计算机程序指令,所述计算机程序指令被处理器执行时,用于执行如第一方面或第二方面任一项所述的视频处理方法。
通过实施本申请实施例的视频处理方法,根据与视频的帧间的全局运动状态关联的信息,将视频划分为包括ROI区域和非ROI区域的多个区域,并对视频中ROI区域和非ROI区域执行不同的图像处理方法,以使ROI区域和非ROI区域的清晰度不同,在提高用户主观画质体验的同时,也降低了整体对传输资源的占用,提高了传输的时效性。
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍。
图1是本申请实施例提供的无人飞行器的示意性场景图;
图2是本申请实施例提供的一种视频处理方法的示意性流程图。
图3是本申请实施例提供的视频中多个区域的划分流程图。
图4是本申请实施例提供的视频中ROI区域位置变化的示意图。
图5是本申请实施例提供的又一种视频处理方法的示意性流程图。
图6是本申请实施例提供的一种视频处理装置的示意框图。
图7是本申请实施例提供的又一种视频处理装置的示意框图。
主要附图元件说明:
无人飞行器10、遥控器20、智能终端30、拍摄装置11、云台12;
存储器301、处理器302;
存储器401、处理器402。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。
在此使用的术语的目的仅在于描述具体实施例并且不作为本申请的限制。在此使用时,单数形式的“一”、“一个”和“所述/该”也意图包括复数形式,除非上下文清楚指出另外的方式。还应明白术语“组成”和/或“包括”,当在该说明书中使用时,确定所述特征、整数、步骤、操作、元件和/或部件的存在,但不排除一个或更多其它的特征、整数、步骤、操作、元件、部件和/或组的存在或添加。在此使用时,术语“和/或”包括相关所列项目的任何及所有组合。“第一”和“第二”仅仅是出于表述的需要,两者之间的数据关系不做限制,可以相同,也可以不同。在此使用时,A与B呈正相关,即A增大B也增大,A与B呈负相关,即A增大而B减小。
附图中所示仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。
本申请实施例适用于任何具有视频采集功能的设备的视频传输/通信场景中,所述设备包括但不限于手持云台,运动相机,直接搭载或通过载体间接搭载拍摄装置的可移动设备,具有拍摄功能的移动电话、平板电脑、智能穿戴设备、计算机之类的电子设备。
可移动设备可以是能够自推进移动的载运工具。载运工具可以具有会能够允许载运工具在环境内移动的一个或多个推进单元。可移动设备能够在陆地上或地下、在水上或水中、在空中、在空间内或其任意组合中穿越。可移动设备可以是飞行器(例如,旋翼飞机、固定翼飞 机)、陆基载运工具、水基载运工具或空基载运工具。可移动设备可以是有人驾驶的或无人驾驶的。
载体可以包括被配置为容纳拍摄装置和/或允许拍摄装置相对于可移动设备进行调整(例如,旋转)的一个或多个装置。例如,载体可以是云台(Gimbal)。载体可以被配置为允许拍摄装置围绕一个或多个旋转轴旋转,所述旋转轴包括偏航轴(Yaw)、俯仰轴(Pitch)或横滚轴(Roll)等,在一些场景中,载体可以被配置为允许围绕每个轴线旋转360°及以上以允许更好地控制拍摄装置的视角。
请参阅图1,以可移动设备中的无人飞行器场景为例进行说明,无人飞行器可以是航拍机或穿越机。在无人飞行器航拍或第一人称视角(First Person View,FPV)飞行体验场景下,无人飞行器10在空中飞行,拍摄装置11通过云台12搭载在无人飞行器10上,拍摄装置11采集视频,然后通过无线图传系统将视频传输到遥控器端20,遥控器端20再将画面传送到一个具有显示功能的智能终端30上进行显示。在某些场景下,拍摄装置11采集的视频也可通过无线图传系统直接传输到智能终端30进行显示。智能终端可以是智能眼镜、护目镜或头戴式显示器等可穿戴设备,也可以是手机、电脑、平板等用户设备。智能终端可以包括结合了增强现实(Augmented Reality,AR)或虚拟现实(Virtual Reality,VR)技术的任何类型的可穿戴计算机或设备。用户通过智能终端观看无人飞行器拍摄的视频,例如,通过头戴智能眼镜来观看可移动设备回传的视频,可以体验一种身临其境的航拍和竞速体验。针对航拍机或者穿越机高速飞行的场景,无人飞行器需要向遥控器或智能终端回传的视频数据较多,且对图像或视频传输的实时或低延时要求较高,因此如何节省传输资源是亟待解决的问题。
请参阅图2,图2是本申请实施例提供的一种视频处理方法的流程示意图。
作为示例,该方法100包括步骤S110至步骤S130:
S110:获取拍摄装置采集的视频;
S120:根据与视频的帧间的全局运动状态关联的信息,将视频划分为多个区域,其中,多个区域包括ROI区域和非ROI区域;
S130:对ROI区域和非ROI区域执行不同的图像处理,以使ROI区域和非ROI区域的清晰度不同。
与视频的帧间的全局运动状态关联的信息指的是能够反映视频的当前帧和历史帧之间的全局运动状态的信息,反映的是当前帧和历史帧画面之间的全局变化信息,其中历史帧可以是当前帧之前的任一帧。若与所述视频的帧间的全局运动状态关联的信息所表征的全局运动变化满足预设变化条件,则将所述视频划分为多个区域。帧间的全局运动变化越大,反映帧间画面的全局变化越剧烈,视频中对人眼不敏感的高速运动内容越多,越倾向于进行划分的 处理。
可选地,与视频的帧间的全局运动状态关联的信息,包括以下至少一种信息或其组合:视频的帧间的全局运动信息;与目标物的运动状态关联的信息,其中,目标物至少包括拍摄装置、承载拍摄装置的设备中的任一种。
需要说明的是,除了上述信息以外,与所述视频的帧间的全局运动状态关联的信息还可以包括其他能够反映视频的当前帧和历史帧之间的全局运动状态的信息,本申请对此不做限制。
可选的,视频的帧间的全局运动信息包括:视频的帧间的全局运动矢量(Global Motion Vector,GMV);
可选的,与目标物的运动状态关联的信息包括:目标物的运动速度以及目标物与被摄物之间的相对距离;或者,目标物的运动速度以及目标物与被摄物之间的相对高度。
在一些场景下,目标物为拍摄装置,与目标物的运动状态关联的信息即是与拍摄装置的运动状态关联的信息。在另一些场景下,考虑到与目标物的运动状态关联的信息的易获取性,与目标物的运动状态关联的信息可以由与搭载拍摄装置的设备关联的运动状态来反映。例如,在手持云台中,与目标物的运动状态关联的信息可以由搭载有拍摄装置的云台来表征。例如,在可移动设备中,当拍摄装置直接搭载在可移动设备,与目标物的运动状态关联的信息可以由可移动设备的运动状态来表征,当拍摄装置利用载体搭载在可移动设备,与目标物的运动状态关联的信息可以由可移动设备或载体的运动状态来表征。可移动设备具体为飞行器时,示例性地,与目标物的运动状态关联的信息具体可以为:飞行器的飞行速度以及飞行器与被摄物之间的相对高度。飞行器的飞行速度和高度信息都可以通过飞行器自身的导航系统被容易地获取,或者利用用户的打杆量来映射,或者利用载体、拍摄装置或可移动设备自身所携带的运动传感器测量得到;在此不做限制。
在一些实施例中,上述根据与视频的帧间的全局运动状态关联的信息,将视频划分为多个区域,包括:若与视频的帧间的全局运动状态关联的信息所表征的全局运动变化满足预设变化条件,则将视频划分为多个区域。可选的,若与视频的帧间的全局运动状态关联的信息所表征的全局运动变化满足预设变化条件,包括如下至少一种情形:
情形一:若视频的帧间的GMV的绝对值大于GMV阈值,则将视频划分为多个区域。
情形二:在目标物与被摄物之间的相对高度保持恒定的情形下,若目标物的运动速度的绝对值大于运动速度阈值,则将视频划分为多个区域。
情形三:在目标物的运动速度保持恒定的情形下,若目标物与被摄物之间的相对高度小于高度阈值,则将视频划分为多个区域。
需要说明的是,上述多种情形中的任一种情形满足或者多种情形均满足,均会触发执行将视频划分为多个区域的步骤。除了上述情形外,还可以包括利用其他能够反映视频的当前帧和历史帧之间的全局运动状态关联的信息进行划分的判断,本申请对此不做限制。
需要说明的是,此处以目标物的运动速度和目标物与被摄物之间的相对距离作为示例,其他场景下,例如飞行器场景下,以目标物的运动速度和目标物与被摄物之间的相对高度等作为与目标物的运动状态关联的信息,也是同理,后文中均是如此。其中,目标物与被摄物之间的相对高度还可以有其他表现形式,例如,目标物与地面之间的高度或者目标物距离起始点(如起飞点)的高度。
在一些实施例中,在传输资源充足的情形下,即拍摄装置对应的传输设备的传输条件满足预设传输条件的情形下,可以不对视频执行ROI区域和非ROI区域的划分,即整个图像的画质较高,无需以牺牲非ROI区域的画质来提高ROI区域的画质,在传输资源不足的情形下,即拍摄装置对应的传输设备的传输调节不满足预设传输条件的情形下,根据与视频的帧间的全局运动状态关联的信息,则将视频划分为多个区域。可选的,传输条件可以由传输码率来表示,传输码率高于码率阈值可以是传输条件满足预设传输条件的一种具体情形,传输码率低于码率阈值可以是传输条件不满足预设传输条件的一种具体情形。还应理解,除了传输码率以外,也包括利用其他能够反映传输设备的传输条件的信息进行判定的方式,本申请对此不做限制。
可选的,请参阅图3,本申请实施例提供了一种具体的ROI区域和非ROI区域的划分判断方式,具体步骤如下:先判断拍摄装置对应的传输设备的传输码率是否低于码率阈值,若否,则不划分,若是,进一步判断视频的帧间的GMV绝对值是否大于GMV阈值;若是,则划分,若否,则进一步判断目标物的运动速度和目标物与被摄物之间的相对高度;若目标物的运动速度是否大于第一运动速度阈值,且目标物与被摄物之间的相对高度是否小于第一高度阈值;若是,则划分,若否,则不划分。
需要说明的是,恒定运动速度下,相对高度越小,帧间画面全局变化越显著;恒定相对高度下,运动速度越大,帧间画面全局变化越显著。在某些场景下,可以将运动速度阈值和相对高度阈值设置为几档,同一运动速度下但由于符合不同的高度阈值条件,可能对应不同的划分判断结果,同一相对高度下但由于符合不同的运动速度阈值条件,可能对应不同的划分判断结果。目标物运动速度和被摄物之间的相对距离之间的关系亦是同理,不再赘述。
需要说明的是,在上述实施例中,预设传输条件的判断、与视频的帧间的全局运动状态关联的信息的判断的先后执行顺序不做限制;多个与视频的帧间的全局运动状态关联的信息相结合时,各个信息之间判断的先后执行顺序也不做限制。除上述具体的ROI区域和非ROI 区域的划分判断方式外,其他可选的划分判断方式,都是可以根据实际场景适应性选择的,此处不再赘述。
在一些实施例中,在步骤S110之后、步骤S130之前,还包括:根据与视频的帧间的全局运动状态关联的信息,确定ROI区域的区域面积和/或非ROI区域的区域面积。在保证用户主观画质质量的基础上,根据与视频的帧间的全局运动状态关联的信息,自适应调整传输资源在ROI区域和非ROI区域之间的合理分配。
可选的,ROI区域的区域面积与视频的帧间全局运动状态关联的信息所表征的全局运动变化呈负相关;和/或,非ROI区域的区域面积与视频的帧间全局运动状态关联的信息所表征的全局运动变化呈正相关。其中,全局运动变化越大,表示帧间画面全局变化越剧烈,图像中人眼不敏感的高速运动内容就越多,人眼所关注的ROI区域的区域面积就可以设置的越小,相应地非ROI区域的区域面积就可以设置的越大,整体可以节省的传输资源也更多。可选地,ROI区域和/或非ROI区域的区域面积调整方式,包括如下至少一种情形:
情形一,ROI区域的区域面积与视频的帧间的GMV的绝对值呈负相关(例如,绝对值越大,ROI区域的区域面积越小),和/或,非ROI区域的区域面积与视频的帧间的GMV的绝对值呈正相关(例如,绝对值越大,非ROI区域的区域面积越大)。
情形二,在目标物与被摄物之间的相对高度保持恒定的情形下,ROI区域的区域面积与目标物的运动速度绝对值呈负相关,和/或,非ROI区域的区域面积与目标物的运动速度绝对值呈正相关。
情形三,在目标物的运动速度保持恒定的情形下,ROI区域的区域面积与目标物与被摄物之间的相对高度呈正相关,和/或,非ROI区域的区域面积与目标物与被摄物之间的相对高度呈负相关。
需要说明的是,上述多种情形中的任一种情形满足或者多种情形均满足,均会触发视频中ROI区域和/或非ROI区域的区域面积动态调整的操作。除了上述情形外,还可以包括利用其他能够反映视频的当前帧和历史帧之间的全局运动状态关联的信息进行ROI区域和/或非ROI区域的区域面积调整的情形,本申请对此不做限制。
需要说明的是,在与视频的帧间的全局运动状态关联的信息保持恒定的情形下,整个图像的大小用拍摄装置的视场角(Field of View,FOV)大小来表征,ROI区域和/或非ROI区域的区域面积与所FOV大小呈正相关。
需要说明的是,在上述实施例中,在传输资源充足的情形下,即拍摄装置对应的传输设备的传输条件满足预设传输条件的情形下,可以不对视频执行ROI区域和非ROI区域的区域面积的动态调整,在传输资源不足的情形下,即拍摄装置对应的传输设备的传输调节不满足 预设传输条件的情形下,根据与视频的帧间的全局运动状态关联的信息,则执行ROI区域和非ROI区域的区域面积的动态调整。可选的,传输条件可以由传输码率来表示,传输码率高于码率阈值可以是传输条件满足预设传输条件的一种具体情形,传输码率低于码率阈值可以是传输条件不满足预设传输条件的一种具体情形。还应理解,除了传输码率以外,也包括利用其他能够反映传输设备的传输条件的信息进行判定的方式,本申请对此不做限制。
本申请实施例提供了一种ROI区域的区域面积动态调整的具体方式,根据传输设备的传输条件的不同情况,此处以传输码率为例进行说明,还应理解,除了用码率判定以外,也包括利用其他能够反映传输设备的传输条件的信息进行调整,本申请对此不做限制。包括以下至少一种情形:
情形一,传输码率低于第一码率阈值
1)若帧间的GMV绝对值大于第一GMV阈值,或者,目标物的运动速度大于第一运动速度阈值且相对高度小于第一高度阈值,则确定ROI区域的区域面积为1/4FOV;
2)若帧间的GMV绝对值大于第二GMV阈值且小于第一GMV阈值,或者,目标物的运动速度小于第一运动速度阈值且大于第二运动速度阈值、且相对高度是小于第二高度阈值,则确定ROI区域的区域面积为1/3FOV;
3)否则,则确定ROI区域的区域面积为1/2FOV。
情形二,传输码率高于第一码率阈值且低于第二码率阈值
1)若帧间的GMV绝对值大于第一GMV阈值,或者,目标物的运动速度大于第一运动速度阈值且相对高度小于第一高度阈值,则确定ROI区域的区域面积为1/3FOV;
2)若帧间的GMV绝对值大于第二GMV阈值且小于第一GMV阈值,或者,目标物的运动速度小于第一运动速度阈值且大于第二运动速度阈值、且相对高度是小于第二高度阈值,则确定ROI区域的区域面积为1/2FOV;
3)否则,则确定ROI区域的区域面积为2/3FOV。
其中,第一码率阈值小于第二码率阈值,第一GMV阈值大于第二GMV阈值,第一运动速度阈值大于第二运动速度阈值,第一高度阈值大于第二高度阈值。这里还应理解,传输码率阈值、GMV阈值、运动速度阈值、相对高度阈值的划分为几档不做限制。
可选的,在与视频的帧间的全局运动状态关联的信息保持恒定的情形下,ROI区域的区域面积与传输资源呈正相关,例如传输码率越小,ROI区域的区域面积越小。
可选的,为了避免无限度的放大或缩小ROI区域在画面中的占比,还可设置ROI区域在整个画面中占比的阈值,例如相对于FOV的最小比例和最大比例,以保证动态调整的合理范畴。
需要说明的是,除了上述调整方式外,其他实现ROI区域的区域面积和/或非ROI区域的区域面积动态调整的方式,例如不考虑传输资源的限制,也都是可以根据实际场景进行适应性选择的,此处不再赘述。
在一些实施例中,方法100还包括:根据与目标物的姿态变化关联的信息,确定ROI区域的位置变化。
当目标物姿态发生变化,例如在飞行器发生转向动作或者云台视角发生变化的情形下,自适应调整ROI区域在画面中的位置,以提高用户视觉体验。示例性的,云台视角发生变化可以是Yaw、Pitch或Roll至少任一轴的姿态变化。与目标物的姿态变化关联的信息可以是用户输入的,例如飞行器场景下可以由用户的打杆量来映射,用户打杆量越大,相应地ROI区域中心点在画面中位置的偏移量也越大;另外,也可以是利用云台、拍摄装置或可移动设备自身所携带的姿态传感器测量得到。
可选的,为了防止操控不稳或系统误差导致的ROI区域位置偏移,预设一姿态变化阈值,若与目标物的姿态变化关联的信息满足预设条件,例如,与目标物的姿态变化关联的信息大于该姿态变化阈值,则确定ROI区域的位置变化。在与目标物的姿态变化关联的信息小于该姿态变化阈值,认为是操控不稳或系统误差所致,例如不期望的抖动等,此种情形下,ROI区域和/或非ROI区域的位置不跟随变化。
ROI区域的水平方向位移和与目标物的姿态变化关联的信息对应的水平方向的分量呈正相关,和/或,ROI区域的垂直方向位移和与目标物的姿态变化关联的信息对应的垂直方向的分量呈正相关。
与目标物的姿态变化关联的信息包括:目标物的姿态变化速度、姿态变化线速度、姿态变化角速度、姿态变化加速度、姿态变化角加速度中的至少一种。例如,当采用姿态变化角速度表征时,将姿态变化角度分解为水平方向的分量和垂直方向的分量,分量应理解为包括方向和幅度的矢量。ROI区域水平方向位移与目标物的姿态变化角速度的水平方向的分量呈正相关,和/或,ROI区域的垂直方向位移和与目标物的姿态变化角速度的垂直方向的分量呈正相关。
可选的,请参阅图5,ROI区域的位置变化可以由ROI区域某一特征标识点的坐标变化来表征,例如在无人飞行场景下,用户的视线通常聚焦在可穿戴设备的中心,因此也可采用ROI区域中心点的坐标变化来反映ROI区域的位置变化。图5中,ROI区域中心点由中心点1(x1,y1)变化到中心点2(x2,y2),x1加上水平方向的偏移分量得到x2,y1加上垂直方向的偏移分量得到y2,其中,水平方向的偏移分量和垂直方向的偏移分量和与目标物的姿态变化关联的信息的水平方向和垂直方向的分量呈正相关。
在一些实施例中,对ROI区域和非ROI区域执行不同的图像处理,以使ROI区域和非ROI区域的清晰度不同。
其中,非ROI区域的清晰度低于ROI区域的清晰度。
在与视频的帧间的全局运动状态关联的信息保持恒定的情形下,ROI区域和/或非ROI区域的清晰度与拍摄装置对应的传输设备对应的传输条件相关。例如,ROI区域和/或非ROI区域的清晰度随着传输码率的下降而下降。
ROI区域和非ROI区域的划分方式包括但不限于:矩形、方形、圆形、椭圆形、三角形或其他任何合适的形状,本申请对此不做限制。
可选地,ROI区域以外可以包括多个非ROI区域,非ROI区域的清晰度呈渐进变化,例如,包括靠近ROI区域的第一非ROI区域和远离ROI区域的第二非ROI区域,其中,第一非ROI区域的清晰度高于第二非ROI区域。
可选地,ROI区域也包括多个ROI区域,多个ROI区域由外至内实现渐进变化,外侧的ROI区域清晰度低于内侧的ROI区域。
上述的清晰度渐进变化,这样平滑的过渡可以更好地符合人眼观看舒适度,避免突兀的清晰度变化带来的观感不适。
可选地,对ROI区域和非ROI区域执行不同的图像处理,包括如下至少一种情形:
对非ROI区域进行模糊处理;
对ROI区域进行锐化处理,且对非ROI区域进行模糊处理,两者联合优化;
对ROI区域和非ROI区域的执行不同的图像处理,以使ROI区域的量化参数(Quantizer Parameter,QP)小于非ROI区域的QP。
比如,在对ROI区域中的像素进行编码时所采用的QP参数小于在对非ROI区域的像素进行编码时采用的QP参数。
需要说明的是,其他任何能够使ROI区域和非ROI区域的清晰度不同的方式都应理解为包括在本申请所称的不同的图像处理中。
需要说明的是,上述方法100的执行主体可以是:相机、手持云台,运动相机、可移动设备、移动电话、平板电脑、智能穿戴设备、计算机之类的设备。在其他可能的实现方式中,还可以由拍摄装置、搭载拍摄装置的云台或者可移动设备将视频以及与视频的帧间的全局运动状态关联的信息上传给云端服务器或者第三方设备进行数据处理,并接收由云端服务器或者第三方设备进行数据处理后的结果,进而根据反馈的结果对ROI区域和非ROI区域进行不同的图像处理,通过利用云端服务器或者第三方设备等高速运算的处理性能,可以提升本地的处理效率。
本申请实施例充分考虑了人眼对高速运动内容不敏感的特性,基于与视频的帧间的全局运动状态关联的信息,确定是否将视频划分为多个区域,多个区域包括ROI区域和非ROI区域。在一些实施例中,还基于与视频的帧间的全局运动状态关联的信息动态自适应调整ROI区域和非ROI区域在画面中的区域面积,以实现在满足用户主观画质的基础上,有效地节省传输资源的占用。在一些实施例中,还基于与目标物的姿态变化关联的信息,确定ROI区域和非ROI区域位置的变化,以实现在目标物姿态变化的情形下ROI区域和非ROI区域位置的自行修正。
可选的,在一些实施例中,为了避免在ROI区域和非ROI区域划分和不划分场景之间频繁切换带来的用户主观视觉的闪动体验,当确定需要将视频由不划分切换成划分,可以在一段时间T1(多帧图像)内渐进过渡完成。在判断需要划分变成不划分,在一段时间T2内渐进过渡完成,渐进过渡相比瞬时闪动更能满足人眼的视觉适应,避免人眼疲劳。其中T1和T2可以相同,也可以不同,可以根据实际需要进行设定。例如,可以通过时域滤波的方式将原本需要在相邻两帧内由不划分切换为划分ROI区域和非ROI区域,切换为在连续的多帧图像中逐渐由不划分切换为划分ROI区域和非ROI区域,可以避免由于画面的跳变带来的用户体验差的问题。
请参阅图5,本申请实施例还提供了又一种视频处理方法200。
作为示例,方法200包括步骤S210至步骤S230:
S210:获取拍摄装置采集的视频;
S220:根据与视频的帧间的全局运动状态关联的信息,确定视频中的ROI区域和/或非ROI区域的区域面积;
S230:对ROI区域和非ROI区域执行不同的图像处理,以使ROI区域和非ROI区域的清晰度不同。
该方法200是对视频中的ROI区域和/或非ROI区域的区域面积进行动态调整。具体如何进行动态调整的操作,以及方法200还包括根据与目标物的姿态变化关联的信息,确定ROI区域的位置变化的操作,与方法100中的原理类似,为了简洁,在此不再赘述。该方法200中对视频中所包括的ROI区域和非ROI区域的划分方式和划分依据不做特殊限制。
可选的,在一些实施例中,为了避免在ROI区域和非ROI区域划分和不划分场景之间频繁切换带来的用户主观视觉的闪动体验,当确定需要将视频由不划分切换成划分,可以在一段时间T1(多帧图像)内渐进过渡完成。在判断需要划分变成不划分,在一段时间T2内渐进过渡完成,渐进过渡相比瞬时闪动更能满足人眼的视觉适应,避免人眼疲劳。其中T1和T2可以相同,也可以不同,可以根据实际需要进行设定。例如,可以通过时域滤波的方式将 原本需要在相邻两帧内由不划分切换为划分ROI区域和非ROI区域,切换为在连续的多帧图像中逐渐由不划分切换为划分ROI区域和非ROI区域,可以避免由于画面的跳变带来的用户体验差的问题。
本申请实施例充分考虑了人眼对高速运动内容不敏感的特性,基于与视频的帧间的全局运动状态关联的信息动态自适应调整ROI区域和非ROI区域在画面中的区域面积,以实现在满足用户主观画质的基础上,有效地节省传输资源的占用。在一些实施例中,还基于与目标物的姿态变化关联的信息,确定ROI区域和非ROI区域位置的变化,以实现在目标物姿态变化的情形下ROI区域和非ROI区域位置的自行修正。
请参阅图6,本申请的实施例还提供了一种视频处理装置300,该装置300,包括:
存储器301,用于存储计算机程序;
处理器302,用于调用计算机程序,当计算机程序被处理器执行时,使得视频处理装置执行如下操作:
获取拍摄装置采集的视频;
根据与视频的帧间的全局运动状态关联的信息,将视频划分为多个区域,其中,多个区域包括ROI区域和非ROI区域;
对ROI区域和非ROI区域执行不同的图像处理,以使ROI区域和非ROI区域的清晰度不同。
需要说明的是,关于处理器302执行的上述操作的具体实现方式可以参考前述方法100中的相关描述,此处不再赘述。
请参阅图7,本申请的实施例还提供了又一种视频处理装置400,该装置400,包括:
存储器401,用于存储计算机程序;
处理器402,用于调用计算机程序,当计算机程序被处理器执行时,使得视频处理装置执行如下操作:
获取拍摄装置采集的视频;
根据与视频的帧间的全局运动状态关联的信息,确定视频中的ROI区域和/或非ROI区域的区域面积;
对ROI区域和非ROI区域执行不同的图像处理,以使ROI区域和非ROI区域的清晰度不同。
需要说明的是,关于处理器402执行的上述操作的具体实现方式可以参考前述方法200中的相关描述,此处不再赘述。
具体地,处理器302或处理器402可以是微控制单元(Micro-controller Unit,MCU)、中央 处理单元(Central Processing Unit,CPU)或数字信号处理器(Digital Signal Processor,DSP)等。
具体地,存储器301或401可以是Flash芯片、只读存储器(ROM,Read-Only Memory)磁盘、光盘、U盘或移动硬盘等。
其中,处理器用于运行存储在存储器中的计算机程序,并在执行计算机程序时实现前述视频处理方法的操作。
本申请实施例提供的视频处理装置的具体原理和实现方式均与前述对应实施例的视频处理方法类似,此处不再赘述。
本申请实施例还提供一种设备,设备包括:
拍摄装置,拍摄装置搭载于设备;
以及如上述实施例中的视频处理装置。
可选地,设备包括但不限于:手持云台,运动相机、可移动设备,移动电话、平板电脑、智能穿戴设备、计算机之类。
本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时使处理器实现上述实施例提供的视频处理方法的步骤。
在计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器可以运行存储装置存储的程序指令,以实现本文的本申请实施例中(由处理器实现)的功能以及/或者其它期望的功能,例如以执行根据本申请实施例的视频处理方法的相应步骤,在计算机可读存储介质中还可以存储各种应用程序和各种数据,例如应用程序使用和/或产生的各种数据等。
应当理解,在此本申请中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。
以上,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。
Claims (80)
- 一种视频处理方法,其特征在于,包括:获取拍摄装置采集的视频;根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,其中,所述多个区域包括感兴趣区域和非感兴趣区域;对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
- 根据权利要求1所述的方法,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:在所述拍摄装置对应的传输设备的传输条件不满足预设传输条件的情形下,根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域。
- 根据权利要求1或2所述的方法,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:若所述与所述视频的帧间的全局运动状态关联的信息所表征的全局运动变化满足预设变化条件,则将所述视频划分为多个区域。
- 根据权利要求1所述的方法,其特征在于,所述与所述视频的帧间的全局运动状态关联的信息包括以下至少一种:所述视频的帧间的全局运动信息;与目标物的运动状态关联的信息,其中,所述目标物至少包括所述拍摄装置、承载所述拍摄装置的设备中的任一种。
- 根据权利要求4所述的方法,其特征在于,所述视频的帧间的全局运动信息包括:所述视频的帧间的全局运动矢量;所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:若所述视频的帧间的全局运动矢量的绝对值大于全局运动矢量阈值,则将所述视频划分为多个区域。
- 根据权利要求4所述的方法,其特征在于,所述与目标物的运动状态关联的信息包括:所述目标物的运动速度;以及,所述目标物与被摄物之间的相对距离或相对高度。
- 根据权利要求6所述的方法,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:在所述目标物与所述被摄物之间的相对距离或相对高度保持恒定的情形下,若所述目标物的运动速度的绝对值大于运动速度阈值,则将所述视频划分为多个区域。
- 根据权利要求6所述的方法,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:在所述目标物的运动速度保持恒定的情形下,若所述目标物与被摄物之间的相对距离小于距离阈值或相对高度小于高度阈值,则将所述视频划分为多个区域。
- 根据权利要求1-6任一项所述的方法,所述方法还包括:根据与所述视频的帧间的全局运动状态关联的信息,确定所述感兴趣区域的区域面积和/或所述非感兴趣区域的区域面积。
- 根据权利要求9所述的方法,其特征在于,所述感兴趣区域的区域面积与所述视频的帧间全局运动状态关联的信息所表征的全局运动变化呈负相关;和/或,所述非感兴趣区域的区域面积与所述视频的帧间全局运动状态关联的信息所表征的全局运动变化呈正相关。
- 根据权利要求9所述的方法,其特征在于,在所述视频的帧间的全局运动信息包括所述视频的帧间的全局运动矢量的情形下:所述感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈正相关。
- 根据权利要求9所述的方法,其特征在于,在所述与目标物的运动状态关联的信息包括所述目标物的运动速度以及所述目标物与被摄物之间的相对距离或相对高度,且所述目标物与所述被摄物之间的相对距离/相对高度保持恒定的情形下:所述感兴趣区域的区域面积与所述目标物的运动速度绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述目标物的运动速度绝对值呈正相关。
- 根据权利要求9所述的方法,其特征在于,在所述与目标物的运动状态关联的信息包括所述目标物的运动速度以及所述目标物与被摄物之间的相对距离或相对高度,且所述目标物的运动速度保持恒定的情形下:所述感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度呈正相关,和/或,所述非感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度呈负相关。
- 根据权利要求9所述的方法,其特征在于,在所述与所述视频的帧间的全局运动状态关联的信息保持恒定的情形下,所述感兴趣区域和/或非感兴趣区域的区域面积与所述拍摄装置的视场角呈正相关。
- 根据权利要求1或9所述的方法,其特征在于,所述方法还包括:根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化;其中,所述目标物至少包括所述拍摄装置、承载所述拍摄装置的设备中的任一种。
- 根据权利要求15所述的方法,其特征在于,所述根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化,包括:若所述与目标物的姿态变化关联的信息满足预设条件,则确定所述感兴趣区域的位置变化。
- 根据权利要求15所述的方法,其特征在于,所述感兴趣区域的位置变化包括所述感兴趣区域的水平方向位移和/或垂直方向位移。
- 根据权利要求17所述的方法,其特征在于,所述感兴趣区域的水平方向位移与所述与目标物的姿态变化关联的信息对应的水平方向的分量呈正相关,和/或,所述感兴趣区域的垂直方向位移与所述与目标物的姿态变化关联的信息对应的垂直方向的分量呈正相关。
- 根据权利要求18所述的方法,其特征在于,所述与目标物的姿态变化关联的信息包括:所述目标物的姿态变化速度、姿态变化线速度、姿态变化角速度、姿态变化加速度、姿态变化角加速度中的至少一种。
- 根据权利要求1所述的方法,其特征在于,所述非感兴趣区域的清晰度低于所述感兴趣区域的清晰度。
- 根据权利要求20所述的方法,其特征在于,在所述与所述视频的帧间的全局运动状态关联的信息保持恒定的情形下,所述感兴趣区域和/或所述非感兴趣区域的清晰度与所述拍摄装置对应的传输设备对应的传输条件相关。
- 根据权利要求21所述的方法,其特征在于,所述传输条件包括传输码率,所述感兴趣区域和/或所述非感兴趣区域的清晰度随着所述传输码率的下降而下降。
- 根据权利要求20所述的方法,其特征在于,所述感兴趣区域包括靠近所述感兴趣区域的第一非感兴趣区域和远离所述感兴趣区域的第二非感兴趣区域,其中,所述第一非感兴趣区域的清晰度高于所述第二非感兴趣区域。
- 根据权利要求20所述的方法,其特征在于,所述对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同,包括:对非感兴趣区域执行模糊处理;或者,对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域的量化参数小于所述非感兴趣区域的量化参数。
- 根据权利要求24所述的方法,其特征在于,所述对非感兴趣区域执行模糊处理,包 括:对所述感兴趣区域执行锐度处理,并对所述非感兴趣区域执行模糊处理。
- 一种视频处理方法,其特征在于,包括:获取拍摄装置采集的视频;根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中的感兴趣区域和/或非感兴趣区域的区域面积;对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
- 根据权利要求26所述的方法,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中感兴趣区域和/或非感兴趣区域的区域面积,包括:在所述拍摄装置对应的传输设备的传输条件不满足预设传输条件的情形下,根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中感兴趣区域的区域面积和/或非感兴趣区域的区域面积。
- 根据权利要求26或27所述的方法,其特征在于,所述与所述视频的帧间的全局运动状态关联的信息包括以下至少一种:所述视频的帧间的全局运动信息;与目标物的运动状态关联的信息,其中,所述目标物至少包括所述拍摄装置、承载所述拍摄装置的设备中的任一种。
- 根据权利要求28所述的方法,其特征在于,所述视频的帧间的全局运动信息包括所述视频的帧间的全局运动矢量;其中,所述感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈正相关。
- 根据权利要求28所述的方法,其特征在于,所述与目标物的运动状态关联的信息,包括:所述目标物的运动速度;以及,所述目标物与被摄物之间的相对距离或相对高度。
- 根据权利要求30所述的方法,其特征在于,在所述目标物与所述被摄物之间的相对距离或相对高度保持恒定的情形下,所述感兴趣区域的区域面积与所述目标物的运动速度绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述目标物的运动速度绝对值呈正相关。
- 根据权利要求30所述的方法,其特征在于,在所述目标物的运动速度保持恒定的情 形下,所述感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度呈正相关,和/或,所述非感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度呈负相关。
- 根据权利要求26或27所述的方法,其特征在于,在与所述视频的帧间的全局运动状态关联的信息保持恒定的情形下,所述感兴趣区域和/或非感兴趣区域的区域面积与所述拍摄装置的视场角呈正相关。
- 根据权利要求26所述的方法,其特征在于,所述方法还包括:根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化;其中,所述目标物包括所述拍摄装置、承载所述拍摄装置的设备中的至少一种。
- 根据权利要求34所述的方法,其特征在于,所述根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化,包括:若所述与目标物的姿态变化关联的信息满足阈值条件,则确定所述感兴趣区域的位置变化。
- 根据权利要求34或35所述的方法,其特征在于,所述感兴趣区域的位置变化包括所述感兴趣区域的水平方向位移和/或垂直方向位移。
- 根据权利要求36所述的方法,其特征在于,所述感兴趣区域的水平方向的位移与所述与目标物的姿态变化关联的信息对应的水平方向的分量呈正相关,和/或,所述感兴趣区域的垂直方向位移与所述与目标物的姿态变化关联的信息对应的垂直方向的分量呈正相关。
- 根据权利要求37所述的方法,其特征在于,所述与目标物的姿态变化关联的信息包括:所述目标物的姿态变化速度、姿态变化线速度、姿态变化角速度、姿态变化加速度、姿态变化角加速度中的至少一种。
- 一种视频处理装置,其特征在于,包括:存储器,用于存储计算机程序;处理器,用于调用所述计算机程序,当所述计算机程序被所述处理器执行时,使得所述视频处理装置执行如下操作:获取拍摄装置采集的视频;根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,其中,所述多个区域包括感兴趣区域和非感兴趣区域;对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
- 根据权利要求39所述的装置,所述根据与所述视频的帧间的全局运动状态关联的信 息,将所述视频划分为感兴趣区域和非感兴趣区域,包括:在所述拍摄装置对应的传输设备的传输条件不满足预设传输条件的情形下,根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域。
- 根据权利要求39或40所述的装置,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:若所述与所述视频的帧间的全局运动状态关联的信息所表征的全局运动变化满足预设变化条件,则将所述视频划分为多个区域。
- 根据权利要求39所述的装置,其特征在于,所述与所述视频的帧间的全局运动状态关联的信息包括以下至少一种:所述视频的帧间的全局运动信息;与目标物的运动状态关联的信息,其中,所述目标物至少包括所述拍摄装置、承载所述拍摄装置的设备中的任一种。
- 根据权利要求42所述的装置,其特征在于,所述视频的帧间的全局运动信息包括:所述视频的帧间的全局运动矢量;所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:若所述视频的帧间的全局运动矢量的绝对值大于全局运动矢量阈值,则将所述视频划分为多个区域。
- 根据权利要求42所述的装置,其特征在于,所述与目标物的运动状态关联的信息,包括:所述目标物的运动速度;以及,所述目标物与被摄物之间的相对距离或相对高度。
- 根据权利要求44所述的装置,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:在所述目标物与所述被摄物之间的相对距离或相对高度保持恒定的情形下,若所述目标物的运动速度的绝对值大于运动速度阈值,则将所述视频划分为多个区域。
- 根据权利要求44所述的装置,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,将所述视频划分为多个区域,包括:在所述目标物的运动速度保持恒定的情形下,若所述目标物与被摄物之间的相对距离小于距离阈值或相对高度小于高度阈值,则将所述视频划分为多个区域。
- 根据权利要求39-46任一项所述的装置,所述处理器还包括:根据与所述视频的帧间的全局运动状态关联的信息,确定所述感兴趣区域的区域面积和/或所述非感兴趣区域的区域 面积。
- 根据权利要求47所述的装置,其特征在于,所述感兴趣区域的区域面积与所述视频的帧间全局运动状态关联的信息所表征的全局运动变化呈负相关,和/或,所述非感兴趣区域的区域面积与所述视频的帧间全局运动状态关联的信息所表征的全局运动变化呈正相关。
- 根据权利要求47所述的装置,其特征在于,在所述视频的帧间的全局运动信息包括所述视频的帧间的全局运动矢量的情形下;所述感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈正相关。
- 根据权利要求47所述的装置,其特征在于,在所述与目标物的运动状态关联的信息包括所述目标物的运动速度以及所述目标物与被摄物之间的相对距离或相对高度、且所述目标物与所述被摄物之间的相对距离/相对高度保持恒定的情形下:所述感兴趣区域的区域面积与所述目标物的运动速度绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述目标物的运动速度绝对值呈正相关。
- 根据权利要求47所述的装置,其特征在于,在所述与目标物的运动状态关联的信息包括所述目标物的运动速度以及所述目标物与被摄物之间的相对距离或相对高度、且在所述目标物的运动速度保持恒定的情形下:所述感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度呈正相关,和/或,所述非感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度呈负相关。
- 根据权利要求47所述的装置,其特征在于,在与所述视频的帧间的全局运动状态关联的信息保持恒定的情形下,所述感兴趣区域和/或非感兴趣区域的区域面积与所述拍摄装置的视场角呈正相关。
- 根据权利要求39或47所述的装置,其特征在于,所述处理器还包括:根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化;其中,所述目标物包括所述拍摄装置、承载所述拍摄装置的设备中至少一种。
- 根据权利要求53所述的装置,其特征在于,所述根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化,包括:若所述与目标物的姿态变化关联的信息满足预设条件,则确定所述感兴趣区域的位置变化。
- 根据权利要求54所述的装置,其特征在于,所述感兴趣区域的位置变化包括所述感 兴趣区域的水平方向位移和/或垂直方向位移。
- 根据权利要求55所述的装置,其特征在于,所述感兴趣区域水平方向位移与所述与目标物的姿态变化关联的信息对应的水平方向的分量呈正相关,和/或,所述感兴趣区域对应的垂直方向位移与所述与目标物的姿态变化关联的信息其垂直方向的分量呈正相关。
- 根据权利要求56所述的装置,其特征在于,所述与目标物的姿态变化关联的信息,包括:所述目标物的姿态变化速度、姿态变化线速度、姿态变化角速度、姿态变化加速度、姿态变化角加速度中至少一种。
- 根据权利要求39所述的装置,其特征在于,所述非感兴趣区域的清晰度低于所述感兴趣区域的清晰度。
- 根据权利要求58所述的装置,其特征在于,在所述与所述视频的帧间的全局运动状态关联的信息保持恒定的情形下,所述感兴趣区域和/或所述非感兴趣区域的清晰度与所述拍摄装置对应的传输设备对应的传输条件相关。
- 根据权利要求59所述的方法,其特征在于,所述传输条件包括传输码率,所述感兴趣区域和/或所述非感兴趣区域的清晰度随着所述传输码率的下降而下降。
- 根据权利要求58所述的装置,其特征在于,在所述与所述视频的帧间的全局运动状态关联的信息保持恒定的情形下,所述感兴趣区域和/或所述非感兴趣区域的清晰度随着所述拍摄装置对应的传输设备的传输码率的下降而下降。
- 根据权利要求58所述的装置,其特征在于,所述感兴趣区域包括靠近所述感兴趣区域的第一非感兴趣区域和远离所述感兴趣区域的第二非感兴趣区域,其中所述第一非感兴趣区域的清晰度高于所述第二非感兴趣区域。
- 根据权利要求58所述的装置,其特征在于,所述对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同,包括:对所述非感兴趣区域执行模糊处理;或者,对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域的量化参数小于所述非感兴趣区域的量化参数。
- 根据权利要求63所述的装置,其特征在于,其特征在于,所述对非感兴趣区域执行模糊处理,包括:对所述感兴趣区域执行锐度处理,并对所述非感兴趣区域执行模糊处理。
- 一种视频处理装置,其特征在于,包括:存储器,用于存储计算机程序;处理器,用于调用所述计算机程序,当所述计算机程序被所述处理器执行时,使得所述 视频处理装置执行如下操作:获取拍摄装置采集的视频;根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中的感兴趣区域和/或非感兴趣区域的区域面积;对所述感兴趣区域和所述非感兴趣区域执行不同的图像处理,以使所述感兴趣区域和所述非感兴趣区域的清晰度不同。
- 根据权利要求65所述的装置,其特征在于,所述根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中感兴趣区域和/或非感兴趣区域的区域面积,包括:在所述拍摄装置对应的传输设备的传输条件不满足预设传输条件的情形下,根据与所述视频的帧间的全局运动状态关联的信息,确定所述视频中感兴趣区域的区域面积和/或非感兴趣区域的区域面积。
- 根据权利要求65或66所述的装置,其特征在于,所述与所述视频的帧间的全局运动状态关联的信息包括以下的至少一种:所述视频的帧间的全局运动信息;与目标物的运动状态关联的信息,其中,所述目标物至少包括所述拍摄装置、承载所述拍摄装置的设备中的任一种。
- 根据权利要求67所述的装置,其特征在于,所述视频的帧间的全局运动信息包括所述视频的帧间的全局运动矢量;所述感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述视频的帧间的全局运动矢量的绝对值呈正相关。
- 根据权利要求67所述的装置,其特征在于,所述与目标物的运动状态关联的信息,包括:所述目标物的运动速度;以及,所述目标物与被摄物之间的相对距离或相对高度。
- 根据权利要求69所述的装置,其特征在于,在所述目标物与所述被摄物之间的相对距离或相对高度保持恒定的情形下,所述感兴趣区域的区域面积与所述目标物的运动速度绝对值呈负相关,和/或,所述非感兴趣区域的区域面积与所述目标物的运动速度绝对值呈正相关。
- 根据权利要求69所述的装置,其特征在于,在所述目标物的运动速度保持恒定的情形下,所述感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度呈正相关,和/或,所述非感兴趣区域的区域面积与所述目标物与被摄物之间的相对距离或相对高度 呈负相关。
- 根据权利要求65或66所述的装置,其特征在于,在与所述视频的帧间的全局运动状态关联的信息保持恒定的情形下,所述感兴趣区域和/或非感兴趣区域的区域面积与所述拍摄装置的视场角呈正相关。
- 根据权利要求65所述的装置,其特征在于,所述装置还包括:根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化;其中,所述目标物包括所述拍摄装置、承载所述拍摄装置的设备中的至少一种。
- 根据权利要求73所述的装置,其特征在于,所述根据与目标物的姿态变化关联的信息,确定所述感兴趣区域的位置变化,包括:若所述与目标物的姿态变化关联的信息满足阈值条件,则确定所述感兴趣区域的位置变化。
- 根据权利要求74所述的装置,其特征在于,所述感兴趣区域的位置变化包括所述感兴趣区域的水平方向位移和/或垂直方向位移。
- 根据权利要求75所述的装置,其特征在于,所述感兴趣区域的水平方向的位移与所述与目标物的姿态变化关联的信息对应的水平方向的分量呈正相关,和/或,所述感兴趣区域的垂直方向位移与所述与目标物的姿态变化关联的信息对应的垂直方向的分量呈正相关。
- 根据权利要求76所述的装置,其特征在于,所述与目标物的姿态变化关联的信息,至少包括:所述目标物的姿态变化速度、姿态变化线速度、姿态变化角速度、姿态变化加速度、姿态变化角加速度中任一种。
- 一种设备,其特征在于,所述设备包括:拍摄装置,所述拍摄装置搭载于所述设备;以及权利要求39-77中任一项所述的视频处理装置。
- 根据权利要求78所述的设备,其特征在于,所述设备包括:移动电话、平板电脑、智能穿戴设备、手持云台、可移动设备中任一种。
- 一种计算机存储介质,其特征在于,所述计算机存储介质中存储有计算机程序指令,所述计算机程序指令被处理器执行时,用于执行权利要求1-38任一项所述的视频处理方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/114915 WO2024040535A1 (zh) | 2022-08-25 | 2022-08-25 | 视频处理方法、装置、设备和计算机存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/114915 WO2024040535A1 (zh) | 2022-08-25 | 2022-08-25 | 视频处理方法、装置、设备和计算机存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024040535A1 true WO2024040535A1 (zh) | 2024-02-29 |
Family
ID=90012133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/114915 WO2024040535A1 (zh) | 2022-08-25 | 2022-08-25 | 视频处理方法、装置、设备和计算机存储介质 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024040535A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101277447A (zh) * | 2008-04-15 | 2008-10-01 | 北京航空航天大学 | 航拍交通视频快速帧间预测方法 |
US20120224629A1 (en) * | 2009-12-14 | 2012-09-06 | Sitaram Bhagavathy | Object-aware video encoding strategies |
CN111918066A (zh) * | 2020-09-08 | 2020-11-10 | 北京字节跳动网络技术有限公司 | 视频编码方法、装置、设备及存储介质 |
CN114531615A (zh) * | 2020-11-03 | 2022-05-24 | 腾讯科技(深圳)有限公司 | 视频数据处理方法、装置、计算机设备和存储介质 |
CN114554212A (zh) * | 2021-12-31 | 2022-05-27 | 深圳市大疆创新科技有限公司 | 视频处理装置及方法、计算机存储介质 |
-
2022
- 2022-08-25 WO PCT/CN2022/114915 patent/WO2024040535A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101277447A (zh) * | 2008-04-15 | 2008-10-01 | 北京航空航天大学 | 航拍交通视频快速帧间预测方法 |
US20120224629A1 (en) * | 2009-12-14 | 2012-09-06 | Sitaram Bhagavathy | Object-aware video encoding strategies |
CN111918066A (zh) * | 2020-09-08 | 2020-11-10 | 北京字节跳动网络技术有限公司 | 视频编码方法、装置、设备及存储介质 |
CN114531615A (zh) * | 2020-11-03 | 2022-05-24 | 腾讯科技(深圳)有限公司 | 视频数据处理方法、装置、计算机设备和存储介质 |
CN114554212A (zh) * | 2021-12-31 | 2022-05-27 | 深圳市大疆创新科技有限公司 | 视频处理装置及方法、计算机存储介质 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10936894B2 (en) | Systems and methods for processing image data based on region-of-interest (ROI) of a user | |
US20220174252A1 (en) | Selective culling of multi-dimensional data sets | |
US11816820B2 (en) | Gaze direction-based adaptive pre-filtering of video data | |
US20210329177A1 (en) | Systems and methods for video processing and display | |
US10977764B2 (en) | Viewport independent image coding and rendering | |
US20190246104A1 (en) | Panoramic video processing method, device and system | |
EP2939432B1 (en) | Display update time reduction for a near-eye display | |
WO2018214078A1 (zh) | 拍摄控制方法及装置 | |
CN108363946B (zh) | 基于无人机的人脸跟踪系统及方法 | |
US11258949B1 (en) | Electronic image stabilization to improve video analytics accuracy | |
WO2021237616A1 (zh) | 图像传输方法、可移动平台及计算机可读存储介质 | |
CN111213002B (zh) | 一种云台控制方法、设备、云台、系统及存储介质 | |
CN105939482A (zh) | 视频流式传输方法 | |
CN106293043B (zh) | 可视化内容传输控制方法、发送方法、及其装置 | |
US20210112194A1 (en) | Method and device for taking group photo | |
CN108419052B (zh) | 一种多台无人机全景成像方法 | |
CN109949381A (zh) | 图像处理方法、装置、图像处理芯片、摄像组件及飞行器 | |
WO2024040535A1 (zh) | 视频处理方法、装置、设备和计算机存储介质 | |
WO2020215214A1 (zh) | 图像处理方法和装置 | |
CN113810696B (zh) | 一种信息传输方法、相关设备及系统 | |
CN108693953A (zh) | 一种增强现实ar投影方法及云端服务器 | |
CN113327228A (zh) | 图像处理方法和装置、终端和可读存储介质 | |
WO2019100247A1 (zh) | 应用于虚拟现实的图像显示方法、装置、设备及系统 | |
CN118212137A (zh) | 一种视频处理方法、装置、设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22956081 Country of ref document: EP Kind code of ref document: A1 |