WO2021078116A1 - Video processing method and electronic device - Google Patents

Video processing method and electronic device Download PDF

Info

Publication number
WO2021078116A1
WO2021078116A1 PCT/CN2020/122176 CN2020122176W WO2021078116A1 WO 2021078116 A1 WO2021078116 A1 WO 2021078116A1 CN 2020122176 W CN2020122176 W CN 2020122176W WO 2021078116 A1 WO2021078116 A1 WO 2021078116A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound source
sound
source object
camera
electronic device
Prior art date
Application number
PCT/CN2020/122176
Other languages
French (fr)
Chinese (zh)
Inventor
孙华伟
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2021078116A1 publication Critical patent/WO2021078116A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/62Control of parameters via user interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Studio Devices (AREA)

Abstract

Disclosed are a video processing method and an electronic device. The method comprises: when a video recording operation is received, starting a camera for image acquisition and starting a microphone for sound acquisition; extracting feature information of photographed objects contained in an acquired image; extracting feature information of sound source objects contained in acquired sound; matching the photographed objects and the sound source objects on the basis of the feature information of the photographed objects and the sound source objects to obtain a matching relationship between the photographed objects and the sound source objects; receiving a selection operation for the photographed objects, and selecting a first photographed object from the photographed objects contained in the acquired image; determining a first sound source object matching the first photographed object from the sound source objects contained in the acquired sound according to the matching relationship; and performing preset first anti-interference processing on a sound track corresponding to a second sound source object contained in the acquired sound, and performing synthesis processing on the sound obtained by the preset first anti-interference processing and the acquired image to obtain a target video.

Description

视频处理方法及电子设备Video processing method and electronic equipment
本申请要求于2019年10月21日提交国家知识产权局、申请号为201911002660.2、申请名称为“视频处理方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the State Intellectual Property Office on October 21, 2019, the application number is 201911002660.2, and the application name is "Video Processing Method and Electronic Equipment", the entire content of which is incorporated into this application by reference .
技术领域Technical field
本申请涉及多媒体技术领域,特别是涉及一种视频处理方法及电子设备。This application relates to the field of multimedia technology, in particular to a video processing method and electronic equipment.
背景技术Background technique
近年来,随着互联网技术的快速发展和设备硬件配置的升级,电子设备的功能越来越丰富,越来越多的用户使用电子设备开展娱乐活动,例如,使用电子设备进行视频直播、vlog(video weblog,视频播客)拍摄等视频录制活动。目前,在视频录制过程中,往往会收录一些杂音,相关技术中,主要通过后期使用专业设备对已录制的视频进行剪辑处理,来滤除杂音,导致成本较高,且操作比较繁琐。In recent years, with the rapid development of Internet technology and the upgrading of equipment hardware configuration, the functions of electronic equipment have become more and more abundant. More and more users use electronic equipment to carry out entertainment activities, such as the use of electronic equipment for live video and vlog( video weblog, video podcast) shooting and other video recording activities. At present, in the video recording process, some noise is often included. In related technologies, the recorded video is edited by professional equipment to filter out the noise, which results in high cost and cumbersome operation.
发明内容Summary of the invention
本申请实施例提供一种视频处理方法及电子设备,以解决相关技术中存在的视频处理成本较高,且操作比较繁琐的技术问题。The embodiments of the present application provide a video processing method and electronic equipment to solve the technical problems of high video processing cost and cumbersome operation in related technologies.
为解决上述技术问题,本申请实施例是这样实现的:In order to solve the above technical problems, the embodiments of the present application are implemented as follows:
第一方面,本申请实施例提供了一种视频处理方法,应用于电子设备,所述方法包括:In the first aspect, an embodiment of the present application provides a video processing method applied to an electronic device, and the method includes:
当接收到视频录制操作时,开启所述电子设备的摄像头进行图像采集,以及开启所述电子设备的麦克风进行声音采集;When a video recording operation is received, turning on the camera of the electronic device for image collection, and turning on the microphone of the electronic device for sound collection;
确定所述摄像头采集到的图像包含的拍摄对象,并提取所述拍摄对象的特征信息;以及确定所述麦克风采集到的声音包含的声源对象,并提取所述声源对象的特征信息,其中,不同的声源对象对应不同的音轨;Determine the subject included in the image collected by the camera, and extract characteristic information of the subject; and determine the sound source object included in the sound collected by the microphone, and extract the characteristic information of the sound source object, where , Different sound source objects correspond to different audio tracks;
基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系;Matching the photographed object and the sound source object based on the characteristic information of the photographed object and the characteristic information of the sound source object to obtain a matching relationship between the photographed object and the sound source object;
接收针对所述拍摄对象的选择操作;Receiving a selection operation for the shooting object;
响应所述选择操作,从所述摄像头采集到的图像包含的拍摄对象中选择第一拍摄对象;In response to the selection operation, select a first photographic subject from the photographic subjects contained in the image collected by the camera;
根据所述匹配关系,确定所述麦克风采集到的声音包含的声源对象中与所述第一拍摄对象匹配的第一声源对象;Determine, according to the matching relationship, a first sound source object that matches the first shooting object among sound source objects included in the sound collected by the microphone;
对所述麦克风采集到的声音包含的第二声源对象对应的音轨进行预设第一防干扰处理,并对所述预设第一防干扰处理得到的声音和所述摄像头采集到的图像进行合成处理,得到目标视频,其中,所述第二声源对象为所述麦克风采集到的声音包含的声源对象中除所述第一声源对象之外的声源对象。Perform preset first anti-interference processing on the sound track corresponding to the second sound source object contained in the sound collected by the microphone, and perform the preset first anti-interference processing on the sound obtained by the preset first anti-interference processing and the image collected by the camera The synthesis process is performed to obtain the target video, wherein the second sound source object is a sound source object other than the first sound source object among the sound source objects included in the sound collected by the microphone.
第二方面,本申请实施例还提供了一种电子设备,所述电子设备包括:In the second aspect, an embodiment of the present application also provides an electronic device, the electronic device including:
开启单元,用于当接收到视频录制操作时,开启所述电子设备的摄像头进行图像 采集,以及开启所述电子设备的麦克风进行声音采集;The opening unit is configured to, when a video recording operation is received, turn on the camera of the electronic device for image collection, and turn on the microphone of the electronic device for sound collection;
第一提取单元,用于确定所述摄像头采集到的图像包含的拍摄对象,并提取所述拍摄对象的特征信息;The first extraction unit is configured to determine the photographic subject contained in the image collected by the camera, and extract characteristic information of the photographic subject;
第二提取单元,用于确定所述麦克风采集到的声音包含的声源对象,并提取所述声源对象的特征信息,其中,不同的声源对象对应不同的音轨;The second extraction unit is configured to determine the sound source object contained in the sound collected by the microphone, and extract characteristic information of the sound source object, where different sound source objects correspond to different sound tracks;
匹配单元,用于基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系;The matching unit is configured to match the photographed object and the sound source object based on the characteristic information of the photographed object and the characteristic information of the sound source object to obtain a relationship between the photographed object and the sound source object. The matching relationship;
接收单元,用于接收针对所述拍摄对象的选择操作;A receiving unit, configured to receive a selection operation for the photographed object;
选择单元,用于响应所述选择操作,从所述摄像头采集到的图像包含的拍摄对象中选择第一拍摄对象;The selection unit is configured to respond to the selection operation and select a first photographic subject from the photographic subjects contained in the image collected by the camera;
确定单元,用于根据所述匹配关系,确定所述麦克风采集到的声音包含的声源对象中与所述第一拍摄对象匹配的第一声源对象;A determining unit, configured to determine, according to the matching relationship, a first sound source object that matches the first shooting object among sound source objects included in the sound collected by the microphone;
第一处理单元,用于对所述麦克风采集到的声音包含的第二声源对象对应的音轨进行预设第一防干扰处理;The first processing unit is configured to perform preset first anti-interference processing on the sound track corresponding to the second sound source object contained in the sound collected by the microphone;
第二处理单元,用于对所述预设第一防干扰处理得到的声音和所述摄像头采集到的图像进行合成处理,得到目标视频,其中,所述第二声源对象为所述麦克风采集到的声音包含的声源对象中除所述第一声源对象之外的声源对象。The second processing unit is configured to synthesize the sound obtained by the preset first anti-interference processing and the image collected by the camera to obtain a target video, wherein the second sound source object is collected by the microphone The sound source objects included in the received sound are sound source objects other than the first sound source object.
第三方面,本申请实施例还提供了一种电子设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述视频处理方法的步骤。In a third aspect, an embodiment of the present application also provides an electronic device, including a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and the computer program is executed by the processor. The steps of the above video processing method are realized when executed.
第四方面,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现上述视频处理方法的步骤。In a fourth aspect, an embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to implement the steps of the above-mentioned video processing method.
本申请实施例中,在视频录制过程中,可以建立所录制的视频画面中的拍摄对象与所录制的视频声音中的声源对象的匹配关系,当用户选择视频画面中的特定拍摄对象时,根据特定拍摄对象和上述匹配关系,确定与特定拍摄对象匹配的特定声源对象,将所录制的视频声音中特定声源对象之外的声源对象的音轨进行防干扰处理,基于防干扰处理得到的声音和所录制的视频画面生成目标视频,使得不需要通过专业设备进行后期剪辑,就可以得到用户想要的、更加纯净的视频,降低了视频处理成本,简化了视频处理操作。In the embodiment of the present application, during the video recording process, the matching relationship between the shooting object in the recorded video screen and the sound source object in the recorded video sound can be established. When the user selects a specific shooting object in the video screen, According to the specific shooting object and the above matching relationship, the specific sound source object matching the specific shooting object is determined, and the sound track of the sound source object other than the specific sound source object in the recorded video sound is subjected to anti-interference processing, based on the anti-interference processing The obtained sound and the recorded video screen generate the target video, so that the purer video that the user wants can be obtained without the need for post-editing through professional equipment, which reduces the video processing cost and simplifies the video processing operation.
附图说明Description of the drawings
图1是本申请的一个实施例的视频处理方法的流程图;Fig. 1 is a flowchart of a video processing method according to an embodiment of the present application;
图2是本申请的一个实施例的视频录制对象的极坐标的实例图;Fig. 2 is an example diagram of the polar coordinates of a video recording object according to an embodiment of the present application;
图3是本申请的另一个实施例的视频录制对象的极坐标的实例图;FIG. 3 is an example diagram of the polar coordinates of a video recording object according to another embodiment of the present application;
图4是本申请书的一个实施例的视频处理方法的应用场景图;Fig. 4 is an application scene diagram of a video processing method of an embodiment of the present application;
图5是本申请的一个实施例的电子设备的结构框图;Fig. 5 is a structural block diagram of an electronic device according to an embodiment of the present application;
图6是实现本申请各个实施例的一种电子设备的硬件结构示意图。Fig. 6 is a schematic diagram of the hardware structure of an electronic device that implements each embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
随着互联网技术的快速发展以及社交网络、短视频爆发式的增长,人们在使用电子设备时会有大量的时间进行视频录制,例如视频拍摄或视频直播。但是在进行视频录制时,如果录制的视频中存在杂音或多个用户的声音,相关技术中,需要通过后期使用专业设备对已录制的视频进行剪辑处理,来滤除杂音或其他用户的声音,导致成本较高,且操作比较繁琐。With the rapid development of Internet technology and the explosive growth of social networks and short videos, people will have a lot of time for video recording when using electronic devices, such as video shooting or live video. However, during video recording, if there is noise or the voice of multiple users in the recorded video, it is necessary to use professional equipment to edit the recorded video later to filter out noise or other users’ voices. The cost is higher and the operation is cumbersome.
为了解决上述技术问题,本申请实施例提供了一种视频处理方法及电子设备。In order to solve the foregoing technical problems, embodiments of the present application provide a video processing method and electronic equipment.
下面首先对本申请实施例提供的一种视频处理方法进行介绍。The following first introduces a video processing method provided by an embodiment of the present application.
需要说明的是,本申请实施例提供的视频处理方法适用于电子设备,在实际应用中,该电子设备可以包括:智能手机、平板电脑、个人数字助理等移动终端,也可以包括:笔记本电脑、台式电脑、桌面机等计算机设备,本申请实施例对此不作限定。It should be noted that the video processing method provided in the embodiments of this application is applicable to electronic devices. In practical applications, the electronic devices may include mobile terminals such as smart phones, tablet computers, personal digital assistants, etc., and may also include: laptop computers, Computer devices such as desktop computers and desktop computers are not limited in the embodiment of the present application.
图1是本申请的一个实施例的视频处理方法的流程图,如图1所示,该方法可以包括以下步骤:步骤101、步骤102、步骤103、步骤104、步骤105和步骤106,其中,Fig. 1 is a flowchart of a video processing method according to an embodiment of the present application. As shown in Fig. 1, the method may include the following steps: step 101, step 102, step 103, step 104, step 105, and step 106, where,
在步骤101中,当接收到视频录制操作时,开启电子设备的摄像头进行图像采集,以及开启电子设备的麦克风进行声音采集。In step 101, when a video recording operation is received, the camera of the electronic device is turned on for image collection, and the microphone of the electronic device is turned on for sound collection.
本申请实施例中,视频录制操作可以为用于触发视频拍摄的操作,也可以为用于触发视频直播的操作。In the embodiment of the present application, the video recording operation may be an operation used to trigger video shooting, or may be an operation used to trigger a live video broadcast.
本申请实施例中,用户可以通过手动操作的方式,在电子设备上输入视频录制操作,例如,点击电子设备操作界面上的照相机图标;或者打开视频录制软件,进入视频录制软件的界面,点击该界面上的视频录制图标/按钮;或者,用户也可以通过语音呼出方式,在电子设备上输入视频录制操作;或者,用户还可以通过手势或摇晃电子设备的方式,在电子设备上输入视频录制操作,本申请实施例对此不作限定。In the embodiments of this application, the user can manually input video recording operations on the electronic device, for example, click the camera icon on the operation interface of the electronic device; or open the video recording software, enter the interface of the video recording software, and click the The video recording icon/button on the interface; or, the user can also input video recording operations on the electronic device through voice calling; or, the user can also input video recording operations on the electronic device through gestures or shaking the electronic device This embodiment of the application does not limit this.
本申请实施例中,在电子设备录制视频的过程中,通过该电子设备的摄像头进行图像画面采集,通过该电子设备的麦克风进行声音采集,也就是,电子设备的摄像头和麦克风同时工作。In the embodiment of the present application, during the process of video recording by the electronic device, the camera of the electronic device is used for image capture, and the microphone of the electronic device is used for sound collection, that is, the camera and microphone of the electronic device work at the same time.
在步骤102中,确定摄像头采集到的图像包含的拍摄对象,并提取拍摄对象的特征信息;以及确定麦克风采集到的声音包含的声源对象,并提取声源对象的特征信息,其中,不同的声源对象对应不同的音轨。In step 102, determine the subject contained in the image collected by the camera, and extract characteristic information of the subject; and determine the sound source object contained in the sound collected by the microphone, and extract the characteristic information of the sound source object. The sound source objects correspond to different audio tracks.
本申请实施例中,拍摄对象和声源对象,在本质上为:视频录制场景中被录制对象(即客观存在的对象)的不同表现形式,具体的,拍摄对象为视频录制场景中被录制对象在视频画面中的表现形式,声源对象为被录制对象在视频声音中的表现形式。In the embodiments of the present application, the shooting object and the sound source object are essentially: different manifestations of the recorded object (ie, objectively existing object) in the video recording scene. Specifically, the shooting object is the recorded object in the video recording scene. In the form of expression in the video picture, the sound source object is the expression form of the recorded object in the video sound.
例如,用户D使用手机进行视频直播,在这种情况下,用户D即为被录制对象,视频直播画面中的用户D即为拍摄对象,视频直播声音中的用户D即为声源对象。For example, user D uses a mobile phone to conduct a live video broadcast. In this case, user D is the recorded object, user D in the live video screen is the shooting object, and user D in the live video sound is the sound source object.
本申请实施例中,拍摄对象的特征信息和声源对象的特征信息,用于确定拍摄对象和声源对象的匹配关系,也就是确定哪个拍摄对象和哪个声源对象属于同一个被录 制对象。In the embodiment of the present application, the feature information of the shooting object and the feature information of the sound source object are used to determine the matching relationship between the shooting object and the sound source object, that is, to determine which shooting object and which sound source object belong to the same recorded object.
在一个例子中,视频录制场景中包括三个被录制对象,分别为:用户A、用户B和用户C,在视频录制过程中,例如,摄像头采集到图像中包括三个拍摄对象,分别为:拍摄对象1、拍摄对象2和拍摄对象3,麦克风采集到的声音中包括四个声源对象,分别为:声源对象1、声源对象2、声源对象3和声源对象4,提取拍摄对象1~拍摄对象3的特征信息和声源对象1~声源对象4的特征信息,其目的是:确定拍摄对象1~拍摄对象3中的哪个拍摄对象和声源对象1~声源对象4中的哪个声源对象属于用户A,确定拍摄对象1~拍摄对象3中的哪个拍摄对象和声源对象1~声源对象4中的哪个声源对象属于用户B,确定拍摄对象1~拍摄对象3中的哪个拍摄对象和声源对象1~声源对象4中的哪个声源对象属于用户C。In an example, the video recording scene includes three recorded objects, namely: user A, user B, and user C. During the video recording process, for example, the image captured by the camera includes three subjects, which are: Shooting subject 1, shooting subject 2, and shooting subject 3. The sound collected by the microphone includes four sound source objects, namely: sound source object 1, sound source object 2, sound source object 3, and sound source object 4. Extract and shoot The feature information of object 1 to object 3 and the feature information of sound source object 1 to sound source object 4, the purpose of which is to determine which of object 1 to object 3 and sound source object 1 to sound source object 4 Which of the sound source objects in belong to user A, determine which of the photographic subjects 1 to 3 and which of the sound source objects 1 to 4 belong to user B, and determine which of the photographic subjects 1 to 3 to the subject of sound belongs to user B, and determine which of subjects 1 to 3 to subject to Which of the shooting objects in 3 and which of the sound source objects 1 to 4 belongs to the user C.
本申请实施例中,拍摄对象的特征信息可以包括:拍摄对象相对于电子设备的空间位置信息,相应的,声源对象的特征信息可以包括:声源对象相对于电子设备的空间位置信息;或者,拍摄对象的特征信息可以包括:拍摄对象的外在形象,相应的,声源对象的特征信息可以包括:声源对象的音轨属性,其中,音轨属性包括下述至少一种:音色、音律和音量。In the embodiment of the present application, the characteristic information of the photographed object may include: the spatial position information of the photographed object relative to the electronic device, and correspondingly, the characteristic information of the sound source object may include: the spatial position information of the sound source object relative to the electronic device; or The feature information of the shooting object may include: the external image of the shooting object. Correspondingly, the feature information of the sound source object may include: the sound track attribute of the sound source object, where the sound track attribute includes at least one of the following: timbre, Tempo and volume.
具体的,当拍摄对象的特征信息包括:拍摄对象相对于电子设备的空间位置信息时,可以采用物体识别技术,识别出摄像头采集到的图像中包含的各拍摄对象,之后根据各拍摄对象的图像深度信息,确定各拍摄对象的空间位置信息;当声源对象的特征信息包括:声源对象相对于电子设备的空间位置信息时,可以根据麦克风采集到的声音中的音色、音律等信息,识别出麦克风采集到的声音中包含的各声源对象,之后根据各声源对象的声波信息,确定各声源对象的空间位置信息。Specifically, when the feature information of the photographed object includes: the spatial position information of the photographed object relative to the electronic device, the object recognition technology can be used to identify each photographed object contained in the image collected by the camera, and then according to the image of each photographed object Depth information, to determine the spatial position information of each shooting object; when the characteristic information of the sound source object includes: the spatial position information of the sound source object relative to the electronic device, it can be identified based on the timbre and rhythm of the sound collected by the microphone Each sound source object contained in the sound collected by the microphone is extracted, and then the spatial position information of each sound source object is determined according to the sound wave information of each sound source object.
当拍摄对象的特征信息包括:拍摄对象的外在形象时,可以采用物体识别技术,识别出摄像头采集到的图像中包含的各拍摄对象,之后采用人脸识别技术,提取各拍摄对象的外在形象,例如,年龄、性别等;当声源对象的特征信息包括:声源对象的音轨属性时,可以根据麦克风采集到的声音中的音色、音律等信息,识别出麦克风采集到的声音中包含的各声源对象,并提取各声源对象的音轨属性。When the feature information of the subject includes: the external image of the subject, object recognition technology can be used to identify each subject contained in the image collected by the camera, and then face recognition technology is used to extract the external image of each subject Image, for example, age, gender, etc.; when the characteristic information of the sound source object includes: the sound track attribute of the sound source object, the timbre and rhythm of the sound collected by the microphone can be used to identify the sound collected by the microphone Include each sound source object, and extract the track attribute of each sound source object.
本申请实施例中,考虑到拍摄对象相对于电子设备的空间位置信息是基于电子设备的摄像头采集到的图像获得的,声源对象相对于电子设备的空间位置信息是基于电子设备的麦克风采集到的声音获得的,因此,具体的,拍摄对象相对于电子设备的空间位置信息可以包括:拍摄对象在以摄像头为坐标原点的空间坐标系下的极坐标(x1,α1);相应的,声源对象相对于电子设备的空间位置信息包括:声源对象在以麦克风为坐标原点的空间坐标系下的极坐标(y1,β1)。In the embodiments of this application, considering that the spatial position information of the shooting object relative to the electronic device is obtained based on the image collected by the camera of the electronic device, the spatial position information of the sound source object relative to the electronic device is collected based on the microphone of the electronic device. Therefore, specifically, the spatial position information of the photographic object relative to the electronic device may include: the polar coordinates (x1, α1) of the photographic object in the spatial coordinate system with the camera as the coordinate origin; correspondingly, the sound source The spatial position information of the object relative to the electronic device includes the polar coordinates (y1, β1) of the sound source object in the spatial coordinate system with the microphone as the origin of the coordinates.
为了便于直观理解,将以摄像头为坐标原点的空间坐标系和以麦克风为坐标原点的空间坐标系放在一张图中进行描述。In order to facilitate intuitive understanding, the spatial coordinate system with the camera as the origin of the coordinates and the spatial coordinate system with the microphone as the origin of the coordinates are put together in a diagram for description.
在一个例子中,被录制对象在摄像头和麦克风之间,如图2所示,O1代表摄像头,O2代表麦克风,被录制对象在以O1为坐标原点的空间坐标系下的极坐标为(x1,α1),即拍摄对象在以摄像头为坐标原点的空间坐标系下的极坐标(x1,α1),被录制对象在以O2为坐标原点的空间坐标系下的极坐标为(y1,β1),即声源对象在以麦克风为坐标原点的空间坐标系下的极坐标(y1,β1),其中,x1为被录制对象到摄像头的 距离,y1为被录制对象到麦克风的距离,L为麦克风到摄像头的距离,α1和β1的取值范围均为(-90°,90°)。In an example, the recorded object is between the camera and the microphone. As shown in Figure 2, O1 represents the camera and O2 represents the microphone. The polar coordinates of the recorded object in the spatial coordinate system with O1 as the origin of the coordinates are (x1, α1), that is, the polar coordinates (x1, α1) of the photographed object in the spatial coordinate system with the camera as the coordinate origin, and the polar coordinates of the recorded object in the spatial coordinate system with O2 as the coordinate origin are (y1, β1), That is, the polar coordinates (y1, β1) of the sound source object in the spatial coordinate system with the microphone as the coordinate origin, where x1 is the distance from the recorded object to the camera, y1 is the distance from the recorded object to the microphone, and L is the distance from the microphone to the For the distance of the camera, the value ranges of α1 and β1 are both (-90°, 90°).
在另一个例子中,被录制对象在摄像头或麦克风的一侧,如图3所示,O1代表摄像头,O2代表麦克风,被录制对象在以O1为坐标原点的空间坐标系下的极坐标为(x1,α1),即拍摄对象在以摄像头为坐标原点的空间坐标系下的极坐标(x1,α1),被录制对象在以O2为坐标原点的空间坐标系下的极坐标为(y1,β1),即声源对象在以麦克风为坐标原点的空间坐标系下的极坐标(y1,β1),其中,x1为被录制对象到摄像头的距离,y1为被录制对象到麦克风的距离,L为麦克风到摄像头的距离,α1和β1的取值范围均为(-90°,90°)。In another example, the recorded object is on the side of the camera or microphone, as shown in Figure 3, O1 represents the camera, O2 represents the microphone, and the polar coordinates of the recorded object in the spatial coordinate system with O1 as the origin of the coordinates are ( x1, α1), that is, the polar coordinates (x1, α1) of the subject in the spatial coordinate system with the camera as the origin of the coordinates, and the polar coordinates of the recorded object in the spatial coordinate system with the origin of O2 as the coordinates (y1, β1) ), that is, the polar coordinates (y1, β1) of the sound source object in the spatial coordinate system with the microphone as the coordinate origin, where x1 is the distance from the recorded object to the camera, y1 is the distance from the recorded object to the microphone, and L is For the distance between the microphone and the camera, the value ranges of α1 and β1 are both (-90°, 90°).
在步骤103中,基于拍摄对象的特征信息和声源对象的特征信息,对拍摄对象和声源对象进行匹配,得到拍摄对象与声源对象之间的匹配关系。In step 103, based on the feature information of the shooting object and the feature information of the sound source object, the shooting object and the sound source object are matched to obtain the matching relationship between the shooting object and the sound source object.
本申请实施例中,如果拍摄对象与声源对象匹配,则说明拍摄对象与声源对象属于同一个被录制对象,如果拍摄对象与声源对象不匹配,则说明拍摄对象与声源对象不属于同一个被录制对象。拍摄对象与声源对象之间的匹配关系中记录的信息为:哪个拍摄对象与哪个声源对象属于同一个被录制对象。In the embodiments of this application, if the photographed object matches the sound source object, it means that the photographed object and the sound source object belong to the same recorded object. If the photographed object does not match the sound source object, it means that the photographed object and the sound source object do not belong to The same object being recorded. The information recorded in the matching relationship between the shooting object and the sound source object is: which shooting object and which sound source object belong to the same recorded object.
本申请实施例中,当拍摄对象的特征信息包括:拍摄对象相对于电子设备的空间位置信息,声源对象的特征信息包括:声源对象相对于电子设备的空间位置信息时,上述步骤103具体可以包括以下步骤:如果拍摄对象相对于电子设备的空间位置信息与声源对象相对于电子设备的空间位置信息重合或者相差不大,则确定拍摄对象与声源对象匹配。In the embodiment of the present application, when the characteristic information of the photographed object includes: the spatial position information of the photographed object relative to the electronic device, and the characteristic information of the sound source object includes: the spatial position information of the sound source object relative to the electronic device, the above step 103 is specifically The following steps may be included: if the spatial position information of the photographic object relative to the electronic device and the spatial position information of the sound source object relative to the electronic device overlap or are not much different, determining that the photographic object matches the sound source object.
更为具体地,在拍摄对象的特征信息为:拍摄对象在以摄像头为坐标原点的空间坐标系下的极坐标(x1,α1),声源对象的特征信息为:声源对象在以麦克风为坐标原点的空间坐标系下的极坐标(y1,β1)的情况下,考虑到(x1,α1)是在以摄像头为坐标原点的坐标系下获得的,(y1,β1)是在以麦克风为坐标原点的坐标系下获得,而摄像头和麦克风位于电子设备的不同位置上,因此为了保证后续匹配结果的准确性,需要消除由于坐标原点不同所造成的偏差,也就是,将拍摄对象和声源对象转换到同一个坐标系下。More specifically, the feature information of the shooting object is: the polar coordinates (x1, α1) of the shooting object in the spatial coordinate system with the camera as the origin of the coordinates, and the feature information of the sound source object is: In the case of the polar coordinates (y1, β1) in the spatial coordinate system of the coordinate origin, considering that (x1, α1) is obtained in the coordinate system with the camera as the coordinate origin, (y1, β1) is obtained with the microphone as the origin The coordinate origin is obtained under the coordinate system, and the camera and the microphone are located at different positions of the electronic device. Therefore, in order to ensure the accuracy of the subsequent matching results, it is necessary to eliminate the deviation caused by the different coordinate origin, that is, the shooting object and the sound source The objects are converted to the same coordinate system.
在消除由于坐标原点不同所造成的偏差时,可以将摄像头作为统一原点,将拍摄对象和声源对象转换到以摄像头作为坐标原点的坐标系下;或者可以将麦克风作为统一原点,将拍摄对象和声源对象转换到以麦克风作为坐标原点的坐标系下;或者,也可以将摄像头和麦克风之外的第三位置作为统一原点,将拍摄对象和声源对象转换到以第三位置作为坐标原点的坐标系下,本申请实施例对此不作限定。When eliminating the deviation caused by the different coordinate origin, the camera can be used as the unified origin, and the shooting object and the sound source object can be converted to the coordinate system with the camera as the coordinate origin; or the microphone can be used as the unified origin to combine the shooting object and The sound source object is converted to a coordinate system with the microphone as the origin of the coordinates; alternatively, the third position other than the camera and the microphone can be used as the unified origin, and the shooting object and the sound source object can be converted to the third position as the origin of the coordinate system. In the coordinate system, the embodiment of the present application does not limit this.
当将摄像头作为统一原点,将拍摄对象和声源对象转换到以摄像头作为坐标原点的坐标系下时,上述步骤103具体可以包括以下步骤(图中未示出):步骤1031、步骤1032和步骤1033,其中,When the camera is used as the unified origin, and the shooting object and the sound source object are converted to the coordinate system with the camera as the coordinate origin, the above step 103 may specifically include the following steps (not shown in the figure): step 1031, step 1032, and step 1033, of which,
在步骤1031中,当(x1,α1)和(y1,β1)位于两个坐标原点之间时,根据(y1,β1)和预设第一坐标转换公式
Figure PCTCN2020122176-appb-000001
计算声源对象在以摄像头为坐标原点的空间坐标系下的极坐标(x2,α2);其中,两个坐标原点包括:以摄像头 作为坐标原点和以麦克风作为坐标原点,L为麦克风到摄像头的距离;
In step 1031, when (x1, α1) and (y1, β1) are located between the two coordinate origins, according to (y1, β1) and the preset first coordinate conversion formula
Figure PCTCN2020122176-appb-000001
Calculate the polar coordinates (x2, α2) of the sound source object in the spatial coordinate system with the camera as the coordinate origin; among them, the two coordinate origins include: the camera as the coordinate origin and the microphone as the coordinate origin, and L is the distance from the microphone to the camera distance;
本步骤中,由已知量(x1,α1)、L和第一坐标转换公式,求解未知量(x2,α2)。In this step, the unknown quantity (x2, α2) is solved by the known quantity (x1, α1), L and the first coordinate conversion formula.
在步骤1032中,当(x1,α1)和(y1,β1)位于两个坐标原点的同一侧时,根据(y1,β1)和预设第二坐标转换公式
Figure PCTCN2020122176-appb-000002
计算声源对象在以摄像头为坐标原点的空间坐标系下的极坐标(x2,α2);
In step 1032, when (x1, α1) and (y1, β1) are located on the same side of the two coordinate origins, according to (y1, β1) and the preset second coordinate conversion formula
Figure PCTCN2020122176-appb-000002
Calculate the polar coordinates (x2, α2) of the sound source object in the spatial coordinate system with the camera as the coordinate origin;
本步骤中,由已知量(x1,α1)、L和第二坐标转换公式,求解未知量(x2,α2)。In this step, the unknown quantity (x2, α2) is solved by the known quantity (x1, α1), L and the second coordinate conversion formula.
在步骤1033中,根据(x1,α1)和(x2,α2),计算拍摄对象与声源对象的匹配度,针对每个拍摄对象,将与每个拍摄对象匹配度最高的声源对象确定为匹配的声源对象,得到对应的匹配关系。In step 1033, according to (x1, α1) and (x2, α2), the degree of matching between the shooting object and the sound source object is calculated. For each shooting object, the sound source object with the highest degree of matching with each shooting object is determined as The matched sound source object obtains the corresponding matching relationship.
在一个实施方式中,上述步骤1033具体可以包括以下步骤:In an embodiment, the above step 1033 may specifically include the following steps:
计算(x1,α1)与(x2,α2)之间的距离值,根据该距离值,确定拍摄对象与声源对象的匹配度,其中,距离值与匹配度成反比关系。Calculate the distance value between (x1, α1) and (x2, α2), and determine the matching degree between the shooting object and the sound source object according to the distance value, where the distance value is inversely proportional to the matching degree.
在另一个实施方式中,考虑到针对同一个被录制对象,在确定摄像头采集到的图像中的拍摄对象和确定麦克风采集到的声音中的声源对象时,以被录制对象为人为例,图像测量的中心是被录制对象的眼睛,声音测量的中心是被录制对象的嘴巴,为了保证后续匹配结果的准确性,需要消除由于图像测量中心和声音测量中心不同所造成的误差。在消除由于图像测量中心和声音测量中心不同所造成的误差时,可以引入误差修正参数,通过误差修正参数进行误差修正,此时,上述步骤1033具体可以包括以下步骤(图中未示出):步骤10331、步骤10332和步骤10333,其中,In another embodiment, considering that for the same recorded object, when determining the object in the image collected by the camera and the sound source object in the sound collected by the microphone, taking the recorded object as a human as an example, the image The center of measurement is the eyes of the recorded object, and the center of sound measurement is the mouth of the recorded object. In order to ensure the accuracy of the subsequent matching results, it is necessary to eliminate errors caused by the difference between the image measurement center and the sound measurement center. When eliminating errors caused by the difference between the image measurement center and the sound measurement center, an error correction parameter can be introduced, and the error correction can be performed through the error correction parameter. At this time, the above step 1033 may specifically include the following steps (not shown in the figure): Step 10331, step 10332, and step 10333, among which,
在步骤10331中,对(x2,α2)与预设误差修正参数δ进行乘积运算,得到修正后的极坐标(δ*x2,δ*α2);In step 10331, multiply (x2, α2) and the preset error correction parameter δ to obtain the corrected polar coordinates (δ*x2, δ*α2);
本申请实施例中,可以针对拍摄对象和声源对象分别设置误差修正参数,在实际应用中,误差修正参数可以由技术人员根据经验设定,可以通过通过对大量样本进行训练得到,本申请实施例对此不作限定。In the embodiments of this application, error correction parameters can be set separately for the shooting object and the sound source object. In practical applications, the error correction parameters can be set by technicians based on experience, and can be obtained by training a large number of samples. The implementation of this application The example does not limit this.
在步骤10332中,根据(x1,α1)、(δ*x2,δ*α2)以及极坐标系下两点间距离公式
Figure PCTCN2020122176-appb-000003
计算得到(x1,α1)与(δ*x2,δ*α2)之间的距离值;
In step 10332, according to (x1, α1), (δ*x2, δ*α2) and the formula for the distance between two points in the polar coordinate system
Figure PCTCN2020122176-appb-000003
Calculate the distance between (x1,α1) and (δ*x2,δ*α2);
在步骤10333中,根据距离值,确定拍摄对象与声源对象的匹配度,其中,距离值与匹配度成反比关系。In step 10333, the degree of matching between the shooting object and the sound source object is determined according to the distance value, where the distance value is inversely proportional to the degree of matching.
由于拍摄对象的空间位置信息和声源对象的空间位置信息可以从很大程度上反映拍摄对象与声源对象之间的相对位置关系,因此本申请实施例中,通过拍摄对象的空间位置信息和声源对象的空间位置信息,对拍摄对象和声源对象进行匹配,可以保证匹配结果的准确性。Since the spatial position information of the photographed object and the spatial position information of the sound source object can reflect the relative positional relationship between the photographed object and the sound source object to a large extent, in this embodiment of the present application, the spatial position information of the photographed object and The spatial position information of the sound source object, matching the shooting object and the sound source object, can ensure the accuracy of the matching result.
本申请实施例中,当拍摄对象的特征信息包括:拍摄对象的外在形象,声源对象的特征信息包括:声源对象的音轨属性时,上述步骤103具体可以包括以下步骤:如果拍摄对象的外在形象与声源对象的音轨属性匹配,则确定拍摄对象与声源对象匹配。In the embodiment of the present application, when the feature information of the shooting object includes the external image of the shooting object, and the feature information of the sound source object includes: the soundtrack attribute of the sound source object, the above step 103 may specifically include the following steps: If the external image of is matched with the soundtrack attribute of the sound source object, it is determined that the shooting object matches the sound source object.
在一个例子中,拍摄对象包括两个,两个拍摄对象的外在形象分别为:一个男生 和一个女生,声源对象也包括两个,两个声源对象的音轨属性分别为:一个女声和一个男声,此时,可以确定外在形象为“男生”的拍摄对象与音轨属性为“男声”的声源对象匹配,确定外在形象为“女生”的拍摄对象与音轨属性为“女声”的声源对象匹配。In an example, there are two shooting objects. The external images of the two shooting objects are: a boy and a girl. The sound source objects also include two. The sound track attributes of the two sound source objects are: a female voice. With a male voice, at this time, it can be determined that the subject with the external image of "boy" matches the sound source object with the soundtrack attribute of "male", and the subject with the external image of "girl" and the audio track attribute are determined to be " The sound source object of "female voice" matches.
在步骤104中,接收针对拍摄对象的选择操作,响应该选择操作,从摄像头采集到的图像包含的拍摄对象中选择第一拍摄对象。In step 104, a selection operation for the shooting object is received, and in response to the selection operation, a first shooting object is selected from the shooting objects included in the image collected by the camera.
本申请实施例中,当用户希望录制得到的最终视频中只包括某一个或某几个拍摄对象的声音时,可以在电子设备上输入对焦对象选择操作。在实际应用中,用户可以通过语音或手动操作的方式,在电子设备上输入对焦对象选择操作。In the embodiment of the present application, when the user wants to record the final video to include only the sound of a certain one or a few shooting objects, he can input a focus object selection operation on the electronic device. In practical applications, the user can input the focus object selection operation on the electronic device through voice or manual operation.
在一个例子中,如图4所示,用户42在使用电子设备40进行视频录制,电子设备40的视频录制画面41中包含三个拍摄对象,用户42可以通过“长按”视频录制画面41中的一个拍摄对象,来将该拍摄对象选择为目标拍摄对象。In an example, as shown in FIG. 4, the user 42 is using the electronic device 40 for video recording. The video recording screen 41 of the electronic device 40 contains three shooting objects. The user 42 can "long press" on the video recording screen 41 To select the subject as the target subject.
在步骤105中,根据所确定的匹配关系,确定麦克风采集到的声音包含的声源对象中与第一拍摄对象匹配的第一声源对象。In step 105, according to the determined matching relationship, a first sound source object matching the first shooting object among the sound source objects included in the sound collected by the microphone is determined.
在步骤106中,对麦克风采集到的声音包含的第二声源对象对应的音轨进行预设第一防干扰处理,并对预设第一防干扰处理得到的声音和摄像头采集到的图像进行合成处理,得到目标视频,其中,第二声源对象为麦克风采集到的声音包含的声源对象中除第一声源对象之外的声源对象。In step 106, perform the preset first anti-interference processing on the sound track corresponding to the second sound source object contained in the sound collected by the microphone, and perform the preset first anti-interference processing on the sound obtained by the preset first anti-interference processing and the image collected by the camera Synthesizing processing to obtain the target video, where the second sound source object is a sound source object other than the first sound source object among the sound source objects included in the sound collected by the microphone.
本申请实施例中,可以将麦克风采集到的声音中第一声源对象对应的音轨之外的音轨(即第二声源对象对应的音轨)进行预设第一防干扰处理,以得到只包含第一声源对象对应的音轨或者主要包含第一声源对象对应的音轨的音频,其中,预设第一防干扰处理可以为消音处理。In the embodiment of the present application, the sound track other than the sound track corresponding to the first sound source object (that is, the sound track corresponding to the second sound source object) among the sounds collected by the microphone can be subjected to the preset first anti-interference processing to Obtain audio that only contains the audio track corresponding to the first sound source object or mainly contains the audio track corresponding to the first sound source object, where the preset first anti-interference processing may be noise cancellation processing.
本申请实施例中,在合成目标视频时,可以保留摄像头采集到的图像中的所有拍摄对象,也可以只保留第一拍摄对象。In the embodiment of the present application, when the target video is synthesized, all the shooting objects in the image collected by the camera may be retained, or only the first shooting object may be retained.
当只保留第一拍摄对象时,上述步骤106具体可以包括以下步骤:When only the first shooting object is retained, the above step 106 may specifically include the following steps:
对摄像头采集到的图像包含的第二拍摄对象所在的图像区域进行预设第二防干扰处理,对预设第二防干扰处理得到的图像和第一预设防干扰处理得到的声音进行合成处理,得到目标视频,其中,第二拍摄对象为摄像头采集到的图像包含的拍摄对象中除第一拍摄对象之外的拍摄对象。Perform a preset second anti-interference process on the image area where the second subject is contained in the image collected by the camera, and perform a synthesis process on the image obtained by the preset second anti-interference process and the sound obtained by the first preset anti-interference process , To obtain the target video, where the second shooting object is a shooting object other than the first shooting object among the shooting objects included in the image collected by the camera.
在实际应用中,预设第二防干扰处理可以包括马赛克处理或模糊处理。In practical applications, the preset second anti-interference processing may include mosaic processing or blurring processing.
由上述实施例可见,该实施例中,在视频录制过程中,可以建立所录制的视频画面中的拍摄对象与所录制的视频声音中的声源对象的匹配关系,当用户选择视频画面中的特定拍摄对象时,根据特定拍摄对象和上述匹配关系,确定与特定拍摄对象匹配的特定声源对象,将所录制的视频声音中特定声源对象之外的声源对象的音轨进行防干扰处理,基于防干扰处理得到的声音和所录制的视频画面生成目标视频,使得不需要通过专业设备进行后期剪辑,就可以得到用户想要的、更加纯净的视频,降低了视频处理成本,简化了视频处理操作。It can be seen from the above embodiment that in this embodiment, during the video recording process, the matching relationship between the shooting object in the recorded video screen and the sound source object in the recorded video sound can be established. When the user selects the When specifying a subject, according to the specific subject and the above matching relationship, determine the specific sound source object that matches the specific subject, and perform anti-interference treatment on the sound track of the sound source object other than the specific sound source object in the recorded video sound , Based on the sound obtained by anti-interference processing and the recorded video screen to generate the target video, so that you can get the purer video that the user wants without the need for professional equipment to perform post-editing, which reduces the video processing cost and simplifies the video Processing operation.
图5是本申请的一个实施例的电子设备的结构框图,如图5所示,电子设备500,可以包括:开启单元501、第一提取单元502、第二提取单元503、匹配单元504、接收单元505、选择单元506、确定单元507、第一处理单元508和第二处理单元509, 其中,FIG. 5 is a structural block diagram of an electronic device according to an embodiment of the present application. As shown in FIG. 5, the electronic device 500 may include: an opening unit 501, a first extracting unit 502, a second extracting unit 503, a matching unit 504, and a receiving unit. The unit 505, the selection unit 506, the determination unit 507, the first processing unit 508, and the second processing unit 509, wherein:
开启单元501,用于当接收到视频录制操作时,开启所述电子设备的摄像头进行图像采集,以及开启所述电子设备的麦克风进行声音采集;The turning on unit 501 is configured to turn on the camera of the electronic device for image collection and turn on the microphone of the electronic device for sound collection when a video recording operation is received;
第一提取单元502,用于确定所述摄像头采集到的图像包含的拍摄对象,并提取所述拍摄对象的特征信息;The first extraction unit 502 is configured to determine a photographic subject contained in the image collected by the camera, and extract characteristic information of the photographic subject;
第二提取单元503,用于确定所述麦克风采集到的声音包含的声源对象,并提取所述声源对象的特征信息,其中,不同的声源对象对应不同的音轨;The second extraction unit 503 is configured to determine the sound source object contained in the sound collected by the microphone, and extract characteristic information of the sound source object, where different sound source objects correspond to different sound tracks;
匹配单元504,用于基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系;The matching unit 504 is configured to match the photographed object and the sound source object based on the characteristic information of the photographed object and the characteristic information of the sound source object to obtain the difference between the photographed object and the sound source object. Matching relationship between;
接收单元505,用于接收针对所述拍摄对象的选择操作;The receiving unit 505 is configured to receive a selection operation for the photographed object;
选择单元506,用于响应所述选择操作,从所述摄像头采集到的图像包含的拍摄对象中选择第一拍摄对象;The selection unit 506 is configured to respond to the selection operation and select a first photographic subject from the photographic subjects contained in the image collected by the camera;
确定单元507,用于根据所述匹配关系,确定所述麦克风采集到的声音包含的声源对象中与所述第一拍摄对象匹配的第一声源对象;The determining unit 507 is configured to determine, according to the matching relationship, a first sound source object that matches the first shooting object among the sound source objects included in the sound collected by the microphone;
第一处理单元508,用于对所述麦克风采集到的声音包含的第二声源对象对应的音轨进行预设第一防干扰处理;The first processing unit 508 is configured to perform preset first anti-interference processing on the sound track corresponding to the second sound source object included in the sound collected by the microphone;
第二处理单元509,用于对所述预设第一防干扰处理得到的声音和所述摄像头采集到的图像进行合成处理,得到目标视频,其中,所述第二声源对象为所述麦克风采集到的声音包含的声源对象中除所述第一声源对象之外的声源对象。The second processing unit 509 is configured to synthesize the sound obtained by the preset first anti-interference processing and the image collected by the camera to obtain a target video, wherein the second sound source object is the microphone Among the sound source objects included in the collected sound, sound source objects other than the first sound source object are included.
由上述实施例可见,该实施例中,在视频录制过程中,可以建立所录制的视频画面中的拍摄对象与所录制的视频声音中的声源对象的匹配关系,当用户选择视频画面中的特定拍摄对象时,根据特定拍摄对象和上述匹配关系,确定与特定拍摄对象匹配的特定声源对象,将所录制的视频声音中特定声源对象之外的声源对象的音轨进行防干扰处理,基于防干扰处理得到的声音和所录制的视频画面生成目标视频,使得不需要通过专业设备进行后期剪辑,就可以得到用户想要的、更加纯净的视频,降低了视频处理成本,简化了视频处理操作。It can be seen from the above embodiment that in this embodiment, during the video recording process, the matching relationship between the shooting object in the recorded video screen and the sound source object in the recorded video sound can be established. When the user selects the When specifying a subject, according to the specific subject and the above matching relationship, determine the specific sound source object that matches the specific subject, and perform anti-interference treatment on the sound track of the sound source object other than the specific sound source object in the recorded video sound , Based on the sound obtained by anti-interference processing and the recorded video screen to generate the target video, so that you can get the purer video that the user wants without the need for professional equipment to perform post-editing, which reduces the video processing cost and simplifies the video Processing operation.
可选地,作为一个实施例,所述拍摄对象的特征信息包括:所述拍摄对象相对于所述电子设备的空间位置信息,所述声源对象的特征信息包括:所述声源对象相对于所述电子设备的空间位置信息。Optionally, as an embodiment, the feature information of the shooting object includes: spatial position information of the shooting object relative to the electronic device, and the feature information of the sound source object includes: The spatial location information of the electronic device.
可选地,作为一个实施例,所述拍摄对象相对于所述电子设备的空间位置信息为:所述拍摄对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x1,α1);Optionally, as an embodiment, the spatial position information of the photographic object relative to the electronic device is: polar coordinates (x1, α1) of the photographic object in a spatial coordinate system with the camera as the origin of the coordinates ;
所述声源对象相对于所述电子设备的空间位置信息为:所述声源对象在以所述麦克风为坐标原点的空间坐标系下的极坐标(y1,β1)。The spatial position information of the sound source object relative to the electronic device is: the polar coordinates (y1, β1) of the sound source object in a spatial coordinate system with the microphone as the origin of the coordinates.
可选地,作为一个实施例,所述匹配单元504,可以包括:Optionally, as an embodiment, the matching unit 504 may include:
第一计算子单元,用于当所述(x1,α1)和所述(y1,β1)位于两个坐标原点之间时,根据所述(y1,β1)和预设第一坐标转换公式
Figure PCTCN2020122176-appb-000004
计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2);
The first calculation subunit is used for when the (x1, α1) and the (y1, β1) are located between the two coordinate origins, according to the (y1, β1) and the preset first coordinate conversion formula
Figure PCTCN2020122176-appb-000004
Calculate the polar coordinates (x2, α2) of the sound source object in a space coordinate system with the camera as the origin of the coordinates;
第二计算子单元,用于当所述(x1,α1)和所述(y1,β1)位于两个坐标原点的同一侧时,根据所述(y1,β1)和预设第二坐标转换公式
Figure PCTCN2020122176-appb-000005
计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2),其中,所述两个坐标原点包括:以所述摄像头作为坐标原点和以所述麦克风作为坐标原点,L为所述麦克风到所述摄像头的距离;
The second calculation subunit is used for when the (x1, α1) and the (y1, β1) are located on the same side of the two coordinate origins, according to the (y1, β1) and the preset second coordinate conversion formula
Figure PCTCN2020122176-appb-000005
Calculate the polar coordinates (x2, α2) of the sound source object in a spatial coordinate system with the camera as the origin of the coordinates, where the two coordinate origins include: taking the camera as the coordinate origin and taking the microphone as the origin As the origin of coordinates, L is the distance from the microphone to the camera;
第三计算子单元,用于根据所述(x1,α1)和所述(x2,α2),计算所述拍摄对象与所述声源对象的匹配度,针对每个拍摄对象,将与所述每个拍摄对象匹配度最高的声源对象确定为匹配的声源对象,得到对应的匹配关系。The third calculation subunit is used to calculate the degree of matching between the shooting object and the sound source object according to the (x1, α1) and the (x2, α2), and for each shooting object, it will match the The sound source object with the highest matching degree of each shooting object is determined as the matched sound source object, and the corresponding matching relationship is obtained.
可选地,作为一个实施例,所述第三计算子单元,可以包括:Optionally, as an embodiment, the third calculation subunit may include:
坐标修正模块,用于对所述(x2,α2)与预设误差修正参数δ进行乘积运算,得到修正后的极坐标(δ*x2,δ*α2);The coordinate correction module is used for multiplying the (x2, α2) and the preset error correction parameter δ to obtain the corrected polar coordinates (δ*x2, δ*α2);
距离计算模块,用于根据所述(x1,α1)、所述(δ*x2,δ*α2)以及极坐标系下两点间距离公式
Figure PCTCN2020122176-appb-000006
计算得到所述(x1,α1)与所述(δ*x2,δ*α2)之间的距离值;
The distance calculation module is used to calculate the distance between two points in the polar coordinate system according to the (x1,α1), the (δ*x2,δ*α2)
Figure PCTCN2020122176-appb-000006
Calculate the distance between the (x1, α1) and the (δ*x2, δ*α2);
匹配度确定模块,用于根据所述距离值,确定所述拍摄对象与所述声源对象的匹配度,其中,距离值与匹配度成反比关系。The matching degree determining module is configured to determine the matching degree between the shooting object and the sound source object according to the distance value, wherein the distance value is in inverse proportion to the matching degree.
可选地,作为一个实施例,所述第二处理单元509,可以包括:Optionally, as an embodiment, the second processing unit 509 may include:
视频合成子单元,用于对所述摄像头采集到的图像包含的第二拍摄对象所在的图像区域进行预设第二防干扰处理,对所述预设第二防干扰处理得到的图像和所述第一预设防干扰处理得到的声音进行合成处理,得到目标视频,其中,所述第二拍摄对象为所述摄像头采集到的图像包含的拍摄对象中除所述第一拍摄对象之外的拍摄对象。The video synthesis subunit is configured to perform a preset second anti-interference process on the image area where the second photographic object contained in the image collected by the camera is located, and perform a preset second anti-interference process on the image obtained by the preset second anti-interference process and the The sound obtained by the first preset anti-interference processing is synthesized to obtain the target video, wherein the second shooting object is a shooting object other than the first shooting object included in the image collected by the camera. Object.
图6是实现本申请各个实施例的一种电子设备的硬件结构示意图,如图6所示,该电子设备600包括但不限于:射频单元601、网络模块602、音频输出单元603、输入单元604、传感器605、显示单元606、用户输入单元607、接口单元608、存储器609、处理器610、以及电源611等部件。本领域技术人员可以理解,图6中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。在本申请实施例中,电子设备包括但不限于手机、平板电脑、笔记本电脑、掌上电脑、车载终端、可穿戴设备、以及计步器等。FIG. 6 is a schematic diagram of the hardware structure of an electronic device that implements each embodiment of the present application. As shown in FIG. 6, the electronic device 600 includes, but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, and an input unit 604 , Sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, processor 610, power supply 611 and other components. Those skilled in the art can understand that the structure of the electronic device shown in FIG. 6 does not constitute a limitation on the electronic device. The electronic device may include more or fewer components than those shown in the figure, or a combination of certain components, or different components. Layout. In the embodiments of the present application, electronic devices include, but are not limited to, mobile phones, tablet computers, notebook computers, palmtop computers, vehicle-mounted terminals, wearable devices, and pedometers.
其中,处理器610,用于当接收到视频录制操作时,开启所述电子设备的摄像头进行图像采集,以及开启所述电子设备的麦克风进行声音采集;确定所述摄像头采集到的图像包含的拍摄对象,并提取所述拍摄对象的特征信息;以及确定所述麦克风采集到的声音包含的声源对象,并提取所述声源对象的特征信息,其中,不同的声源对象对应不同的音轨;基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系;接收针对所述拍摄对象的选择操作;响应所述选择操作,从所述摄像头采集到的图像包含的拍摄对象中选择第一拍摄对象;根据所述匹配关系,确定所述麦克风采集到的声音包含的声源对象中与所述第一拍摄对象匹配的第一声源对象;对所述麦克 风采集到的声音包含的第二声源对象对应的音轨进行预设第一防干扰处理,并对所述预设第一防干扰处理得到的声音和所述摄像头采集到的图像进行合成处理,得到目标视频,其中,所述第二声源对象为所述麦克风采集到的声音包含的声源对象中除所述第一声源对象之外的声源对象。Wherein, the processor 610 is configured to, when a video recording operation is received, turn on the camera of the electronic device for image collection, and turn on the microphone of the electronic device for sound collection; determine that the image collected by the camera includes shooting Object, and extract the characteristic information of the shooting object; and determine the sound source object contained in the sound collected by the microphone, and extract the characteristic information of the sound source object, wherein different sound source objects correspond to different sound tracks Based on the feature information of the shooting object and the feature information of the sound source object, matching the shooting object and the sound source object to obtain the matching relationship between the shooting object and the sound source object; Receiving a selection operation for the shooting object; in response to the selection operation, selecting a first shooting object from the shooting objects contained in the image collected by the camera; determining that the sound collected by the microphone contains The first sound source object matching the first shooting object among the sound source objects; the first anti-interference processing is performed on the sound track corresponding to the second sound source object contained in the sound collected by the microphone, and The sound obtained by the preset first anti-interference processing and the image collected by the camera are synthesized and processed to obtain a target video, wherein the second sound source object is a sound source object contained in the sound collected by the microphone Sound source objects other than the first sound source object.
本申请实施例中,在视频录制过程中,可以建立所录制的视频画面中的拍摄对象与所录制的视频声音中的声源对象的匹配关系,当用户选择视频画面中的特定拍摄对象时,根据特定拍摄对象和上述匹配关系,确定与特定拍摄对象匹配的特定声源对象,将所录制的视频声音中特定声源对象之外的声源对象的音轨进行防干扰处理,基于防干扰处理得到的声音和所录制的视频画面生成目标视频,使得不需要通过专业设备进行后期剪辑,就可以得到用户想要的、更加纯净的视频,降低了视频处理成本,简化了视频处理操作。In the embodiment of the present application, during the video recording process, the matching relationship between the shooting object in the recorded video screen and the sound source object in the recorded video sound can be established. When the user selects a specific shooting object in the video screen, According to the specific shooting object and the above matching relationship, the specific sound source object matching the specific shooting object is determined, and the sound track of the sound source object other than the specific sound source object in the recorded video sound is subjected to anti-interference processing, based on the anti-interference processing The obtained sound and the recorded video screen generate the target video, so that the purer video that the user wants can be obtained without the need for post-editing through professional equipment, which reduces the video processing cost and simplifies the video processing operation.
可选地,作为一个实施例,所述拍摄对象的特征信息包括:所述拍摄对象相对于所述电子设备的空间位置信息,所述声源对象的特征信息包括:所述声源对象相对于所述电子设备的空间位置信息。Optionally, as an embodiment, the feature information of the shooting object includes: spatial position information of the shooting object relative to the electronic device, and the feature information of the sound source object includes: The spatial location information of the electronic device.
可选地,作为一个实施例,所述拍摄对象相对于所述电子设备的空间位置信息为:所述拍摄对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x1,α1);Optionally, as an embodiment, the spatial position information of the photographic object relative to the electronic device is: polar coordinates (x1, α1) of the photographic object in a spatial coordinate system with the camera as the origin of the coordinates ;
所述声源对象相对于所述电子设备的空间位置信息为:所述声源对象在以所述麦克风为坐标原点的空间坐标系下的极坐标(y1,β1)。The spatial position information of the sound source object relative to the electronic device is: the polar coordinates (y1, β1) of the sound source object in a spatial coordinate system with the microphone as the origin of the coordinates.
可选地,作为一个实施例,所述基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系,包括:Optionally, as an embodiment, the photographing object and the sound source object are matched based on the characteristic information of the photographing object and the characteristic information of the sound source object to obtain the photographing object and the sound source object. Describe the matching relationship between the sound source objects, including:
当所述(x1,α1)和所述(y1,β1)位于两个坐标原点之间时,根据所述(y1,β1)和预设第一坐标转换公式
Figure PCTCN2020122176-appb-000007
计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2);
When the (x1, α1) and the (y1, β1) are located between the two coordinate origins, according to the (y1, β1) and the preset first coordinate conversion formula
Figure PCTCN2020122176-appb-000007
Calculate the polar coordinates (x2, α2) of the sound source object in a space coordinate system with the camera as the origin of the coordinates;
当所述(x1,α1)和所述(y1,β1)位于两个坐标原点的同一侧时,根据所述(y1,β1)和预设第二坐标转换公式
Figure PCTCN2020122176-appb-000008
计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2),其中,所述两个坐标原点包括:以所述摄像头作为坐标原点和以所述麦克风作为坐标原点,L为所述麦克风到所述摄像头的距离;
When the (x1, α1) and the (y1, β1) are located on the same side of the two coordinate origins, according to the (y1, β1) and the preset second coordinate conversion formula
Figure PCTCN2020122176-appb-000008
Calculate the polar coordinates (x2, α2) of the sound source object in a spatial coordinate system with the camera as the origin of the coordinates, where the two coordinate origins include: taking the camera as the coordinate origin and taking the microphone as the origin As the origin of coordinates, L is the distance from the microphone to the camera;
根据所述(x1,α1)和所述(x2,α2),计算所述拍摄对象与所述声源对象的匹配度,针对每个拍摄对象,将与所述每个拍摄对象匹配度最高的声源对象确定为匹配的声源对象,得到对应的匹配关系。According to the (x1, α1) and the (x2, α2), the degree of matching between the shooting object and the sound source object is calculated, and for each shooting object, the highest matching degree with each shooting object The sound source object is determined as the matched sound source object, and the corresponding matching relationship is obtained.
可选地,作为一个实施例,所述根据所述(x1,α1)和所述(x2,α2),计算所述拍摄对象与所述声源对象的匹配度,针对每个拍摄对象,将与所述每个拍摄对象匹配度最高的声源对象确定为匹配的声源对象,得到对应的匹配关系,包括:Optionally, as an embodiment, the degree of matching between the shooting object and the sound source object is calculated according to the (x1, α1) and the (x2, α2), and for each shooting object, The sound source object with the highest degree of matching with each shooting object is determined as the matched sound source object, and the corresponding matching relationship is obtained, including:
对所述(x2,α2)与预设误差修正参数δ进行乘积运算,得到修正后的极坐标(δ*x2,δ*α2);Perform a product operation on the (x2, α2) and the preset error correction parameter δ to obtain the corrected polar coordinates (δ*x2, δ*α2);
根据所述(x1,α1)、所述(δ*x2,δ*α2)以及极坐标系下两点间距离公式
Figure PCTCN2020122176-appb-000009
计算得到所述(x1,α1)与所述(δ*x2,δ*α2)之间的距离值;
According to the (x1, α1), the (δ*x2, δ*α2) and the formula for the distance between two points in the polar coordinate system
Figure PCTCN2020122176-appb-000009
Calculate the distance between the (x1, α1) and the (δ*x2, δ*α2);
根据所述距离值,确定所述拍摄对象与所述声源对象的匹配度,其中,距离值与匹配度成反比关系。According to the distance value, the degree of matching between the shooting object and the sound source object is determined, wherein the distance value is inversely proportional to the degree of matching.
可选地,作为一个实施例,所述对所述预设第一防干扰处理得到的声音和所述摄像头采集到的图像进行合成处理,得到目标视频,包括:Optionally, as an embodiment, the synthesizing the sound obtained by the preset first anti-interference processing and the image collected by the camera to obtain the target video includes:
对所述摄像头采集到的图像包含的第二拍摄对象所在的图像区域进行预设第二防干扰处理,对所述预设第二防干扰处理得到的图像和所述第一预设防干扰处理得到的声音进行合成处理,得到目标视频,其中,所述第二拍摄对象为所述摄像头采集到的图像包含的拍摄对象中除所述第一拍摄对象之外的拍摄对象。Perform a preset second anti-interference process on the image area where the second photographic object contained in the image collected by the camera is located, and perform a preset second anti-interference process on the image obtained by the preset second anti-interference process and the first preset anti-interference process The obtained sound is synthesized to obtain a target video, wherein the second shooting object is a shooting object other than the first shooting object among the shooting objects included in the image collected by the camera.
应理解的是,本申请实施例中,射频单元601可用于收发信息或通话过程中,信号的接收和发送,具体的,将来自基站的下行数据接收后,给处理器610处理;另外,将上行的数据发送给基站。通常,射频单元601包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器、双工器等。此外,射频单元601还可以通过无线通信系统与网络和其他设备通信。It should be understood that, in the embodiment of the present application, the radio frequency unit 601 can be used for receiving and sending signals in the process of sending and receiving information or talking. Specifically, after receiving the downlink data from the base station, it is sent to the processor 610 for processing; Uplink data is sent to the base station. Generally, the radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 601 can also communicate with the network and other devices through a wireless communication system.
电子设备通过网络模块602为用户提供了无线的宽带互联网访问,如帮助用户收发电子邮件、浏览网页和访问流式媒体等。The electronic device provides users with wireless broadband Internet access through the network module 602, such as helping users to send and receive emails, browse web pages, and access streaming media.
音频输出单元603可以将射频单元601或网络模块602接收的或者在存储器609中存储的音频数据转换成音频信号并且输出为声音。而且,音频输出单元603还可以提供与电子设备600执行的特定功能相关的音频输出(例如,呼叫信号接收声音、消息接收声音等等)。音频输出单元603包括扬声器、蜂鸣器以及受话器等。The audio output unit 603 can convert the audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into audio signals and output them as sounds. Moreover, the audio output unit 603 may also provide audio output related to a specific function performed by the electronic device 600 (for example, call signal reception sound, message reception sound, etc.). The audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.
输入单元604用于接收音频或视频信号。输入单元604可以包括图形处理器(Graphics Processing Unit,GPU)6041和麦克风6042,图形处理器6041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。处理后的图像可以显示在显示单元606上。经图形处理器6041处理后的图像可以存储在存储器609(或其它存储介质)中或者经由射频单元601或网络模块602进行发送。麦克风6042可以接收声音,并且能够将这样的声音处理为音频数据。处理后的音频数据可以在电话通话模式的情况下转换为可经由射频单元601发送到移动通信基站的格式输出。The input unit 604 is used to receive audio or video signals. The input unit 604 may include a graphics processing unit (GPU) 6041 and a microphone 6042. The graphics processor 6041 is configured to monitor images of still pictures or videos obtained by an image capture device (such as a camera) in a video capture mode or an image capture mode. The data is processed. The processed image may be displayed on the display unit 606. The image processed by the graphics processor 6041 may be stored in the memory 609 (or other storage medium) or sent via the radio frequency unit 601 or the network module 602. The microphone 6042 can receive sound, and can process such sound into audio data. The processed audio data can be converted into a format that can be sent to the mobile communication base station via the radio frequency unit 601 for output in the case of a telephone call mode.
电子设备600还包括至少一种传感器605,比如光传感器、运动传感器以及其他传感器。具体地,光传感器包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板6061的亮度,接近传感器可在电子设备600移动到耳边时,关闭显示面板6061和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别电子设备姿态(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;传感器605还可以包括指纹传感器、压力传感 器、虹膜传感器、分子传感器、陀螺仪、气压计、湿度计、温度计、红外线传感器等,在此不再赘述。The electronic device 600 further includes at least one sensor 605, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor. The ambient light sensor can adjust the brightness of the display panel 6061 according to the brightness of the ambient light. The proximity sensor can close the display panel 6061 and the display panel 6061 when the electronic device 600 is moved to the ear. / Or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three axes), and can detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of electronic devices (such as horizontal and vertical screen switching, related games) , Magnetometer attitude calibration), vibration recognition related functions (such as pedometer, percussion), etc.; sensor 605 can also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, Infrared sensors, etc., will not be repeated here.
显示单元606用于显示由用户输入的信息或提供给用户的信息。显示单元606可包括显示面板6061,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板6061。The display unit 606 is used to display information input by the user or information provided to the user. The display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc.
用户输入单元607可用于接收输入的数字或字符信息,以及产生与电子设备的用户设置以及功能控制有关的键信号输入。具体地,用户输入单元607包括触控面板6071以及其他输入设备6072。触控面板6071,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板6071上或在触控面板6071附近的操作)。触控面板6071可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器610,接收处理器610发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板6071。除了触控面板6071,用户输入单元607还可以包括其他输入设备6072。具体地,其他输入设备6072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。The user input unit 607 may be used to receive inputted numeric or character information, and generate key signal input related to user settings and function control of the electronic device. Specifically, the user input unit 607 includes a touch panel 6071 and other input devices 6072. The touch panel 6071, also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 6071 or near the touch panel 6071. operating). The touch panel 6071 may include two parts: a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 610, the command sent by the processor 610 is received and executed. In addition, the touch panel 6071 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 6071, the user input unit 607 may also include other input devices 6072. Specifically, other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick, which will not be repeated here.
进一步的,触控面板6071可覆盖在显示面板6061上,当触控面板6071检测到在其上或附近的触摸操作后,传送给处理器610以确定触摸事件的类型,随后处理器610根据触摸事件的类型在显示面板6061上提供相应的视觉输出。虽然在图6中,触控面板6071与显示面板6061是作为两个独立的部件来实现电子设备的输入和输出功能,但是在某些实施例中,可以将触控面板6071与显示面板6061集成而实现电子设备的输入和输出功能,具体此处不做限定。Further, the touch panel 6071 can cover the display panel 6061. When the touch panel 6071 detects a touch operation on or near it, it transmits it to the processor 610 to determine the type of the touch event, and then the processor 610 determines the type of touch event according to the touch. The type of event provides corresponding visual output on the display panel 6061. Although in FIG. 6, the touch panel 6071 and the display panel 6061 are used as two independent components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 6071 and the display panel 6061 can be integrated The implementation of the input and output functions of the electronic device is not specifically limited here.
接口单元608为外部装置与电子设备600连接的接口。例如,外部装置可以包括有线或无线头戴式耳机端口、外部电源(或电池充电器)端口、有线或无线数据端口、存储卡端口、用于连接具有识别模块的装置的端口、音频输入/输出(I/O)端口、视频I/O端口、耳机端口等等。接口单元608可以用于接收来自外部装置的输入(例如,数据信息、电力等等)并且将接收到的输入传输到电子设备600内的一个或多个元件或者可以用于在电子设备600和外部装置之间传输数据。The interface unit 608 is an interface for connecting an external device and the electronic device 600. For example, the external device may include a wired or wireless headset port, an external power source (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, audio input/output (I/O) port, video I/O port, headphone port, etc. The interface unit 608 can be used to receive input (for example, data information, power, etc.) from an external device and transmit the received input to one or more elements in the electronic device 600 or can be used to connect the electronic device 600 to an external device. Transfer data between devices.
存储器609可用于存储软件程序以及各种数据。存储器609可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器609可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 609 can be used to store software programs and various data. The memory 609 may mainly include a storage program area and a storage data area. The storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of mobile phones, etc. In addition, the memory 609 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
处理器610是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器609内的软件程序和/或模块,以及调用存储在存储器609内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控。处理器610可包括一个或多个处理单元;优选地,处理器610可集成应用处 理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器610中。The processor 610 is the control center of the electronic device. It uses various interfaces and lines to connect the various parts of the entire electronic device, runs or executes the software programs and/or modules stored in the memory 609, and calls the data stored in the memory 609. , Perform various functions of electronic equipment and process data, so as to monitor the electronic equipment as a whole. The processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc., and the modem The processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 610.
电子设备600还可以包括给各个部件供电的电源611(比如电池),优选地,电源611可以通过电源管理系统与处理器610逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The electronic device 600 may also include a power supply 611 (such as a battery) for supplying power to various components. Preferably, the power supply 611 may be logically connected to the processor 610 through a power management system, so as to manage charging, discharging, and power consumption management through the power management system And other functions.
另外,电子设备600包括一些未示出的功能模块,在此不再赘述。In addition, the electronic device 600 includes some functional modules not shown, which will not be repeated here.
优选地,本申请实施例还提供了一种电子设备,包括处理器610,存储器609,存储在存储器609上并可在所述处理器610上运行的计算机程序,该计算机程序被处理器610执行时实现上述视频处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。Preferably, an embodiment of the present application also provides an electronic device, including a processor 610, a memory 609, a computer program stored in the memory 609 and running on the processor 610, and the computer program is executed by the processor 610 Each process of the foregoing video processing method embodiment can be realized at a time, and the same technical effect can be achieved. In order to avoid repetition, details are not repeated here.
本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述视频处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。其中,所述的计算机可读存储介质,如只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等。The embodiment of the present application also provides a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, each process of the above-mentioned video processing method embodiment is realized, and the same Technical effects, in order to avoid repetition, I will not repeat them here. Wherein, the computer-readable storage medium, such as read-only memory (Read-Only Memory, ROM for short), random access memory (Random Access Memory, RAM for short), magnetic disk, or optical disk, etc.
需要说明的是,在本说明书中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that in this specification, the terms "including", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements , But also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or device. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or device that includes the element.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the related technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) ) Includes several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。The embodiments of the application are described above with reference to the accompanying drawings, but the application is not limited to the above-mentioned specific embodiments. The above-mentioned specific embodiments are only illustrative and not restrictive. Those of ordinary skill in the art are Under the enlightenment of this application, many forms can be made without departing from the purpose of this application and the scope of protection of the claims, all of which fall within the protection of this application.

Claims (14)

  1. 一种视频处理方法,应用于电子设备,所述方法包括:A video processing method applied to an electronic device, the method including:
    当接收到视频录制操作时,开启所述电子设备的摄像头进行图像采集,以及开启所述电子设备的麦克风进行声音采集;When a video recording operation is received, turning on the camera of the electronic device for image collection, and turning on the microphone of the electronic device for sound collection;
    确定所述摄像头采集到的图像包含的拍摄对象,并提取所述拍摄对象的特征信息;以及确定所述麦克风采集到的声音包含的声源对象,并提取所述声源对象的特征信息,其中,不同的声源对象对应不同的音轨;Determine the subject included in the image collected by the camera, and extract characteristic information of the subject; and determine the sound source object included in the sound collected by the microphone, and extract the characteristic information of the sound source object, where , Different sound source objects correspond to different audio tracks;
    基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系;Matching the photographed object and the sound source object based on the characteristic information of the photographed object and the characteristic information of the sound source object to obtain a matching relationship between the photographed object and the sound source object;
    接收针对所述拍摄对象的选择操作;Receiving a selection operation for the shooting object;
    响应所述选择操作,从所述摄像头采集到的图像包含的拍摄对象中选择第一拍摄对象;In response to the selection operation, select a first photographic subject from the photographic subjects contained in the image collected by the camera;
    根据所述匹配关系,确定所述麦克风采集到的声音包含的声源对象中与所述第一拍摄对象匹配的第一声源对象;Determine, according to the matching relationship, a first sound source object that matches the first shooting object among sound source objects included in the sound collected by the microphone;
    对所述麦克风采集到的声音包含的第二声源对象对应的音轨进行预设第一防干扰处理,并对所述预设第一防干扰处理得到的声音和所述摄像头采集到的图像进行合成处理,得到目标视频,其中,所述第二声源对象为所述麦克风采集到的声音包含的声源对象中除所述第一声源对象之外的声源对象。Perform preset first anti-interference processing on the sound track corresponding to the second sound source object contained in the sound collected by the microphone, and perform the preset first anti-interference processing on the sound obtained by the preset first anti-interference processing and the image collected by the camera The synthesis process is performed to obtain the target video, wherein the second sound source object is a sound source object other than the first sound source object among the sound source objects included in the sound collected by the microphone.
  2. 根据权利要求1所述的方法,其中,所述拍摄对象的特征信息包括:所述拍摄对象相对于所述电子设备的空间位置信息,所述声源对象的特征信息包括:所述声源对象相对于所述电子设备的空间位置信息。The method according to claim 1, wherein the characteristic information of the photographed object comprises: spatial position information of the photographed object relative to the electronic device, and the characteristic information of the sound source object comprises: the sound source object Relative to the spatial location information of the electronic device.
  3. 根据权利要求2所述的方法,其中,所述拍摄对象相对于所述电子设备的空间位置信息为:所述拍摄对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x1,α1);The method according to claim 2, wherein the spatial position information of the photographic object relative to the electronic device is: polar coordinates (x1, α1);
    所述声源对象相对于所述电子设备的空间位置信息为:所述声源对象在以所述麦克风为坐标原点的空间坐标系下的极坐标(y1,β1)。The spatial position information of the sound source object relative to the electronic device is: the polar coordinates (y1, β1) of the sound source object in a spatial coordinate system with the microphone as the origin of the coordinates.
  4. 根据权利要求3所述的方法,其中,所述基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系,包括:The method according to claim 3, wherein the matching of the shooting object and the sound source object is performed based on the feature information of the shooting object and the feature information of the sound source object to obtain the shooting object The matching relationship with the sound source object includes:
    当所述(x1,α1)和所述(y1,β1)位于两个坐标原点之间时,根据所述(y1,β1)和预设第一坐标转换公式
    Figure PCTCN2020122176-appb-100001
    计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2);
    When the (x1, α1) and the (y1, β1) are located between the two coordinate origins, according to the (y1, β1) and the preset first coordinate conversion formula
    Figure PCTCN2020122176-appb-100001
    Calculate the polar coordinates (x2, α2) of the sound source object in a space coordinate system with the camera as the origin of the coordinates;
    当所述(x1,α1)和所述(y1,β1)位于两个坐标原点的同一侧时,根据所述(y1,β1)和预设第二坐标转换公式
    Figure PCTCN2020122176-appb-100002
    计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2),其中,所述两个坐标原点包 括:以所述摄像头作为坐标原点和以所述麦克风作为坐标原点,L为所述麦克风到所述摄像头的距离;
    When the (x1, α1) and the (y1, β1) are located on the same side of the two coordinate origins, according to the (y1, β1) and the preset second coordinate conversion formula
    Figure PCTCN2020122176-appb-100002
    Calculate the polar coordinates (x2, α2) of the sound source object in the spatial coordinate system with the camera as the coordinate origin, where the two coordinate origins include: taking the camera as the coordinate origin and taking the microphone as the origin As the origin of coordinates, L is the distance from the microphone to the camera;
    根据所述(x1,α1)和所述(x2,α2),计算所述拍摄对象与所述声源对象的匹配度,针对每个拍摄对象,将与所述每个拍摄对象匹配度最高的声源对象确定为匹配的声源对象,得到对应的匹配关系。According to the (x1, α1) and the (x2, α2), the degree of matching between the shooting object and the sound source object is calculated, and for each shooting object, the highest matching degree with each shooting object The sound source object is determined as the matched sound source object, and the corresponding matching relationship is obtained.
  5. 根据权利要求4所述的方法,其中,所述根据所述(x1,α1)和所述(x2,α2),计算所述拍摄对象与所述声源对象的匹配度,针对每个拍摄对象,将与所述每个拍摄对象匹配度最高的声源对象确定为匹配的声源对象,得到对应的匹配关系,包括:The method according to claim 4, wherein the calculation of the matching degree between the shooting object and the sound source object according to the (x1, α1) and the (x2, α2) is for each shooting object , Determining the sound source object with the highest degree of matching with each of the shooting objects as the matching sound source object, and obtaining the corresponding matching relationship, including:
    对所述(x2,α2)与预设误差修正参数δ进行乘积运算,得到修正后的极坐标(δ*x2,δ*α2);Perform a product operation on the (x2, α2) and the preset error correction parameter δ to obtain the corrected polar coordinates (δ*x2, δ*α2);
    根据所述(x1,α1)、所述(δ*x2,δ*α2)以及极坐标系下两点间距离公式
    Figure PCTCN2020122176-appb-100003
    计算得到所述(x1,α1)与所述(δ*x2,δ*α2)之间的距离值;
    According to the (x1, α1), the (δ*x2, δ*α2) and the formula for the distance between two points in the polar coordinate system
    Figure PCTCN2020122176-appb-100003
    Calculate the distance between the (x1, α1) and the (δ*x2, δ*α2);
    根据所述距离值,确定所述拍摄对象与所述声源对象的匹配度,其中,距离值与匹配度成反比关系。According to the distance value, the degree of matching between the shooting object and the sound source object is determined, wherein the distance value is inversely proportional to the degree of matching.
  6. 根据权利要求1所述的方法,其中,所述对所述预设第一防干扰处理得到的声音和所述摄像头采集到的图像进行合成处理,得到目标视频,包括:The method according to claim 1, wherein the synthesizing the sound obtained by the preset first anti-interference processing and the image collected by the camera to obtain the target video comprises:
    对所述摄像头采集到的图像包含的第二拍摄对象所在的图像区域进行预设第二防干扰处理,对所述预设第二防干扰处理得到的图像和所述第一预设防干扰处理得到的声音进行合成处理,得到目标视频,其中,所述第二拍摄对象为所述摄像头采集到的图像包含的拍摄对象中除所述第一拍摄对象之外的拍摄对象。Perform a preset second anti-interference process on the image area where the second photographic object contained in the image collected by the camera is located, and perform a preset second anti-interference process on the image obtained by the preset second anti-interference process and the first preset anti-interference process The obtained sound is synthesized to obtain a target video, wherein the second shooting object is a shooting object other than the first shooting object among the shooting objects included in the image collected by the camera.
  7. 一种电子设备,所述电子设备包括:An electronic device, the electronic device comprising:
    开启单元,用于当接收到视频录制操作时,开启所述电子设备的摄像头进行图像采集,以及开启所述电子设备的麦克风进行声音采集;The opening unit is configured to, when a video recording operation is received, turn on the camera of the electronic device for image collection, and turn on the microphone of the electronic device for sound collection;
    第一提取单元,用于确定所述摄像头采集到的图像包含的拍摄对象,并提取所述拍摄对象的特征信息;The first extraction unit is configured to determine the photographic subject contained in the image collected by the camera, and extract characteristic information of the photographic subject;
    第二提取单元,用于确定所述麦克风采集到的声音包含的声源对象,并提取所述声源对象的特征信息,其中,不同的声源对象对应不同的音轨;The second extraction unit is configured to determine the sound source object contained in the sound collected by the microphone, and extract characteristic information of the sound source object, where different sound source objects correspond to different sound tracks;
    匹配单元,用于基于所述拍摄对象的特征信息和所述声源对象的特征信息,对所述拍摄对象和所述声源对象进行匹配,得到所述拍摄对象与所述声源对象之间的匹配关系;The matching unit is configured to match the photographed object and the sound source object based on the characteristic information of the photographed object and the characteristic information of the sound source object to obtain a relationship between the photographed object and the sound source object. The matching relationship;
    接收单元,用于接收针对所述拍摄对象的选择操作;A receiving unit, configured to receive a selection operation for the photographed object;
    选择单元,用于响应所述选择操作,从所述摄像头采集到的图像包含的拍摄对象中选择第一拍摄对象;The selection unit is configured to respond to the selection operation and select a first photographic subject from the photographic subjects contained in the image collected by the camera;
    确定单元,用于根据所述匹配关系,确定所述麦克风采集到的声音包含的声源对象中与所述第一拍摄对象匹配的第一声源对象;A determining unit, configured to determine, according to the matching relationship, a first sound source object that matches the first shooting object among sound source objects included in the sound collected by the microphone;
    第一处理单元,用于对所述麦克风采集到的声音包含的第二声源对象对应的音轨进行预设第一防干扰处理;The first processing unit is configured to perform preset first anti-interference processing on the sound track corresponding to the second sound source object contained in the sound collected by the microphone;
    第二处理单元,用于对所述预设第一防干扰处理得到的声音和所述摄像头采集到 的图像进行合成处理,得到目标视频,其中,所述第二声源对象为所述麦克风采集到的声音包含的声源对象中除所述第一声源对象之外的声源对象。The second processing unit is configured to synthesize the sound obtained by the preset first anti-interference processing and the image collected by the camera to obtain a target video, wherein the second sound source object is collected by the microphone The sound source objects included in the received sound are sound source objects other than the first sound source object.
  8. 根据权利要求7所述的电子设备,其中,所述拍摄对象的特征信息包括:所述拍摄对象相对于所述电子设备的空间位置信息,所述声源对象的特征信息包括:所述声源对象相对于所述电子设备的空间位置信息。8. The electronic device according to claim 7, wherein the characteristic information of the photographic object comprises: spatial position information of the photographic object relative to the electronic device, and the characteristic information of the sound source object comprises: the sound source The spatial position information of the object relative to the electronic device.
  9. 根据权利要求8所述的电子设备,其中,所述拍摄对象相对于所述电子设备的空间位置信息为:所述拍摄对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x1,α1);8. The electronic device according to claim 8, wherein the spatial position information of the photographic object relative to the electronic device is: polar coordinates (x1) of the photographic object in a spatial coordinate system with the camera as the origin of the coordinates. ,Α1);
    所述声源对象相对于所述电子设备的空间位置信息为:所述声源对象在以所述麦克风为坐标原点的空间坐标系下的极坐标(y1,β1)。The spatial position information of the sound source object relative to the electronic device is: the polar coordinates (y1, β1) of the sound source object in a spatial coordinate system with the microphone as the origin of the coordinates.
  10. 根据权利要求9所述的电子设备,其中,所述匹配单元包括:The electronic device according to claim 9, wherein the matching unit comprises:
    第一计算子单元,用于当所述(x1,α1)和所述(y1,β1)位于两个坐标原点之间时,根据所述(y1,β1)和预设第一坐标转换公式
    Figure PCTCN2020122176-appb-100004
    计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2);
    The first calculation subunit is used for when the (x1, α1) and the (y1, β1) are located between the two coordinate origins, according to the (y1, β1) and the preset first coordinate conversion formula
    Figure PCTCN2020122176-appb-100004
    Calculate the polar coordinates (x2, α2) of the sound source object in a space coordinate system with the camera as the origin of the coordinates;
    第二计算子单元,用于当所述(x1,α1)和所述(y1,β1)位于两个坐标原点的同一侧时,根据所述(y1,β1)和预设第二坐标转换公式
    Figure PCTCN2020122176-appb-100005
    计算所述声源对象在以所述摄像头为坐标原点的空间坐标系下的极坐标(x2,α2),其中,所述两个坐标原点包括:以所述摄像头作为坐标原点和以所述麦克风作为坐标原点,L为所述麦克风到所述摄像头的距离;
    The second calculation subunit is used for when the (x1, α1) and the (y1, β1) are located on the same side of the two coordinate origins, according to the (y1, β1) and the preset second coordinate conversion formula
    Figure PCTCN2020122176-appb-100005
    Calculate the polar coordinates (x2, α2) of the sound source object in a spatial coordinate system with the camera as the origin of the coordinates, where the two coordinate origins include: taking the camera as the coordinate origin and taking the microphone as the origin As the origin of coordinates, L is the distance from the microphone to the camera;
    第三计算子单元,用于根据所述(x1,α1)和所述(x2,α2),计算所述拍摄对象与所述声源对象的匹配度,针对每个拍摄对象,将与所述每个拍摄对象匹配度最高的声源对象确定为匹配的声源对象,得到对应的匹配关系。The third calculation subunit is used to calculate the degree of matching between the shooting object and the sound source object according to the (x1, α1) and the (x2, α2), and for each shooting object, it will match the The sound source object with the highest matching degree of each shooting object is determined as the matched sound source object, and the corresponding matching relationship is obtained.
  11. 根据权利要求10所述的电子设备,其中,所述第三计算子单元包括:The electronic device according to claim 10, wherein the third calculation subunit comprises:
    坐标修正模块,用于对所述(x2,α2)与预设误差修正参数δ进行乘积运算,得到修正后的极坐标(δ*x2,δ*α2);The coordinate correction module is used for multiplying the (x2, α2) and the preset error correction parameter δ to obtain the corrected polar coordinates (δ*x2, δ*α2);
    距离计算模块,用于根据所述(x1,α1)、所述(δ*x2,δ*α2)以及极坐标系下两点间距离公式
    Figure PCTCN2020122176-appb-100006
    计算得到所述(x1,α1)与所述(δ*x2,δ*α2)之间的距离值;
    The distance calculation module is used to calculate the distance between two points in the polar coordinate system according to the (x1,α1), the (δ*x2,δ*α2)
    Figure PCTCN2020122176-appb-100006
    Calculate the distance between the (x1, α1) and the (δ*x2, δ*α2);
    匹配度确定模块,用于根据所述距离值,确定所述拍摄对象与所述声源对象的匹配度,其中,距离值与匹配度成反比关系。The matching degree determining module is configured to determine the matching degree between the shooting object and the sound source object according to the distance value, wherein the distance value is in inverse proportion to the matching degree.
  12. 根据权利要求11所述的电子设备,其中,所述第二处理单元包括:The electronic device according to claim 11, wherein the second processing unit comprises:
    视频合成子单元,用于对所述摄像头采集到的图像包含的第二拍摄对象所在的图像区域进行预设第二防干扰处理,对所述预设第二防干扰处理得到的图像和所述第一预设防干扰处理得到的声音进行合成处理,得到目标视频,其中,所述第二拍摄对象为所述摄像头采集到的图像包含的拍摄对象中除所述第一拍摄对象之外的拍摄对象。The video synthesis subunit is configured to perform a preset second anti-interference process on the image area where the second photographic object contained in the image collected by the camera is located, and perform a preset second anti-interference process on the image obtained by the preset second anti-interference process and the The sound obtained by the first preset anti-interference processing is synthesized to obtain the target video, wherein the second shooting object is a shooting object other than the first shooting object included in the image collected by the camera. Object.
  13. 一种电子设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至6任一项所述的视频处理方法的步骤。An electronic device comprising a processor, a memory, and a computer program stored on the memory and capable of running on the processor. The computer program is executed by the processor to implement any one of claims 1 to 6 The steps of the video processing method described in the item.
  14. 一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的视频处理方法的步骤。A computer-readable storage medium storing a computer program on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the video processing method according to any one of claims 1 to 6 are realized.
PCT/CN2020/122176 2019-10-21 2020-10-20 Video processing method and electronic device WO2021078116A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911002660.2A CN110740259B (en) 2019-10-21 2019-10-21 Video processing method and electronic equipment
CN201911002660.2 2019-10-21

Publications (1)

Publication Number Publication Date
WO2021078116A1 true WO2021078116A1 (en) 2021-04-29

Family

ID=69270736

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/122176 WO2021078116A1 (en) 2019-10-21 2020-10-20 Video processing method and electronic device

Country Status (2)

Country Link
CN (1) CN110740259B (en)
WO (1) WO2021078116A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114245156A (en) * 2021-11-30 2022-03-25 广州繁星互娱信息科技有限公司 Live broadcast audio adjusting method and device, storage medium and electronic equipment
CN115174959A (en) * 2022-06-21 2022-10-11 咪咕文化科技有限公司 Video 3D sound effect setting method and device
CN116866720A (en) * 2023-09-04 2023-10-10 国网山东省电力公司东营供电公司 Camera angle self-adaptive regulation and control method, system and terminal based on sound source localization

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110740259B (en) * 2019-10-21 2021-06-25 维沃移动通信有限公司 Video processing method and electronic equipment
CN113365012A (en) * 2020-03-06 2021-09-07 华为技术有限公司 Audio processing method and device
CN111641794B (en) * 2020-05-25 2023-03-28 维沃移动通信有限公司 Sound signal acquisition method and electronic equipment
CN111629164B (en) * 2020-05-29 2021-09-14 联想(北京)有限公司 Video recording generation method and electronic equipment
CN111918127B (en) * 2020-07-02 2023-04-07 影石创新科技股份有限公司 Video clipping method and device, computer readable storage medium and camera
CN111863002A (en) * 2020-07-06 2020-10-30 Oppo广东移动通信有限公司 Processing method, processing device and electronic equipment
CN112416229A (en) * 2020-11-26 2021-02-26 维沃移动通信有限公司 Audio content adjusting method and device and electronic equipment
CN114827448A (en) * 2021-01-29 2022-07-29 华为技术有限公司 Video recording method and electronic equipment
CN115225840A (en) * 2021-04-17 2022-10-21 华为技术有限公司 Video recording method and electronic equipment
CN113473057B (en) * 2021-05-20 2023-03-03 华为技术有限公司 Video recording method and electronic equipment
CN113573120B (en) * 2021-06-16 2023-10-27 北京荣耀终端有限公司 Audio processing method, electronic device, chip system and storage medium
CN113676668A (en) * 2021-08-24 2021-11-19 维沃移动通信有限公司 Video shooting method and device, electronic equipment and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888703A (en) * 2014-03-28 2014-06-25 深圳市中兴移动通信有限公司 Shooting method and camera shooting device with recording enhanced
US8913761B2 (en) * 2009-10-30 2014-12-16 Samsung Electronics Co., Ltd. Sound source recording apparatus and method adaptable to operating environment
CN105578097A (en) * 2015-07-10 2016-05-11 宇龙计算机通信科技(深圳)有限公司 Video recording method and terminal
CN106653041A (en) * 2017-01-17 2017-05-10 北京地平线信息技术有限公司 Audio signal processing equipment and method as well as electronic equipment
CN107004426A (en) * 2014-11-28 2017-08-01 华为技术有限公司 The method and mobile terminal of the sound of admission video recording object
CN107993671A (en) * 2017-12-04 2018-05-04 南京地平线机器人技术有限公司 Sound processing method, device and electronic equipment
CN109506568A (en) * 2018-12-29 2019-03-22 苏州思必驰信息科技有限公司 A kind of sound localization method and device based on image recognition and speech recognition
CN110740259A (en) * 2019-10-21 2020-01-31 维沃移动通信有限公司 Video processing method and electronic equipment

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010154260A (en) * 2008-12-25 2010-07-08 Victor Co Of Japan Ltd Voice recognition device
CN103916723B (en) * 2013-01-08 2018-08-10 联想(北京)有限公司 A kind of sound collection method and a kind of electronic equipment
CN105245811B (en) * 2015-10-16 2018-03-27 广东欧珀移动通信有限公司 A kind of kinescope method and device
CN106791442B (en) * 2017-01-20 2019-11-15 维沃移动通信有限公司 A kind of image pickup method and mobile terminal
CN108871310A (en) * 2017-05-12 2018-11-23 中华映管股份有限公司 Thermal image positioning system and localization method
CN109683135A (en) * 2018-12-28 2019-04-26 科大讯飞股份有限公司 A kind of sound localization method and device, target capturing system
CN110225256B (en) * 2019-06-28 2021-02-09 Oppo广东移动通信有限公司 Device imaging method and device, storage medium and electronic device
CN110213492B (en) * 2019-06-28 2021-03-02 Oppo广东移动通信有限公司 Device imaging method and device, storage medium and electronic device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8913761B2 (en) * 2009-10-30 2014-12-16 Samsung Electronics Co., Ltd. Sound source recording apparatus and method adaptable to operating environment
CN103888703A (en) * 2014-03-28 2014-06-25 深圳市中兴移动通信有限公司 Shooting method and camera shooting device with recording enhanced
CN107004426A (en) * 2014-11-28 2017-08-01 华为技术有限公司 The method and mobile terminal of the sound of admission video recording object
CN105578097A (en) * 2015-07-10 2016-05-11 宇龙计算机通信科技(深圳)有限公司 Video recording method and terminal
CN106653041A (en) * 2017-01-17 2017-05-10 北京地平线信息技术有限公司 Audio signal processing equipment and method as well as electronic equipment
CN107993671A (en) * 2017-12-04 2018-05-04 南京地平线机器人技术有限公司 Sound processing method, device and electronic equipment
CN109506568A (en) * 2018-12-29 2019-03-22 苏州思必驰信息科技有限公司 A kind of sound localization method and device based on image recognition and speech recognition
CN110740259A (en) * 2019-10-21 2020-01-31 维沃移动通信有限公司 Video processing method and electronic equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114245156A (en) * 2021-11-30 2022-03-25 广州繁星互娱信息科技有限公司 Live broadcast audio adjusting method and device, storage medium and electronic equipment
CN115174959A (en) * 2022-06-21 2022-10-11 咪咕文化科技有限公司 Video 3D sound effect setting method and device
CN115174959B (en) * 2022-06-21 2024-01-30 咪咕文化科技有限公司 Video 3D sound effect setting method and device
CN116866720A (en) * 2023-09-04 2023-10-10 国网山东省电力公司东营供电公司 Camera angle self-adaptive regulation and control method, system and terminal based on sound source localization
CN116866720B (en) * 2023-09-04 2023-11-28 国网山东省电力公司东营供电公司 Camera angle self-adaptive regulation and control method, system and terminal based on sound source localization

Also Published As

Publication number Publication date
CN110740259B (en) 2021-06-25
CN110740259A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
WO2021078116A1 (en) Video processing method and electronic device
WO2021036536A1 (en) Video photographing method and electronic device
WO2021136134A1 (en) Video processing method, electronic device, and computer-readable storage medium
WO2021104197A1 (en) Object tracking method and electronic device
WO2021104236A1 (en) Method for sharing photographing parameter, and electronic apparatus
CN108989672B (en) Shooting method and mobile terminal
WO2020108261A1 (en) Photographing method and terminal
CN111010610B (en) Video screenshot method and electronic equipment
WO2021104227A1 (en) Photographing method and electronic device
WO2021190428A1 (en) Image capturing method and electronic device
WO2019196929A1 (en) Video data processing method and mobile terminal
WO2020238831A1 (en) Photographing method and terminal
WO2020156123A1 (en) Information processing method and terminal device
WO2021036623A1 (en) Display method and electronic device
CN109819167B (en) Image processing method and device and mobile terminal
WO2020011080A1 (en) Display control method and terminal device
WO2021190387A1 (en) Detection result output method, electronic device, and medium
WO2021036659A1 (en) Video recording method and electronic apparatus
WO2021104159A1 (en) Display control method and electronic device
WO2022252823A1 (en) Method and apparatus for generating live video
WO2021129818A1 (en) Video playback method and electronic device
CN108881721B (en) Display method and terminal
WO2021190390A1 (en) Focusing method, electronic device, storage medium and program product
WO2021129771A1 (en) Application sharing method, first electronic device, and computer-readable storage medium
WO2021104226A1 (en) Photographing method and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20878310

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20878310

Country of ref document: EP

Kind code of ref document: A1