CN117425930A - Data processing method and device, storage medium and vehicle - Google Patents

Data processing method and device, storage medium and vehicle Download PDF

Info

Publication number
CN117425930A
CN117425930A CN202280005449.XA CN202280005449A CN117425930A CN 117425930 A CN117425930 A CN 117425930A CN 202280005449 A CN202280005449 A CN 202280005449A CN 117425930 A CN117425930 A CN 117425930A
Authority
CN
China
Prior art keywords
information
user
image
noise reduction
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280005449.XA
Other languages
Chinese (zh)
Inventor
袁麓
胡溪玮
刘杨
李腾
仲旭
邱小军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN117425930A publication Critical patent/CN117425930A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

The application relates to a data processing method, a data processing device, a storage medium and a vehicle. The method comprises the following steps: acquiring first image information, wherein the first image information comprises position information of a user; and determining ear position information of the user according to the first image information, wherein the ear position information is used for determining sound waves for noise reduction. According to the embodiment of the application, the image information is acquired to determine the ear position information of the user, and the user's ear position information is utilized to make noise reduction, so that the noise reduction process is more targeted, and the noise reduction effect is better. Meanwhile, by determining ear position information of the user, it is possible to dynamically determine the ear position of the user to reduce noise for the user.

Description

Data processing method and device, storage medium and vehicle Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a data processing method, a data processing device, a storage medium, and a vehicle.
Background
The noise energy of people's perception can be reduced through making an uproar to the noise, noise interference that people received is reduced. The current noise reduction method generally comprises two methods of passive noise reduction and active noise reduction, wherein the passive noise reduction is to reduce the noise of a vehicle by a physical noise reduction mode, the active noise reduction generally utilizes an active noise control (active noise cancellation, ANC) technology to generate an audio signal for suppressing a noise signal through a loudspeaker, and the audio signal is neutralized to cancel each other after the noise signal and the noise signal are intersected and overlapped, so that the noise reduction purpose is finally achieved.
In the active noise reduction method, the noise reduction effect is generally better at a microphone (may also be referred to as an error microphone), and the noise reduction effect is worse as the distance from the microphone is farther. The scheme for improving the active noise reduction effect is worthy of research.
Disclosure of Invention
In view of the above, a data processing method, a data processing device, a storage medium and a vehicle are provided, which are used for improving the noise reduction effect and improving the user experience.
In a first aspect, embodiments of the present application provide a data processing method. The method comprises the following steps: acquiring first image information, wherein the first image information comprises position information of a user; and determining the position information of the user according to the first image information, wherein the position information of the user is used for determining sound waves for noise reduction.
According to the embodiment of the application, the image information is acquired to determine the position information of the user, and the noise reduction is performed by utilizing the position information of the user, so that the noise reduction effect is better, and the user position can be dynamically determined to reduce the noise of the user.
In a first possible implementation manner of the data processing method according to the first aspect, the position information of the user comprises ear position information of the user.
According to the embodiment of the application, the user's ear position information is included by the user's position information, so that the user's ear position information is determined by acquiring the image information, noise is reduced by utilizing the user's ear position information, the noise reduction process is more targeted, and the noise reduction effect is better. Meanwhile, by tracking the ear position information of the user, dynamic determination of the ear position of the user to reduce noise for the user can be achieved.
In a second possible implementation form of the data processing method according to the first aspect as such or according to the first possible implementation form of the first aspect, the user's position information comprises user's head position information, the ear position information comprises ear hole position information, and determining the user's ear position information from the first image information comprises: and determining the position information of the earhole of the user according to the first image information.
According to the embodiment of the application, more targeted and more accurate noise reduction can be realized by determining the position information of the earhole of the user.
It should be understood that, in addition to the ear hole position information, the ear position information in the embodiment of the present application may also include information of other positions of the left ear and/or the right ear of the user, and may also include information of positions of areas near the left ear and/or the right ear of the user.
In a third possible implementation manner of the data processing method according to the first aspect or the first or the second possible implementation manner of the first aspect, the sound wave for noise reduction includes a sound wave for noise reduction of a region corresponding to a position of the user.
According to the embodiment of the application, the noise reduction effect can be more targeted by reducing the noise of the area corresponding to the user position.
In a fourth possible implementation manner of the data processing method according to the third possible implementation manner of the first aspect, the area corresponding to the position of the user includes one or more of the following areas: the method comprises the steps of enabling a region corresponding to the position of a head of a user, a region corresponding to the position of an ear of the user and a region corresponding to the position of an earhole of the user.
According to the embodiment of the application, the area near the corresponding position is included in the area, so that the noise reduction area can be adjusted according to the need, the noise reduction process is more flexible, and the noise reduction effect is better.
In a fifth possible implementation form of the data processing method according to the first aspect as such or according to the first or second or third or fourth possible implementation form of the first aspect, the first image information comprises two-dimensional information for indicating a shape of a face of the user.
According to the embodiment of the application, the image information comprises the two-dimensional information indicating the shape of the face, so that more accurate noise reduction can be realized according to personalized user characteristics.
In a sixth possible implementation manner of the data processing method according to the first aspect or the first or the second or the third or the fourth or the fifth possible implementation manner of the first aspect, acquiring the first image information includes: acquiring first image information acquired by an image sensor, the image sensor including one or more of: camera, depth sensor, laser radar.
The camera can be a color camera, a black-and-white camera, an infrared camera and the like.
It should be understood that the image sensor in the embodiments of the present application may also be implemented by other sensors, such as millimeter wave radar, ultrasonic radar, and the like.
According to the embodiment of the application, the mode of acquiring the first image information is diversified, so that hardware can be flexibly deployed according to the needs, and the cost is reduced.
In a seventh possible implementation manner of the data processing method according to the fifth or sixth possible implementation manner of the first aspect, the method further includes: determining head model information including three-dimensional information indicating a shape of a face in the head model; determining ear position information of the user based on the first image information, comprising: and determining ear position information of the user according to the internal parameters, the two-dimensional information and the three-dimensional information of the image sensor.
According to the embodiment of the application, the ear position information of the user is determined by determining the human head model information and utilizing the three-dimensional information in the human head model information, so that the ear position of the user can be accurately and dynamically determined along with the movement and rotation of the head of the user, noise of the user is reduced, the hardware cost in the process is reduced, and the ear position change caused by the translation and rotation of the head of the user in an actual scene can be estimated.
In an eighth possible implementation form of the data processing method according to the fifth or sixth or seventh possible implementation form of the first aspect, the two-dimensional information comprises at least three sets of two-dimensional points, the three-dimensional information comprising at least three sets of three-dimensional points corresponding to the at least three sets of two-dimensional points.
The two-dimensional points may be any three or more groups of two-dimensional points in the first image information, and the three-dimensional points may be three or more groups of three-dimensional points corresponding to the three or more groups of two-dimensional points in the head model information.
According to the embodiment of the application, the at least three groups of two-dimensional points and the at least three groups of three-dimensional points corresponding to the at least three groups of two-dimensional points are acquired, so that the device can be suitable for noise reduction and provides better noise reduction experience for users.
In a ninth possible implementation form of the data processing method according to the seventh or eighth possible implementation form of the first aspect, the head model information comprises preset model information.
According to the embodiment of the application, the noise reduction cost can be saved by using the preset model information.
In a tenth possible implementation form of the data processing method according to the seventh or eighth or ninth possible implementation form of the first aspect, determining the head model information comprises: acquiring point cloud data information acquired by a depth sensor; and determining the head model information according to the first image information and the point cloud data information.
According to the embodiment of the application, the head model information is determined by fusing the point cloud data information and the first image information, so that the determined head model information is more close to the actual head information of the user, and therefore the ear position of the user can be more accurately positioned, and better noise reduction effect is achieved.
In an eleventh possible implementation form of the data processing method according to the seventh or eighth or ninth or tenth possible implementation form of the first aspect, determining the head model information comprises: acquiring second image information acquired by a camera, wherein the second image information comprises information of the position of a lateral head of a user; and determining the model information of the head according to the second image information, the preset model information and the internal parameters of the camera.
According to the embodiment of the application, the user is matched with the acquisition of the second image information, so that the determined human head model information is closer to the actual human head information of the user, and therefore the ear position of the user can be positioned more accurately, and better noise reduction effect is achieved.
In a twelfth possible implementation form of the data processing method according to the eleventh possible implementation form of the first aspect, the information of the lateral head position of the user comprises ear position information of the user.
According to the embodiment of the application, the ear position information of the user is obtained, so that the finally determined ear position is more accurate, and the noise reduction effect is better.
In a thirteenth possible implementation manner of the data processing method according to the seventh or eighth or ninth or tenth or eleventh or twelfth possible implementation manner of the first aspect, the image sensor includes a first image sensor and a second image sensor, and determining the human head model information includes: acquiring third image information acquired by a first image sensor and fourth image information acquired by a second image sensor; and determining the model information of the head according to the third image information, the fourth image information, the preset model information, the internal parameters of the first image sensor, the internal parameters of the second image sensor and the external parameters between the first image sensor and the second image sensor.
According to the embodiment of the application, the head model information of the user can be more comprehensively and accurately determined by collecting the third image information and the fourth image information, so that the determined head model information is closer to the actual head information of the user, and therefore the ear position of the user can be more accurately positioned, and better noise reduction effect is achieved.
In a thirteenth possible implementation manner of the first aspect or the first or second or third or fourth or fifth or sixth or seventh or eighth or ninth or tenth or eleventh or twelfth or thirteenth possible implementation manner of the first aspect, in a fourteenth possible implementation manner of the data processing method, the method further includes: and displaying the area where noise reduction is performed.
According to the embodiment of the application, the current noise reduction condition can be known in real time by displaying the noise reduction region.
In a fifteenth possible implementation manner of the data processing method according to the fourteenth possible implementation manner of the first aspect, displaying the area where noise reduction is performed includes: displaying the region with noise reduction according to the first image information; and/or displaying the area for noise reduction with preset information.
According to the embodiment of the application, the area for noise reduction is displayed according to the first image information and/or with the preset information, so that a user can more accurately know the current noise reduction condition, and the display mode is more vivid.
In a second aspect, embodiments of the present application provide a data processing apparatus. The device comprises: the acquisition module is used for acquiring first image information, wherein the first image information comprises position information of a user; the first determining module is used for determining position information of a user according to the first image information, wherein the position information of the user is used for determining sound waves for noise reduction.
In a first possible implementation form of the data processing apparatus according to the second aspect, the position information of the user comprises ear position information of the user.
In a second possible implementation manner of the data processing apparatus according to the second aspect or the first possible implementation manner of the second aspect, the position information of the user includes head position information of the user, the ear position information includes ear hole position information, and the first determining module is configured to: and determining the position information of the earhole of the user according to the first image information.
In a third possible implementation manner of the data processing apparatus according to the second aspect or the first or second possible implementation manner of the second aspect, the sound wave for noise reduction includes a sound wave for noise reduction of a region corresponding to a position of the user.
In a fourth possible implementation manner of the data processing apparatus according to the third possible implementation manner of the second aspect, the area corresponding to the position of the user includes one or more of the following areas: the method comprises the steps of enabling a region corresponding to the position of a head of a user, a region corresponding to the position of an ear of the user and a region corresponding to the position of an earhole of the user.
In a fifth possible implementation form of the data processing apparatus according to the second aspect as such or according to the first or second or third or fourth possible implementation form of the second aspect, the first image information comprises two-dimensional information for indicating a shape of a face of the user.
In a sixth possible implementation manner of the data processing apparatus according to the second aspect or the first or the second or the third or the fourth or the fifth possible implementation manner of the second aspect, the obtaining module is configured to: acquiring first image information acquired by an image sensor, the image sensor including one or more of: camera, depth sensor, laser radar.
In a seventh possible implementation manner of the data processing apparatus according to the fifth or sixth possible implementation manner of the second aspect, the apparatus further includes: a second determination module for determining head model information including three-dimensional information indicating a shape of a face in the head model; the first determining module is used for: and determining ear position information of the user according to the internal parameters, the two-dimensional information and the three-dimensional information of the image sensor.
In an eighth possible implementation form of the data processing apparatus according to the fifth or sixth or seventh possible implementation form of the second aspect, the two-dimensional information comprises at least three sets of two-dimensional points, the three-dimensional information comprising at least three sets of three-dimensional points corresponding to the at least three sets of two-dimensional points.
In a ninth possible implementation manner of the data processing apparatus according to the seventh or eighth possible implementation manner of the second aspect, the head model information includes preset model information.
In a tenth possible implementation form of the data processing apparatus according to the seventh or eighth or ninth possible implementation form of the second aspect, the second determining module is configured to: acquiring point cloud data information acquired by a depth sensor; and determining the head model information according to the first image information and the point cloud data information.
In an eleventh possible implementation form of the data processing apparatus according to the seventh or eighth or ninth or tenth possible implementation form of the second aspect, the second determining module is configured to: acquiring second image information acquired by a camera, wherein the second image information comprises information of the position of a lateral head of a user; and determining the model information of the head according to the second image information, the preset model information and the internal parameters of the camera.
In a twelfth possible implementation form of the data processing apparatus according to the eleventh possible implementation form of the second aspect, the information of the lateral head position of the user comprises ear position information of the user.
In a thirteenth possible implementation manner of the data processing apparatus according to the seventh or eighth or ninth or tenth or eleventh or twelfth possible implementation manner of the second aspect, the image sensor includes a first image sensor and a second image sensor, and the second determining module is configured to: acquiring third image information acquired by a first image sensor and fourth image information acquired by a second image sensor; and determining the model information of the head according to the third image information, the fourth image information, the preset model information, the internal parameters of the first image sensor, the internal parameters of the second image sensor and the external parameters between the first image sensor and the second image sensor.
In a fourteenth possible implementation manner of the data processing apparatus according to the second aspect or the first or the second or the third or the fourth or the fifth or the sixth or the seventh or the eighth or the ninth or the tenth or the eleventh or the twelfth or the thirteenth possible implementation manner of the second aspect, the apparatus further includes: and the display module is used for displaying the noise reduction area.
In a fifteenth possible implementation manner of the data processing apparatus according to the fourteenth possible implementation manner of the second aspect, the display module is configured to: displaying the region with noise reduction according to the first image information; and/or displaying the area for noise reduction with preset information.
In a third aspect, embodiments of the present application provide a data processing apparatus, including: a processor and a memory; the memory is used for storing programs; the processor is configured to execute a program stored in the memory to cause the apparatus to implement the data processing method of the first aspect or one or more of the possible implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a terminal device, which may perform the data processing method of the first aspect or one or several of the multiple possible implementations of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon program instructions, characterized in that the program instructions when executed by a computer cause the computer to implement the data processing method of the first aspect or one or more of the possible implementations of the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program product comprising program instructions which, when executed by a computer, cause the computer to implement the data processing method of the first aspect or one or more of the possible implementations of the first aspect.
In a seventh aspect, embodiments of the present application provide a vehicle comprising a processor for performing the data processing method of the first aspect or one or more of the plurality of possible implementations of the first aspect.
These and other aspects of the application will be apparent from and elucidated with reference to the embodiment(s) described hereinafter.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the present application and together with the description, serve to explain the principles of the present application.
Fig. 1 (a) shows a schematic diagram of an application scenario according to an embodiment of the present application.
Fig. 1 (b) shows a schematic diagram of an application scenario according to an embodiment of the present application.
Fig. 2 shows a flow chart of a data processing method according to an embodiment of the present application.
Fig. 3 (a) shows a schematic diagram of determining two-dimensional keypoints according to an embodiment of the application.
Fig. 3 (b) shows a schematic diagram of determining two-dimensional keypoints according to an embodiment of the application.
Fig. 3 (c) shows a schematic diagram of determining two-dimensional keypoints according to an embodiment of the application.
Fig. 4 shows a flow chart of a data processing method according to an embodiment of the present application.
Fig. 5 (a) shows a schematic diagram of determining ear position information according to an embodiment of the present application.
Fig. 5 (b) shows a schematic diagram of determining ear position information according to an embodiment of the present application.
Fig. 5 (c) shows a schematic diagram of determining ear position information according to an embodiment of the present application.
Fig. 6 shows a schematic representation of the position of an earhole in a model of a human head according to an embodiment of the present application.
Fig. 7 shows a block diagram of a data processing apparatus according to an embodiment of the present application.
Fig. 8 shows a block diagram of an electronic device according to an embodiment of the present application.
Fig. 9 shows a block diagram of an electronic device according to an embodiment of the present application.
Fig. 10 shows a block diagram of an electronic device according to an embodiment of the present application.
Fig. 11 shows a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Various exemplary embodiments, features and aspects of the present application will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
In addition, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits have not been described in detail as not to unnecessarily obscure the present application.
In the active noise reduction method, the noise reduction effect is generally better at a microphone (may also be referred to as an error microphone), and the noise reduction effect is worse as the distance from the microphone is farther.
In order to solve the technical problems described above, the present application provides a data processing method, which can determine ear position information of a user according to image information, so that noise reduction can be performed according to the ear position information.
Fig. 1 (a) and 1 (b) show schematic diagrams of application scenarios according to an embodiment of the present application. As shown in fig. 1 (a) and fig. 1 (b), in one possible application scenario, the data processing method of the embodiments of the present application may be used to reduce noise heard by a driver in a vehicle. The data processing system of the embodiments of the present application may be provided on a vehicle, including speakers, sensors, and a processor.
Wherein a speaker (see fig. 1 (b)) may be used to emit sound waves for noise reduction to cancel or partially cancel noise near the driver's ear position, so that noise heard by the driver in the vehicle may be reduced. The speakers may be one or more. The left speaker may be used to emit sound waves that reduce noise near the user's left ear position and the right speaker may be used to emit sound waves that reduce noise near the user's right ear position.
The sensor may include an image sensor and a microphone, among others.
The image sensor (see fig. 1 (a)) may be one or more and may include a camera, a depth sensor, a lidar, etc. The camera can be an infrared camera, a color camera, a black-and-white camera and the like. The image sensor may be used to capture information of the position of the user's earhole, for example, an image including the user's head may be captured by a camera. In an in-vehicle scenario, the camera may also be a camera for a driver monitoring system (driver monitor system, DMS) or a camera for a cabin monitoring system (cockpit monitor system, CMS), which is not limited in this application.
A microphone (which may be referred to as a microphone, as shown in fig. 1 (b)) may be provided near the ear position of a driver in a vehicle for collecting a residual signal, and may include a condenser microphone, a moving coil microphone, a laser microphone, and the like. The residual signal may be used to indicate residual noise that a driver in the vehicle hears after the sound waves emitted by the speaker cancel out with noise in the vicinity of the ear. For example, if a plurality of drivers may exist in the vehicle, a plurality of microphones may be provided to each of the plurality of drivers to collect the corresponding residual signals. For each driver or a driver, a plurality of microphones may be respectively provided, for example, the microphone on the left side of the driver a may be used to collect the residual noise heard by the left ear of the driver a, and the microphone on the right side of the driver a may be used to collect the residual noise heard by the right ear of the driver a.
The processor may be built into a vehicle (or audio system) on a vehicle as an onboard computing unit, such as a digital signal processing (digital signal processor, DSP) chip. The processor may determine ear position information of a user (i.e., a driver in a vehicle) based on image information collected by the image sensor, and determine sound waves for noise reduction using the information and a residual signal collected by the microphone. The processor may be external to the cloud server. The server and the vehicle may communicate by wireless connection, for example, by mobile communication technologies such as second generation mobile communication technology (2 nd-generation, 2G)/third generation mobile communication technology (3 rd-generation, 3G)/fourth generation mobile communication technology (4 th-generation, 4G)/fifth generation mobile communication technology (5 th-generation, 5G), and wireless communication modes such as Wi-Fi, bluetooth, frequency modulation (frequency modulation, FM), data transmission station, satellite communication, and the like. Through communication between the vehicle and the server, the server can collect information acquired by the sensor for calculation and transmit calculation results back to the corresponding vehicle.
Optionally, a display device (see fig. 1 (a)) may be further included in the data processing system, where the display device may include a display screen, a projection, etc. for displaying the noise reduction region.
It should be noted that, the data processing method in the embodiment of the present application may be used in other scenes that need noise reduction besides the vehicle-mounted scene shown in fig. 1 (a) and 1 (b), which is not limited in this application.
Taking fig. 2-6 as an example, the data processing method according to the embodiment of the present application is described in detail on the basis of the data processing system as described above:
fig. 2 shows a flow chart of a data processing method according to an embodiment of the present application. The method may be used in the data processing system described above. As shown in fig. 2, the method may include:
step S201, first image information is acquired.
The first image information may include one frame of image acquired by the image sensor, or may be multiple frames of images, where the multiple frames of images may be continuous multiple frames of images, or may be discontinuous multiple frames of images. The first image information may be an image acquired by the image sensor, or may be information obtained according to an image acquired by the image sensor. The image sensor includes one or more of the following: camera, depth sensor, laser radar. The camera may include an infrared camera and a color camera.
The first image information includes position information of a user, and the position information of the user can be determined according to one or more frames of images acquired by the image sensor. Wherein the user's location information may include the user's head location information.
The head position information may include a head position in a camera coordinate system, which may represent a coordinate system having a focus center of an image sensor (e.g., a camera) as an origin. For example, the first image information may include a two-dimensional image acquired by a single camera, and the two-dimensional image may include an image of part or all of the head of the user, and the head included in the image may also be a lateral head, for example, the image includes an image of the head in a lateral deflected posture. A head may be detected in the image using a head detector algorithm to determine head position information.
Wherein the first image information may include two-dimensional information indicating a shape of a face of the user.
The two-dimensional information may include two-dimensional key points of the image of the user's head, and may be used to indicate the positions of key areas of the face of the user, such as the positions of planes including the eyebrows, eyes, nose, mouth, face contours, and the like. Referring to fig. 3 (a), 3 (b) and 3 (c), schematic diagrams of determining two-dimensional keypoints according to an embodiment of the application are shown. For example, the first image information may be referred to fig. 3 (a), the detected image corresponding to the head (i.e., the head image) may be referred to fig. 3 (b), according to which a two-dimensional key point in the head image may be determined by using a face key point detection method, the two-dimensional key point may be referred to each point indicated in fig. 3 (c), the face key point detection method may be a method based on cascade gesture regression (cascaded pose regression, CPR), a method based on active appearance model (Active Appearance Model, AAM), a method based on constraint local model (Constrained Local Model, CLM), a face key point detection method based on deep learning, or the like, which is not limited in this application. The two-dimensional keypoints may be keypoints in the camera coordinate system.
Step S202, determining ear position information of a user according to the first image information.
Wherein the ear position information is used to determine sound waves for noise reduction. The ear position information of the user may be ear position information of the left ear and/or the right ear of the user in the world coordinate system, and the ear position information of the user may include, for example, coordinates of any position of the left ear and/or the right ear of the user, or coordinates of a position of an area near the left ear and/or the right ear of the user, which is not limited in this application.
The ear position information of the user (which may be the ear position information of the user in the world coordinate system) may be determined according to the position information of the user included in the first image information, so as to perform noise reduction, for example, after the ear position information of the user is determined, a transfer function from the ear position of the user to the microphone may be determined (for example, may be obtained from a pre-established transfer function library, which may include transfer functions between the microphone and different positions in the vehicle), and by an active noise reduction method, a control signal for performing noise reduction is determined according to the transfer function, a residual signal collected by the microphone and a noise signal (which may be referred to as a noise signal) by using a filter, and a speaker is made to emit a corresponding reverse sound wave according to the control signal so as to cancel noise, so as to achieve the noise reduction effect.
According to the embodiment of the application, the image information is acquired to determine the ear position information of the user, and the user's ear position information is utilized to make noise reduction, so that the noise reduction process is more targeted, and the noise reduction effect is better. Meanwhile, by tracking the ear position information of the user, dynamic determination of the ear position of the user to reduce noise for the user can be achieved.
In order to more accurately determine the noise reduction region, the ear position information may include ear hole position information, and the step S202 includes: and determining the position information of the earhole of the user according to the first image information.
The earhole position information of the user may include, for example, coordinates of positions of vertices of the left ear and/or the right ear of the user in the world coordinate system, and coordinates of positions of center points, lowest points, and the like of the left ear and/or the right ear of the user.
Therefore, more specific and more accurate noise reduction can be realized.
The sound wave for noise reduction may include a sound wave for noise reduction of an area corresponding to the position of the user.
According to the embodiment of the application, the noise reduction effect can be more targeted by reducing the noise of the area corresponding to the user position.
Wherein the region corresponding to the location of the user comprises one or more of the following: the method comprises the steps of enabling a region corresponding to the position of a head of a user, a region corresponding to the position of an ear of the user and a region corresponding to the position of an earhole of the user. The region corresponding to the position of the user may also include other regions than the above, which is not limited in this application.
The area may include an area near a corresponding position, for example, an area where a user's head corresponds to a position of the user's head may include an area where the user's head is located and an area near the head, and an area where a user's ear corresponds to a position of the user's ear may include an area where the user's ear is located and an area near the ear.
According to the embodiment of the application, the area near the corresponding position is included in the area, so that the noise reduction area can be adjusted according to the need, the noise reduction process is more flexible, and the noise reduction effect is better.
Hereinafter, a method of determining ear position information from first image information according to the present application will be described in detail with reference to fig. 4. Referring to fig. 4, a flow chart of a data processing method according to an embodiment of the present application is shown. As shown in fig. 4, the method further includes:
step S401, determining head model information.
The human head model information includes three-dimensional information for indicating a shape of a human face in the human head model, the human head model may be a three-dimensional model, the three-dimensional information may include three-dimensional key points on the human head model under a head coordinate system, the head coordinate system may use a center point of a two-ear connecting line as an origin, and may also use other positions on or outside the human head of the user as origins. The three-dimensional keypoints may be corresponding to two-dimensional keypoints, indicating three-dimensional positions under the head coordinate system.
Referring to fig. 5 (a), 5 (b) and 5 (c), schematic diagrams of determining ear position information according to an embodiment of the present application are shown. Wherein the three-dimensional keypoints may be corresponding to two-dimensional keypoints, upon determination of a two-dimensional keypoint (which may be shown in fig. 5 (a)), the corresponding three-dimensional keypoints (which may be shown in fig. 5 (c)) may be determined from the two-dimensional keypoints in the determined model of the human head (which may be shown in fig. 5 (b)). Referring to fig. 5 (a), 5 (b) and 5 (c), ear position information of a user in a world coordinate system may be determined using two-dimensional key points in a head image and three-dimensional key points in a head model.
The following details 4 methods for determining the model information of the human head:
if the user cannot cooperate with the system to collect the information of the head model, a preset head model can be used to determine the information of the head model, see below.
For example, the head model information may include preset model information.
The model may be composed of a plurality of vertices and a plurality of triangular patches, and the number of the vertices and the triangular patches may be preset.
Thus, noise reduction costs can be saved.
If a depth sensor is further deployed in the data processing system, the data information of the acquired point cloud can be fused with the first image information, so that personalized head model information can be determined more accurately, see below.
For example, this step S401 includes:
acquiring point cloud data information acquired by a depth sensor;
the point cloud data information may include, among other things, a two-dimensional point cloud image (which may be low resolution) acquired by a depth sensor.
And determining the head model information according to the first image information and the point cloud data information.
The first image information may include a two-dimensional image acquired by a camera, for example, a high-resolution gray-scale image acquired by an infrared camera.
The two-dimensional point cloud image and the gray level image can be subjected to data fusion to obtain a high-resolution point cloud image, namely each pixel of the gray level image is provided with corresponding depth information in the point cloud image, and the data fusion mode can be a mode of utilizing image alignment (image alignment). The human head of the user can be detected in the image by utilizing a human head detection algorithm based on the high-resolution point cloud image, the image corresponding to the human head in the point cloud image is subjected to human face reconstruction by utilizing a regression network, parameters (which can comprise parameters indicating the shape and the size of the human head) of a human head model are obtained, and the human head model can be a three-dimensional deformable model (3d morphable model,3DMM), so that human head model information can be determined.
According to the embodiment of the application, the head model information is determined by fusing the point cloud data information and the first image information, so that the determined head model information is more close to the actual head information of the user, and therefore the ear position of the user can be more accurately positioned, and better noise reduction effect is achieved.
If only a camera is deployed in the data processing system, the user can acquire the second image information in cooperation to more accurately determine personalized human head model information, see below.
For example, this step S401 includes:
acquiring second image information acquired by a camera;
wherein the second image information includes information of a lateral head position of the user. The second image information may include an image captured by a single camera, which may be one or more frames of images. The user can be prompted to rotate the head in a matched manner through the vehicle-mounted display device or the loudspeaker, an image comprising the lateral human head of the user is obtained, and the lateral side can be a posture of the head of the user with the rotation angle of 45-60 degrees.
In order to enable a more accurate positioning of the position of the user's ear, the information of the user's lateral head position may include user's ear position information. I.e. the image of the human head at the side may include an image of all or part of the user's ears.
And determining the model information of the head according to the second image information, the preset model information and the internal parameters of the camera.
The image included in the second image information may detect a side head in the image by using a head detection algorithm, and determine a two-dimensional key point indicating a face shape corresponding to the side head in the image corresponding to the side head by using a face key point detection method. The preset model information may be, for example, the preset human head model information, and the three-dimensional key points in the human head model may be determined according to the method. And determining the head posture information of the user according to the two-dimensional key points in the image of the side head, the three-dimensional key points in the head model and the internal parameters of the camera (namely, the camera for acquiring the second image information), wherein the head posture information can comprise a rotation matrix R and a translation vector T for indicating the head posture of the user. The method for determining the head pose information of the user may use a perspective n-point (PnP) algorithm, an efficient perspective n-point (efficient pespective-n-point, EPnP) algorithm, or the like, and the three-dimensional key points in the human head model are represented as linear combinations of 4 non-coplanar control points, and the coordinates of the 4 control points under the camera coordinate system are solved by using the two-dimensional key points in the image of the side human head and the camera inner parameters, so as to obtain the coordinates of all the three-dimensional key points under the camera coordinate system, thereby determining the head pose information of the user, where the head pose information may be used to indicate the conversion relationship between the head coordinate system and the camera coordinate system, and the method for determining the information of the head pose of the user may also use other ways besides the algorithm, which is not limited in the present application.
After the head pose information of the user is determined, three-dimensional coordinate points of corresponding earholes (also can be ears) in three-dimensional key points in the human head model can be determined, internal parameters and head pose information of a camera are utilized, two-dimensional coordinate points of the three-dimensional coordinate points of the corresponding earholes in an image of a side human head are determined through projection, the two-dimensional coordinate points are compared with two-dimensional coordinate points of the corresponding earholes (also can be ears) in the image of the side human head, and parameters (including parameters such as shape, size and the like) of the human head model (the model can be 3 DMM) are optimized in a loss optimization mode (such as gradient descent algorithm), so that parameters of the optimized human head model are obtained, and human head model information is determined.
According to the embodiment of the application, the user is matched with the acquisition of the second image information, so that the determined human head model information is closer to the actual human head information of the user, and therefore the ear position of the user can be positioned more accurately, and better noise reduction effect is achieved.
In a vehicle-mounted scene, if cameras respectively used for a driver monitoring system DMS and a cabin monitoring system CMS are deployed in a vehicle, personalized head model information can be more accurately determined through image information respectively acquired by the two cameras, and the method is described below.
Wherein the image sensor may include a first image sensor and a second image sensor, the step S301 includes:
acquiring third image information acquired by a first image sensor and fourth image information acquired by a second image sensor;
wherein the first image sensor may be a camera for the DMS, the second image sensor may be a camera for the CMS, the third image information may include a frame of two-dimensional image acquired by the camera for the DMS, and the fourth image information may include a frame of two-dimensional image acquired by the camera for the CMS. It should be understood that the first image sensor and the second image sensor in the embodiments of the present application may also be implemented by other vehicle-mounted cameras, or by a depth sensor, a laser radar, or the like.
And determining the model information of the head according to the third image information, the fourth image information, the preset model information, the internal parameters of the first image sensor, the internal parameters of the second image sensor and the external parameters between the first image sensor and the second image sensor.
The external parameters between the first image sensor and the second image sensor may represent a geometric transformation relationship between a camera coordinate system corresponding to the first image sensor and a camera coordinate system corresponding to the second image sensor.
A human head can be detected in the image by utilizing a human head detection algorithm for the image in the third image information, and a two-dimensional key point (which can be called a first two-dimensional key point) for indicating the shape of the human face of the user is determined in the image corresponding to the human head by utilizing a human face key point detection method; similarly, a human head can be detected in the image by using a human head detection algorithm on the image in the fourth image information, and a two-dimensional key point (which can be called a second two-dimensional key point) indicating the shape of the human face of the user can be determined in the image corresponding to the human head by using a human face key point detection method. The preset model information may be, for example, the preset human head model information, and the three-dimensional key points in the human head model may be determined according to the method.
Based on the first two-dimensional keypoints, the three-dimensional keypoints in the human head model, and the internal parameters of the first image sensor, the above-described PNP algorithm may be utilized to determine head pose information of the user corresponding to the third image information, which may include a rotation matrix R and a translation vector T indicating the head pose of the user. Based on the three-dimensional coordinate point corresponding to the earhole (may also be the ear) and the head pose information in the head coordinate system in the human head model, a three-dimensional coordinate point (may be referred to as a first three-dimensional coordinate point) corresponding to the earhole (may also be the ear) in the camera coordinate system corresponding to the first image sensor is determined.
Similarly, based on the second two-dimensional keypoints, the three-dimensional keypoints in the human head model, and the internal parameters of the second image sensor, the above-described PNP algorithm may be used to determine the head pose information of the user corresponding to the fourth image information. And determining a three-dimensional coordinate point (which can be called a second three-dimensional coordinate point) corresponding to the earhole (which can be called an ear) in a camera coordinate system corresponding to the second image sensor according to the three-dimensional coordinate point corresponding to the earhole (which can be called an ear) in the head coordinate system and the head pose information in the human head model.
The first three-dimensional coordinate point may be converted into a three-dimensional coordinate point under the camera coordinate system corresponding to the second image sensor according to the geometric conversion relationship between the camera coordinate system corresponding to the first image sensor and the camera coordinate system corresponding to the second image sensor, and compared with the second three-dimensional coordinate point, for example, the mean square error between the two three-dimensional coordinate points is determined as a loss function, and parameters (including parameters such as shape and size) of the head model (which may be 3 DMM) are optimized by using a loss optimization mode (for example, gradient descent algorithm), so as to obtain parameters of the optimized head model, so as to determine head model information.
It should be noted that, in the above description, the second three-dimensional coordinate point may be converted into a three-dimensional coordinate point under a camera coordinate system corresponding to the third image sensor, and compared with the first three-dimensional coordinate point, so as to optimize parameters of the head model, which is not limited in this application.
It should be noted that the head model information may be determined according to methods other than the above four methods, which is not limited in this application.
According to the embodiment of the application, the head model information of the user can be more comprehensively and accurately determined by collecting the third image information and the fourth image information, so that the determined head model information is closer to the actual head information of the user, and therefore the ear position of the user can be more accurately positioned, and better noise reduction effect is achieved.
The step S202 further includes:
step S402, determining ear position information of a user according to the internal reference, the two-dimensional information and the three-dimensional information of the image sensor.
In the case where the head model information is determined in step S401, three-dimensional information may be determined in the head model information according to two-dimensional key points in the two-dimensional information. Based on the two-dimensional keypoints in the two-dimensional information, the three-dimensional keypoints in the three-dimensional information, and the internal parameters of the image sensor (i.e., the image sensor that acquired the first image information), the above-described PNP algorithm may be used to determine the head pose information of the user corresponding to the first image information, which may include a rotation matrix R and a translation vector T indicating the head pose of the user.
Because the head pose information can be used for indicating the conversion relation between the head coordinate system and the camera coordinate system, the ear position information under the head coordinate system (namely, the three-dimensional coordinate point corresponding to the ear in the human head model) can be determined in the human head model information, and the ear position information under the head coordinate system is converted into the ear position information under the camera coordinate system corresponding to the image sensor by utilizing the head pose information. Thus, the ear position information in the camera coordinate system can be converted into ear position information in the world coordinate system by using the external parameters of the image sensor, thereby performing noise reduction.
Wherein, three-dimensional coordinate points of the head coordinate system corresponding to the earholes can be determined in the human head model included in the human head model information, so that the earhole position information under the camera coordinate system can be determined. Referring to fig. 6, a schematic diagram of the location of an earhole in a model of a human head according to an embodiment of the present application is shown. The position indicated as a point in fig. 6 is the earhole position of the right ear in the human head model.
According to the embodiment of the application, the ear position information of the user is determined by determining the human head model information and utilizing the three-dimensional information in the human head model information, so that the ear position of the user can be accurately and dynamically determined along with the movement and rotation of the head of the user, noise of the user is reduced, the hardware cost in the process is reduced, and the ear position change caused by the translation and rotation of the head of the user in an actual scene can be estimated.
To accommodate the need for noise reduction, the two-dimensional information may include at least three sets of two-dimensional points, and the three-dimensional information may include at least three sets of three-dimensional points corresponding to the at least three sets of two-dimensional points.
The two-dimensional points may be any three or more groups of two-dimensional points in the first image information, for example, any three or more groups of two-dimensional key points in the head image, and the three-dimensional points may be three or more groups of three-dimensional points corresponding to the three or more groups of two-dimensional points in the head model information, for example, three or more groups of three-dimensional key points corresponding to the three or more groups of two-dimensional key points in the head model.
According to the embodiment of the application, the at least three groups of two-dimensional points and the at least three groups of three-dimensional points corresponding to the at least three groups of two-dimensional points are acquired, so that the device can be suitable for noise reduction and provides better noise reduction experience for users.
Referring back to fig. 2, in order to allow the user to know the current noise reduction area in real time, the method may further include:
in step S203, the noise reduction region is displayed.
The region where noise reduction is performed may be displayed in the form of text, image, video, or the like, which is not limited in this application. The area where noise reduction is currently performed may be displayed on the display device.
According to the embodiment of the application, the current noise reduction condition can be known in real time by displaying the noise reduction region.
The displaying the noise reduction area may include:
displaying the region with noise reduction according to the first image information; and/or
And displaying the area for noise reduction according to preset information.
For example, in the case where the ear position of the user is currently noise-reduced, an area indicating the ear position of the user in the first image information may be identified (for example, shown in a box), and the identified image may be displayed on the display device. Text indicating the noise reduction region may also be displayed on the display device, for example: "the current noise reduction region is: binaural position of driver).
The user may be notified of the noise reduction region by other means, for example, by audio broadcasting, which is not limited in this application.
Therefore, the user can more accurately know the current noise reduction condition, and the display mode is more vivid.
Based on the same inventive concept of the above method embodiments, the present application further provides a data processor device, where the data processor device is configured to execute the technical solution described in the above method embodiments. For example, the steps of the data processing methods shown in fig. 2-6 described above may be performed.
Fig. 7 shows a block diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 7, the apparatus includes:
an acquiring module 701, configured to acquire first image information, where the first image information includes location information of a user;
the first determining module 702 is configured to determine ear position information of a user according to the first image information, where the ear position information is used to determine sound waves for noise reduction.
According to the embodiment of the application, the image information is acquired to determine the ear position information of the user, and the user's ear position information is utilized to make noise reduction, so that the noise reduction process is more targeted, and the noise reduction effect is better. Meanwhile, by tracking the ear position information of the user, dynamic determination of the ear position of the user to reduce noise for the user can be achieved.
Optionally, the position information of the user may include head position information of the user, the ear position information includes ear hole position information, and the first determining module 702 is configured to: and determining the position information of the earhole of the user according to the first image information.
According to the embodiment of the application, more targeted and more accurate noise reduction can be realized by determining the position information of the earhole of the user.
Alternatively, the sound wave for noise reduction may include a sound wave for noise reduction of an area corresponding to the position of the user.
According to the embodiment of the application, the noise reduction effect can be more targeted by reducing the noise of the area corresponding to the user position.
Optionally, the area corresponding to the location of the user may include one or more of the following: the method comprises the steps of enabling a region corresponding to the position of a head of a user, a region corresponding to the position of an ear of the user and a region corresponding to the position of an earhole of the user.
According to the embodiment of the application, the area which is used for noise reduction can be adjusted according to the requirement by enabling the area to comprise the area near the corresponding position, so that the noise reduction process is more flexible, and the noise reduction effect is better.
Alternatively, the first image information may include two-dimensional information indicating a shape of a face of the user.
According to the embodiment of the application, the image information comprises the two-dimensional information indicating the shape of the face, so that more accurate noise reduction can be realized according to personalized user characteristics.
Optionally, the obtaining module 701 may be configured to: acquiring first image information acquired by an image sensor, the image sensor including one or more of: camera, depth sensor, laser radar.
According to the embodiment of the application, the mode of acquiring the first image information is diversified, so that hardware can be flexibly deployed according to the needs, and the cost is reduced.
Optionally, the apparatus may further include: a second determination module for determining head model information including three-dimensional information indicating a shape of a face in the head model; the first determining module is used for: and determining ear position information of the user according to the internal parameters, the two-dimensional information and the three-dimensional information of the image sensor.
According to the embodiment of the application, the ear position information of the user is determined by determining the human head model information and utilizing the three-dimensional information in the human head model information, so that the ear position of the user can be accurately and dynamically determined along with the movement and rotation of the head of the user, noise of the user is reduced, the hardware cost in the process is reduced, and the ear position change caused by the translation and rotation of the head of the user in an actual scene can be estimated.
Alternatively, the two-dimensional information may include at least three sets of two-dimensional points, and the three-dimensional information may include at least three sets of three-dimensional points corresponding to the at least three sets of two-dimensional points.
According to the embodiment of the application, the at least three groups of two-dimensional points and the at least three groups of three-dimensional points corresponding to the at least three groups of two-dimensional points are acquired, so that the device can be suitable for noise reduction and provides better noise reduction experience for users.
Alternatively, the head model information may include preset model information.
According to the embodiment of the application, the noise reduction cost can be saved by using the preset model information.
Optionally, the second determining module may be configured to: acquiring point cloud data information acquired by a depth sensor; and determining the head model information according to the first image information and the point cloud data information.
According to the embodiment of the application, the head model information is determined by fusing the point cloud data information and the first image information, so that the determined head model information is more close to the actual head information of the user, and therefore the ear position of the user can be more accurately positioned, and better noise reduction effect is achieved.
Optionally, the second determining module may be configured to: acquiring second image information acquired by a camera, wherein the second image information comprises information of the position of a lateral head of a user; and determining the model information of the head according to the second image information, the preset model information and the internal parameters of the camera.
According to the embodiment of the application, the user is matched with the acquisition of the second image information, so that the determined human head model information is closer to the actual human head information of the user, and therefore the ear position of the user can be positioned more accurately, and better noise reduction effect is achieved.
Alternatively, the information of the lateral head position of the user may include ear position information of the user.
According to the embodiment of the application, the ear position information of the user is obtained, so that the finally determined ear position is more accurate, and the noise reduction effect is better.
Optionally, the image sensor may include a first image sensor and a second image sensor, and the second determining module may be configured to: acquiring third image information acquired by a first image sensor and fourth image information acquired by a second image sensor; and determining the model information of the head according to the third image information, the fourth image information, the preset model information, the internal parameters of the first image sensor, the internal parameters of the second image sensor and the external parameters between the first image sensor and the second image sensor.
According to the embodiment of the application, the head model information of the user can be more comprehensively and accurately determined by collecting the third image information and the fourth image information, so that the determined head model information is closer to the actual head information of the user, and therefore the ear position of the user can be more accurately positioned, and better noise reduction effect is achieved.
Optionally, the apparatus may further include: and the display module is used for displaying the noise reduction area.
According to the embodiment of the application, the current noise reduction condition can be known in real time by displaying the noise reduction region.
Optionally, the display module may be further configured to: displaying the region with noise reduction according to the first image information; and/or displaying the area for noise reduction with preset information.
According to the embodiment of the application, the area for noise reduction is displayed according to the first image information and/or with the preset information, so that a user can more accurately know the current noise reduction condition, and the display mode is more vivid.
Fig. 8 shows a block diagram of an electronic device according to an embodiment of the present application. It should be understood that the electronic device 800 may be a terminal, such as a car or a car machine, or may be a chip built in the terminal, and may implement the steps of the data processing method shown in fig. 2 to 6 or implement the functions of the modules of the data processing apparatus shown in fig. 7. As shown in fig. 8, the electronic device 800 includes a processor 801, and interface circuitry 802 coupled to the processor. It should be appreciated that although only one processor and one interface circuit are shown in fig. 8. Electronic device 800 may include other numbers of processors and interface circuits.
Wherein the interface circuit 802 is used to communicate with other components of the terminal, such as a memory or other processor. The processor 801 is configured to interact with other components via interface circuitry 802. The interface circuit 802 may be an input/output interface of the processor 801.
The processor 801 may be a processor in a vehicle-mounted device such as a car machine, or may be a processing device sold separately.
For example, the processor 801 reads computer programs or instructions in a memory coupled thereto through the interface circuit 802, and decodes and executes the computer programs or instructions. The corresponding programs or instructions, when interpreted and executed by the processor 801, enable the electronic device 800 to implement aspects of the data processing methods provided in the embodiments of the present application.
Optionally, these programs or instructions are stored in a memory external to the electronic device 800. When the above-described program or instructions are decoded and executed by the processor 801, part or all of the content of the above-described program or instructions is temporarily stored in the memory.
Optionally, these programs or instructions are stored in a memory internal to the electronic device 800. When a program or instructions are stored in a memory inside the electronic device 800, the electronic device 800 may be provided in a terminal of an embodiment of the present application.
Optionally, some of the content of these programs or instructions is stored in a memory external to the electronic device 800, and other portions of the content of these programs or instructions is stored in a memory internal to the electronic device 800.
Fig. 9 shows a block diagram of an electronic device according to an embodiment of the present application. The electronic device may be a car or a car machine, or may be a chip built in a terminal, and implement the steps of the data processing method shown in fig. 2 to 6, or implement the functions of the modules of the electronic device shown in fig. 7. As shown in fig. 9, the electronic device 900 includes: a processor 901, a memory 902 coupled to the processor. It should be understood that although only one processor and one memory are shown in fig. 9. The electronic device 900 may include other numbers of processors and memory.
Wherein the memory 902 is for storing a computer program or computer instructions. These computer programs or instructions, when executed by the processor 901, may cause the electronic device 900 to implement steps in the data processing methods of embodiments of the present application.
Fig. 10 shows a block diagram of an electronic device according to an embodiment of the present application. As shown in fig. 10, the electronic device 1000 may be the above-mentioned terminal, for example, a car or a car machine, or may be a chip built in the terminal, and may perform the data processing method shown in any one of fig. 2 to 6. The electronic device 1000 includes at least one processor 1801, at least one memory 1802, and at least one communication interface 1803. The electronic device may further comprise common components such as an antenna, which are not described in detail herein.
The respective constituent elements of the electronic apparatus 1000 are specifically described below with reference to fig. 10.
The processor 1801 may be a general purpose Central Processing Unit (CPU), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the above program schemes. The processor 1801 may include one or more processing units, such as: the processor 1801 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.
A communication interface 1803 for communicating with other electronic devices or communication networks, such as ethernet, radio Access Network (RAN), core network, wireless local area network (Wireless Local Area Networks, WLAN), etc.
The Memory 1802 may be, but is not limited to, a read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (random access Memory, RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), a compact disc read-Only Memory (Compact Disc Read-Only Memory) or other optical disc storage, a compact disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be stand alone and coupled to the processor via a bus. The memory may also be integrated with the processor.
The memory 1802 is used for storing application program codes for executing the above schemes, and is controlled to be executed by the processor 1801. The processor 1801 is used to execute application code stored in the memory 1802.
As an example, in connection with the data processing apparatus shown in fig. 7, the acquisition module 701 in fig. 7 described above may be implemented by the communication interface 1803 in fig. 8; the first determination module 702 in fig. 7 described above may be implemented by the processor 1801 in fig. 8.
Fig. 11 shows a block diagram of an electronic device according to an embodiment of the present application. The electronic device 1100 may be the above-described terminal, such as a car or a car machine, or may be a chip built in the terminal, and may perform the data processing method shown in any one of fig. 2 to 6. The electronic device 1100 comprises a sensor 1101 and a processing unit 1102 coupled to the sensor 1101. It should be understood that although only one sensor and one processing unit are shown in fig. 11. The electronic device 1100 may include other numbers of sensors and processors.
The sensor 1101 may be an image sensor, for example, including a camera, a depth sensor, a laser radar, etc., and the sensor 1101 may be used to collect the first image information. The processing unit 1102 may be configured to determine, according to the first image information, location information of a user, where the location information of the user may include ear location information of the user, so that sound waves for noise reduction in the noise reduction area may be determined according to the location information, so as to achieve a noise reduction effect.
Optionally, the electronic device may further include a display unit 1103, which may be coupled to the processing unit 1102, and configured to display the noise reduction area after the processing unit 1102 determines the noise reduction area, so that a visual effect of the noise reduction area can be achieved, and a user experience is better.
It should be understood that the electronic device in the embodiments of the present application may be implemented by software, for example, by the above-mentioned computer program or instructions, and the corresponding computer program or instructions may be stored in a memory inside the terminal, and the above-mentioned functions are implemented by the processor reading the corresponding computer program or instructions inside the memory. Alternatively, the electronic device in the embodiment of the present application may be implemented by hardware. Wherein the processing unit 1102 is a processor.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (Random Access Memory, RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random-Access Memory (SRAM), portable compact disk Read Only Memory (Compact Disc Read-Only Memory, CD-ROM), digital versatile disks (Digital Video Disc, DVD), memory sticks, floppy disks, mechanical coding devices such as punch cards or in-groove protrusion structures having instructions stored thereon, and any suitable combination of the foregoing.
The computer readable program instructions or code described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present application may be assembly instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (Local Area Network, LAN) or a wide area network (Wide Area Network, WAN), or it may be connected to an external computer (e.g., through the internet using an internet service provider). In some embodiments, aspects of the present application are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field programmable gate arrays (Field-Programmable Gate Array, FPGA), or programmable logic arrays (Programmable Logic Array, PLA), with state information of computer readable program instructions.
Various aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by hardware (e.g., circuits or ASICs (Application Specific Integrated Circuit, application specific integrated circuits)) which perform the corresponding functions or acts, or combinations of hardware and software, such as firmware, etc.
Although the present application has been described herein in connection with various embodiments, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed application, from a review of the figures, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
The embodiments of the present application have been described above, the foregoing description is exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (20)

  1. A method of data processing, the method comprising:
    acquiring first image information, wherein the first image information comprises position information of a user;
    and determining ear position information of the user according to the first image information, wherein the ear position information is used for determining sound waves for noise reduction.
  2. The method of claim 1, wherein the user's location information comprises the user's head location information, the ear location information comprises earhole location information, and wherein determining the user's ear location information from the first image information comprises:
    and determining the position information of the earhole of the user according to the first image information.
  3. A method according to claim 1 or 2, wherein the sound waves for noise reduction comprise sound waves for noise reduction of a region corresponding to the user's location.
  4. A method according to claim 3, wherein the region to which the user's location corresponds comprises one or more of the following: the region corresponding to the position of the head of the user, the region corresponding to the position of the ear of the user and the region corresponding to the position of the earhole of the user.
  5. The method according to any one of claims 1-4, wherein the first image information comprises two-dimensional information indicating a shape of a face of the user.
  6. The method of any of claims 1-5, wherein the acquiring the first image information comprises:
    acquiring first image information acquired by an image sensor, the image sensor comprising one or more of: camera, depth sensor, laser radar.
  7. The method according to claim 5 or 6, characterized in that the method further comprises:
    determining human head model information, the human head model information comprising three-dimensional information for indicating a shape of a human face in the human head model;
    the determining ear position information of the user according to the first image information includes:
    and determining ear position information of the user according to the internal reference of the image sensor, the two-dimensional information and the three-dimensional information.
  8. The method of any of claims 5-7, wherein the two-dimensional information comprises at least three sets of two-dimensional points, and the three-dimensional information comprises at least three sets of three-dimensional points corresponding to the at least three sets of two-dimensional points.
  9. The method according to claim 7 or 8, wherein the model information of the head of a person comprises preset model information.
  10. The method according to any one of claims 7-9, wherein said determining human head model information comprises:
    acquiring point cloud data information acquired by a depth sensor;
    and determining the head model information according to the first image information and the point cloud data information.
  11. The method according to any one of claims 7-10, wherein said determining human head model information comprises:
    acquiring second image information acquired by a camera, wherein the second image information comprises information of the side head position of the user;
    and determining the model information of the head according to the second image information, the preset model information and the internal parameters of the camera.
  12. The method of claim 11, wherein the information of the user's lateral head position includes ear position information of the user.
  13. The method of any of claims 7-12, wherein the image sensor comprises a first image sensor and a second image sensor, and wherein determining the human head model information comprises:
    Acquiring third image information acquired by a first image sensor and fourth image information acquired by a second image sensor;
    and determining the model information of the head according to the third image information, the fourth image information, the preset model information, the internal parameters of the first image sensor, the internal parameters of the second image sensor and the external parameters between the first image sensor and the second image sensor.
  14. The method according to any one of claims 1-13, further comprising:
    and displaying the area where noise reduction is performed.
  15. The method of claim 14, wherein displaying the region in which noise reduction is performed comprises:
    displaying the noise reduction area according to the first image information; and/or
    And displaying the noise reduction area according to preset information.
  16. A data processing apparatus, the apparatus comprising:
    the device comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring first image information, and the first image information comprises position information of a user;
    and the first determining module is used for determining ear position information of the user according to the first image information, wherein the ear position information is used for determining sound waves for noise reduction.
  17. A data processing apparatus, comprising: a processor and a memory;
    the memory is used for storing programs;
    the processor is configured to execute a program stored in the memory, to cause the apparatus to implement the method of any one of claims 1-14.
  18. A computer readable storage medium having stored thereon program instructions, which when executed by a computer cause the computer to implement the method of any of claims 1-15.
  19. A computer program product comprising program instructions which, when executed by a computer, cause the computer to carry out the method of any one of claims 1 to 15.
  20. A vehicle comprising a processor for performing the method of any of claims 1-15.
CN202280005449.XA 2022-05-17 2022-05-17 Data processing method and device, storage medium and vehicle Pending CN117425930A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/093287 WO2023220920A1 (en) 2022-05-17 2022-05-17 Data processing method, apparatus, storage medium and vehicle

Publications (1)

Publication Number Publication Date
CN117425930A true CN117425930A (en) 2024-01-19

Family

ID=88834440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280005449.XA Pending CN117425930A (en) 2022-05-17 2022-05-17 Data processing method and device, storage medium and vehicle

Country Status (2)

Country Link
CN (1) CN117425930A (en)
WO (1) WO2023220920A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080304677A1 (en) * 2007-06-08 2008-12-11 Sonitus Medical Inc. System and method for noise cancellation with motion tracking capability
CN108352155A (en) * 2015-09-30 2018-07-31 惠普发展公司,有限责任合伙企业 Inhibit ambient sound
WO2021157614A1 (en) * 2020-02-05 2021-08-12 豊通ケミプラス株式会社 Noise reduction device and noise reduction method
CN112331173B (en) * 2020-10-26 2024-02-23 通力科技股份有限公司 In-vehicle noise reduction method, controller, in-vehicle pillow and computer readable storage medium

Also Published As

Publication number Publication date
WO2023220920A1 (en) 2023-11-23

Similar Documents

Publication Publication Date Title
US11989350B2 (en) Hand key point recognition model training method, hand key point recognition method and device
JP6944136B2 (en) Image processing device and image processing method
US9574874B2 (en) System and method for reconstructing 3D model
WO2018150933A1 (en) Image processing device and image processing method
JP6944137B2 (en) Image processing device and image processing method
US11605179B2 (en) System for determining anatomical feature orientation
CN111160309B (en) Image processing method and related equipment
US10943335B2 (en) Hybrid tone mapping for consistent tone reproduction of scenes in camera systems
JPWO2019039282A1 (en) Image processing device and image processing method
CN112927362A (en) Map reconstruction method and device, computer readable medium and electronic device
CN110706339B (en) Three-dimensional face reconstruction method and device, electronic equipment and storage medium
JP6944133B2 (en) Image processing device and image processing method
CN112598780B (en) Instance object model construction method and device, readable medium and electronic equipment
WO2019118089A1 (en) Multi-modal far field user interfaces and vision-assisted audio processing
CN113936085A (en) Three-dimensional reconstruction method and device
JP6930541B2 (en) Image processing device and image processing method
US20190096073A1 (en) Histogram and entropy-based texture detection
CN117425930A (en) Data processing method and device, storage medium and vehicle
CN108846817B (en) Image processing method and device and mobile terminal
JP6977725B2 (en) Image processing device and image processing method
CN113192072B (en) Image segmentation method, device, equipment and storage medium
JPWO2019087513A1 (en) Information processing equipment, information processing methods and programs
US11240482B2 (en) Information processing device, information processing method, and computer program
CN116563817B (en) Obstacle information generation method, obstacle information generation device, electronic device, and computer-readable medium
WO2022009552A1 (en) Information processing device, information processing method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination