CN114157905B

CN114157905B - Television sound adjusting method and device based on image recognition and television

Info

Publication number: CN114157905B
Application number: CN202111382511.0A
Authority: CN
Inventors: 苏运全; 李蛟龙
Original assignee: Shenzhen Konka Electronic Technology Co Ltd
Current assignee: Shenzhen Konka Electronic Technology Co Ltd
Priority date: 2021-11-22
Filing date: 2021-11-22
Publication date: 2023-12-05
Anticipated expiration: 2041-11-22
Also published as: CN114157905A

Abstract

The embodiment of the invention provides a television sound adjusting method and device based on image recognition, and a television, wherein the method comprises the following steps: acquiring an image of a user currently watching a television, and acquiring user information based on the image; determining target sound parameters of a sound system of the television according to the user information; and adjusting the corresponding sound parameters of the sound system according to the target sound parameters. The invention can automatically adjust the sound parameters of the television in real time according to the user information of the user currently watching the television without manual adjustment of the user, thereby simplifying the operation of the user and improving the use experience of the user.

Description

Television sound adjusting method and device based on image recognition and television

Technical Field

The invention relates to the field of televisions, in particular to a television sound adjusting method and device based on image recognition and a television.

Background

Television is one of the important devices for people's living room leisure and entertainment, which can be used by users who meet different ages in the home by providing different programs.

However, since the hearing situations of users in different ages are different, the hearing situations have different requirements on the sound system of the television, so that the sound system of the television needs to be considered for users in different ages.

The traditional television manufacturer generally adjusts three to five sound modes according to the frequency response performance and arrangement of the built-in speakers in advance, and fixes the three to five sound modes in a television menu so that a user can actively switch the sound modes. In this way, a great deal of inconvenience is brought to the user, for example, each sound mode needs to be understood before operation, but some old people or users in the young age stage may not have a method for correctly recognizing the sound mode, and in addition, once the crowd watching television is replaced, the user needs to manually adjust the sound mode, so that the operation is complicated, and the use experience of the user is poor.

Disclosure of Invention

In view of the above, the present invention is directed to a method and apparatus for adjusting television sound based on image recognition, and a television set, so as to improve the above-mentioned problems.

The embodiment of the invention provides a television sound adjusting method based on image recognition, which comprises the following steps:

acquiring an image of a user currently watching a television, and acquiring user information based on the image;

determining target sound parameters of a sound system of the television according to the user information;

and adjusting the corresponding sound parameters of the sound system according to the target sound parameters.

Preferably, the user information includes a user population and location information of each user; the position information is represented by a coordinate, the origin of a coordinate system where the coordinate is located is the center point of the image, the transverse axis of the coordinate system is along the width direction of the image, and the longitudinal axis of the coordinate system is along the height direction of the image;

determining target sound parameters of a sound system of the television according to the user information, wherein the target sound parameters comprise:

determining a target channel balance value of a sound system of the television according to the position information of each user and the adjusting range of the channel balance value of the sound system;

and determining the left channel gain and the right channel gain of the sound system according to the target channel balance value.

Preferably, the calculation formula of the target channel balance value S is:

wherein:

n is the total number of users, W is theHalf the image resolution width; x is x ₁ ，x ₂ …x _N The abscissa value of the coordinates for each user.

Preferably, each target channel balance value corresponds to a set of left and right channel gains, and when the target channel balance value is greater than 0, the right channel gain is greater than the left channel gain, and when the target channel balance value is less than 0, the right channel gain is less than the left channel gain.

Preferably, the user information further includes an age group of each user;

determining target sound parameters of a sound system of the television according to the user information, and further comprising:

according to the age groups of each user, obtaining the proportion of each age group;

the gains in the treble, midrange and bass of the sound system are determined according to the proportions of the age groups.

Preferably, the age group comprises an elderly group, a middle-aged and young group, and a childhood group;

the gains of the sound system in the treble, the midrange and the bass are determined according to the proportion of each age range, and the method specifically comprises the following steps:

determining the age group currently in the leading age group according to the proportion of each age group;

acquiring an adjusting value corresponding to the age group in the leading state;

and determining the gains of the sound system in the high-pitched sound, the medium-pitched sound and the low-pitched sound according to the adjustment value.

Preferably, when the age group at the dominant age is the elderly, the treble gain is greater than the midrange gain and greater than the bass gain;

when the age group at the dominant age is the child group, the treble gain is less than the midrange gain and less than the bass gain.

The embodiment of the invention also provides a television sound adjusting device based on image recognition, which comprises:

the television comprises a user information acquisition unit, a television information acquisition unit and a display unit, wherein the user information acquisition unit is used for acquiring an image of a user currently watching a television and acquiring user information based on the image;

a target sound parameter determining unit, configured to determine a sound system target sound parameter of the television according to the user information;

and the adjusting unit is used for adjusting the corresponding sound parameters of the sound system according to the target sound parameters.

The embodiment of the invention also provides a television, which comprises:

an image capture module that captures an image of a user who is watching television;

a processor configured to:

acquiring user information based on the image;

determining a left channel gain and a right channel gain of the sound system according to the target channel balance value; the calculation formula of the target channel balance value S is as follows:

wherein:

n is the total number of users, W is the width of the resolution of the imageHalf of the total; x is x ₁ ，x ₂ …x _N The abscissa value of the coordinates for each user.

In one embodiment, after obtaining the user information through the image of the user in front of the television, the target sound parameter corresponding to the user currently watching the television is determined according to the user information, and the sound system of the television is automatically adjusted according to the target sound parameter, so that the adjusted sound parameter can better meet the requirements of the user currently watching the television or the requirements of most of the users currently watching the television. Because the embodiment continuously acquires the user information, the voice parameters can be automatically adjusted in real time according to the number of people, the position and the like of the users currently watching the television, and manual adjustment of the users is not needed, so that the operation of the users can be simplified, and the use experience of the users is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a television sound adjustment method based on image recognition according to a first embodiment of the present invention.

Fig. 2 is a schematic diagram of an image captured by a camera according to an embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a television sound adjusting device based on image recognition according to a second embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

For a better understanding of the technical solution of the present invention, the following detailed description of the embodiments of the present invention refers to the accompanying drawings.

It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

The invention is described in further detail below with reference to the attached drawings and detailed description:

referring to fig. 1, a first embodiment of the present invention provides a television sound adjusting method based on image recognition, which includes the following steps:

s101, acquiring an image of a user currently watching a television, and acquiring user information based on the image.

In this embodiment, the image of the user currently watching the television may be captured by a camera module (such as a camera), where the camera module may be an internal camera module or an external camera module of the television, and this embodiment is not limited specifically.

In this embodiment, the image capturing module is responsible for detecting and capturing pictures in the visual field range in real time, and generating corresponding images so as to obtain user information of the current watching television based on the images.

The user information obtained through the image can be processed by a television or a camera module.

When the image is processed by the television, the image shot by the camera can be transmitted to the television through a uvc protocol, and then the television can acquire the user information by executing a corresponding image recognition algorithm.

Among them, uvc is called USB Video Class, namely: the USB video class is a protocol standard defined for USB video capture devices. The protocol standard defined for USB video capture devices, which is jointly introduced by Microsoft and several other device manufacturers, has become one of the USB org standards.

When the user information is processed by the image pickup module, the image pickup module obtains the user information after carrying out operation on the image through an algorithm built in the image pickup module after shooting the image, and then the user information is transmitted to the television for further processing.

Of course, part of the processing may be performed by a television and part of the processing may be performed by a camera, and the present invention is not limited specifically. In addition, when the image is processed, each frame of image can be processed, or the image can be processed once every preset time period, and all the schemes are within the protection scope of the invention.

In this embodiment, the user information may include the total number of users, the location information, the gender information, the age group, etc., which may be implemented by the existing image processing algorithm, and the present invention is not described herein.

S102, determining target sound parameters of a sound system of the television according to the user information.

S103, adjusting corresponding sound parameters of the sound system according to the target sound parameters.

In this embodiment, the sound system may be a sound system embedded in a television, or may be a sound system externally connected to the television, which is not particularly limited in the present invention.

In this embodiment, the sound parameters include, for example, volume level, left channel gain, right channel gain, treble gain, midrange gain, and bass gain, and the invention is not limited specifically as the case may be.

In this embodiment, after obtaining the user information, the television determines a target sound parameter corresponding to the user currently watching the television according to the user information, and automatically adjusts a sound system of the television according to the target sound parameter, so that the adjusted sound parameter can better meet the requirements of the user currently watching the television or meet the requirements of most of the users currently watching the television. Because the embodiment continuously acquires the user information, the voice parameters can be automatically adjusted in real time according to the number of people, the position and the like of the users currently watching the television, and manual adjustment of the users is not needed, so that the operation of the users can be simplified, and the use experience of the users is improved.

In order to facilitate an understanding of the invention, some preferred embodiments of the invention are described further below.

In a preferred embodiment, the user information includes a user population and location information of each user; step S102 specifically includes:

s1021, determining a target channel balance value of the television according to the position information of each user and the adjusting range of the channel balance value of the television;

s1022, determining the left channel gain and the right channel gain of the television according to the target channel balance value.

In this embodiment, the population of users may be achieved by a face recognition algorithm, and in order to obtain the position information of each user, as shown in fig. 2, a coordinate system is established in the image with the center point of the image as the origin, the horizontal axis of the coordinate system is along the width direction of the image (positive to the right and negative to the left), and the vertical axis of the coordinate system is along the height direction of the image (positive upwards and negative downwards).

As shown in fig. 2, assuming that the resolution of the image is 1080×720 (1080 is width in the width direction, which means 1080 pixels in the width direction, 720 is height in the height direction, which means 720 pixels in the height direction), it is known from the image that three users A, B, C are included on the image, and the position information of user a is (-500,180), the position information of user B is (-280,180), and the position information of user C is (280, -60), wherein the position information of the user can be represented by a specific reference point of the head of the user, for example, the specific reference point can be selected as a midpoint of a connecting line of two eyes, depending on the actual situation.

As can be seen from fig. 2, among the three existing persons, two persons are on the left side of the tv set and one person is on the right side of the tv set, so that in order to make most users obtain better sound experience, the direction of sound can be adjusted by adjusting the balance of the channels.

Specifically, the channel balance S represents the difference between the gains of the left and right channels in the stereo broadcasting system, and if the imbalance is too large, the sound image localization of the stereo broadcasting will be shifted. Typically the stereo balance of a high quality sound system should be less than 1dB. It is the offset generated by this difference that is used in this embodiment to offset the sound to the location where more users are located.

In this embodiment, the adjustment range (-s, s) of the channel balance value of the sound system is related to the actual capability of the sound system, and if the range is too large, problems such as sound breaking can occur. For example, s may be 50.

The channel balance S can be calculated by the following algorithm:

wherein: n is the total number of users, and W is half of the resolution width of the image; x is x ₁ ，x ₂ …x _N The abscissa value of the coordinates for each user. Each target channel balance value corresponds to a set of left channel gain and right channel gain, and when the target channel balance value is greater than 0, the right channel gain is greater than the left channel gain, and when the target channel balance value is less than 0, the right channel gain is less than the left channel gain.

From the values of fig. 2, the current channel balance s= -15.43 can be calculated. S is negative and represents that the user on the left is dominant, so that the sound can be properly deflected to the left, for example, the gain of the left channel can be set to 15db, and the gain of the right channel can be set to 0db, and then, after the last sound is overlapped, the sound is deflected to the left, and the perception of the user is that the sound comes out from the left. Of course, the specific channel balance S and the gain values of the left and right channels are determined by the power amplifier capability of the specific television and the speaker material, which will not be described herein.

In summary, in this embodiment, the channel balance is determined according to the position information of the user and the total number of people, and then the gains of the left and right channels are determined according to the channel balance, so as to adjust the propagation direction of the sound system, so that the sound is shifted to the position where more users are located, and the experience of the user is improved.

Preferably, the user information further includes an age group of each user;

step S102 further includes:

s1023, acquiring the proportion of each age group according to the age group of each user;

s1024, according to the proportion of each age group, the gains of the television in the high, medium and low tones are determined.

In this embodiment, the gains of the high, medium and low frequencies of the high quality sound system are relatively balanced, so that the gain is not excessively adjusted for the middle-aged and young people, but is limited by the age of the user, and the subjective feeling of the balance is uneven, especially for the elderly and children. The elderly are insensitive to high tones and even some have not heard sounds of 12kHz-20 kHz. Whereas children are vice versa. It is necessary to adjust the sound pattern according to the user information.

In this embodiment, for example, the age group may be divided into three stages of an elderly stage, a young and middle aged stage, and a childhood stage. Step S1024 specifically includes:

first, according to the proportion of each age group, the age group currently in the dominant age group is determined.

TABLE 1

z	Children' s	Middle-aged people	Elderly people
				N(z)/N	h	j	k

As shown in Table 1, z represents the identified age group of the user, and N (z)/N represents the user's duty cycle for each age group.

And secondly, acquiring an adjustment value corresponding to the dominant age group.

TABLE 2

As shown in table 2, according to the ratio of the age groups, the present embodiment can determine the dominant age group and acquire the adjustment value (i.e., G value) corresponding to the dominant age group.

The dominant position may be only the age group with the highest duty ratio, or may be required to have a duty ratio larger than the sum of duty ratios of other age groups, which is specific to the actual situation, and the present invention is not limited in particular.

It will also be appreciated that the adjustment values may be set as desired, and that the above-mentioned values, -10,0, 10 are merely examples and should not be construed as limiting the invention.

And finally, determining the gains of the television in the high-pitched, medium-pitched and low-pitched according to the adjustment value.

In the present embodiment, the gain of a set of high, medium, and low tones corresponding to each adjustment value G is (FA (G), FB (G), FC (G)). Examples: when G is +10, the gain of sound channel high, medium and low sound +3dB,1dB and-2 dB can be set, and after the last sound is superimposed, the medium and high sound is relatively improved, the effect is bright, and the method is suitable for the sound mode of the observation of the old user. When G is calculated to be-10, the gain of sound channels of high, medium and low sound is set to be-2 dB,1dB and +3dB, and finally, after sound is overlapped, the bass is relatively improved, so that the method is suitable for watching the video of the child user. When the calculated G is 0, the high, medium and low sound gains of the sound channels can be set to be the same (for example, all the gains are 0), so that the sound effect of comparative equilibrium can be achieved.

It should be noted that, the adjustment value G is not limited to a fixed value, but may be a function value taking N (z)/N as a factor, and will not be described here.

It should be noted that, the relation between the adjustment value G and the high, medium and low sound gains may be a preset mapping relation, or may be a function relation, and the function is not limited to a linear function, but may be another function, which is not described herein.

In order to facilitate an understanding of the present invention, a practical example will be described below to illustrate an application of the embodiments of the present invention.

Assume that 3 persons are being located the angle view of parallel television horizontal line, are 2 old persons and 1 child respectively, and 2 old persons are located the screen axis and lean on right 2m, and children are located the screen axis and lean on left 1m. User information of 3 persons can be acquired through the camera and data are packaged, and the key data format is as follows: (3, n) ₁ (2, 0, senior stage), n ₂ (2, 0, senior stage), n ₃ (-1, 0, childhood). Wherein 3 represents the total number of users, n ₁ (2, 0, elderly person) user information indicating the first user is coordinates (2, 0), and the age group is the elderly.

Calculating s=1w by a formula of channel balance; w is half of the image resolution width, and at the moment, the fact that most users are located on the right side of the television is known, so that the sound mode adjusting module adjusts the gain of a right channel and adjusts the gain of a left channel, more sound is concentrated on the right side of a plurality of people, and the right channel is larger than the left channel in detail and can be preset in advance. H=1/3, k=0, j=2/3 can be obtained according to age group information of the user; j > = h+k; g is +10, so the high, medium and low gain is adjusted as: FA (g), FB (g) and FC (g) are as the corresponding +3dB,1dB and-2 dB, thereby improving the effect of the middle and high voice section and making up the loss of middle and high voice sensitivity of the ears of the old.

Referring to fig. 3, a second embodiment of the present invention further provides a television sound adjusting device based on image recognition, which includes:

a user information obtaining unit 210, configured to obtain user information of a current television being watched through an image captured by a camera;

a target sound parameter determining unit 220, configured to determine a target sound parameter of the television according to the user information;

the adjusting unit 230 is configured to adjust the corresponding sound parameters of the television according to the target sound parameters, so that the adjusted sound parameters can meet the hearing requirements of most users currently watching the television.

The third embodiment of the present invention also provides a television set, which includes:

a camera module capturing an image of a user watching television;

and a processor connected with the camera module and configured to:

acquiring user information based on the image;

wherein:

n is the total number of users, and W is half of the resolution width of the image; x is x ₁ ，x ₂ …x _N The abscissa value of the coordinates for each user.

The fourth embodiment of the present invention also provides a computer readable storage medium storing a computer program, where the computer program can be executed by a processor of a device where the readable storage medium is located, so as to implement the television sound adjustment method based on image recognition as described above.

In the embodiments provided in the present invention, it should be understood that the disclosed method may be implemented in other manners. The apparatus and method embodiments described above are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A television sound adjustment method based on image recognition, comprising:

acquiring an image of a user currently watching a television, and acquiring user information based on the image; wherein the user information includes an age group of each user; the age group comprises an elderly section, a middle-aged and young section and a child section;

determining target sound parameters of a sound system of the television according to the user information; the method comprises the steps of obtaining the proportion of each age group according to the age group of each user; according to the proportion of each age range, the gains of the sound system in high-pitched, medium-pitched and low-pitched are determined; the method for determining the gains of the sound system in the high-pitched sound, the medium-pitched sound and the low-pitched sound according to the proportion of each age group specifically comprises the following steps: determining the age group currently in the leading age group according to the proportion of each age group; acquiring an adjusting value corresponding to the age group in the leading state; determining gains of the sound system in high-pitched, medium-pitched and low-pitched according to the adjustment value; wherein each adjustment value G corresponds to a set of gains of high, medium and low tones; when the age group at the dominant age is the elderly, the high-pitch gain is larger than the medium-pitch gain and the low-pitch gain; when the age group at the dominant age group is the child group, the high-pitch gain is smaller than the medium-pitch gain and smaller than the low-pitch gain;

2. The image recognition-based television sound adjustment method according to claim 1, wherein the user information includes a total number of users and location information of each user; the position information is represented by a coordinate, the origin of a coordinate system where the coordinate is located is the center point of the image, the transverse axis of the coordinate system is along the width direction of the image, and the longitudinal axis of the coordinate system is along the height direction of the image;

3. The method for adjusting television sound based on image recognition according to claim 2, wherein the calculation formula of the target channel balance value S is:

wherein:

n is the total number of users, and W is half of the resolution width of the image; x1, x2 … xN are the abscissa values of the coordinates of the respective users.

4. The image recognition-based television sound adjustment method of claim 3, wherein each target channel balance value corresponds to a set of left channel gain and right channel gain, and the right channel gain is greater than the left channel gain when the target channel balance value is greater than 0, and the right channel gain is less than the left channel gain when the target channel balance value is less than 0.

5. A television sound conditioning device based on image recognition, comprising:

the television comprises a user information acquisition unit, a television information acquisition unit and a display unit, wherein the user information acquisition unit is used for acquiring an image of a user currently watching a television and acquiring user information based on the image; wherein the user information includes an age group of each user; the age group comprises an elderly section, a middle-aged and young section and a child section;

a target sound parameter determining unit, configured to determine a sound system target sound parameter of the television according to the user information; the method comprises the steps of obtaining the proportion of each age group according to the age group of each user; according to the proportion of each age range, the gains of the sound system in high-pitched, medium-pitched and low-pitched are determined; the method for determining the gains of the sound system in the high-pitched sound, the medium-pitched sound and the low-pitched sound according to the proportion of each age group specifically comprises the following steps: determining the age group currently in the leading age group according to the proportion of each age group; acquiring an adjusting value corresponding to the age group in the leading state; determining gains of the sound system in high-pitched, medium-pitched and low-pitched according to the adjustment value; when the age group at the dominant age is the elderly, the high-pitch gain is larger than the medium-pitch gain and the low-pitch gain; when the age group at the dominant age group is the child group, the high-pitch gain is smaller than the medium-pitch gain and smaller than the low-pitch gain;

6. A television set, comprising:

a camera module capturing an image of a user watching television;

and a processor connected with the camera module and configured to:

acquiring user information based on the image; wherein the user information includes an age group of each user; the age group comprises an elderly section, a middle-aged and young section and a child section;

determining target sound parameters of a sound system of the television according to the user information; the method comprises the steps of obtaining the proportion of each age group according to the age group of each user; according to the proportion of each age range, the gains of the sound system in high-pitched, medium-pitched and low-pitched are determined; the method for determining the gains of the sound system in the high-pitched sound, the medium-pitched sound and the low-pitched sound according to the proportion of each age group specifically comprises the following steps: determining the age group currently in the leading age group according to the proportion of each age group; acquiring an adjusting value corresponding to the age group in the leading state; determining gains of the sound system in high-pitched, medium-pitched and low-pitched according to the adjustment value; when the age group at the dominant age is the elderly, the high-pitch gain is larger than the medium-pitch gain and the low-pitch gain; when the age group at the dominant age group is the child group, the high-pitch gain is smaller than the medium-pitch gain and smaller than the low-pitch gain;

7. The television set according to claim 6, wherein the user information includes a user population and location information of each user; the position information is represented by a coordinate, the origin of a coordinate system where the coordinate is located is the center point of the image, the transverse axis of the coordinate system is along the width direction of the image, and the longitudinal axis of the coordinate system is along the height direction of the image;

determining a left channel gain and a right channel gain of the sound system according to the target channel balance value;

the calculation formula of the target channel balance value S is as follows:

wherein: