WO2023040547A1 - Volume adjustment method and apparatus, terminal, and computer-readable storage medium - Google Patents

Volume adjustment method and apparatus, terminal, and computer-readable storage medium Download PDF

Info

Publication number
WO2023040547A1
WO2023040547A1 PCT/CN2022/112705 CN2022112705W WO2023040547A1 WO 2023040547 A1 WO2023040547 A1 WO 2023040547A1 CN 2022112705 W CN2022112705 W CN 2022112705W WO 2023040547 A1 WO2023040547 A1 WO 2023040547A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
information
distance
volume
processor
Prior art date
Application number
PCT/CN2022/112705
Other languages
French (fr)
Chinese (zh)
Inventor
吴文飞
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2023040547A1 publication Critical patent/WO2023040547A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72439User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for image or video messaging
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers

Definitions

  • the present application relates to the technical field of volume adjustment, and more specifically, to a volume adjustment method, a volume adjustment device, a terminal, and a non-volatile computer-readable storage medium.
  • the user often adjusts the volume by pressing the volume adjustment button on the terminal.
  • the distance between the user and the terminal changes, only the button is provided to adjust the volume, and the user cannot get the best results. playback volume.
  • Embodiments of the present application provide a volume adjustment method, a volume adjustment device, a terminal, and a non-volatile computer-readable storage medium.
  • the volume adjustment method in the embodiment of the present application includes acquiring a face image, the face image including shake information; calculating the distance between the face and the electronic device according to the face image; When setting the range, adjust the playback volume according to the distance.
  • the volume adjustment device in the embodiment of the present application includes an acquisition module, a calculation module and an adjustment module.
  • the obtaining module is used to obtain a face image, and the face image includes shaking information.
  • the calculating module is used for calculating the distance between the human face and the electronic device according to the human face image.
  • the adjustment module is configured to adjust the playback volume according to the distance when the shaking information is within a first preset range.
  • the terminal in the embodiment of the present application includes a processor.
  • the processor is used to acquire a human face image, the human face image includes shaking information; calculate the distance between the human face and the electronic device according to the human face image; and when the shaking information is within a first preset range, Adjust the playback volume according to the distance.
  • the non-transitory computer-readable storage medium of the embodiment of the present application contains a computer program.
  • the processor is made to perform the following volume adjustment method: acquire a face image, and The human face image includes shaking information; the distance between the human face and the electronic device is calculated according to the human face image; and when the shaking information is within a first preset range, the playback volume is adjusted according to the distance.
  • FIG. 1 is a schematic flowchart of a volume adjustment method in some embodiments of the present application
  • FIG. 2 is a schematic diagram of a volume adjustment device in some embodiments of the present application.
  • FIG. 3 is a schematic plan view of a terminal in some embodiments of the present application.
  • FIG. 4 is a schematic diagram of a scene of a volume adjustment method in some embodiments of the present application.
  • FIG. 5 is a schematic flowchart of a volume adjustment method in some embodiments of the present application.
  • FIG. 6 is a schematic diagram of a scene of a volume adjustment method in some embodiments of the present application.
  • FIG. 7 and FIG. 8 are schematic flowcharts of volume adjustment methods in some embodiments of the present application.
  • Fig. 9 is a schematic scene diagram of a volume adjustment method in some embodiments of the present application.
  • Fig. 13 is a schematic diagram of a connection state between a non-volatile computer-readable storage medium and a processor in some embodiments of the present application.
  • the volume adjustment method in the embodiment of the present application includes acquiring a face image, the face image including shake information; calculating the distance between the face and the electronic device according to the face image; When setting the range, adjust the playback volume according to the distance.
  • the volume adjustment method includes acquiring a plurality of consecutive frames of the human face images within a first predetermined duration; Whether the difference of the position coordinates of the face is within the first preset range; and if so, determining that the shaking information is within the first preset range.
  • the face image further includes angle information
  • the adjusting the playback volume according to the distance includes when the shaking information is in the first preset range and the angle information is in a second preset range.
  • adjust the playback volume according to the distance when setting the range, adjust the playback volume according to the distance.
  • the volume adjustment method further includes acquiring a plurality of consecutive frames of the human face image within a second predetermined duration; judging whether the angle information of the human face in the continuous multiple frames of the human face image within the second preset range; and if so, determining that the angle information is within the second preset range.
  • the volume adjustment method before calculating the distance between the human face and the electronic device according to the human face image, further includes receiving an input operation to set the human face of a plurality of different users priority; and acquiring the first face information of the face with the highest priority in the face image as the target face information; calculating the The distance between the human face and the electronic device includes calculating the distance between the human face and the electronic device according to the target human face information.
  • the acquiring the first face information of the face with the highest priority in the face image as the target face information includes identifying the face in the face image The second face information of one or more of the faces; comparing the one or more of the second face information with the pre-stored face information in the preset face database to obtain the same The second face information matched with the pre-stored face information is used as the first face information; the first face information with the highest priority of the face is obtained as the target person face information.
  • the pre-stored face information is generated according to the face images of different users under different lighting conditions.
  • the pre-stored face information is generated according to the face images of different users at different shooting angles.
  • the volume adjustment method further includes setting an initial volume at an initial distance according to an input operation, and associating the initial distance with the face in the face image collected at the initial distance
  • the initial size of the human face; the calculation of the distance between the human face and the electronic device according to the human face image includes the calculation according to the initial distance, the initial size and the size of the human face in the multiple frames of the human face image average size, calculate the distance.
  • the adjusting the playback volume according to the distance includes determining an adjustment volume according to the initial distance, the distance, and the initial volume; and adjusting the playback volume according to the adjustment volume.
  • the volume adjustment device in the embodiment of the present application includes an acquisition module, a calculation module and an adjustment module.
  • the obtaining module is used to obtain a face image, and the face image includes shaking information.
  • the calculating module is used for calculating the distance between the human face and the electronic device according to the human face image.
  • the adjustment module is configured to adjust the playback volume according to the distance when the shaking information is within a first preset range.
  • the terminal in the embodiment of the present application includes a processor.
  • the processor is used to acquire a human face image, the human face image includes shaking information; calculate the distance between the human face and the electronic device according to the human face image; and when the shaking information is within a first preset range, Adjust the playback volume according to the distance.
  • the processor is configured to acquire multiple consecutive frames of the human face images within a first predetermined duration; and determine whether the person in any two frames of the human face images among the multiple consecutive frames of the human face images Whether the difference of the position coordinates of the face is within the first preset range; and if so, determining that the shaking information is within the first preset range.
  • the face image further includes angle information
  • the processor is configured to, when the shaking information is in the first preset range and the angle information is in a second preset range, according to the Adjust the playback volume by the above distance.
  • the processor is configured to acquire multiple consecutive frames of the human face images within a second predetermined duration; and determine whether the angle information of the human faces in the multiple consecutive frames of the human face images is within the specified range. the second preset range; and if so, determining that the angle information is within the second preset range.
  • the processor before the processor calculates the distance between the human face and the electronic device according to the human face image, the processor is configured to: receive an input operation to set the human face of a plurality of different users The priority of the face; and obtaining the first face information of the face with the highest priority in the face image as the target face information; calculating the target face information according to the target face information The distance between the human face and the electronic device.
  • the processor is configured to identify the second face information of one or more of the faces in the face image; combine one or more of the second face information with Compare the pre-stored face information in the preset face database to obtain the second face information matched with the pre-stored face information as the first face information; obtain the face information The first face information with the highest priority is used as the target face information.
  • the processor is configured to generate the prestored face information according to the face images of different users under different lighting conditions.
  • the processor is configured to generate the prestored face information according to the face images of different users at different shooting angles.
  • the processor is configured to set an initial volume at an initial distance according to an input operation, and associate the initial distance with the initial volume of the face in the face image collected at the initial distance. Size: Calculate the distance according to the initial distance, the initial size, and the average size of the face sizes in multiple frames of the face images.
  • the processor is configured to determine an adjusted volume according to the initial distance, the distance, and the initial volume; and adjust the playback volume according to the adjusted volume.
  • an embodiment of the present application provides a volume adjustment method.
  • the volume adjustment method includes steps:
  • an embodiment of the present application provides a volume adjustment device 10 .
  • the volume adjustment device 10 includes an acquisition module 11 , a calculation module 12 and an adjustment module 13 .
  • the volume adjustment method in the embodiments of the present application can be applied to the volume adjustment device 10 .
  • the acquisition module 11 is used to execute step 101
  • the calculation module 12 is used to execute step 102
  • the adjustment module 13 is used to execute step 103 . That is, the obtaining module 11 is used to obtain a face image, and the face image includes shaking information.
  • the calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image.
  • the adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range.
  • the embodiment of the present application further provides a terminal 100 .
  • the terminal 100 includes a processor 30 .
  • the volume adjustment method in the embodiments of the present application may be applied to the terminal 100 .
  • the processor 30 is configured to execute step 101 , step 102 and step 103 . That is, the processor 30 is used to acquire a face image, and the face image includes shake information; calculate the distance between the face and the electronic device according to the face image; and adjust the playback volume according to the distance when the shake information is within a first preset range.
  • the terminal 100 further includes a housing 40 .
  • the terminal 100 may be a mobile phone, a tablet computer, a display device, a notebook computer, an teller machine, a gate, a smart watch, a head-mounted display device, a game console, and the like. As shown in FIG. 3 , the embodiment of the present application is described by taking the terminal 100 as an example of a mobile phone. It can be understood that the specific form of the terminal 100 is not limited to the mobile phone.
  • the housing 40 can also be used to install functional modules such as a display device, an imaging device, a power supply device, and a communication device of the terminal 100, so that the housing 40 provides protection for the functional modules such as dustproof, dropproof, and waterproof.
  • the processor 30 first needs to determine whether the face (that is, the user) in the face image is within the first preset range according to the shaking information in the face image.
  • the first preset range may be a position where the face is not shaken.
  • the first preset range may also be the maximum range that the human face can allow shaking, that is, when the range is exceeded, the processor determines that the human face shakes.
  • the shaking information may include the position of the human face and the preset position (i.e. the first preset range) where the human face is not shaking, and the processor may determine whether the position of the human face in the human face image is within When presetting the position, it is used to judge whether the face shakes. For example, when the processor judges that the position of the human face is in the predetermined position, the processor judges that the human face is in the first preset range, that is, the human face is shaken; when the processor judges that the position of the human face is not in the predetermined position, the processor judges that the human face If it is not in the first preset range, that is, the human face shakes.
  • the processor 30 before adjusting the playback volume of the terminal 100, acquires multiple frames of face images, and only detects the faces in the face images, and compares the faces in the multiple frames of face images. After whether the position changes greatly, it can be determined whether the shake information is within the first preset range, so as to determine whether the face shakes.
  • the processor 30 when the processor 30 changes greatly by comparing the positions of the faces in the multi-frame face images (that is, the position difference of the faces in the multi-frame face images is outside the first preset range), the processor will 30, it is determined that the shake information is not in the first preset range, and the face shakes, and the processor 30 compares the position of the target face image in the multi-frame portrait images without changing, or the position change is small (that is, the position of the multi-frame portrait image is relatively small). When the position difference of the face in the face image is within the first preset range), the processor 30 determines that the shake information is within the first preset range, and the face does not shake.
  • the processor 30 can calculate the current distance between the face and the electronic device (ie, the terminal 100 ) according to the face image.
  • the corresponding mapping relationship between the size of the human face and the distance can be preset in the terminal 100, that is, the size of the human face can reflect the distance between the human face and the terminal 100, so that the processor 30 can , to get the distance between the face and the electronic device.
  • the terminal 100 can preset the distance between the human face and the electronic device to be 0.5 meters and 1 meter, and the corresponding facial images are human face image P1 and human face image P2 respectively. It can be seen that the human face image P1 Different from the size of the face in the face image P2, the size of the face in the face image P2 is smaller than that in the face image P1.
  • the processor 30 when the processor 30 acquires the face image, it can be compared with the face size in the face image P1 and the face image P2 respectively, that is, when the face in the face image acquired by the processor 30 is compared with the size of the person
  • the processor 30 can conclude that the distance between the faces and the electronic device is 0.5 meters.
  • the processor 30 can conclude that the distance between the face and the electronic device is 1 meter.
  • the processor 30 determines that the human face is not shaking, that is, the shaking information is within the first preset range, and calculates the distance between the human face and the electronic device, the processor 30 will obtain the volume corresponding to the distance according to the distance, thereby Adjust the playback volume of the terminal 100.
  • the mapping relationship between the predetermined distance and the predetermined volume set by the user can be set in advance in the terminal 100.
  • the processor 30 can compare the distance with the predetermined volume. distance, so as to obtain the change ratio of the distance relative to the preset distance, and then calculate the product of the change ratio and the predetermined volume to obtain the corresponding volume at the current distance, and the processor 30 can adjust the playback volume of the terminal 100 according to the volume, that is, The playback volume of the terminal 100 is adjusted to this volume.
  • the processor 30 can compare the current distance and the predetermined distance to obtain the change ratio of the distance, so that the relationship between the sound pressure and the distance and the change ratio can be used to obtain the volume of the playback volume of the terminal 100 that needs to be adjusted theoretically relative to the predetermined volume at the current distance, so as to adjust the volume of the terminal 100 Playback volume.
  • the volume adjustment method, the volume adjustment device 10 and the terminal 100 of the embodiment of the present application will adjust the playback volume according to the distance between the face and the electronic device when the shaking information of the face image is in the first preset range, that is, when the face is not shaking , thus, it can be ensured that the user will not adjust the playback volume if the user shakes unconsciously during the use of the terminal 100, thereby ensuring the accuracy of judging whether to adjust the playback volume, so that the user can obtain the best volume experience.
  • the volume adjustment method of the embodiment of the present application also includes steps:
  • 501 Obtain a face image, where the face image includes shaking information
  • the acquisition module 11 is used to execute step 501 , step 503 , step 404 and step 505 , the calculation module 12 is used to execute step 502 , and the adjustment module 13 is used to execute step 506 . That is, the acquisition module 11 is used to obtain a face image, and the face image includes shaking information; obtain continuous multiple frames of human face images in the first predetermined duration; judge the continuous multiple frames of human face images, the people in any two frames of human face images Whether the difference of the position coordinates of the face is within a first preset range; and if so, determining that the shaking information is within the first preset range.
  • the calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image.
  • the adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range.
  • the processor 30 is configured to execute step 501 , step 502 , step 503 , step 504 , step 505 and step 506 . That is, the processor 30 obtains a face image, and the face image includes shaking information; calculates the distance between the face and the electronic device according to the face image; obtains continuous multi-frame face images in the first predetermined duration; judges the continuous multi-frame face images , whether the difference between the position coordinates of the faces in any two frames of face images is within a first preset range; and if so, determining that the shaking information is within a first preset range. When the shaking information is within the first preset range, the playback volume is adjusted according to the distance.
  • step 501 is executed in the same manner as above-mentioned step 101
  • step 502 is executed in the same manner as above-mentioned step 102
  • step 506 is executed in the same manner as above-mentioned step 103, which will not be repeated here.
  • the processor 30 will acquire multiple consecutive frames of human face images within the first predetermined time period, so as to determine the shaking Whether the information is in the first preset range. Whether the shaking information is within the first preset range can also reflect whether the face shakes within the first predetermined time period.
  • the shaking of the human face may be the shaking that occurs on the terminal 100 when the user operates the terminal 100, or the shaking that occurs unconsciously by the user himself, that is, the shaking of the human face is the relative shaking between the terminal 100 and the user, and is not limited to Jitter generated by the user itself.
  • the first predetermined duration is 1 second
  • the processor 30 will acquire 5 consecutive frames of human face images within 1 second.
  • the position coordinates of the faces in any two frames of face images in the face image to obtain the coordinate difference.
  • the difference between the position coordinates of the face in the first frame of face image and the second frame of face image, and the difference between the position coordinates of the face in the first frame of face image and the fifth frame of face image such as the difference between the position coordinates of the face in the second frame of the face image and the fourth frame of the face image, etc.
  • the difference of the position coordinates of the face in any two frames of face images can be the difference of the position coordinates of the center point of the face in any two frames of face images, or the difference of the position coordinates of the center point of the face in any two frames of face images.
  • the difference between the position coordinates of facial feature points (such as eye feature points, mouth feature points and nose feature points).
  • the processor 30 compares whether the difference of the position coordinates is within the first preset range to determine whether the human face shakes.
  • the first preset range represents the maximum value that allows the positions of the faces in any two frames of face images to change.
  • processor 30 can calculate the position coordinates of the mouth corner feature point Q1 in Figure 6 (a) and Figure 6 (b) ) in the position coordinate difference of the mouth corner feature point Q2, to obtain the difference value of the position coordinates in any two frames of face images.
  • the coordinates of Q1 are (1, 1.5)
  • the coordinates of Q2 are (1, 2)
  • the difference between the position coordinates of Q1 and Q2 is (0, 0.5)
  • the first preset range is (0.5 , 0.5)
  • the maximum distance that allows the position of the face in any two frames of face images to change on the X-axis and Y-axis is 0.5 units.
  • the difference between the position coordinates of Q1 and Q2 is in the first A preset range means that there is no shaking in the multiple frames of human face images within the first preset duration. If the first preset range is (0, 0.25), at this time, the difference between the position coordinates of Q1 and Q2 is not in the first preset range, which means that the multi-frame face images within the first preset duration shake. .
  • the processor 30 compares whether the absolute value of the position coordinates is within the first preset range.
  • the first preset range is (1, 1)
  • the processor 30 determines the difference between the position coordinates
  • the absolute value (2, 2) of is not in the first preset range (1, 1), it means that the difference between the position coordinates of the faces in any two frames of face images is not in the first preset range, that is, in The multiple frames of face images within the first pre-duration shake.
  • the processor 30 determines that no face has occurred. shake.
  • the processor 30 judges that the difference between the position coordinates of the faces in any two frames of the face images is within the first preset range among the consecutive multiple frames of face images, the processor 30 determines that the face shakes. At this time, it means that the user does not wish to adjust the playback volume of the terminal 100 , and the processor 30 will not adjust the playback volume of the terminal 100 either.
  • the face image also includes angle information
  • the volume adjustment method of the embodiment of the present application also includes the steps:
  • the acquisition module 11 is used to execute step 701
  • the calculation module 12 is used to execute step 702
  • the adjustment module 13 is used to execute step 703 . That is, the acquisition module 11 is used to acquire a face image, and the face image includes shaking information.
  • the calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image.
  • the adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range and the angle information is within the second preset range.
  • the processor 30 is further configured to execute step 601 , step 602 and step 603 . That is, the processor 30 is used to obtain a face image, and the face image includes shake information; calculate the distance between the face and the electronic device according to the face image; and when the shake information is in a first preset range and the angle information is in a second preset range , adjust the playback volume according to the distance.
  • step 701 and step 702 are performed in the same way as the above-mentioned step 101 and step 102 respectively, and will not be repeated here.
  • the distance between the user's face and the terminal 100 (electronic device) will also change, and at this time, the user does not need to adjust the playback volume of the terminal 100 .
  • the processor 30 when the processor 30 adjusts the playback volume according to the distance between the human face and the electronic device, it is also necessary to determine whether the angle information in the face image is in the second predetermined range.
  • the processor 30 will adjust the playback volume of the terminal 100 only when the angle information is in the second preset range and the shaking information is in the first preset range.
  • the second preset range may include a preset angle between the human face and the terminal and a corresponding preset orientation.
  • the preset angle may be 70 degrees, which means that relative to the terminal 100 , the angle thresholds of the left head, right head, head up and head down of the human face are 70 degrees.
  • the angle information includes the angle and orientation between the face in the face image and the terminal.
  • the processor 30 may determine whether to adjust the playback volume according to the distance by judging whether the angle between the target face image and the terminal 100 in the multiple frames of face images is within a second preset range. For example, when the angle between the target face image and the terminal 100 is smaller than the second preset range, the processor 30 determines that the angle between the target face image and the terminal 100 is within the second preset range, when the target face image and the terminal 100 When the included angle is greater than the predetermined angle, the included angle between the processor 30 target face image and the terminal 100 is not within the second preset range.
  • the volume adjustment method of the embodiment of the present application also includes steps:
  • the acquisition module 11 is further configured to execute step 801 , step 803 , step 804 and step 805 , the calculation module 12 is configured to execute step 802 , and the adjustment module 13 is configured to execute step 806 . That is, the acquisition module 11 is used to obtain the face image, and the face image includes shaking information; obtain continuous multiple frames of human face images in the second predetermined duration; judge whether the angle information of the human face is in the first position in the continuous multiple frames of human face images. Two preset ranges; if yes, determine that the angle information is in the second preset range.
  • the calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image.
  • the adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range and the angle information is within the second preset range.
  • the processor 30 is further configured to execute step 801 , step 802 , step 803 , step 804 , step 805 and step 806 . That is, the processor 30 obtains a face image, and the face image includes shaking information; calculates the distance between the face and the electronic device according to the face image; obtains continuous multi-frame face images in the second predetermined duration; judges the continuous multi-frame face images , whether the angle information of the face is in the second preset range; if so, then determine that the angle information is in the second preset range; and when the shaking information is in the first preset range and the angle information is in the second preset range, Adjust playback volume according to distance.
  • step 801 is executed in the same manner as above-mentioned step 701
  • step 802 is executed in the same manner as above-mentioned step 702
  • step 806 is executed in the same manner as above-mentioned step 703, which will not be repeated here.
  • the processor 30 also acquires multiple consecutive frames of human face images within a second predetermined time period, and determines whether the angle information of the human face in the continuous multiple frames of human face images is within a second preset range. Whether the angle information is within the second preset range can also reflect whether the angle of the face within the second predetermined time period is valid.
  • the second predetermined duration may be greater than the first predetermined duration, may also be shorter than the first predetermined duration, and may also be equal to the first predetermined duration.
  • the second preset range is a specific angle representing the orientation.
  • the second preset range may be 70 degrees, which means that relative to the terminal 100, the angle thresholds of the left head, right head, head up and head down of the human face are 70 degrees. If the processor 30 acquires 5 frames of human face images, The processor 30 judges whether the angle of the face in the five frames of face images is less than 70 degrees, and when it is less than 70 degrees, determines that the angle information of the face in the second preset range is in the second preset range, If the angle of the face is valid, it means that the user wishes to adjust the playback volume of the terminal 100 .
  • FIG. 8 is a face image P of the user's right head
  • the processor 30 can determine whether the angle between the face and the terminal is in the second predetermined position according to the degree of the user's right head in the face image P. Set the range to determine whether the face angle is valid. If the second preset range is 60 degrees, and the processor 30 judges that the angle of the user's right head in FIG. 8 is 80 degrees, then at this time, the angle between the human face and the terminal is not in the second preset range. Invalid face angle determination. If the second preset range is 60 degrees, the processor 30 judges that the angle of the user’s right head in FIG. .
  • the processor 30 can simultaneously determine whether the shaking information is in the first preset range and whether the angle information is in the second preset range; the processor 30 can also first determine whether the shaking information is in the first preset range. work, and then determine whether the angle information is in the second preset range; the processor 30 can also first determine whether the angle information is in the second preset range, and then determine whether the shaking information is in the first preset range.
  • the processor 30 determines whether the jitter information is in the first preset range and whether the angle information is in the second preset range, if the jitter information is not in the first preset range or the angle information is not in the second preset range When one of the operations is performed, the processor 30 will not adjust the playback volume of the terminal 100 . When the processor 30 successively determines whether the shaking information is in the first preset range and whether the angle information is in the second preset range, then when the previously determined work does not meet the conditions, the processor 30 will not perform subsequent operations. Work. For example, after the processor 30 first determines that the shaking information is not within the first preset range, the processor 30 will not perform work to determine whether the angle information is within the second preset range. Thus, the workload of the processor 30 can be reduced.
  • the volume adjustment method of the embodiment of the present application also includes steps:
  • 1001 receiving an input operation to set the priority of faces of multiple different users
  • the volume adjustment device 10 further includes a setting module 14, the setting module 14 is used to execute step 1001 and step 1002, the acquisition module 11 is used to execute step 1003, the calculation module 12 is used to execute step 1004, and the adjustment module 13 It is used to execute step 1005. That is, the setting module 14 is used to receive an input operation to set the priorities of faces of multiple different users; obtain the first face information of the face with the highest priority in the face image as the target face information.
  • the obtaining module 11 is used for obtaining a face image, and the face image includes face shaking information.
  • the calculation module 12 is used for calculating the distance between the human face and the electronic device according to the target human face information.
  • the adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range.
  • the processor 30 is configured to execute step 1001 , step 1002 , step 1003 , step 1004 and step 1005 . That is, the processor 30 receives an input operation to set the priority of the faces of a plurality of different users; obtains the first face information of the face with the highest priority in the face image as the target face information; obtains the face The image, the face image includes face shake information; calculate the distance between the face and the electronic device according to the target face information; and adjust the playback volume according to the distance when the shake information is in the first preset range.
  • step 1003 and step 1005 are performed in the same way as the above-mentioned step 101 and step 103 respectively, and will not be repeated here.
  • multiple users may input their own faces in the terminal 100, and the processor 30 may receive an input operation, that is, receive the faces of multiple users.
  • the owner of the terminal 100 can set the priority of the faces of multiple different users through the terminal 100.
  • the owner of the terminal 100 has entered the faces of three users including his own face, and the The owner of the machine can set his own face as the first priority, and the faces of the remaining two users as the second priority and the third priority.
  • the processor 30 may use the first face information of the face with the highest priority among the acquired face images as the target face information.
  • the terminal 100 is provided with a total of three priority faces, namely, a face with a first priority, a face with a second priority, and a face with a third priority. Then when the processor 30 obtains continuous multi-frame face images, the processor 30 will first look for a face of the first priority, if there is no face of the first priority, then find a face of the second priority, If there is no face with the second priority, then go to the face with the third priority. It should be noted that, if the face image contains the face of the first priority, the face of the second priority and the face of the third priority, the processor 30 will select the first priority (i.e. the priority The first human face information of the highest) human face is used as the target human face information. If the face image does not contain the face of the first priority, the face of the second priority and the face of the third priority, it means that the face image of the continuous multiple frames is invalid, and the processor 30 will not execute The volume adjustment method of the embodiment of the present application.
  • the processor 30 can use the target face information to calculate the distance between the face and the electronic device (ie, the terminal 100 ). It can be understood that when the face image contains multiple faces, the processor 30 will first determine the face with the highest priority among the multiple faces, so as to use the face information with the highest priority as the target face information, and When the processor 30 calculates the distance between the human face and the electronic device according to the human face image, it calculates the distance between the human face with the highest priority among the multiple human faces and the electronic device.
  • the processor 30 will only provide the owner of the terminal 100 with the work of adjusting the playback volume, so as to avoid the occurrence of other faces affecting the accuracy of adjusting the playback volume when the acquired multi-frame face images contain other faces. situation, thereby ensuring the accuracy of the processor 30 in adjusting the playback volume.
  • step 1002 acquire the first face information of the face with the highest priority in the face image, as the target face information, also includes the step :
  • the setting module 14 is used to execute step 1101 , step 1102 and step 1103 . That is, the setting module 14 is used to identify the second face information of one or more faces in the face image; one or more second face information is compared with the pre-stored face information in the preset face storehouse , to obtain the second face information matching the pre-stored face information as the first face information; and obtain the first face information with the highest priority of the face as the target face information.
  • the processor 30 is configured to execute step 1101 , step 1102 and step 1103 . That is, the processor 30 is used to identify the second face information of one or more faces in the face image; compare the one or more second face information with the pre-stored face information in the preset face database , to obtain the second face information matching the pre-stored face information as the first face information; and obtain the first face information with the highest priority of the face as the target face information.
  • a preset face library may be set in the terminal 100, and the preset face library includes There is pre-stored face information.
  • the processor 30 can recognize the human face information of all the human faces in the human face images, and use the human face information as the second human face information. It should be noted that when the face image contains multiple faces, the processor 30 may acquire face information of the multiple faces to obtain multiple second face information.
  • the pre-stored face information in the preset face database can be generated according to the face images of different users under different lighting conditions, or can be generated according to the face images of different users under different shooting angles of.
  • the processor 30 can remind the user to operate under the same lighting conditions as the pre-stored face information, or the processor 30 can remind the user to take pictures under the same lighting conditions as the pre-stored face information. Operate at an angle to ensure the accuracy of adjusting the playback volume.
  • the processor 30 can compare the second face information with the pre-stored face information, thereby finding the second face information that matches (that is, is consistent) with the pre-stored face information, and compares the second face information with the pre-stored face information. as the first face information.
  • the processor 30 compares and obtains the second facial information that matches the multiple pre-stored facial information, multiple first facial information can be obtained.
  • the processor 30 can find out the first human face information with the highest priority according to the priorities of different human faces as the target human face information. That is to say, the processor 30 will only determine whether the face shakes and whether the angle of the face is valid for the first face information with the highest priority, and calculate the face and electronic information according to the first face information with the highest priority.
  • the distance between the device that is, the terminal 100 is used to perform the corresponding work of adjusting the playback volume.
  • the volume adjustment method of the embodiment of the present application also includes steps:
  • the volume adjustment device 10 further includes an association module 15, the association module 15 is used to execute step 1202, the acquisition module 11 is used to execute step 1201, the calculation module 12 is used to execute step 1203, and the adjustment module 13 is used to execute Step 1204 and Step 1205. That is, the acquiring module 11 acquires a face image, and the face image includes shaking information.
  • the association module 15 is used to set the initial volume at the initial distance according to the input operation, and associate the initial distance with the initial size of the face in the face image collected at the initial distance.
  • the calculating module 12 calculates the current distance according to the initial distance, the initial size and the average size of the faces in multiple frames of face images.
  • the adjustment module 13 is used for determining the adjusted volume according to the initial distance, the current distance and the initial volume; and adjusting the playback volume according to the adjusted volume.
  • the processor 30 is used to execute step 1201, step 1202, step 1203, step 1204 and step 1205, that is, the processor 30 is used to acquire a face image, and the face image includes shaking information; according to the input operation, Set the initial volume at the initial distance, and associate the initial distance with the initial size of the face in the face image collected at the initial distance; calculate The current distance; determining and adjusting the volume according to the initial distance, the current distance and the initial volume; and adjusting the playback volume according to the adjusted volume.
  • step 1201 is executed in the same way as the above step 101, and will not be repeated here.
  • the user can operate according to the instructions of the terminal 100 to set an appropriate distance from the terminal 100 and an optimal playback volume of the terminal 100.
  • the distance between the user and the terminal 100 is 0.5 meters
  • the optimal playback volume of the terminal 100 is 50 decibels.
  • the processor 30 takes the distance and the playback volume as the initial distance and the initial volume, respectively.
  • the processor 30 may also acquire the current face image of the user at the initial distance.
  • the processor 30 can associate the initial distance with the initial size of the face in the face image, for example, the initial distance corresponds to the initial size.
  • the processor 30 calculates the distance between the human face and the electronic device according to the target human face information
  • the average size of the size of the human face in the multi-frame human face images can be calculated first, and then according to the following formula (1), the Calculate the current distance.
  • L1 is the current distance
  • S1 is the average size of the face in the multi-frame face image
  • S0 is the initial size of the face in the face image
  • L0 is the initial distance.
  • the face images corresponding to S1 and S0 are the face images at the L1 distance and the face images at the L0 distance respectively, and they are not the same face image.
  • S1 is equal to S0
  • L1 is equal to L0.
  • the processor 30 calculates the distance between the human face and the electronic device, according to the relationship between the sound pressure and the distance, the following formula (2) can be obtained, so that under the current distance, the playback volume of the terminal 100 is adjusted from the initial volume to Vary volume required for proper playback volume.
  • the adjusted volume corresponding to the current distance can be obtained according to the following formula (3).
  • V1 V0+ ⁇ V (3)
  • ⁇ V is the changing volume required to adjust the playback volume of the terminal 100 from the initial volume to an appropriate playback volume at the current distance
  • V1 is the corresponding adjusted volume at the current distance
  • V0 is the initial volume
  • the processor 30 can adjust the playing volume according to the volume V1, that is, adjust the playing volume of the terminal 100 to V1.
  • V0 is the optimal playback volume of the terminal 100 under the preset initial distance
  • the processor 30 calculates the adjusted volume V1 according to the above formulas (1), (2), and (3)
  • the adjusted volume V1 It is also the optimal playback volume of the terminal 100 at the current distance, so as to ensure a better user experience for the user.
  • the embodiment of the present application also provides a non-volatile computer-readable storage medium 200 containing a computer program 201 .
  • the computer program 201 is executed by one or more processors 30, the one or more processors 30 are made to execute the volume adjustment method in any one of the above-mentioned embodiments.
  • the processors 30 are made to perform the following volume adjustment method:
  • the processors 30 are made to perform the following volume adjustment method:
  • 501 Obtain a face image, where the face image includes shaking information
  • the processors 30 are made to perform the following volume adjustment method:
  • the processors 30 are made to perform the following volume adjustment method:
  • the processors 30 are made to perform the following volume adjustment method:
  • 1001 receiving an input operation to set the priority of faces of multiple different users
  • the processors 30 are made to perform the following volume adjustment method:
  • the processors 30 are made to perform the following volume adjustment method:

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Image Processing (AREA)
  • Studio Devices (AREA)
  • Image Analysis (AREA)

Abstract

A volume adjustment method, a volume adjustment apparatus (10), a terminal (100), and a non-volatile computer-readable storage medium (200). The volume adjustment method comprises: acquiring a face image, the face image comprising jitter information (101); calculating the distance between a face and an electronic device according to the face image (102); and when the jitter information is in a first preset range, adjusting playback volume according to the distance (103).

Description

音量调节方法及装置、终端及计算机可读存储介质Volume adjustment method and device, terminal, and computer-readable storage medium
优先权信息priority information
本申请请求2021年9月16日向中国国家知识产权局提交的、专利申请号为202111088747.3的专利申请的优先权和权益,并且通过参照将其全文并入此处。This application claims the priority and benefit of the patent application No. 202111088747.3 filed with the State Intellectual Property Office of China on September 16, 2021, which is hereby incorporated by reference in its entirety.
技术领域technical field
本申请涉及音量调节技术领域,更具体而言,涉及一种音量调节方法、音量调节装置、终端及非易失性计算机可读存储介质。The present application relates to the technical field of volume adjustment, and more specifically, to a volume adjustment method, a volume adjustment device, a terminal, and a non-volatile computer-readable storage medium.
背景技术Background technique
目前,在扬声器场景时,用户往往是通过按压终端上的音量调节按键,以实现对音量的调节,在用户和终端的距离发生变化时,往往仅提供按键调节音量,用户并不能够得到最佳的播放音量。At present, in the speaker scene, the user often adjusts the volume by pressing the volume adjustment button on the terminal. When the distance between the user and the terminal changes, only the button is provided to adjust the volume, and the user cannot get the best results. playback volume.
发明内容Contents of the invention
本申请实施方式提供一种音量调节方法、音量调节装置、终端及非易失性计算机可读存储介质。Embodiments of the present application provide a volume adjustment method, a volume adjustment device, a terminal, and a non-volatile computer-readable storage medium.
本申请实施方式的音量调节方法包括获取人脸图像,所述人脸图像包括抖动信息;根据所述人脸图像计算所述人脸和电子设备的距离;及在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。The volume adjustment method in the embodiment of the present application includes acquiring a face image, the face image including shake information; calculating the distance between the face and the electronic device according to the face image; When setting the range, adjust the playback volume according to the distance.
本申请实施方式的音量调节装置包括获取模块、计算模块及调整模块。所述获取模块用于获取人脸图像,所述人脸图像包括抖动信息。所述计算模块用于根据所述人脸图像计算所述人脸和电子设备的距离。及所述调整模块用于在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。The volume adjustment device in the embodiment of the present application includes an acquisition module, a calculation module and an adjustment module. The obtaining module is used to obtain a face image, and the face image includes shaking information. The calculating module is used for calculating the distance between the human face and the electronic device according to the human face image. And the adjustment module is configured to adjust the playback volume according to the distance when the shaking information is within a first preset range.
本申请实施方式的终端包括处理器。所述处理器用于获取人脸图像,所述人脸图像包括抖动信息;根据所述人脸图像计算所述人脸和电子设备的距离;及在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。The terminal in the embodiment of the present application includes a processor. The processor is used to acquire a human face image, the human face image includes shaking information; calculate the distance between the human face and the electronic device according to the human face image; and when the shaking information is within a first preset range, Adjust the playback volume according to the distance.
本申请实施方式的非易失性计算机可读存储介质包含计算机程序,当所述计算机程序被一个或多个处理器执行时,使得所述处理器执行如下音量调节方法:获取人脸图像,所述人脸图像包括抖动信息;根据所述人脸图像计算所述人脸和电子设备的距离;及在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。The non-transitory computer-readable storage medium of the embodiment of the present application contains a computer program. When the computer program is executed by one or more processors, the processor is made to perform the following volume adjustment method: acquire a face image, and The human face image includes shaking information; the distance between the human face and the electronic device is calculated according to the human face image; and when the shaking information is within a first preset range, the playback volume is adjusted according to the distance.
本申请的实施方式的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实施方式的实践了解到。Additional aspects and advantages of the embodiments of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the embodiments of the application.
附图说明Description of drawings
本申请的上述和/或附加的方面和优点从结合下面附图对实施方式的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become apparent and understandable from the description of the embodiments in conjunction with the following drawings, wherein:
图1是本申请某些实施方式的音量调节方法的流程示意图;FIG. 1 is a schematic flowchart of a volume adjustment method in some embodiments of the present application;
图2是本申请某些实施方式的音量调节装置的示意图;FIG. 2 is a schematic diagram of a volume adjustment device in some embodiments of the present application;
图3是本申请某些实施方式的终端的平面示意图;FIG. 3 is a schematic plan view of a terminal in some embodiments of the present application;
图4是本申请某些实施方式的音量调节方法的场景示意图;FIG. 4 is a schematic diagram of a scene of a volume adjustment method in some embodiments of the present application;
图5是本申请某些实施方式的音量调节方法的流程示意图;FIG. 5 is a schematic flowchart of a volume adjustment method in some embodiments of the present application;
图6是本申请某些实施方式的音量调节方法的场景示意图;FIG. 6 is a schematic diagram of a scene of a volume adjustment method in some embodiments of the present application;
图7和图8是本申请某些实施方式的音量调节方法的流程示意图;FIG. 7 and FIG. 8 are schematic flowcharts of volume adjustment methods in some embodiments of the present application;
图9是本申请某些实施方式的音量调节方法的场景示意图;Fig. 9 is a schematic scene diagram of a volume adjustment method in some embodiments of the present application;
图10至图12本申请某些实施方式的音量调节方法的流程示意图;10 to 12 are schematic flowcharts of volume adjustment methods in some embodiments of the present application;
图13是本申请某些实施方式的非易失性计算机可读存储介质和处理器的连接状态示意图。Fig. 13 is a schematic diagram of a connection state between a non-volatile computer-readable storage medium and a processor in some embodiments of the present application.
具体实施方式Detailed ways
下面详细描述本申请的实施方式,所述实施方式的示例在附图中示出,其中,相同或类似的标号自始至终表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施方式是示例性的,仅用于解释本申请的实施方式,而不能理解为对本申请的实施方式的限制。Embodiments of the present application are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary, are only for explaining the embodiments of the present application, and should not be construed as limiting the embodiments of the present application.
本申请实施方式的音量调节方法包括获取人脸图像,所述人脸图像包括抖动信息;根据所述人脸图像计算所述人脸和电子设备的距离;及在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。The volume adjustment method in the embodiment of the present application includes acquiring a face image, the face image including shake information; calculating the distance between the face and the electronic device according to the face image; When setting the range, adjust the playback volume according to the distance.
在某些实施方式中,所述音量调节方法包括获取第一预定时长内的连续多帧所述人脸图像;判断连续多帧所述人脸图像中,任意两帧所述人脸图像中的人脸的位置坐标的差值是否处于所述第一预设范围;及若是,则确定所述抖动信息处于第一预设范围。In some implementations, the volume adjustment method includes acquiring a plurality of consecutive frames of the human face images within a first predetermined duration; Whether the difference of the position coordinates of the face is within the first preset range; and if so, determining that the shaking information is within the first preset range.
在某些实施方式中,所述人脸图像还包括角度信息,所述根据所述距离调整播放音量,包括在所述抖动信息处于所述第一预设范围且所述角度信息处于第二预设范围时,根据所述距离调整播放音量。In some implementations, the face image further includes angle information, and the adjusting the playback volume according to the distance includes when the shaking information is in the first preset range and the angle information is in a second preset range. When setting the range, adjust the playback volume according to the distance.
在某些实施方式中,所述音量调节方法还包括获取第二预定时长内的连续多帧所述人脸图像;判断连续多帧所述人脸图像中,所述人脸的角度信息是否均处于所述第二预设范围;及若是,则确定所述角度信息处于所述第二预设范围。In some implementations, the volume adjustment method further includes acquiring a plurality of consecutive frames of the human face image within a second predetermined duration; judging whether the angle information of the human face in the continuous multiple frames of the human face image within the second preset range; and if so, determining that the angle information is within the second preset range.
在某些实施方式中,在所述根据所述人脸图像计算所述人脸和电子设备的距离之前,所述音量调节方法还包括接收输入操作,以设置多个不同用户的所述人脸的优先级;及获取所述人脸图像中,所述优先级最高的所述人脸的第一人脸信息,以作为所述目标人脸信息;所述根据所述人脸图像计算所述人脸和电子设备的距离,包括根据所述目标人脸信息计算所述人脸和电子设备的距离。In some implementations, before calculating the distance between the human face and the electronic device according to the human face image, the volume adjustment method further includes receiving an input operation to set the human face of a plurality of different users priority; and acquiring the first face information of the face with the highest priority in the face image as the target face information; calculating the The distance between the human face and the electronic device includes calculating the distance between the human face and the electronic device according to the target human face information.
在某些实施方式中,所述获取所述人脸图像中,优先级最高的所述人脸的第一人脸信息,以作为所述目标人脸信息,包括识别所述人脸图像中的一个或多个所述人脸的所述第二人脸信息;将一个或多个所述第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与所述预存人脸信息匹配的所述第二人脸信息,作为所述第一人脸信息;获取所述人脸的所述优先级最高的所述第一人脸信息,以作为所述目标人脸信息。In some implementations, the acquiring the first face information of the face with the highest priority in the face image as the target face information includes identifying the face in the face image The second face information of one or more of the faces; comparing the one or more of the second face information with the pre-stored face information in the preset face database to obtain the same The second face information matched with the pre-stored face information is used as the first face information; the first face information with the highest priority of the face is obtained as the target person face information.
在某些实施方式中,所述预存人脸信息根据不同所述用户在不同光照条件下的所述人脸图像生成。In some implementations, the pre-stored face information is generated according to the face images of different users under different lighting conditions.
在某些实施方式中,所述预存人脸信息根据不同所述用户在不同拍摄角度下的所述人脸图像生成。In some implementations, the pre-stored face information is generated according to the face images of different users at different shooting angles.
在某些实施方式中,所述音量调节方法还包括根据输入操作,设置初始距离下的初始音量,并关联所述初始距离和所述初始距离下采集的所述人脸图像中所述人脸的初始尺寸;所述根据所述人脸图像计算所述人脸和电子设备的距离,包括根据所述初始距离、所述初始尺寸和多帧所述人脸图像中所述人脸的尺寸的平均尺寸,计算所述距离。In some implementations, the volume adjustment method further includes setting an initial volume at an initial distance according to an input operation, and associating the initial distance with the face in the face image collected at the initial distance The initial size of the human face; the calculation of the distance between the human face and the electronic device according to the human face image includes the calculation according to the initial distance, the initial size and the size of the human face in the multiple frames of the human face image average size, calculate the distance.
在某些实施方式中,所述根据所述距离调整播放音量,包括根据所述初始距离、所述距离和所述初始音量确定调整音量;及根据所述调整音量调整所述播放音量。In some implementations, the adjusting the playback volume according to the distance includes determining an adjustment volume according to the initial distance, the distance, and the initial volume; and adjusting the playback volume according to the adjustment volume.
本申请实施方式的音量调节装置包括获取模块、计算模块及调整模块。所述获取模块用于获取人脸图像,所述人脸图像包括抖动信息。所述计算模块用于根据所述人脸图像计算所述人脸和电子设备的距离。及所述调整模块用于在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。The volume adjustment device in the embodiment of the present application includes an acquisition module, a calculation module and an adjustment module. The obtaining module is used to obtain a face image, and the face image includes shaking information. The calculating module is used for calculating the distance between the human face and the electronic device according to the human face image. And the adjustment module is configured to adjust the playback volume according to the distance when the shaking information is within a first preset range.
本申请实施方式的终端包括处理器。所述处理器用于获取人脸图像,所述人脸图像包括抖动信息;根据所述人脸图像计算所述人脸和电子设备的距离;及在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。The terminal in the embodiment of the present application includes a processor. The processor is used to acquire a human face image, the human face image includes shaking information; calculate the distance between the human face and the electronic device according to the human face image; and when the shaking information is within a first preset range, Adjust the playback volume according to the distance.
在某些实施方式中,所述处理器用于获取第一预定时长内的连续多帧所述人脸图像;判断连续多帧所述人脸图像中,任意两帧所述人脸图像中的人脸的位置坐标的差值是否处于所述第一预设范围;及若是,则确定所述抖动信息处于第一预设范围。In some implementations, the processor is configured to acquire multiple consecutive frames of the human face images within a first predetermined duration; and determine whether the person in any two frames of the human face images among the multiple consecutive frames of the human face images Whether the difference of the position coordinates of the face is within the first preset range; and if so, determining that the shaking information is within the first preset range.
在某些实施方式中,所述人脸图像还包括角度信息,所述处理器用于在所述抖动信息处于所述第一预设范围且所述角度信息处于第二预设范围时,根据所述距离调整播放音量。In some implementations, the face image further includes angle information, and the processor is configured to, when the shaking information is in the first preset range and the angle information is in a second preset range, according to the Adjust the playback volume by the above distance.
在某些实施方式中,所述处理器用于获取第二预定时长内的连续多帧所述人脸图像;判断连续多帧所述人脸图像中,所述人脸的角度信息是否均处于所述第二预设范围;及若是,则确定所述角度信息处于所述第二预设范围。In some implementations, the processor is configured to acquire multiple consecutive frames of the human face images within a second predetermined duration; and determine whether the angle information of the human faces in the multiple consecutive frames of the human face images is within the specified range. the second preset range; and if so, determining that the angle information is within the second preset range.
在某些实施方式中,在所述处理器根据所述人脸图像计算所述人脸和电子设备的距离之前,所述 处理器用于:接收输入操作,以设置多个不同用户的所述人脸的优先级;及获取所述人脸图像中,所述优先级最高的所述人脸的第一人脸信息,以作为所述目标人脸信息;根据所述目标人脸信息计算所述人脸和电子设备的距离。In some implementations, before the processor calculates the distance between the human face and the electronic device according to the human face image, the processor is configured to: receive an input operation to set the human face of a plurality of different users The priority of the face; and obtaining the first face information of the face with the highest priority in the face image as the target face information; calculating the target face information according to the target face information The distance between the human face and the electronic device.
在某些实施方式中,所述处理器用于识别所述人脸图像中的一个或多个所述人脸的所述第二人脸信息;将一个或多个所述第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与所述预存人脸信息匹配的所述第二人脸信息,作为所述第一人脸信息;获取所述人脸的所述优先级最高的所述第一人脸信息,以作为所述目标人脸信息。In some implementations, the processor is configured to identify the second face information of one or more of the faces in the face image; combine one or more of the second face information with Compare the pre-stored face information in the preset face database to obtain the second face information matched with the pre-stored face information as the first face information; obtain the face information The first face information with the highest priority is used as the target face information.
在某些实施方式中,所述处理器用于根据不同所述用户在不同光照条件下的所述人脸图像生成所述预存人脸信息。In some implementations, the processor is configured to generate the prestored face information according to the face images of different users under different lighting conditions.
在某些实施方式中,所述处理器用于根据不同所述用户在不同拍摄角度下的所述人脸图像生成所述预存人脸信息。In some implementations, the processor is configured to generate the prestored face information according to the face images of different users at different shooting angles.
在某些实施方式中,所述处理器用于根据输入操作,设置初始距离下的初始音量,并关联所述初始距离和所述初始距离下采集的所述人脸图像中所述人脸的初始尺寸;根据所述初始距离、所述初始尺寸和多帧所述人脸图像中所述人脸的尺寸的平均尺寸,计算所述距离。In some implementations, the processor is configured to set an initial volume at an initial distance according to an input operation, and associate the initial distance with the initial volume of the face in the face image collected at the initial distance. Size: Calculate the distance according to the initial distance, the initial size, and the average size of the face sizes in multiple frames of the face images.
在某些实施方式中,所述处理器用于根据所述初始距离、所述距离和所述初始音量确定调整音量;及根据所述调整音量调整所述播放音量。In some implementations, the processor is configured to determine an adjusted volume according to the initial distance, the distance, and the initial volume; and adjust the playback volume according to the adjusted volume.
请参阅图1,本申请实施方式提供一种音量调节方法。该音量调节方法包括步骤:Referring to FIG. 1 , an embodiment of the present application provides a volume adjustment method. The volume adjustment method includes steps:
101:获取人脸图像,人脸图像包括抖动信息;101: Acquire a face image, where the face image includes shaking information;
102:根据人脸图像计算人脸和电子设备的距离;及102: Calculate the distance between the face and the electronic device according to the face image; and
103:在抖动信息处于第一预设范围时,根据距离调整播放音量。103: When the shaking information is within the first preset range, adjust the playback volume according to the distance.
请参阅图2,本申请实施方式提供一种音量调节装置10。音量调节装置10包括获取模块11、计算模块12及调整模块13。本申请实施方式的音量调节方法可应用于音量调节装置10。其中,获取模块11用于执行步骤101,计算模块12用于执行步骤102,调整模块13用于执行步骤103。即,获取模块11用于获取人脸图像,人脸图像包括抖动信息。计算模块12用于根据人脸图像计算人脸和电子设备的距离。调整模块13用于在抖动信息处于第一预设范围时,根据距离调整播放音量。Referring to FIG. 2 , an embodiment of the present application provides a volume adjustment device 10 . The volume adjustment device 10 includes an acquisition module 11 , a calculation module 12 and an adjustment module 13 . The volume adjustment method in the embodiments of the present application can be applied to the volume adjustment device 10 . Wherein, the acquisition module 11 is used to execute step 101 , the calculation module 12 is used to execute step 102 , and the adjustment module 13 is used to execute step 103 . That is, the obtaining module 11 is used to obtain a face image, and the face image includes shaking information. The calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image. The adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range.
请参阅图3,本申请实施方式还提供一种终端100。终端100包括处理器30。本申请实施方式的音量调节方法可应用于终端100。处理器30用于执行步骤101、步骤102和步骤103。即,处理器30用于获取人脸图像,人脸图像包括抖动信息;根据人脸图像计算人脸和电子设备的距离;及在抖动信息处于第一预设范围时,根据距离调整播放音量。Referring to FIG. 3 , the embodiment of the present application further provides a terminal 100 . The terminal 100 includes a processor 30 . The volume adjustment method in the embodiments of the present application may be applied to the terminal 100 . The processor 30 is configured to execute step 101 , step 102 and step 103 . That is, the processor 30 is used to acquire a face image, and the face image includes shake information; calculate the distance between the face and the electronic device according to the face image; and adjust the playback volume according to the distance when the shake information is within a first preset range.
其中,终端100还包括壳体40。终端100可以是手机、平板电脑、显示设备、笔记本电脑、柜员机、闸机、智能手表、头显设备、游戏机等。如图3所示,本申请实施方式以终端100是手机为例进行说明,可以理解,终端100的具体形式并不限于手机。壳体40还可用于安装终端100的显示装置、成像装置、供电装置、通信装置等功能模块,以使壳体40为功能模块提供防尘、防摔、防水等保护。Wherein, the terminal 100 further includes a housing 40 . The terminal 100 may be a mobile phone, a tablet computer, a display device, a notebook computer, an teller machine, a gate, a smart watch, a head-mounted display device, a game console, and the like. As shown in FIG. 3 , the embodiment of the present application is described by taking the terminal 100 as an example of a mobile phone. It can be understood that the specific form of the terminal 100 is not limited to the mobile phone. The housing 40 can also be used to install functional modules such as a display device, an imaging device, a power supply device, and a communication device of the terminal 100, so that the housing 40 provides protection for the functional modules such as dustproof, dropproof, and waterproof.
具体地,在调整终端100的播放音量前,处理器30需先根据人脸图像中的抖动信息,以确定人脸图像中的人脸(即用户)是否处于第一预设范围。其中,第一预设范围可以是人脸未发生抖动时,所处的位置。第一预设范围还可以是人脸能够允许抖动的最大范围,即超出该范围时,处理器则判定人脸发生抖动。Specifically, before adjusting the playback volume of the terminal 100, the processor 30 first needs to determine whether the face (that is, the user) in the face image is within the first preset range according to the shaking information in the face image. Wherein, the first preset range may be a position where the face is not shaken. The first preset range may also be the maximum range that the human face can allow shaking, that is, when the range is exceeded, the processor determines that the human face shakes.
在一个实施方式中,抖动信息可包含有人脸的位置和人脸未抖动时所处的预设位置(即第一预设范围),处理器可通过判断人脸图像中人脸的位置是否处于预定位置时,以判断人脸是否发生抖动。如,处理器判断人脸的位置处于预定位置时,处理器判断人脸处于第一预设范围,即人脸为抖动;处理器判断人脸的位置未处于预定位置时,处理器判断人脸未处于第一预设范围,即人脸发生抖动。In one embodiment, the shaking information may include the position of the human face and the preset position (i.e. the first preset range) where the human face is not shaking, and the processor may determine whether the position of the human face in the human face image is within When presetting the position, it is used to judge whether the face shakes. For example, when the processor judges that the position of the human face is in the predetermined position, the processor judges that the human face is in the first preset range, that is, the human face is shaken; when the processor judges that the position of the human face is not in the predetermined position, the processor judges that the human face If it is not in the first preset range, that is, the human face shakes.
在另一个实施方式中,在调整终端100的播放音量前,处理器30会获取多帧人脸图像,并仅检测人脸图像中的人脸,在通过对比多帧人脸图像中人脸的位置是否发生较大变化后,则可确定抖动信息是否处于第一预设范围,从而可确定人脸是否发生抖动。即处理器30在通过对比多帧人脸图像中的人脸的位置发生较大变化(即多帧人脸图像中的人脸的位置差值位于第一预设范围之外)时,处理器30便判定抖动信息不处于第一预设范围,人脸发生抖动,处理器30在通过对比多帧人像图像中的目标人脸图像的位置未发生变化,或位置变化较小(即多帧人脸图像中的人脸的位置差值位于第一预设范围 内)时,处理器30便判定抖动信息处于第一预设范围,人脸未发生抖动。接下来,处理器30可根据人脸图像,以计算人脸和电子设备(即终端100)的当前距离。具体地,终端100中可预先设定有人脸大小与距离的对应映射关系,即人脸的大小可反映出人脸与终端100之间的距离,由此,处理器30便可根据人脸图像,以得到人脸和电子设备的距离。In another embodiment, before adjusting the playback volume of the terminal 100, the processor 30 acquires multiple frames of face images, and only detects the faces in the face images, and compares the faces in the multiple frames of face images. After whether the position changes greatly, it can be determined whether the shake information is within the first preset range, so as to determine whether the face shakes. That is, when the processor 30 changes greatly by comparing the positions of the faces in the multi-frame face images (that is, the position difference of the faces in the multi-frame face images is outside the first preset range), the processor will 30, it is determined that the shake information is not in the first preset range, and the face shakes, and the processor 30 compares the position of the target face image in the multi-frame portrait images without changing, or the position change is small (that is, the position of the multi-frame portrait image is relatively small). When the position difference of the face in the face image is within the first preset range), the processor 30 determines that the shake information is within the first preset range, and the face does not shake. Next, the processor 30 can calculate the current distance between the face and the electronic device (ie, the terminal 100 ) according to the face image. Specifically, the corresponding mapping relationship between the size of the human face and the distance can be preset in the terminal 100, that is, the size of the human face can reflect the distance between the human face and the terminal 100, so that the processor 30 can , to get the distance between the face and the electronic device.
以图4为例,终端100可预先设定有人脸和电子设备的距离为0.5米和1米时,对应人脸图像分别为人脸图像P1和人脸图像P2,可以看出,人脸图像P1和人脸图像P2中的人脸尺寸不同,人脸图像P2中的人脸尺寸小于人脸图像P1中的尺寸。由此,当处理器30获取到人脸图像后,则可分别与人脸图像P1和人脸图像P2中的人脸尺寸对比,即当处理器30获取的人脸图像中的人脸与人脸图像P1中的人脸尺寸相同时,处理器30则可得出此时人脸和电子设备的距离为0.5米。当处理器30获取的人脸图像中的人脸与人脸图像P2中的人脸尺寸相同时,处理器30则可得出此时人脸和电子设备的距离为1米。Taking Figure 4 as an example, the terminal 100 can preset the distance between the human face and the electronic device to be 0.5 meters and 1 meter, and the corresponding facial images are human face image P1 and human face image P2 respectively. It can be seen that the human face image P1 Different from the size of the face in the face image P2, the size of the face in the face image P2 is smaller than that in the face image P1. Thus, when the processor 30 acquires the face image, it can be compared with the face size in the face image P1 and the face image P2 respectively, that is, when the face in the face image acquired by the processor 30 is compared with the size of the person When the sizes of the faces in the face image P1 are the same, the processor 30 can conclude that the distance between the faces and the electronic device is 0.5 meters. When the size of the face in the face image acquired by the processor 30 is the same as that in the face image P2, the processor 30 can conclude that the distance between the face and the electronic device is 1 meter.
最后,在处理器30确定人脸未抖动,即抖动信息处于第一预设范围,并计算出人脸和电子设备的距离后,处理器30便会根据距离以获取与距离对应的音量,从而调整终端100的播放音量。Finally, after the processor 30 determines that the human face is not shaking, that is, the shaking information is within the first preset range, and calculates the distance between the human face and the electronic device, the processor 30 will obtain the volume corresponding to the distance according to the distance, thereby Adjust the playback volume of the terminal 100.
例如,终端100中可提前设定有用户设定好的预定距离和预定音量的映射关系,在处理器30计算出人脸和电子设备的距离后,处理器30则可根据对比该距离和预定距离,从而得到该距离相对预设距离的变化比例,从而计算变化比例与预定音量的乘积,以得到当前距离下对应的音量,处理器30则可根据该音量以调整终端100的播放音量,即将终端100的播放音量调整至该音量。For example, the mapping relationship between the predetermined distance and the predetermined volume set by the user can be set in advance in the terminal 100. After the processor 30 calculates the distance between the face and the electronic device, the processor 30 can compare the distance with the predetermined volume. distance, so as to obtain the change ratio of the distance relative to the preset distance, and then calculate the product of the change ratio and the predetermined volume to obtain the corresponding volume at the current distance, and the processor 30 can adjust the playback volume of the terminal 100 according to the volume, that is, The playback volume of the terminal 100 is adjusted to this volume.
又例如,在终端100中设定有用户设定好的预定距离和预定音量的映射关系的情况下,处理器30在计算出人脸和电子设备的距离后,处理器30可根据对比当前距离和预定距离,以得到距离的变化比例,从而通过声压与距离的关系、及变化比例,以得到当前距离下,终端100的播放音量相对预定音量理论需要调整的音量大小,从而调整终端100的播放音量。For another example, when the mapping relationship between the predetermined distance and the predetermined volume set by the user is set in the terminal 100, after the processor 30 calculates the distance between the face and the electronic device, the processor 30 can compare the current distance and the predetermined distance to obtain the change ratio of the distance, so that the relationship between the sound pressure and the distance and the change ratio can be used to obtain the volume of the playback volume of the terminal 100 that needs to be adjusted theoretically relative to the predetermined volume at the current distance, so as to adjust the volume of the terminal 100 Playback volume.
目前仅通过用户与终端100的距离变化,从而自主调节终端100的播放音量,则会导致执行调节播放音量的判断不准确,从而导致用户无法得到最佳的声音体验。At present, only by changing the distance between the user and the terminal 100 to adjust the playback volume of the terminal 100 independently, it will lead to an inaccurate judgment of performing the adjustment of the playback volume, so that the user cannot obtain the best sound experience.
本申请实施方式的音量调节方法、音量调节装置10及终端100在人脸图像的抖动信息处于第一预设范围,即人脸未抖动时,才会根据人脸和电子设备的距离调整播放音量,由此,则可保证用户在使用终端100过程中,若出现用户发生无意识的抖动时,并不会调整播放音量,从而保证判断是否调整播放音量的准确性,以使用户得到最佳的音量体验。The volume adjustment method, the volume adjustment device 10 and the terminal 100 of the embodiment of the present application will adjust the playback volume according to the distance between the face and the electronic device when the shaking information of the face image is in the first preset range, that is, when the face is not shaking , thus, it can be ensured that the user will not adjust the playback volume if the user shakes unconsciously during the use of the terminal 100, thereby ensuring the accuracy of judging whether to adjust the playback volume, so that the user can obtain the best volume experience.
请参阅图2、图3及图5,本申请实施方式的音量调节方法,还包括步骤:Please refer to Fig. 2, Fig. 3 and Fig. 5, the volume adjustment method of the embodiment of the present application also includes steps:
501:获取人脸图像,人脸图像包括抖动信息;501: Obtain a face image, where the face image includes shaking information;
502:根据人脸图像计算人脸和电子设备的距离;502: Calculate the distance between the face and the electronic device according to the face image;
503:获取第一预定时长内的连续多帧人脸图像;503: Obtain multiple consecutive frames of face images within the first predetermined duration;
504:判断连续多帧人脸图像中,任意两帧人脸图像中的人脸的位置坐标的差值是否处于第一预设范围内;及504: Judging whether the difference between the position coordinates of the faces in any two frames of face images in the continuous multiple frames of face images is within the first preset range; and
505:若是,则确定抖动信息处于第一预设范围。505: If yes, determine that the shaking information is within a first preset range.
506:在抖动信息处于第一预设范围时,根据距离调整播放音量。506: When the shake information is within the first preset range, adjust the playback volume according to the distance.
在某些实施方式中,获取模块11用于执行步骤501、步骤503、步骤404和步骤505,计算模块12用于执行步骤502,调整模块13用于执行步骤506。即获取模块11用于获取人脸图像,人脸图像包括抖动信息;获取第一预定时长内的连续多帧人脸图像;判断连续多帧人脸图像中,任意两帧人脸图像中的人脸的位置坐标的差值是否处于第一预设范围内;及若是,则确定抖动信息处于第一预设范围。计算模块12用于根据人脸图像计算人脸和电子设备的距离。调整模块13用于在抖动信息处于第一预设范围时,根据距离调整播放音量。In some embodiments, the acquisition module 11 is used to execute step 501 , step 503 , step 404 and step 505 , the calculation module 12 is used to execute step 502 , and the adjustment module 13 is used to execute step 506 . That is, the acquisition module 11 is used to obtain a face image, and the face image includes shaking information; obtain continuous multiple frames of human face images in the first predetermined duration; judge the continuous multiple frames of human face images, the people in any two frames of human face images Whether the difference of the position coordinates of the face is within a first preset range; and if so, determining that the shaking information is within the first preset range. The calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image. The adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range.
在某些实施方式中,处理器30用于执行步骤501、步骤502、步骤503、步骤504、步骤505和步骤506。即处理器30获取人脸图像,人脸图像包括抖动信息;根据人脸图像计算人脸和电子设备的距离;获取第一预定时长内的连续多帧人脸图像;判断连续多帧人脸图像中,任意两帧人脸图像中的人脸的位置坐标的差值是否处于第一预设范围内;及若是,则确定抖动信息处于第一预设范围。在抖动信息处于第一预设范围时,根据距离调整播放音量。In some implementations, the processor 30 is configured to execute step 501 , step 502 , step 503 , step 504 , step 505 and step 506 . That is, the processor 30 obtains a face image, and the face image includes shaking information; calculates the distance between the face and the electronic device according to the face image; obtains continuous multi-frame face images in the first predetermined duration; judges the continuous multi-frame face images , whether the difference between the position coordinates of the faces in any two frames of face images is within a first preset range; and if so, determining that the shaking information is within a first preset range. When the shaking information is within the first preset range, the playback volume is adjusted according to the distance.
其中,步骤501与上述步骤101执行方式相同,步骤502与上述步骤102执行方式相同,步骤506 与上述步骤103执行方式相同,在此不一一赘述。Wherein, step 501 is executed in the same manner as above-mentioned step 101, step 502 is executed in the same manner as above-mentioned step 102, and step 506 is executed in the same manner as above-mentioned step 103, which will not be repeated here.
具体地,处理器30会获取在第一预定时长内的连续多帧人脸图像,从而根据多帧人脸图像中任意两帧人脸图像中的人脸的位置坐标的差值,以确定抖动信息是否处于第一预设范围。而抖动信息是否第一预设范围,还可反应在第一预定时长内的人脸是否发抖动。其中,人脸的抖动可以是用户在操作终端100时,终端100发生的抖动,也可以是用户自身无意识发生的抖动,即人脸的抖动是终端100与用户之间的相对抖动,并不限于用户自身发生的抖动。Specifically, the processor 30 will acquire multiple consecutive frames of human face images within the first predetermined time period, so as to determine the shaking Whether the information is in the first preset range. Whether the shaking information is within the first preset range can also reflect whether the face shakes within the first predetermined time period. Wherein, the shaking of the human face may be the shaking that occurs on the terminal 100 when the user operates the terminal 100, or the shaking that occurs unconsciously by the user himself, that is, the shaking of the human face is the relative shaking between the terminal 100 and the user, and is not limited to Jitter generated by the user itself.
例如,第一预定时长为1秒,处理器30在1秒内会获取5帧连续的人脸图像,此时,由于每帧人脸图像的坐标系是一致的,则可通过对比5帧人脸图像中任意两帧人脸图像中的人脸的位置坐标,以得到坐标差值。如第1帧人脸图像和第2帧人脸图像中的人脸的位置坐标的差值,又如第1帧人脸图像和第5帧人脸图像中的人脸的位置坐标的差值,还如第2帧人脸图像和第4帧人脸图像中的人脸的位置坐标的差值等。其中,任意两帧人脸图像中的人脸的位置坐标的差值可以是任意两帧人脸图像中的人脸中心点的位置坐标的差值,还可以是任意两帧人脸图像中的人脸特征点(如眼部特征点、嘴部特征点和鼻部特征点)的位置坐标的差值。For example, the first predetermined duration is 1 second, and the processor 30 will acquire 5 consecutive frames of human face images within 1 second. The position coordinates of the faces in any two frames of face images in the face image to obtain the coordinate difference. For example, the difference between the position coordinates of the face in the first frame of face image and the second frame of face image, and the difference between the position coordinates of the face in the first frame of face image and the fifth frame of face image , such as the difference between the position coordinates of the face in the second frame of the face image and the fourth frame of the face image, etc. Wherein, the difference of the position coordinates of the face in any two frames of face images can be the difference of the position coordinates of the center point of the face in any two frames of face images, or the difference of the position coordinates of the center point of the face in any two frames of face images. The difference between the position coordinates of facial feature points (such as eye feature points, mouth feature points and nose feature points).
接下来,处理器30会通过对比位置坐标的差值是否处于第一预设范围内,以判断人脸是否发生抖动。此时,第一预设范围则代表允许任意两帧人脸图像中的人脸的位置发生变化的最大值。Next, the processor 30 compares whether the difference of the position coordinates is within the first preset range to determine whether the human face shakes. At this time, the first preset range represents the maximum value that allows the positions of the faces in any two frames of face images to change.
如图6所示,图6(a)和图6(b)为任意两帧人脸图像,处理器30可通过计算图6(a)中的嘴角特征点Q1的位置坐标与图6(b)中嘴角特征点Q2的位置坐标的差值,以得到任意两帧人脸图像中的位置坐标的差值。例如,Q1的坐标为(1,1.5),Q2的坐标为(1,2),可以得出Q1和Q2的位置坐标的差值为(0,0.5),若第一预设范围为(0.5,0.5),即允许任意两帧人脸图像中的人脸的位置在X轴和Y轴上位置发生变化的最大距离为0.5个单位,可以看出Q1和Q2的位置坐标的差值处于第一预设范围,则说明在第一预时长内的多帧人脸图像未发生抖动。若第一预设范围为(0,0.25),此时,则Q1和Q2的位置坐标的差值不处于第一预设范围,则说明在第一预时长内的多帧人脸图像发生抖动。As shown in Figure 6, Figure 6 (a) and Figure 6 (b) are any two frames of human face images, processor 30 can calculate the position coordinates of the mouth corner feature point Q1 in Figure 6 (a) and Figure 6 (b) ) in the position coordinate difference of the mouth corner feature point Q2, to obtain the difference value of the position coordinates in any two frames of face images. For example, the coordinates of Q1 are (1, 1.5), and the coordinates of Q2 are (1, 2), it can be obtained that the difference between the position coordinates of Q1 and Q2 is (0, 0.5), if the first preset range is (0.5 , 0.5), that is, the maximum distance that allows the position of the face in any two frames of face images to change on the X-axis and Y-axis is 0.5 units. It can be seen that the difference between the position coordinates of Q1 and Q2 is in the first A preset range means that there is no shaking in the multiple frames of human face images within the first preset duration. If the first preset range is (0, 0.25), at this time, the difference between the position coordinates of Q1 and Q2 is not in the first preset range, which means that the multi-frame face images within the first preset duration shake. .
需要说明的是,当任意两帧人脸图像中的人脸的位置坐标的差值为负数时,处理器30比较的是该位置坐标的绝对值是否处于第一预设范围内。如,第一预设范围为(1,1),任意两帧人脸图像中的人脸的位置坐标的差值为(-2,-2)时,处理器30则确定位置坐标的差值的绝对值(2,2)不处于第一预设范围(1,1),则说明任意两帧人脸图像中的人脸的位置坐标的差值不处于第一预设范围内,即在第一预时长内的多帧人脸图像发生抖动。It should be noted that when the difference between the position coordinates of the faces in any two frames of face images is negative, the processor 30 compares whether the absolute value of the position coordinates is within the first preset range. For example, the first preset range is (1, 1), and when the difference between the position coordinates of the faces in any two frames of human face images is (-2, -2), the processor 30 then determines the difference between the position coordinates The absolute value (2, 2) of is not in the first preset range (1, 1), it means that the difference between the position coordinates of the faces in any two frames of face images is not in the first preset range, that is, in The multiple frames of face images within the first pre-duration shake.
综上,在处理器30判断连续多帧人脸图像中,任意两帧人脸图像中的人脸的位置坐标的差值不处于第一预设范围时,处理器30则确定人脸未发生抖动。在处理器30判断连续多帧人脸图像中,任意两帧人脸图像中的人脸的位置坐标的差值处于第一预设范围时,处理器30则确定人脸发生抖动。此时,则说明用户并不是希望调整终端100的播放音量,处理器30同样也不会调整终端100的播放音量。To sum up, when the processor 30 judges that the difference between the position coordinates of the faces in any two frames of face images is not within the first preset range in the continuous multi-frame face images, the processor 30 determines that no face has occurred. shake. When the processor 30 judges that the difference between the position coordinates of the faces in any two frames of the face images is within the first preset range among the consecutive multiple frames of face images, the processor 30 determines that the face shakes. At this time, it means that the user does not wish to adjust the playback volume of the terminal 100 , and the processor 30 will not adjust the playback volume of the terminal 100 either.
请参阅图2、图3和图7,在某些实施方式中,人脸图像还包括角度信息,本申请实施方式的音量调节方法,还包括步骤:Please refer to Fig. 2, Fig. 3 and Fig. 7, in some embodiments, the face image also includes angle information, the volume adjustment method of the embodiment of the present application, also includes the steps:
701:获取人脸图像,人脸图像包括抖动信息;701: Obtain a face image, where the face image includes shaking information;
702:根据人脸图像计算人脸和电子设备的距离;及702: Calculate the distance between the face and the electronic device according to the face image; and
703:在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。703: When the shake information is within the first preset range and the angle information is within the second preset range, adjust the playback volume according to the distance.
在某些实施方式中,获取模块11用于执行步骤701,计算模块12用于执行步骤702,调整模块13用于执行步骤703。即获取模块11用于获取人脸图像,人脸图像包括抖动信息。计算模块12用于根据人脸图像计算人脸和电子设备的距离。调整模块13用于在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。In some embodiments, the acquisition module 11 is used to execute step 701 , the calculation module 12 is used to execute step 702 , and the adjustment module 13 is used to execute step 703 . That is, the acquisition module 11 is used to acquire a face image, and the face image includes shaking information. The calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image. The adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range and the angle information is within the second preset range.
在某些实施方式中,处理器30还用于执行步骤601、步骤602和步骤603。即处理器30用于获取人脸图像,人脸图像包括抖动信息;根据人脸图像计算人脸和电子设备的距离;及在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。In some implementations, the processor 30 is further configured to execute step 601 , step 602 and step 603 . That is, the processor 30 is used to obtain a face image, and the face image includes shake information; calculate the distance between the face and the electronic device according to the face image; and when the shake information is in a first preset range and the angle information is in a second preset range , adjust the playback volume according to the distance.
其中,步骤701和步骤702分别与上述步骤101和步骤102的执行方式相同,在此不一一赘述。Wherein, step 701 and step 702 are performed in the same way as the above-mentioned step 101 and step 102 respectively, and will not be repeated here.
在某些情况下,当用户发生转头、抬头或低头时,则用户人脸与终端100(电子设备)之间的距离同样会发生变化,而此时,用户并非需要调整终端100的播放音量。In some cases, when the user turns, raises or lowers the head, the distance between the user's face and the terminal 100 (electronic device) will also change, and at this time, the user does not need to adjust the playback volume of the terminal 100 .
因此,为了保证处理器30判断是否调整播放音量的准确性,在处理器30根据人脸和电子设备之间的距离调整播放音量时,还需判断人脸图像中的角度信息是否处于第二预设范围,并在该角度信息处于第二预设范围且上述抖动信息处于第一预设范围时,处理器30才会调整终端100的播放音量。Therefore, in order to ensure the accuracy of the processor 30 in judging whether to adjust the playback volume, when the processor 30 adjusts the playback volume according to the distance between the human face and the electronic device, it is also necessary to determine whether the angle information in the face image is in the second predetermined range. The processor 30 will adjust the playback volume of the terminal 100 only when the angle information is in the second preset range and the shaking information is in the first preset range.
其中,第二预设范围可以包括人脸与终端的预设角度及对应的预设方位。如,预设角度可以为70度,则说明人脸相对于终端100,左侧头、右侧头、抬头及低头的角度阈值为70度。角度信息则包括人脸图像中的人脸与终端的角度和方位。Wherein, the second preset range may include a preset angle between the human face and the terminal and a corresponding preset orientation. For example, the preset angle may be 70 degrees, which means that relative to the terminal 100 , the angle thresholds of the left head, right head, head up and head down of the human face are 70 degrees. The angle information includes the angle and orientation between the face in the face image and the terminal.
具体地,处理器30可通过判断多帧人脸图像中,目标人脸图像与终端100的夹角是否处于第二预设范围内,以确定是否根据距离调整播放音量。如当目标人脸图像与终端100的夹角小于第二预设范围时,则处理器30确定目标人脸图像与终端100的夹角处于第二预设范围,当目标人脸图像与终端100的夹角大于预定角度时,则处理器30目标人脸图像与终端100的夹角未处于第二预设范围。Specifically, the processor 30 may determine whether to adjust the playback volume according to the distance by judging whether the angle between the target face image and the terminal 100 in the multiple frames of face images is within a second preset range. For example, when the angle between the target face image and the terminal 100 is smaller than the second preset range, the processor 30 determines that the angle between the target face image and the terminal 100 is within the second preset range, when the target face image and the terminal 100 When the included angle is greater than the predetermined angle, the included angle between the processor 30 target face image and the terminal 100 is not within the second preset range.
请参阅图2、图3和图8,本申请实施方式的音量调节方法,还包括步骤:Please refer to Fig. 2, Fig. 3 and Fig. 8, the volume adjustment method of the embodiment of the present application also includes steps:
801:获取人脸图像,人脸图像包括抖动信息;801: Obtain a face image, where the face image includes shaking information;
802:根据人脸图像计算人脸和电子设备的距离;802: Calculate the distance between the face and the electronic device according to the face image;
803:获取第二预定时长内的连续多帧人脸图像;803: Obtain multiple consecutive frames of face images within a second predetermined duration;
804:判断连续多帧人脸图像中,人脸的角度信息是否均处于第二预设范围;804: Judging whether the angle information of the face in the continuous multiple frames of face images is within the second preset range;
805:若是,则确定角度信息处于第二预设范围;及805: If yes, determine that the angle information is in the second preset range; and
806:在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。806: When the shaking information is in the first preset range and the angle information is in the second preset range, adjust the playback volume according to the distance.
在某些实施方式中,获取模块11还用于执行步骤801、步骤803、步骤804及步骤805,计算模块12用于执行步骤802,调整模块13用于执行步骤806。即获取模块11用于获取人脸图像,人脸图像包括抖动信息;获取第二预定时长内的连续多帧人脸图像;判断连续多帧人脸图像中,人脸的角度信息是否均处于第二预设范围;若是,则确定角度信息处于第二预设范围。计算模块12用于根据人脸图像计算人脸和电子设备的距离。调整模块13用于在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。In some embodiments, the acquisition module 11 is further configured to execute step 801 , step 803 , step 804 and step 805 , the calculation module 12 is configured to execute step 802 , and the adjustment module 13 is configured to execute step 806 . That is, the acquisition module 11 is used to obtain the face image, and the face image includes shaking information; obtain continuous multiple frames of human face images in the second predetermined duration; judge whether the angle information of the human face is in the first position in the continuous multiple frames of human face images. Two preset ranges; if yes, determine that the angle information is in the second preset range. The calculation module 12 is used for calculating the distance between the human face and the electronic device according to the human face image. The adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range and the angle information is within the second preset range.
在某些实施方式中,处理器30还用于执行步骤801、步骤802、步骤803、步骤804、步骤805和步骤806。即处理器30获取人脸图像,人脸图像包括抖动信息;根据人脸图像计算人脸和电子设备的距离;获取第二预定时长内的连续多帧人脸图像;判断连续多帧人脸图像中,人脸的角度信息是否均处于第二预设范围;若是,则确定角度信息处于第二预设范围;及在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。In some implementations, the processor 30 is further configured to execute step 801 , step 802 , step 803 , step 804 , step 805 and step 806 . That is, the processor 30 obtains a face image, and the face image includes shaking information; calculates the distance between the face and the electronic device according to the face image; obtains continuous multi-frame face images in the second predetermined duration; judges the continuous multi-frame face images , whether the angle information of the face is in the second preset range; if so, then determine that the angle information is in the second preset range; and when the shaking information is in the first preset range and the angle information is in the second preset range, Adjust playback volume according to distance.
其中,步骤801与上述步骤701执行方式相同,步骤802与上述步骤702执行方式相同,步骤806与上述步骤703执行方式相同,在此不一一赘述。Wherein, step 801 is executed in the same manner as above-mentioned step 701, step 802 is executed in the same manner as above-mentioned step 702, and step 806 is executed in the same manner as above-mentioned step 703, which will not be repeated here.
具体地,处理器30还会获取第二预定时长内的连续多帧人脸图像,并通过判断连续多帧人脸图像中,人脸的角度信息是否处于第二预设范围。而角度信息是否处于第二预设范围还可反应第二预定时长内的人脸的角度是否有效。其中,第二预定时长可以大于第一预定时长,也可以小于第一预定时长,还可以等于第一预定时长。Specifically, the processor 30 also acquires multiple consecutive frames of human face images within a second predetermined time period, and determines whether the angle information of the human face in the continuous multiple frames of human face images is within a second preset range. Whether the angle information is within the second preset range can also reflect whether the angle of the face within the second predetermined time period is valid. Wherein, the second predetermined duration may be greater than the first predetermined duration, may also be shorter than the first predetermined duration, and may also be equal to the first predetermined duration.
第二预设范围为代表方位的具体角度。例如,第二预设范围可以为70度,则说明人脸相对于终端100,左侧头、右侧头、抬头及低头的角度阈值为70度,若处理器30获取5帧人脸图像,处理器30则分别判断这5帧人脸图像中人脸的角度是否小于70度,并在小于70度时,以确定第二预设范围内的人脸的角度信息处于第二预设范围,人脸的角度有效,此时,则说明用户是希望调整终端100的播放音量的。The second preset range is a specific angle representing the orientation. For example, the second preset range may be 70 degrees, which means that relative to the terminal 100, the angle thresholds of the left head, right head, head up and head down of the human face are 70 degrees. If the processor 30 acquires 5 frames of human face images, The processor 30 judges whether the angle of the face in the five frames of face images is less than 70 degrees, and when it is less than 70 degrees, determines that the angle information of the face in the second preset range is in the second preset range, If the angle of the face is valid, it means that the user wishes to adjust the playback volume of the terminal 100 .
如图9所示,图8为用户右侧头的人脸图像P,处理器30可根据人脸图像P中用户右侧头的程度,即人脸相对于终端的夹角是否处于第二预设范围,以确定人脸角度是否有效。若第二预设范围为60度,处理器30判断图8中用户右侧头的角度为80度,则此时,人脸相对于终端的夹角不处于第二预设范围处理器,则确定人脸角度无效。若第二预设范围为60度,处理器30判断图8中用户右侧头的角度为50度,人脸相对于终端的夹角处于第二预设范围,处理器则确定人脸角度有效。As shown in FIG. 9 , FIG. 8 is a face image P of the user's right head, and the processor 30 can determine whether the angle between the face and the terminal is in the second predetermined position according to the degree of the user's right head in the face image P. Set the range to determine whether the face angle is valid. If the second preset range is 60 degrees, and the processor 30 judges that the angle of the user's right head in FIG. 8 is 80 degrees, then at this time, the angle between the human face and the terminal is not in the second preset range. Invalid face angle determination. If the second preset range is 60 degrees, the processor 30 judges that the angle of the user’s right head in FIG. .
需要说明的是,处理器30可以同时确定抖动信息是否处于第一预设范围和角度信息是否处于第二预设范围的工作;处理器30也可以先确定抖动信息是否处于第一预设范围的工作,再确定角度信息是否处于第二预设范围的工作;处理器30还可以先确定角度信息是否处于第二预设范围的工作,再确定 抖动信息是否处于第一预设范围的工作。It should be noted that the processor 30 can simultaneously determine whether the shaking information is in the first preset range and whether the angle information is in the second preset range; the processor 30 can also first determine whether the shaking information is in the first preset range. work, and then determine whether the angle information is in the second preset range; the processor 30 can also first determine whether the angle information is in the second preset range, and then determine whether the shaking information is in the first preset range.
当处理器30确定抖动信息是否处于第一预设范围和角度信息是否处于第二预设范围的工作时,则在满足抖动信息不处于第一预设范围或角度信息不处于第二预设范围的工作其中一种时,处理器30则不会调整终端100的播放音量。当处理器30分先后确定抖动信息是否处于第一预设范围和角度信息是否处于第二预设范围的工作,则在先确定的工作不满足条件时,处理器30便不会再进行后续的工作。例如,处理器30先确定抖动信息不处于第一预设范围后,处理器30则不会再去确定角度信息是否处于第二预设范围的工作。由此,则可减轻处理器30的工作量。When the processor 30 determines whether the jitter information is in the first preset range and whether the angle information is in the second preset range, if the jitter information is not in the first preset range or the angle information is not in the second preset range When one of the operations is performed, the processor 30 will not adjust the playback volume of the terminal 100 . When the processor 30 successively determines whether the shaking information is in the first preset range and whether the angle information is in the second preset range, then when the previously determined work does not meet the conditions, the processor 30 will not perform subsequent operations. Work. For example, after the processor 30 first determines that the shaking information is not within the first preset range, the processor 30 will not perform work to determine whether the angle information is within the second preset range. Thus, the workload of the processor 30 can be reduced.
请参阅图2、图3及图10,本申请实施方式的音量调节方法,还包括步骤:Please refer to Fig. 2, Fig. 3 and Fig. 10, the volume adjustment method of the embodiment of the present application also includes steps:
1001:接收输入操作,以设置多个不同用户的人脸的优先级;1001: receiving an input operation to set the priority of faces of multiple different users;
1002:获取人脸图像中,优先级最高的人脸的第一人脸信息,以作为目标人脸信息;1002: Obtain the first face information of the face with the highest priority in the face image as the target face information;
1003:获取人脸图像,人脸图像包括人脸抖动信息;1003: Obtain a face image, where the face image includes face shaking information;
1004:根据目标人脸信息计算人脸和电子设备的距离;及1004: Calculate the distance between the face and the electronic device according to the target face information; and
1005:在抖动信息处于第一预设范围时,根据距离调整播放音量。1005: When the shaking information is within the first preset range, adjust the playback volume according to the distance.
在某些实施方式中,音量调节装置10还包括设置模块14,设置模块14用于执行步骤1001和步骤1002,获取模块11用于执行步骤1003,计算模块12用于执行步骤1004,调整模块13用于执行步骤1005。即设置模块14用于接收输入操作,以设置多个不同用户的人脸的优先级;获取人脸图像中,优先级最高的人脸的第一人脸信息,以作为目标人脸信息。获取模块11用于获取人脸图像,人脸图像包括人脸抖动信息。计算模块12用于根据目标人脸信息计算人脸和电子设备的距离。调整模块13用于在抖动信息处于第一预设范围时,根据距离调整播放音量。In some embodiments, the volume adjustment device 10 further includes a setting module 14, the setting module 14 is used to execute step 1001 and step 1002, the acquisition module 11 is used to execute step 1003, the calculation module 12 is used to execute step 1004, and the adjustment module 13 It is used to execute step 1005. That is, the setting module 14 is used to receive an input operation to set the priorities of faces of multiple different users; obtain the first face information of the face with the highest priority in the face image as the target face information. The obtaining module 11 is used for obtaining a face image, and the face image includes face shaking information. The calculation module 12 is used for calculating the distance between the human face and the electronic device according to the target human face information. The adjustment module 13 is configured to adjust the playback volume according to the distance when the shaking information is within the first preset range.
在某些实施方式中,处理器30用于执行步骤1001、步骤1002、步骤1003、步骤1004和步骤1005。即处理器30接收输入操作,以设置多个不同用户的人脸的优先级;获取人脸图像中,优先级最高的人脸的第一人脸信息,以作为目标人脸信息;获取人脸图像,人脸图像包括人脸抖动信息;根据目标人脸信息计算人脸和电子设备的距离;及在抖动信息处于第一预设范围时,根据距离调整播放音量。In some implementations, the processor 30 is configured to execute step 1001 , step 1002 , step 1003 , step 1004 and step 1005 . That is, the processor 30 receives an input operation to set the priority of the faces of a plurality of different users; obtains the first face information of the face with the highest priority in the face image as the target face information; obtains the face The image, the face image includes face shake information; calculate the distance between the face and the electronic device according to the target face information; and adjust the playback volume according to the distance when the shake information is in the first preset range.
其中,步骤1003和步骤1005分别与上述步骤101和步骤103的执行方式相同,在此不一一赘述。Wherein, step 1003 and step 1005 are performed in the same way as the above-mentioned step 101 and step 103 respectively, and will not be repeated here.
具体地,在处理器30获取人脸图像前,多个用户可在终端100内录入自身的人脸,处理器30则可接收输入操作,即接收到多个用户的人脸。Specifically, before the processor 30 acquires the face image, multiple users may input their own faces in the terminal 100, and the processor 30 may receive an input operation, that is, receive the faces of multiple users.
接下来,终端100的机主可通过终端100以设置多个不同用户的人脸的优先级,例如,终端100的机主录入了包括自身的人脸的3个用户的人脸,终端100的机主则可将自身的人脸设置为第一优先级,剩下两个用户的人脸设置为第二优先级和第三优先级。Next, the owner of the terminal 100 can set the priority of the faces of multiple different users through the terminal 100. For example, the owner of the terminal 100 has entered the faces of three users including his own face, and the The owner of the machine can set his own face as the first priority, and the faces of the remaining two users as the second priority and the third priority.
在设置好多个用户的人脸的优先级后,处理器30则可根据获取到的人脸图像中,优先级最高的人脸的第一人脸信息,以作为目标人脸信息。After setting the priorities of the faces of multiple users, the processor 30 may use the first face information of the face with the highest priority among the acquired face images as the target face information.
例如,终端100的共设置有三个优先级的人脸,分别为第一优先级的人脸、第二优先级的人脸和第三优先级的人脸。则当处理器30获取到连续多帧人脸图像后,处理器30会先找第一优先级的人脸,若没有第一优先级的人脸,则接着找第二优先级的人脸,若没有第二优先级的人脸,则再去找第三优先级的人脸。需要说明的是,若人脸图像中同时包含第一优先级的人脸、第二优先级的人脸和第三优先级的人脸,则处理器30会选取第一优先级(即优先级最高)的人脸的第一人脸信息,以作为目标人脸信息。若人脸图像中均不包含第一优先级的人脸、第二优先级的人脸和第三优先级的人脸,则说明该连续多帧的人脸图像无效,处理器30不会执行本申请实施方式的音量调节方法。For example, the terminal 100 is provided with a total of three priority faces, namely, a face with a first priority, a face with a second priority, and a face with a third priority. Then when the processor 30 obtains continuous multi-frame face images, the processor 30 will first look for a face of the first priority, if there is no face of the first priority, then find a face of the second priority, If there is no face with the second priority, then go to the face with the third priority. It should be noted that, if the face image contains the face of the first priority, the face of the second priority and the face of the third priority, the processor 30 will select the first priority (i.e. the priority The first human face information of the highest) human face is used as the target human face information. If the face image does not contain the face of the first priority, the face of the second priority and the face of the third priority, it means that the face image of the continuous multiple frames is invalid, and the processor 30 will not execute The volume adjustment method of the embodiment of the present application.
最后,在得到人脸图像中的目标人脸信息后,处理器30便可通过该目标人脸信息以计算该人脸与电子设备(即终端100)的距离。可以理解的是,当人脸图像中包含有多个人脸时,处理器30会先判断多个人脸中优先级最高的人脸,以将优先级最高的人脸信息作为目标人脸信息,而当处理器30根据人脸图像计算人脸和电子设备的距离时,则是计算多个人脸中优先级最高的人脸与电子设备的距离。Finally, after obtaining the target face information in the face image, the processor 30 can use the target face information to calculate the distance between the face and the electronic device (ie, the terminal 100 ). It can be understood that when the face image contains multiple faces, the processor 30 will first determine the face with the highest priority among the multiple faces, so as to use the face information with the highest priority as the target face information, and When the processor 30 calculates the distance between the human face and the electronic device according to the human face image, it calculates the distance between the human face with the highest priority among the multiple human faces and the electronic device.
由此,处理器30仅会为终端100的机主,提供调整播放音量的工作,以避免获取的多帧人脸图像中包含有其他人脸时,出现其他人脸影响调整播放音量准确性的情况,从而保证处理器30调整播放音量的准确性。Therefore, the processor 30 will only provide the owner of the terminal 100 with the work of adjusting the playback volume, so as to avoid the occurrence of other faces affecting the accuracy of adjusting the playback volume when the acquired multi-frame face images contain other faces. situation, thereby ensuring the accuracy of the processor 30 in adjusting the playback volume.
请参阅图2、图3和图11,在某些实施方式中,步骤1002:获取人脸图像中,优先级最高的人脸的第一人脸信息,以作为目标人脸信息,还包括步骤:Please refer to Fig. 2, Fig. 3 and Fig. 11, in some embodiments, step 1002: acquire the first face information of the face with the highest priority in the face image, as the target face information, also includes the step :
1101:识别人脸图像中的一个或多个人脸的第二人脸信息;1101: Identify the second face information of one or more faces in the face image;
1102:将一个或多个第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与预存人脸信息匹配的第二人脸信息,作为第一人脸信息;及1102: Compare one or more second face information with the pre-stored face information in the preset face database to obtain the second face information matching the pre-stored face information as the first face information ;and
1103:获取人脸的优先级最高的第一人脸信息,以作为目标人脸信息。1103: Obtain the first face information with the highest priority of the face as the target face information.
在某些实施方式中,设置模块14用于执行步骤1101、步骤1102及步骤1103。即设置模块14用于识别人脸图像中的一个或多个人脸的第二人脸信息;将一个或多个第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与预存人脸信息匹配的第二人脸信息,作为第一人脸信息;及获取人脸的优先级最高的第一人脸信息,以作为目标人脸信息。In some embodiments, the setting module 14 is used to execute step 1101 , step 1102 and step 1103 . That is, the setting module 14 is used to identify the second face information of one or more faces in the face image; one or more second face information is compared with the pre-stored face information in the preset face storehouse , to obtain the second face information matching the pre-stored face information as the first face information; and obtain the first face information with the highest priority of the face as the target face information.
在某些实施方式中,处理器30用于执行步骤1101、步骤1102及步骤1103。即处理器30用于识别人脸图像中的一个或多个人脸的第二人脸信息;将一个或多个第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与预存人脸信息匹配的第二人脸信息,作为第一人脸信息;及获取人脸的优先级最高的第一人脸信息,以作为目标人脸信息。In some implementations, the processor 30 is configured to execute step 1101 , step 1102 and step 1103 . That is, the processor 30 is used to identify the second face information of one or more faces in the face image; compare the one or more second face information with the pre-stored face information in the preset face database , to obtain the second face information matching the pre-stored face information as the first face information; and obtain the first face information with the highest priority of the face as the target face information.
具体地,处理器30在获取人脸图像中,优先级最高的人脸的第一人脸信息的前,终端100内可设置有预设的人脸库,该预设的人脸库中包含有预存人脸信息。在处理器30获取到多帧人脸图像后,处理器30则可识别出人脸图像中所有的人脸的人脸信息,并将该人脸信息作为第二人脸信息。需要说明的是,当人脸图像中包含有多个人脸时,处理器30则可获取多个人脸的人脸信息,以得到多个第二人脸信息。Specifically, before the processor 30 acquires the first face information of the face with the highest priority in the face image, a preset face library may be set in the terminal 100, and the preset face library includes There is pre-stored face information. After the processor 30 acquires multiple frames of human face images, the processor 30 can recognize the human face information of all the human faces in the human face images, and use the human face information as the second human face information. It should be noted that when the face image contains multiple faces, the processor 30 may acquire face information of the multiple faces to obtain multiple second face information.
其中,预设的人脸库中的预存人脸信息可以是根据不同的用户在不同光照条件下的人脸图像生成的,也可以是根据不同的用户在不同的拍摄角度下的人脸图像生成的。Wherein, the pre-stored face information in the preset face database can be generated according to the face images of different users under different lighting conditions, or can be generated according to the face images of different users under different shooting angles of.
由此,在用户需要调整终端100的播放音量时,处理器30可以提醒用户在与预存人脸信息相同的光照条件下进行操作,或处理器30可以提醒用户在与预存人脸信息相同的拍摄角度下进行操作,从而保证调整播放音量的准确性。Thus, when the user needs to adjust the playback volume of the terminal 100, the processor 30 can remind the user to operate under the same lighting conditions as the pre-stored face information, or the processor 30 can remind the user to take pictures under the same lighting conditions as the pre-stored face information. Operate at an angle to ensure the accuracy of adjusting the playback volume.
接下来,处理器30则可将第二人脸信息与预存人脸信息进行比对,从而找到与预存人脸信息匹配(即一致)的第二人脸信息,并将该第二人脸信息作为第一人脸信息。当处理器30对比得到多个预存人脸信息匹配的第二人脸信息时,则可得到多个第一人脸信息。Next, the processor 30 can compare the second face information with the pre-stored face information, thereby finding the second face information that matches (that is, is consistent) with the pre-stored face information, and compares the second face information with the pre-stored face information. as the first face information. When the processor 30 compares and obtains the second facial information that matches the multiple pre-stored facial information, multiple first facial information can be obtained.
最后,处理器30则可根据不同的人脸的优先级,以找出优先级最高的第一人脸信息,以作为目标人脸信息。即处理器30仅会针对优先级最高的第一人脸信息,进行人脸是否抖动和人脸的角度是否有效的确定工作,并根据优先级最高的第一人脸信息计算该人脸和电子设备(即终端100)的距离,以执行对应的调整播放音量的工作。Finally, the processor 30 can find out the first human face information with the highest priority according to the priorities of different human faces as the target human face information. That is to say, the processor 30 will only determine whether the face shakes and whether the angle of the face is valid for the first face information with the highest priority, and calculate the face and electronic information according to the first face information with the highest priority. The distance between the device (that is, the terminal 100) is used to perform the corresponding work of adjusting the playback volume.
请参阅图2、图3及图12,本申请实施方式的音量调节方法,还包括步骤:Please refer to Fig. 2, Fig. 3 and Fig. 12, the volume adjustment method of the embodiment of the present application also includes steps:
1201:获取人脸图像,人脸图像包括抖动信息;1201: Obtain a face image, where the face image includes shaking information;
1202:根据输入操作,设置初始距离下的初始音量,并关联初始距离和初始距离下采集的人脸图像中人脸的初始尺寸;1202: According to the input operation, set the initial volume at the initial distance, and associate the initial distance with the initial size of the face in the face image collected at the initial distance;
1203:根据初始距离、初始尺寸和多帧人脸图像中人脸的尺寸的平均尺寸,计算距离;1203: Calculate the distance according to the initial distance, the initial size, and the average size of the face sizes in multiple frames of face images;
1204:根据初始距离、当前距离和初始音量确定调整音量;及1204: Determine and adjust the volume according to the initial distance, the current distance and the initial volume; and
1205:根据调整音量调整播放音量。1205: Adjust the playback volume according to the adjusted volume.
在某些实施方式中,音量调节装置10还包括关联模块15,关联模块15用于执行步骤1202,获取模块11用于执行步骤1201,计算模块12用于执行步骤1203,调整模块13用于执行步骤1204和步骤1205。即获取模块11获取人脸图像,人脸图像包括抖动信息。关联模块15用于根据输入操作,设置初始距离下的初始音量,并关联初始距离和初始距离下采集的人脸图像中人脸的初始尺寸。计算模块12根据初始距离、初始尺寸和多帧人脸图像中人脸的尺寸的平均尺寸,计算当前距离。调整模块13用于根据初始距离、当前距离和初始音量确定调整音量;及根据调整音量调整播放音量。In some embodiments, the volume adjustment device 10 further includes an association module 15, the association module 15 is used to execute step 1202, the acquisition module 11 is used to execute step 1201, the calculation module 12 is used to execute step 1203, and the adjustment module 13 is used to execute Step 1204 and Step 1205. That is, the acquiring module 11 acquires a face image, and the face image includes shaking information. The association module 15 is used to set the initial volume at the initial distance according to the input operation, and associate the initial distance with the initial size of the face in the face image collected at the initial distance. The calculating module 12 calculates the current distance according to the initial distance, the initial size and the average size of the faces in multiple frames of face images. The adjustment module 13 is used for determining the adjusted volume according to the initial distance, the current distance and the initial volume; and adjusting the playback volume according to the adjusted volume.
在某些实施方式中,处理器30用于执行步骤1201、步骤1202、步骤1203、步骤1204和步骤1205,即处理器30用于获取人脸图像,人脸图像包括抖动信息;根据输入操作,设置初始距离下的初始音量,并关联初始距离和初始距离下采集的人脸图像中人脸的初始尺寸;根据初始距离、初始尺寸和多帧人脸图像中人脸的尺寸的平均尺寸,计算当前距离;根据初始距离、当前距离和初始音量确定调整音量;及根据调整音量调整播放音量。In some embodiments, the processor 30 is used to execute step 1201, step 1202, step 1203, step 1204 and step 1205, that is, the processor 30 is used to acquire a face image, and the face image includes shaking information; according to the input operation, Set the initial volume at the initial distance, and associate the initial distance with the initial size of the face in the face image collected at the initial distance; calculate The current distance; determining and adjusting the volume according to the initial distance, the current distance and the initial volume; and adjusting the playback volume according to the adjusted volume.
其中,步骤1201与上述步骤101的执行方式相同,在此不一一赘述。Wherein, step 1201 is executed in the same way as the above step 101, and will not be repeated here.
具体地,在处理器30根据目标人脸信息计算人脸和电子设备的距离前,用户可根据终端100的指示操作,以设定好与终端100的合适距离及终端100的最佳播放音量。例如,用户与终端100距离0.5米,终端100的最佳播放音量为50分贝。此时,处理器30则将该距离和播放音量分别作为初始距离和初始音量。此时,处理器30还可在初始距离下获取到的用户当前的人脸图像。由此,处理器30则可将初始距离和人脸图像中人脸的初始尺寸关联起来,如初始距离与初始尺寸相对应。Specifically, before the processor 30 calculates the distance between the face and the electronic device according to the target face information, the user can operate according to the instructions of the terminal 100 to set an appropriate distance from the terminal 100 and an optimal playback volume of the terminal 100. For example, the distance between the user and the terminal 100 is 0.5 meters, and the optimal playback volume of the terminal 100 is 50 decibels. At this time, the processor 30 takes the distance and the playback volume as the initial distance and the initial volume, respectively. At this time, the processor 30 may also acquire the current face image of the user at the initial distance. Thus, the processor 30 can associate the initial distance with the initial size of the face in the face image, for example, the initial distance corresponds to the initial size.
接下来,在处理器30根据目标人脸信息计算人脸和电子设备的距离时,则可先计算多帧人脸图像中人脸的尺寸的平均尺寸,然后根据下述公式(1),以计算得到当前距离。Next, when the processor 30 calculates the distance between the human face and the electronic device according to the target human face information, the average size of the size of the human face in the multi-frame human face images can be calculated first, and then according to the following formula (1), the Calculate the current distance.
L1=L0*(S1/S0)  (1)L1=L0*(S1/S0) (1)
其中,L1为当前距离,S1为多帧人脸图像中人脸的平均尺寸,S0为人脸图像中人脸的初始尺寸,L0为初始距离。可以理解,S1和S0对应的人脸图像分别为L1距离下的人脸图像和L0下的人脸图像,不是同一张人脸图像,当S1等于S0时,L1才等于L0。Among them, L1 is the current distance, S1 is the average size of the face in the multi-frame face image, S0 is the initial size of the face in the face image, and L0 is the initial distance. It can be understood that the face images corresponding to S1 and S0 are the face images at the L1 distance and the face images at the L0 distance respectively, and they are not the same face image. When S1 is equal to S0, L1 is equal to L0.
在处理器30通过计算得到人脸和电子设备的距离后,根据声压和距离的关系,则可得到下述公式(2),从而得到当前距离下,终端100的播放音量由初始音量调整到合适的播放音量所需的变化音量。After the processor 30 calculates the distance between the human face and the electronic device, according to the relationship between the sound pressure and the distance, the following formula (2) can be obtained, so that under the current distance, the playback volume of the terminal 100 is adjusted from the initial volume to Vary volume required for proper playback volume.
△V=20Log(L1/L0)  (2)△V=20Log(L1/L0) (2)
由此,在变化音量△V已知的情况下,根据下述公式(3)则可得到当前距离下所对应的调整音量。Thus, when the changing volume ΔV is known, the adjusted volume corresponding to the current distance can be obtained according to the following formula (3).
V1=V0+△V  (3)V1=V0+△V (3)
其中,△V为当前距离下,终端100的播放音量由初始音量调整到合适的播放音量所需的变化音量,V1为当前距离下所对应的调整音量,V0为初始音量。Wherein, △V is the changing volume required to adjust the playback volume of the terminal 100 from the initial volume to an appropriate playback volume at the current distance, V1 is the corresponding adjusted volume at the current distance, and V0 is the initial volume.
最后,处理器30则可根据调整音量V1以调整播放音量,即将终端100的播放音量调整至V1大小。其中,由于V0为预设的初始距离下终端100的最佳播放音量,因此,当处理器30根据上述公式(1)、(2)、(3)以计算得到调整音量V1时,调整音量V1同样为当前距离下终端100的最佳播放音量,从而可保证用户具有较好的使用体验。Finally, the processor 30 can adjust the playing volume according to the volume V1, that is, adjust the playing volume of the terminal 100 to V1. Wherein, since V0 is the optimal playback volume of the terminal 100 under the preset initial distance, when the processor 30 calculates the adjusted volume V1 according to the above formulas (1), (2), and (3), the adjusted volume V1 It is also the optimal playback volume of the terminal 100 at the current distance, so as to ensure a better user experience for the user.
请参阅图13,本申请实施方式还提供一种包含计算机程序201的非易失性计算机可读存储介质200。当计算机程序201被一个或多个处理器30执行时,使得一个或多个处理器30执行上述任一实施方式的音量调节方法。Referring to FIG. 13 , the embodiment of the present application also provides a non-volatile computer-readable storage medium 200 containing a computer program 201 . When the computer program 201 is executed by one or more processors 30, the one or more processors 30 are made to execute the volume adjustment method in any one of the above-mentioned embodiments.
例如,计算机程序201被一个或多个处理器30执行时,使得处理器30执行以下音量调节方法:For example, when the computer program 201 is executed by one or more processors 30, the processors 30 are made to perform the following volume adjustment method:
101:获取人脸图像,人脸图像包括抖动信息;101: Acquire a face image, where the face image includes shaking information;
102:根据人脸图像计算人脸和电子设备的距离;及102: Calculate the distance between the face and the electronic device according to the face image; and
103:在抖动信息处于第一预设范围时,根据距离调整播放音量。103: When the shaking information is within the first preset range, adjust the playback volume according to the distance.
又例如,计算机程序201被一个或多个处理器30执行时,使得处理器30执行以下音量调节方法:For another example, when the computer program 201 is executed by one or more processors 30, the processors 30 are made to perform the following volume adjustment method:
501:获取人脸图像,人脸图像包括抖动信息;501: Obtain a face image, where the face image includes shaking information;
502:根据人脸图像计算人脸和电子设备的距离;502: Calculate the distance between the face and the electronic device according to the face image;
503:获取第一预定时长内的连续多帧人脸图像;503: Obtain multiple consecutive frames of face images within the first predetermined duration;
504:判断连续多帧人脸图像中,任意两帧人脸图像中的人脸的位置坐标的差值是否处于第一预设范围内;及504: Judging whether the difference between the position coordinates of the faces in any two frames of face images in the continuous multiple frames of face images is within the first preset range; and
505:若是,则确定抖动信息处于第一预设范围。505: If yes, determine that the shaking information is within a first preset range.
506:在抖动信息处于第一预设范围时,根据距离调整播放音量。506: When the shake information is within the first preset range, adjust the playback volume according to the distance.
又例如,计算机程序201被一个或多个处理器30执行时,使得处理器30执行以下音量调节方法:For another example, when the computer program 201 is executed by one or more processors 30, the processors 30 are made to perform the following volume adjustment method:
701:获取人脸图像,人脸图像包括抖动信息;701: Obtain a face image, where the face image includes shaking information;
702:根据人脸图像计算人脸和电子设备的距离;及702: Calculate the distance between the face and the electronic device according to the face image; and
703:在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。703: When the shake information is within the first preset range and the angle information is within the second preset range, adjust the playback volume according to the distance.
又例如,计算机程序201被一个或多个处理器30执行时,使得处理器30执行以下音量调节方法:For another example, when the computer program 201 is executed by one or more processors 30, the processors 30 are made to perform the following volume adjustment method:
801:获取人脸图像,人脸图像包括抖动信息;801: Obtain a face image, where the face image includes shaking information;
802:根据人脸图像计算人脸和电子设备的距离;802: Calculate the distance between the face and the electronic device according to the face image;
803:获取第二预定时长内的连续多帧人脸图像;803: Obtain multiple consecutive frames of face images within a second predetermined duration;
804:判断连续多帧人脸图像中,人脸的角度信息是否均处于第二预设范围;804: Judging whether the angle information of the face in the continuous multiple frames of face images is within the second preset range;
805:若是,则确定角度信息处于第二预设范围;及805: If yes, determine that the angle information is in the second preset range; and
806:在抖动信息处于第一预设范围且角度信息处于第二预设范围时,根据距离调整播放音量。806: When the shaking information is in the first preset range and the angle information is in the second preset range, adjust the playback volume according to the distance.
又例如,计算机程序201被一个或多个处理器30执行时,使得处理器30执行以下音量调节方法:For another example, when the computer program 201 is executed by one or more processors 30, the processors 30 are made to perform the following volume adjustment method:
1001:接收输入操作,以设置多个不同用户的人脸的优先级;1001: receiving an input operation to set the priority of faces of multiple different users;
1002:获取人脸图像中,优先级最高的人脸的第一人脸信息,以作为目标人脸信息;1002: Obtain the first face information of the face with the highest priority in the face image as the target face information;
1003:获取人脸图像,人脸图像包括人脸抖动信息;1003: Obtain a face image, where the face image includes face shaking information;
1004:根据目标人脸信息计算人脸和电子设备的距离;及1004: Calculate the distance between the face and the electronic device according to the target face information; and
1005:在抖动信息处于第一预设范围时,根据距离调整播放音量。1005: When the shaking information is within the first preset range, adjust the playback volume according to the distance.
再例如,计算机程序201被一个或多个处理器30执行时,使得处理器30执行以下音量调节方法:For another example, when the computer program 201 is executed by one or more processors 30, the processors 30 are made to perform the following volume adjustment method:
1101:识别人脸图像中的一个或多个人脸的第二人脸信息;1101: Identify the second face information of one or more faces in the face image;
1102:将一个或多个第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与预存人脸信息匹配的第二人脸信息,作为第一人脸信息;及1102: Compare one or more second face information with the pre-stored face information in the preset face database to obtain the second face information matching the pre-stored face information as the first face information ;and
1103:获取人脸的优先级最高的第一人脸信息,以作为目标人脸信息。1103: Obtain the first face information with the highest priority of the face as the target face information.
还例如,计算机程序201被一个或多个处理器30执行时,使得处理器30执行以下音量调节方法:Also for example, when the computer program 201 is executed by one or more processors 30, the processors 30 are made to perform the following volume adjustment method:
1201:获取人脸图像,人脸图像包括抖动信息;1201: Obtain a face image, where the face image includes shaking information;
1202:根据输入操作,设置初始距离下的初始音量,并关联初始距离和初始距离下采集的人脸图像中人脸的初始尺寸;1202: According to the input operation, set the initial volume at the initial distance, and associate the initial distance with the initial size of the face in the face image collected at the initial distance;
1203:根据初始距离、初始尺寸和多帧人脸图像中人脸的尺寸的平均尺寸,计算距离;1203: Calculate the distance according to the initial distance, the initial size, and the average size of the face sizes in multiple frames of face images;
1204:根据初始距离、当前距离和初始音量确定调整音量;及1204: Determine and adjust the volume according to the initial distance, the current distance and the initial volume; and
1205:根据调整音量调整播放音量。1205: Adjust the playback volume according to the adjusted volume.
在本说明书的描述中,参考术语“某些实施方式”、“一个例子中”、“示例地”等的描述意指结合所述实施方式或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions with reference to the terms "certain embodiments", "in one example", "exemplarily" and the like mean that specific features, structures, materials or characteristics described in connection with the embodiments or examples are included in the In at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments or portions of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the present application includes additional implementations in which functions may be performed out of the order shown or discussed, including in substantially simultaneous fashion or in reverse order depending on the functions involved, which shall It should be understood by those skilled in the art to which the embodiments of the present application belong.
尽管上面已经示出和描述了本申请的实施方式,可以理解的是,上述实施方式是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施方式进行变化、修改、替换和变型。Although the implementation of the present application has been shown and described above, it can be understood that the above-mentioned implementation is exemplary and should not be construed as limiting the application, and those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims (22)

  1. 一种音量调节方法,其特征在于,包括:A volume adjustment method, characterized in that, comprising:
    获取人脸图像,所述人脸图像包括抖动信息;Obtaining a face image, the face image includes shaking information;
    根据所述人脸图像计算所述人脸和电子设备的距离;及calculating the distance between the human face and the electronic device according to the human face image; and
    在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。When the shaking information is within a first preset range, the playback volume is adjusted according to the distance.
  2. 根据权利要求1所述的音量调节方法,其特征在于,还包括:The volume adjustment method according to claim 1, further comprising:
    获取第一预定时长内的连续多帧所述人脸图像;Acquiring the face images of multiple consecutive frames within the first predetermined duration;
    判断连续多帧所述人脸图像中,任意两帧所述人脸图像中的人脸的位置坐标的差值是否处于所述第一预设范围;及Judging whether the difference between the position coordinates of the faces in any two frames of the face images in the continuous multiple frames of the face images is within the first preset range; and
    若是,则确定所述抖动信息处于第一预设范围。If yes, determine that the shaking information is within a first preset range.
  3. 根据权利要求1所述的音量调节方法,其特征在于,所述人脸图像还包括角度信息,所述根据所述距离调整播放音量,包括:The volume adjustment method according to claim 1, wherein the face image further includes angle information, and adjusting the playback volume according to the distance includes:
    在所述抖动信息处于所述第一预设范围且所述角度信息处于第二预设范围时,根据所述距离调整播放音量。When the shaking information is within the first preset range and the angle information is within a second preset range, the playback volume is adjusted according to the distance.
  4. 根据权利要求3所述的音量调节方法,其特征在于,还包括:The volume adjustment method according to claim 3, further comprising:
    获取第二预定时长内的连续多帧所述人脸图像;Acquiring the face images of multiple consecutive frames within a second predetermined duration;
    判断连续多帧所述人脸图像中,所述人脸的角度信息是否均处于所述第二预设范围;及judging whether the angle information of the face in the multiple consecutive frames of the face image is within the second preset range; and
    若是,则确定所述角度信息处于所述第二预设范围。If yes, determine that the angle information is within the second preset range.
  5. 根据权利要求1所述的音量调节方法,其特征在于,在所述根据所述人脸图像计算所述人脸和电子设备的距离之前,还包括:The volume adjustment method according to claim 1, further comprising: before calculating the distance between the human face and the electronic device according to the human face image:
    接收输入操作,以设置多个不同用户的所述人脸的优先级;及receiving input to prioritize said faces of a plurality of different users; and
    获取所述人脸图像中,所述优先级最高的所述人脸的第一人脸信息,以作为所述目标人脸信息;Obtaining the first face information of the face with the highest priority in the face image as the target face information;
    所述根据所述人脸图像计算所述人脸和电子设备的距离,包括:The calculating the distance between the human face and the electronic device according to the human face image includes:
    根据所述目标人脸信息计算所述人脸和电子设备的距离。Calculate the distance between the human face and the electronic device according to the target human face information.
  6. 根据权利要求5所述的音量调节方法,其特征在于,所述获取所述人脸图像中,优先级最高的所述人脸的第一人脸信息,以作为所述目标人脸信息,包括:The volume adjustment method according to claim 5, wherein said obtaining the first face information of the face with the highest priority in the face image as the target face information includes :
    识别所述人脸图像中的一个或多个所述人脸的所述第二人脸信息;identifying the second face information of one or more of the faces in the face image;
    将一个或多个所述第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与所述预存人脸信息匹配的所述第二人脸信息,作为所述第一人脸信息;Comparing one or more of the second face information with the pre-stored face information in the preset face database, to obtain the second face information matching the pre-stored face information, as the Describe the first face information;
    获取所述人脸的所述优先级最高的所述第一人脸信息,以作为所述目标人脸信息。Acquiring the first human face information with the highest priority of the human face as the target human face information.
  7. 根据权利要求6所述的音量调节方法,其特征在于,所述预存人脸信息根据不同所述用户在不同光照条件下的所述人脸图像生成。The volume adjustment method according to claim 6, wherein the pre-stored face information is generated according to the face images of different users under different lighting conditions.
  8. 根据权利要求6所述的音量调节方法,其特征在于,所述预存人脸信息根据不同所述用户在不同拍摄角度下的所述人脸图像生成。The volume adjustment method according to claim 6, wherein the pre-stored face information is generated according to the face images of different users at different shooting angles.
  9. 根据权利要求1所述的音量调节方法,其特征在于,所述音量调节方法还包括:The volume adjustment method according to claim 1, wherein the volume adjustment method further comprises:
    根据输入操作,设置初始距离下的初始音量,并关联所述初始距离和所述初始距离下采集的所述人脸图像中所述人脸的初始尺寸;According to the input operation, an initial volume at an initial distance is set, and the initial size of the face in the face image collected at the initial distance is associated with the initial distance;
    所述根据所述人脸图像计算所述人脸和电子设备的距离,包括:The calculating the distance between the human face and the electronic device according to the human face image includes:
    根据所述初始距离、所述初始尺寸和多帧所述人脸图像中所述人脸的尺寸的平均尺寸,计算所述距离。The distance is calculated according to the initial distance, the initial size, and an average size of the face sizes in multiple frames of the face images.
  10. 根据权利要求9所述的音量调节方法,其特征在于,所述根据所述距离调整播放音量,包括:The volume adjustment method according to claim 9, wherein the adjusting the playback volume according to the distance comprises:
    根据所述初始距离、所述距离和所述初始音量确定调整音量;及determining and adjusting volume according to the initial distance, the distance and the initial volume; and
    根据所述调整音量调整所述播放音量。Adjusting the playing volume according to the adjusting volume.
  11. 一种音量调节装置,其特征在于,包括:A volume adjustment device, characterized in that it comprises:
    获取模块,所述获取模块用于获取人脸图像,所述人脸图像包括抖动信息;An acquisition module, the acquisition module is used to acquire a face image, and the face image includes shaking information;
    计算模块,所述计算模块用于根据所述人脸图像计算所述人脸和电子设备的距离;及A calculation module, the calculation module is used to calculate the distance between the human face and the electronic device according to the human face image; and
    调整模块,所述调整模块用于在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。An adjustment module, configured to adjust the playback volume according to the distance when the shaking information is within a first preset range.
  12. 一种终端,其特征在于,包括处理器,所述处理器用于:A terminal, characterized in that it includes a processor, and the processor is used for:
    获取人脸图像,所述人脸图像包括抖动信息;Obtaining a face image, the face image includes shaking information;
    根据所述人脸图像计算所述人脸和电子设备的距离;及calculating the distance between the human face and the electronic device according to the human face image; and
    在所述抖动信息处于第一预设范围时,根据所述距离调整播放音量。When the shaking information is within a first preset range, the playback volume is adjusted according to the distance.
  13. 根据权利要求12所述的终端,其特征在于,所述处理器用于:The terminal according to claim 12, wherein the processor is configured to:
    获取第一预定时长内的连续多帧所述人脸图像;Acquiring the face images of multiple consecutive frames within the first predetermined duration;
    判断连续多帧所述人脸图像中,任意两帧所述人脸图像中的人脸的位置坐标的差值是否处于所述第一预设范围;及Judging whether the difference between the position coordinates of the faces in any two frames of the face images in the continuous multiple frames of the face images is within the first preset range; and
    若是,则确定所述抖动信息处于第一预设范围。If yes, determine that the shaking information is within a first preset range.
  14. 根据权利要求12所述的终端,其特征在于,所述人脸图像还包括角度信息,所述处理器用于在所述抖动信息处于所述第一预设范围且所述角度信息处于第二预设范围时,根据所述距离调整播放音量。The terminal according to claim 12, wherein the face image further includes angle information, and the processor is configured to, when the shaking information is in the first preset range and the angle information is in a second preset range, When setting the range, adjust the playback volume according to the distance.
  15. 根据权利要求14所述的终端,其特征在于,所述处理器用于:The terminal according to claim 14, wherein the processor is configured to:
    获取第二预定时长内的连续多帧所述人脸图像;Acquiring the face images of multiple consecutive frames within a second predetermined duration;
    判断连续多帧所述人脸图像中,所述人脸的角度信息是否均处于所述第二预设范围;及judging whether the angle information of the face in the multiple consecutive frames of the face image is within the second preset range; and
    若是,则确定所述角度信息处于所述第二预设范围。If yes, determine that the angle information is within the second preset range.
  16. 根据权利要求12所述的终端,其特征在于,在所述处理器根据所述人脸图像计算所述人脸和电子设备的距离之前,所述处理器用于:The terminal according to claim 12, wherein before the processor calculates the distance between the human face and the electronic device according to the human face image, the processor is configured to:
    接收输入操作,以设置多个不同用户的所述人脸的优先级;及receiving input to prioritize said faces of a plurality of different users; and
    获取所述人脸图像中,所述优先级最高的所述人脸的第一人脸信息,以作为所述目标人脸信息;Obtaining the first face information of the face with the highest priority in the face image as the target face information;
    根据所述目标人脸信息计算所述人脸和电子设备的距离。Calculate the distance between the human face and the electronic device according to the target human face information.
  17. 根据权利要求16所述的终端,其特征在于,所述处理器用于:The terminal according to claim 16, wherein the processor is configured to:
    识别所述人脸图像中的一个或多个所述人脸的所述第二人脸信息;identifying the second face information of one or more of the faces in the face image;
    将一个或多个所述第二人脸信息与预设的人脸库中的预存人脸信息进行比对,以获取与所述预存人脸信息匹配的所述第二人脸信息,作为所述第一人脸信息;Comparing one or more of the second face information with the pre-stored face information in the preset face database to obtain the second face information matching the pre-stored face information, as the Describe the first face information;
    获取所述人脸的所述优先级最高的所述第一人脸信息,以作为所述目标人脸信息。Acquiring the first human face information with the highest priority of the human face as the target human face information.
  18. 根据权利要求17所述的终端,其特征在于,所述处理器用于根据不同所述用户在不同光照条件下的所述人脸图像生成所述预存人脸信息。The terminal according to claim 17, wherein the processor is configured to generate the pre-stored face information according to the face images of different users under different lighting conditions.
  19. 根据权利要求17所述的终端,其特征在于,所述处理器用于根据不同所述用户在不同拍摄角度下的所述人脸图像生成所述预存人脸信息。The terminal according to claim 17, wherein the processor is configured to generate the pre-stored face information according to the face images of different users at different shooting angles.
  20. 根据权利要求12所述的终端,其特征在于,所述处理器用于:The terminal according to claim 12, wherein the processor is configured to:
    根据输入操作,设置初始距离下的初始音量,并关联所述初始距离和所述初始距离下采集的所述人脸图像中所述人脸的初始尺寸;According to the input operation, an initial volume at an initial distance is set, and the initial size of the face in the face image collected at the initial distance is associated with the initial distance;
    根据所述初始距离、所述初始尺寸和多帧所述人脸图像中所述人脸的尺寸的平均尺寸,计算所述距离。The distance is calculated according to the initial distance, the initial size, and an average size of the face sizes in multiple frames of the face images.
  21. 根据权利要求20所述的终端,其特征在于,所述处理器用于:The terminal according to claim 20, wherein the processor is configured to:
    根据所述初始距离、所述距离和所述初始音量确定调整音量;及determining and adjusting volume according to the initial distance, the distance and the initial volume; and
    根据所述调整音量调整所述播放音量。Adjusting the playing volume according to the adjusting volume.
  22. 一种包括计算机程序的非易失性计算机可读存储介质,所述计算机程序被处理器执行时,使得所述处理器执行权利要求1-10任意一项所述的音量调节方法。A non-volatile computer-readable storage medium including a computer program, when the computer program is executed by a processor, the processor is made to execute the volume adjustment method according to any one of claims 1-10.
PCT/CN2022/112705 2021-09-16 2022-08-16 Volume adjustment method and apparatus, terminal, and computer-readable storage medium WO2023040547A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111088747.3 2021-09-16
CN202111088747.3A CN113965641B (en) 2021-09-16 2021-09-16 Volume adjusting method and device, terminal and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2023040547A1 true WO2023040547A1 (en) 2023-03-23

Family

ID=79461763

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/112705 WO2023040547A1 (en) 2021-09-16 2022-08-16 Volume adjustment method and apparatus, terminal, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN113965641B (en)
WO (1) WO2023040547A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113965641B (en) * 2021-09-16 2023-03-28 Oppo广东移动通信有限公司 Volume adjusting method and device, terminal and computer readable storage medium

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101022682A (en) * 2006-02-13 2007-08-22 明基电通股份有限公司 Method for adjusting gain value of sound signal in gain adjustment system and audio system
CN103491230A (en) * 2013-09-04 2014-01-01 三星半导体(中国)研究开发有限公司 Mobile terminal capable of automatically adjusting volume and fonts and automatic adjusting method thereof
CN103517201A (en) * 2012-06-22 2014-01-15 纬创资通股份有限公司 Sound playing method capable of automatically adjusting volume and electronic equipment
CN104703090A (en) * 2013-12-05 2015-06-10 北京东方正龙数字技术有限公司 Automatic adjustment pick-up equipment based on face recognition and automatic adjustment method
CN105163240A (en) * 2015-09-06 2015-12-16 珠海全志科技股份有限公司 Playing device and sound effect adjusting method
CN106331371A (en) * 2016-09-14 2017-01-11 维沃移动通信有限公司 Volume adjustment method and mobile terminal
CN106792177A (en) * 2016-12-28 2017-05-31 海尔优家智能科技(北京)有限公司 A kind of TV control method and system
CN107343076A (en) * 2017-08-18 2017-11-10 广东欧珀移动通信有限公司 Volume adjusting method, device, storage medium and mobile terminal
CN107506171A (en) * 2017-08-22 2017-12-22 深圳传音控股有限公司 Audio-frequence player device and its effect adjusting method
WO2020057419A1 (en) * 2018-09-18 2020-03-26 西安中兴新软件有限责任公司 Audio control method and device, and terminal
CN111294706A (en) * 2020-01-16 2020-06-16 珠海格力电器股份有限公司 Voice electrical appliance control method and device, storage medium and voice electrical appliance
CN112019929A (en) * 2019-05-31 2020-12-01 腾讯科技(深圳)有限公司 Volume adjusting method and device
CN112380972A (en) * 2020-11-12 2021-02-19 四川长虹电器股份有限公司 Volume adjusting method applied to television scene
US10956122B1 (en) * 2020-04-01 2021-03-23 Motorola Mobility Llc Electronic device that utilizes eye position detection for audio adjustment
CN112995551A (en) * 2021-02-05 2021-06-18 海信视像科技股份有限公司 Sound control method and display device
CN113157246A (en) * 2021-06-25 2021-07-23 深圳小米通讯技术有限公司 Volume adjusting method and device, electronic equipment and storage medium
CN113965641A (en) * 2021-09-16 2022-01-21 Oppo广东移动通信有限公司 Volume adjusting method and device, terminal and computer readable storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3134850B1 (en) * 2014-04-22 2023-06-14 Snap-Aid Patents Ltd. Method for controlling a camera based on processing an image captured by other camera
CN106303819A (en) * 2015-06-05 2017-01-04 青岛海尔智能技术研发有限公司 A kind of method controlling volume of electronic device and electronic equipment
CN110392298B (en) * 2018-04-23 2021-09-28 腾讯科技(深圳)有限公司 Volume adjusting method, device, equipment and medium
CN109218614B (en) * 2018-09-21 2021-02-26 深圳美图创新科技有限公司 Automatic photographing method of mobile terminal and mobile terminal
CN109639893A (en) * 2018-12-14 2019-04-16 Oppo广东移动通信有限公司 Play parameter method of adjustment, device, electronic equipment and storage medium
CN109710080B (en) * 2019-01-25 2021-12-03 华为技术有限公司 Screen control and voice control method and electronic equipment
CN111026263B (en) * 2019-11-26 2021-10-15 维沃移动通信有限公司 Audio playing method and electronic equipment
CN111897510A (en) * 2020-07-30 2020-11-06 成都新潮传媒集团有限公司 Volume adjusting method and device of multimedia equipment and computer readable storage medium

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101022682A (en) * 2006-02-13 2007-08-22 明基电通股份有限公司 Method for adjusting gain value of sound signal in gain adjustment system and audio system
CN103517201A (en) * 2012-06-22 2014-01-15 纬创资通股份有限公司 Sound playing method capable of automatically adjusting volume and electronic equipment
CN103491230A (en) * 2013-09-04 2014-01-01 三星半导体(中国)研究开发有限公司 Mobile terminal capable of automatically adjusting volume and fonts and automatic adjusting method thereof
CN104703090A (en) * 2013-12-05 2015-06-10 北京东方正龙数字技术有限公司 Automatic adjustment pick-up equipment based on face recognition and automatic adjustment method
CN105163240A (en) * 2015-09-06 2015-12-16 珠海全志科技股份有限公司 Playing device and sound effect adjusting method
CN106331371A (en) * 2016-09-14 2017-01-11 维沃移动通信有限公司 Volume adjustment method and mobile terminal
CN106792177A (en) * 2016-12-28 2017-05-31 海尔优家智能科技(北京)有限公司 A kind of TV control method and system
CN107343076A (en) * 2017-08-18 2017-11-10 广东欧珀移动通信有限公司 Volume adjusting method, device, storage medium and mobile terminal
CN107506171A (en) * 2017-08-22 2017-12-22 深圳传音控股有限公司 Audio-frequence player device and its effect adjusting method
WO2020057419A1 (en) * 2018-09-18 2020-03-26 西安中兴新软件有限责任公司 Audio control method and device, and terminal
CN112019929A (en) * 2019-05-31 2020-12-01 腾讯科技(深圳)有限公司 Volume adjusting method and device
CN111294706A (en) * 2020-01-16 2020-06-16 珠海格力电器股份有限公司 Voice electrical appliance control method and device, storage medium and voice electrical appliance
US10956122B1 (en) * 2020-04-01 2021-03-23 Motorola Mobility Llc Electronic device that utilizes eye position detection for audio adjustment
CN112380972A (en) * 2020-11-12 2021-02-19 四川长虹电器股份有限公司 Volume adjusting method applied to television scene
CN112995551A (en) * 2021-02-05 2021-06-18 海信视像科技股份有限公司 Sound control method and display device
CN113157246A (en) * 2021-06-25 2021-07-23 深圳小米通讯技术有限公司 Volume adjusting method and device, electronic equipment and storage medium
CN113965641A (en) * 2021-09-16 2022-01-21 Oppo广东移动通信有限公司 Volume adjusting method and device, terminal and computer readable storage medium

Also Published As

Publication number Publication date
CN113965641A (en) 2022-01-21
CN113965641B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
US10986265B2 (en) Electronic device and control method thereof
US11517099B2 (en) Method for processing images, electronic device, and storage medium
US11436779B2 (en) Image processing method, electronic device, and storage medium
US8233789B2 (en) Dynamic exposure metering based on face detection
US10198622B2 (en) Electronic mirror device
EP4000700A1 (en) Camera shot movement control method, device, apparatus, and storage medium
CN111028144B (en) Video face changing method and device and storage medium
KR20040107890A (en) Image slope control method of mobile phone
JP2013058828A (en) Smile determination device and method
CN108616691B (en) Photographing method and device based on automatic white balance, server and storage medium
CN110933452B (en) Method and device for displaying lovely face gift and storage medium
TW201337641A (en) Method and system for prompting self-catch
CN110796083B (en) Image display method, device, terminal and storage medium
US20100145232A1 (en) Methods and apparatuses for correcting sport postures captured by a digital image processing apparatus
WO2021147650A1 (en) Photographing method and apparatus, storage medium, and electronic device
CN106973236B (en) Shooting control method and device
KR20130122411A (en) Image capturing device and operating method of image capturing device
TW201439890A (en) Method and system for adjusting fonts of an electronic device
WO2023040547A1 (en) Volume adjustment method and apparatus, terminal, and computer-readable storage medium
WO2021136035A1 (en) Photographing method and apparatus, storage medium, and electronic device
CN107958223A (en) Face identification method and device, mobile equipment, computer-readable recording medium
CN112333385A (en) Electronic anti-shake control method and device
KR101825321B1 (en) System and method for providing feedback of real-time optimal shooting composition using mobile camera recognition technology
WO2021218926A1 (en) Image display method and apparatus, and computer device
CN107872619B (en) Photographing processing method, device and equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22868923

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22868923

Country of ref document: EP

Kind code of ref document: A1