CN108766460B

CN108766460B - Voice-based interaction method and system

Info

Publication number: CN108766460B
Application number: CN201810462653.XA
Authority: CN
Inventors: 陈志鹏
Original assignee: Zhejiang Koubei Network Technology Co Ltd
Current assignee: Zhejiang Koubei Network Technology Co Ltd
Priority date: 2018-05-15
Filing date: 2018-05-15
Publication date: 2020-07-10
Anticipated expiration: 2038-05-15
Also published as: CN108766460A; WO2019218749A1

Abstract

The invention discloses an interaction method and system based on voice, relating to the field of electronic information, wherein the method comprises the following steps: determining a voice volume value in the current sampling time, and judging whether the current sampling time is the first sampling time; if so, determining the volume level corresponding to the voice volume value in the current sampling time as an initial level, and executing interactive operation corresponding to the initial level; if not, determining the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, and executing the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time. According to the method, the influence of the sensitivity difference of the user equipment and the distance between the sound source and the equipment on the voice volume level collected by the client can be avoided.

Description

Voice-based interaction method and system

Technical Field

The invention relates to the field of electronic information, in particular to an interaction method and system based on voice.

Background

Many platform (windows/ios/android) applications currently introduce voice interaction functionality. In these interfaces of voice interaction, in order to feed back the voice interaction effect to the user in time and improve the interest of the voice interaction, an interaction animation is usually displayed according to a real-time recording. For example, singing software such as 'nationality karaoke' and 'singing bar' can give a prompt of audio frequency when a user sings a song; such as voice search of "pay for App", gives sound animation when the user records, etc.

In addition, with the development of science and technology, many service scenes needing to identify the volume, such as resource allocation activities like a lottery-drawn collar and a red packet, appear. In the service scene of drawing a red packet in a lottery, in order to improve entertainment, an event host wants to realize the event effect of 'the sound is bigger, the red packet is bigger', and meanwhile 'the sound is bigger, and the animation is faster' on a recording interface. However, in the process of implementing the present invention, the inventors found that at least the following problems exist in the prior art: the effect of identifying the volume size of the mobile phone recording is greatly influenced by the microphone sensitivity and the distance between a sound source and a microphone, and if the volume size acquired by a client is directly used, bonus received by some clients in resource configuration activities is very small all the time or animation displayed on a client interface is very slow all the time.

Disclosure of Invention

In view of the above, the present invention has been developed to provide a voice-based interaction method and system that overcome, or at least partially solve, the above-mentioned problems.

According to an aspect of the present invention, there is provided a voice-based interaction method, including: determining a voice volume value in the current sampling time, and judging whether the current sampling time is the first sampling time;

if so, determining the volume level corresponding to the voice volume value in the current sampling time as an initial level, and executing interactive operation corresponding to the initial level;

if not, determining the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, and executing the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time.

Optionally, the step of determining, according to the volume level corresponding to the voice volume value in the last sampling time and the amount of change of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, the volume level corresponding to the voice volume value in the current sampling time specifically includes:

determining the volume level corresponding to the voice volume value in the last sampling time as a reference volume level;

if the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is a positive number, increasing at least one volume level on the basis of the reference volume level to obtain a volume level corresponding to the voice volume value in the current sampling time;

and if the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is a negative number, reducing at least one volume level on the basis of the reference volume level to obtain the volume level corresponding to the voice volume value in the current sampling time.

Optionally, if a variation of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time is a positive number, the step of increasing at least one volume level on the basis of the reference volume level specifically includes: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, increasing at least one volume level on the basis of the reference volume level;

if the variation of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time is a negative number, the step of decreasing at least one volume level on the basis of the reference volume level specifically includes: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, reducing at least one volume level on the basis of the reference volume level.

Optionally, wherein the voice volume value in the sampling time is determined according to an average volume value, a maximum volume value, and/or a minimum volume value of the voice input content received in the sampling time.

Optionally, before the method is executed, the method further includes:

setting a plurality of volume levels sequentially arranged from high to low, and respectively setting the operation type and/or operation content of the interactive operation corresponding to each volume level.

Optionally, wherein the operation type of the interactive operation includes: an interactive animation type, and/or a resource configuration type;

the operation content corresponding to the interactive animation type comprises the following steps: the animation type, animation change speed and/or animation duration of the interactive animation;

the operation content corresponding to the resource configuration type comprises the following steps: the type of resource configured, and/or the amount of the resource.

Optionally, before the method is executed, the method further includes: setting at least one volume level of the plurality of volume levels to the initial level.

According to another aspect of the present invention, there is provided a voice-based interaction method, including:

receiving voice input content for realizing interactive operation;

determining a voice volume value of the voice input content in the current sampling time, and judging whether the current sampling time is the first sampling time;

if so, determining the volume level corresponding to the voice volume value in the current sampling time as an initial level, executing the interactive operation corresponding to the initial level, and displaying an interactive interface corresponding to the interactive operation corresponding to the initial level;

if not, determining the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, executing the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time, and displaying the interactive interface corresponding to the interactive operation corresponding to the volume level.

Optionally, the step of receiving the voice input content for implementing the interactive operation specifically includes: receiving voice input content for realizing interactive operation through a preset interactive entrance; wherein the interaction portal comprises: an entry for implementing a resource configuration activity, an entry for displaying an interactive animation;

the interactive interface corresponding to the interactive operation corresponding to the initial level and/or the interactive interface corresponding to the interactive operation corresponding to the volume level comprise: resource allocation interface, interactive animation interface.

Optionally, before the method is executed, the method further includes:

According to still another aspect of the present invention, there is provided a voice-based interactive system, including:

the determining module is suitable for determining the voice volume value in the current sampling time and judging whether the current sampling time is the first sampling time;

the first execution module is suitable for determining the volume level corresponding to the voice volume value in the current sampling time as an initial level if the voice volume value in the current sampling time is the initial level, and executing the interactive operation corresponding to the initial level;

and if not, determining the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, and executing the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time.

Optionally, the second execution module is specifically adapted to:

Optionally, the second execution module is specifically adapted to: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, increasing at least one volume level on the basis of the reference volume level;

the second execution module is specifically adapted to: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, reducing at least one volume level on the basis of the reference volume level.

Optionally, wherein the system further comprises: the first setting module is suitable for setting a plurality of volume levels which are sequentially arranged from high to low, and respectively setting the operation type and/or the operation content of the interactive operation corresponding to each volume level.

Optionally, wherein the system further comprises a second setting module adapted to:

setting at least one volume level of the plurality of volume levels to the initial level.

According to still another aspect of the present invention, there is provided a voice-based interactive system, including: a receiving module, a determining module, a first executing module, a second executing module, and a presenting module, wherein,

the receiving module is suitable for receiving voice input content for realizing interactive operation;

the determining module is suitable for determining the voice volume value of the voice input content in the current sampling time and judging whether the current sampling time is the first sampling time;

the first execution module is adapted to determine, if yes, a volume level corresponding to the voice volume value within the current sampling time as an initial level, execute an interactive operation corresponding to the initial level, and then the presentation module is adapted to present an interactive interface corresponding to the interactive operation corresponding to the initial level;

and the second execution module is suitable for determining the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time if the voice volume value in the current sampling time is not the same as the voice volume value in the last sampling time, and executing the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time, so that the presentation module is suitable for presenting the interactive interface corresponding to the interactive operation corresponding to the volume level.

Optionally, wherein the receiving module is specifically adapted to: receiving voice input content for realizing interactive operation through a preset interactive entrance; wherein the interaction portal comprises: an entry for implementing a resource configuration activity, an entry for displaying an interactive animation;

Optionally, the second execution module is specifically adapted to:

judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, increasing at least one volume level on the basis of the reference volume level;

judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, reducing at least one volume level on the basis of the reference volume level.

Optionally, wherein the system further comprises a first setting module adapted to:

According to still another aspect of the present invention, there is provided an electronic apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the voice-based interaction method.

According to still another aspect of the present invention, there is provided another electronic apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

According to yet another aspect of the present invention, a computer storage medium is provided, in which at least one executable instruction is stored, and the executable instruction causes a processor to perform operations corresponding to the voice-based interaction method as described above.

According to yet another aspect of the present invention, another computer storage medium is provided, in which at least one executable instruction is stored, and the executable instruction causes a processor to perform operations corresponding to the voice-based interaction method as described above.

According to the voice-based interaction method and system provided by the invention, whether the current sampling time is the first sampling time is judged by determining the voice volume value in the current sampling time, if so, the volume level corresponding to the voice volume value in the current sampling time is determined as the initial level, the interaction operation corresponding to the initial level is executed, otherwise, the volume level corresponding to the voice volume value in the current sampling time is determined according to the volume level corresponding to the voice volume value in the last sampling time and the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, and the interaction operation corresponding to the volume level corresponding to the voice volume value in the current sampling time is executed. According to the method, the volume level corresponding to the voice volume value in the current sampling time of the client can be only compared with the volume level corresponding to the voice volume value in the last sampling time of the user, so that the volume level corresponding to the voice volume value in the current sampling time is obtained, and the corresponding interactive operation is executed according to the volume level, so that the influence of the sensitivity difference of equipment of the user and the distance between a sound source and the equipment on the voice volume level collected by the client is avoided, and the experience of the user in various dynamic interactive activities is improved.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a flow chart illustrating a voice-based interaction method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a voice-based interaction method according to a second embodiment of the present invention;

FIG. 3 illustrates an exemplary graphical depiction of custom volume levels versus time and prize corresponding ranges for a sound animation;

FIG. 4a is a flowchart illustrating a method of a voice-based interaction method according to another embodiment of the present invention;

FIG. 4b is a schematic flow chart corresponding to the voice-based interaction method provided by the present invention;

FIG. 5 is a block diagram of a voice-based interactive system according to a third embodiment of the present invention;

fig. 6 shows a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Example one

Fig. 1 shows a flowchart of a voice-based interaction method according to an embodiment of the present invention. As shown in fig. 1, the method includes:

step S110: and determining the voice volume value in the current sampling time, and judging whether the current sampling time is the first sampling time.

The voice volume value may be an original value acquired by the system interface, a processed decibel value, or another type of value that can represent the volume, and in short, the present invention does not limit the concrete representation form of the voice volume value, and any value that can represent the size of the voice volume value is within the protection scope of the present invention. The current sampling time may be each of the successive time periods or a sampling interval. Specifically, the duration of each sampling interval may be divided in advance, for example, each second may be determined as a sampling interval or every 0.5 seconds may be determined as a sampling interval, and the duration of the sampling interval may be other time values, and the size of the time value may be determined by those skilled in the art according to the accuracy and sensitivity of the voice volume value to be obtained. The time value may be set relatively small if it is desired to improve the accuracy and sensitivity of acquiring the voice volume value, and may be set relatively large if it is not highly required for the accuracy and sensitivity of acquiring the voice volume value. After the voice volume value in the current sampling time is determined, whether the current sampling time is the first sampling time is judged.

Step S120: and if so, determining the volume level corresponding to the voice volume value in the current sampling time as an initial level, and executing interactive operation corresponding to the initial level.

Specifically, a plurality of volume levels arranged in order from high to low may be set in advance before step S110 is performed, and the operation type and/or the operation content of the interactive operation corresponding to each volume level may be set, respectively. One or more of the various volume levels described above may then be set to the initial level. The operation type can be an interactive animation type and/or a resource configuration type. Accordingly, the operation content may be an animation type, an animation change speed, and/or an animation duration of the interactive animation. The operation content corresponding to the resource configuration type comprises the following steps: the type of resource configured, and/or the amount of the resource. If the current sampling time is judged to be the first sampling time, the volume level corresponding to the voice volume value in the current sampling time can be determined as the initial level, and the interactive operation corresponding to the initial level is performed.

Step S130: if not, determining the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, and executing the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time.

Specifically, the volume level corresponding to the voice volume value at the last sampling time may be determined as a reference volume level, and the voice volume value at the current sampling time may be compared with the voice volume value at the last sampling time, and if the voice volume value at the current sampling time is greater than the voice volume value at the last sampling time, at least one volume level may be increased on the basis of the reference volume level, resulting in a volume level corresponding to the voice volume value at the current sampling time. If the voice volume value in the current sampling time is smaller than the voice volume value in the last sampling time, at least one volume level can be reduced on the basis of the reference volume level, and the volume level corresponding to the voice volume value in the current sampling time is obtained. And after the volume level corresponding to the voice volume value in the current sampling time is determined, performing interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time.

By executing the content in step S130, when the client changes the interactive operation according to the collected voice change, the client is prevented from being affected by the quality of the device used by the user and the distance between the user and the device, so that the client can determine the volume level corresponding to the voice volume value in the current sampling time only by comparing the voice volume value in the previous sampling time with the voice volume value, and thus the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time is executed.

According to a flowchart of a voice-based interaction method provided in an embodiment of the present application, it is determined whether a current sampling time is a first sampling time by determining a voice volume value in the current sampling time, if so, a volume level corresponding to the voice volume value in the current sampling time is determined as an initial level, and an interaction operation corresponding to the initial level is performed, otherwise, a volume level corresponding to the voice volume value in the current sampling time is determined according to a volume level corresponding to the voice volume value in a previous sampling time and a variation of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time, and an interaction operation corresponding to the volume level corresponding to the voice volume value in the current sampling time is performed. According to the method, the volume level corresponding to the voice volume value in the current sampling time of the client can be only compared with the volume level corresponding to the voice volume value in the last sampling time of the user, so that the volume level corresponding to the voice volume value in the current sampling time is obtained, and the corresponding interactive operation is executed according to the volume level, so that the influence of the sensitivity difference of equipment of the user and the distance between a sound source and the equipment on the voice volume level collected by the client is avoided, and the experience of the user in various dynamic interactive activities is improved.

Fig. 2 shows a flowchart of a voice-based interaction method according to a second embodiment of the present invention. As shown in fig. 2, the method includes:

step S210: setting a plurality of volume levels sequentially arranged from high to low, and respectively setting the operation type and/or operation content of the interactive operation corresponding to each volume level.

Wherein, the operation types of the interactive operation comprise: an interactive animation type, and/or a resource configuration type. The interactive animation type may refer to an interactive animation given according to a real-time recording, for example, an audio prompt presented on a singing software interface according to voice, and an animation interface displayed by a client interface according to voice size and changing rapidly in resource configuration activities. The resource allocation type can be interactive operation of allocating prizes and other resource allocation types according to voice red packet scrambling and tray shaking. Correspondingly, the operation content corresponding to the interactive animation type comprises the following steps: the animation type, animation change speed, and/or animation duration of the interactive animation. The operation content corresponding to the resource configuration type comprises the following steps: the type of resource configured, and/or the amount of the resource. The resources allocated by the resource allocation activity can be various resources such as coupons, cash, electronic tickets, commodity redemption codes and the like.

The volume level may be set according to the volume value, for example, may be set in a positive correlation relationship with the volume value, and the volume level increases with the increase of the volume value. Optionally, taking the operation type as the resource configuration type as an example, the volume level may also be set according to the time corresponding to the sound animation, for example, the volume level may be set in an inverse relationship with the time corresponding to the sound animation, so that the volume level decreases with the increase of the time corresponding to the sound animation, where the sound animation refers to the sound animation displayed on the client interface and changing according to the collected sound. Fig. 3 is an exemplary diagram illustrating the comparison of the customized volume level with the time corresponding to the sound animation and the corresponding range of the bonus. As shown in fig. 3, the sound animation corresponding to the custom volume level 5 has a corresponding time of 400ms, and the bonus corresponding range is 10-14.99; the corresponding time of the sound wave animation corresponding to the user-defined volume level 4 is 600ms, and the corresponding range of the bonus is 8-9.99; the time corresponding to the sound wave animation corresponding to the custom volume level 3 is 800ms, the bonus corresponding range is 6-7.99, and the comparison relationship between the other custom volume levels and the corresponding time and bonus corresponding range of the sound wave animation can be referred to fig. 3, which is not described in detail herein. It should be noted that fig. 3 is only an exemplary diagram, and the comparison relationship between the customized volume level and the corresponding time of the corresponding sound wave animation and the corresponding range of the bonus is not only the comparison relationship described above, but also can be set as other comparison relationships by those skilled in the art according to the interest of the resource allocation activity. The volume levels and the operation types and/or operation contents of the interactive operations corresponding to each volume level can be built in the client application or dynamically issued by the server.

After setting the plurality of volume levels sequentially arranged from high to low, the operation type and/or the operation content of the interactive operation corresponding to each volume level are set respectively. By executing the contents in this step, after each volume level within the sampling time is determined in the following step, the interactive operation corresponding to each volume level can be executed according to the comparison relationship.

Step S220: at least one volume level of the plurality of volume levels is set to an initial level.

In order to achieve that the client volume recognition is not limited by the user equipment itself and is not influenced by the distance between the sound source and the equipment, at least one of the plurality of volume levels may be set as an initial volume level in this step. The initial volume level may be any one or more of the above volume levels, for example, custom sound level 0 may be set as the initial level, or custom sound level 1 may be set as the initial level, and other custom volume levels may also be set as the initial levels.

Step S230: and determining the voice volume value in the current sampling time, and judging whether the current sampling time is the first sampling time.

The sampling time may be a sampling time period or a sampling interval in a continuous time period. Specifically, the duration of each sampling interval may be divided in advance, for example, each second may be determined as one sampling interval or every 0.5 seconds may be determined as one sampling interval, and the duration of the sampling interval may be other time values, which is not limited herein. The sampling time can be set in the application of the client and can be dynamically issued by the server. After the client collects the voice input by the user in the sampling time, the voice volume value in the current sampling time is determined, and whether the current sampling time is the first sampling time is judged.

And the voice volume value in the sampling time is determined according to the average volume value, the maximum volume value and/or the minimum volume value of the voice input content received in the sampling time. The volume value of the voice input content may be an original value obtained from the system interface, or a processed decibel value, or may be another type of numerical value that can represent the volume, and in short, the present invention does not limit the concrete representation form of the voice volume value. Specifically, the volume is also called loudness and intensity, and refers to the subjective feeling of the human ear on the magnitude of the heard sound, and the objective evaluation scale is the amplitude of the sound. This sensation is derived from the pressure, i.e. sound pressure, generated when the object vibrates. The object vibrates through different media, conducting its vibrational energy away. The object vibrates through different media, conducting its vibrational energy away. In order to quantify the perception of sound as a monitorable indicator, sound pressure is classified into "levels," or sound pressure levels, to objectively represent the intensity of the sound in "decibels" (dB). Decibel (decibel) is a unit of measure for measuring the ratio of two identical units, and is mainly used to measure the sound intensity, usually expressed in dB. The calculation formula is as follows: log10(amplitude/REFERENCE)), where amplitude is the monitored sound pressure value (in Pascal) and REFERENCE is the REFERENCE sound pressure value (typically 20Pascal, i.e. the lowest sound pressure that can be felt by the human ear). In android applications, the system interface "mediaregister. getmaxamplitude ()" can be used to obtain the original sound pressure value amplitude, and then substituted into the formula to find the corresponding decibel value. In addition, tone/audio refers to: the sound frequency generated by the vibration of the object is Hertz (Hz); the audible vibration frequency range of human ears is about 20-20000 Hz.

Step S240: and if so, determining the volume level corresponding to the voice volume value in the current sampling time as an initial level, and executing interactive operation corresponding to the initial level.

If the current sampling time is judged to be the first sampling time, the volume level corresponding to the voice volume value in the current sampling time is determined as the initial level, and the interactive operation corresponding to the initial level is performed according to the operation type and/or the operation content of the interactive operation corresponding to each volume level set in step S210.

Step S250: and if not, determining the volume level corresponding to the voice volume value in the last sampling time as the reference volume level.

If the current sampling time is judged not to be the first sampling time, the volume level corresponding to the voice volume value in the last sampling time period can be determined as the reference volume level, the voice volume value in the current sampling time is compared with the voice volume value in the last sampling time, and then the volume level corresponding to the voice volume value in the current sampling time is determined according to the comparison result and the reference volume level.

Step S251: if the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is positive, increasing at least one volume level on the basis of the reference volume level to obtain a volume level corresponding to the voice volume value in the current sampling time, and executing interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time.

Specifically, if the voice volume value in the current sampling time is greater than the voice volume value in the previous sampling time, at least one volume level may be increased on the basis of the reference volume level, so as to obtain a volume level corresponding to the voice volume value in the current sampling time, and perform an interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time. Specifically, in order to prevent the volume level corresponding to the voice volume value in the current sampling time from changing too frequently, so that the interaction changes too frequently, a variation threshold may be preset, and if it is determined that the variation of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time is greater than the preset variation threshold, at least one volume level may be increased on the basis of the reference volume level, so as to obtain the volume level corresponding to the voice volume value in the current sampling time. The variation threshold may be determined according to how frequently the user wants to change the interactive operation according to the voice change, specifically, the frequency of changing the interactive operation may be increased by decreasing the variation threshold, and the frequency of changing the interactive operation may be decreased by increasing the variation threshold, and specific values may be set by those skilled in the art according to actual situations, which is not limited herein.

Further, in order to prevent frequent adjustment of the volume level corresponding to the voice volume value in the current sampling time due to the minute fluctuation, and to more accurately adjust the volume level corresponding to the voice volume value in the current sampling time according to the voice volume value, a volume step value may be set in advance, and the number of increased volume levels may be determined according to a result of comparing the amount of change in the voice volume value with the volume step value.

Step S252: if the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is negative, at least one volume level is reduced on the basis of the reference volume level, the volume level corresponding to the voice volume value in the current sampling time is obtained, and interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time is executed.

Specifically, if the voice volume value in the current sampling time is smaller than the voice volume value in the previous sampling time, at least one volume level may be decreased based on the reference volume level, so as to obtain a volume level corresponding to the voice volume value in the current sampling time, and perform an interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time. Specifically, in order to prevent the volume level corresponding to the voice volume value in the current sampling time from changing too frequently, so that dynamic interaction changes too frequently, a variation threshold may be preset, and if it is determined that the variation of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time is greater than the preset variation threshold, at least one volume level may be reduced on the basis of the reference volume level, so as to obtain the volume level corresponding to the voice volume value in the current sampling time. The variation threshold may be determined according to how frequently the user wants to change the interactive operation according to the voice change, specifically, the frequency of changing the interactive operation may be increased by decreasing the variation threshold, and the frequency of changing the interactive operation may be decreased by increasing the variation threshold, and specific values may be set by those skilled in the art according to actual situations, which is not limited herein.

Further, in order to prevent frequent adjustment of the volume level corresponding to the voice volume value in the current sampling time due to the minute fluctuation, and to more accurately adjust the volume level corresponding to the voice volume value in the current sampling time according to the voice volume value, a volume step value may be set in advance, and the number of the reduced volume levels may be determined according to a result of comparing the amount of change in the voice volume value with the volume step value.

According to the voice-based interaction method provided by the second embodiment, by first setting a plurality of volume levels sequentially arranged from high to low, and setting the operation type and/or operation content of the interactive operation corresponding to each volume level, respectively, and setting at least one of the volume levels as an initial level, and then determining the voice volume value in the current sampling time, determining whether the current sampling time is the first sampling time, if so, determining the volume level corresponding to the voice volume value in the current sampling time as the initial level, performing the interactive operation corresponding to the initial level, otherwise, determining the volume level corresponding to the voice volume value in the previous sampling time as a reference volume level, and if the amount of change of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time is a positive number, and if the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is a negative number, reducing at least one volume level on the basis of the reference volume level to obtain the volume level corresponding to the voice volume value in the current sampling time, and performing interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time. According to the method, the volume level corresponding to the sound volume value input by the user determined by the client is not interfered by objective factors such as equipment sensitivity, distance between a sound source and the equipment and the like, but the volume level is determined only based on the sound volume value input by the user, so that the user is really compared with the user to determine the volume level corresponding to the sound volume value in the current sampling time, and the interactive operation corresponding to the volume level is executed, so that each user has the opportunity to achieve the animation in the resource configuration activity at the fastest speed or obtain the highest prize money.

In addition, fig. 4a shows a flowchart of a method of a voice-based interaction method according to another embodiment of the present invention, as shown in fig. 4a, including:

step S410: speech input content for enabling an interactive operation is received.

Specifically, voice input content for realizing interactive operation can be received through a preset interactive inlet; wherein the interaction portal comprises: entries for implementing resource configuration activities (e.g., a rushing to red envelope entry), entries for displaying interactive animations, and the like.

Step S420: and determining the voice volume value of the voice input content in the current sampling time, and judging whether the current sampling time is the first sampling time.

Since the voice input content inputted by the user usually lasts for a period of time, in order to detect the voice volume value of the voice input content in each period of time, step S420 and the following steps may be executed at preset time intervals. Wherein, the smaller the preset time interval is, the more real-time effect can be achieved. The preset time interval can be set by a person skilled in the art in combination with the real-time and the performance parameters of the terminal device. In addition, the preset time interval may also be equal to the sampling period, for example, if the preset time interval is 1 second, and the sampling period is also 1 second, in the initial stage, the voice volume value in the 1 st second (i.e. the current sampling time) is obtained through step S420, and accordingly, the current sampling time is determined to be the first sampling time, and step S430 is executed. For another example, in the subsequent stage, the voice volume value in the nth second (i.e. the current sampling time, where n is greater than 1) is obtained through step S420, and accordingly, it is determined that the current sampling time is not the first sampling time, and step S440 is performed.

Step S430: if so, determining the volume level corresponding to the voice volume value in the current sampling time as an initial level, executing the interactive operation corresponding to the initial level, and displaying the interactive interface corresponding to the interactive operation corresponding to the initial level.

For the initial level and the corresponding interactive operation and the interactive interface, reference may be made to the description of the corresponding steps in the second embodiment, which is not repeated herein.

Step S440: if not, determining the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, executing the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time, and displaying the interactive interface corresponding to the interactive operation corresponding to the volume level.

Specifically, the volume level corresponding to the voice volume value in the last sampling time is determined as a reference volume level; if the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is a positive number, increasing at least one volume level on the basis of the reference volume level to obtain a volume level corresponding to the voice volume value in the current sampling time; and if the variable quantity of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is a negative number, reducing at least one volume level on the basis of the reference volume level to obtain the volume level corresponding to the voice volume value in the current sampling time. Optionally, if a variation of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time is a positive number, the step of increasing at least one volume level on the basis of the reference volume level specifically includes: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, increasing at least one volume level on the basis of the reference volume level; if the variation of the voice volume value in the current sampling time with respect to the voice volume value in the previous sampling time is a negative number, the step of decreasing at least one volume level on the basis of the reference volume level specifically includes: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, reducing at least one volume level on the basis of the reference volume level. Wherein the voice volume value in the sampling time is determined according to the average volume value, the maximum volume value and/or the minimum volume value of the voice input content received in the sampling time.

The details of step S440 may refer to the description of the corresponding steps in the second embodiment, and are not repeated herein.

In addition, the interactive interface corresponding to the interactive operation corresponding to the initial level and/or the interactive interface corresponding to the interactive operation corresponding to the volume level include: resource allocation interface, interactive animation interface. For example, in the resource allocation activity, if the volume level is larger, the number of resources included in the corresponding resource allocation interface is larger, and the type is more valuable; conversely, the smaller the volume level, the less the number of resources and the cheaper the type of resources contained in the corresponding resource allocation interface. For another example, in the interactive animation activity, if the volume level is larger, the interactive animation in the corresponding interactive animation interface changes faster and has shorter duration; conversely, the smaller the volume level is, the slower the change of the interactive animation in the corresponding interactive animation interface is and the longer the duration is.

Optionally, before step S410, a plurality of volume levels sequentially arranged from high to low are further set, and the operation type and/or the operation content of the interactive operation corresponding to each volume level are respectively set. Wherein the operation type of the interactive operation comprises: an interactive animation type, and/or a resource configuration type; the operation content corresponding to the interactive animation type comprises the following steps: the animation type, animation change speed and/or animation duration of the interactive animation; the operation content corresponding to the resource configuration type comprises the following steps: the type of resource configured, and/or the amount of the resource.

A person skilled in the art may combine or variously modify this embodiment and the second embodiment, and details of implementation of each step in this embodiment may refer to the description of the corresponding part in the second embodiment, which is not described herein again.

Fig. 4b shows a schematic flowchart corresponding to the voice-based interaction method provided by the present invention, in order to describe the technical solution of the present invention in more detail, taking fig. 4b as an example, the following will discuss specific steps of the voice-based interaction method provided by the present invention, step S401 is to start recording and the client starts to collect the recording of the user, step S402 is to assume the initial sound level of the user is initial level L0, step S403 is to obtain the maximum decibel d of the sound within the sampling time, step S404 is to compare whether the decibel value is greater than the previous decibel value, step S405 is to increase the sound level by one or more levels if the decibel value is greater than the previous decibel value, step S406 is to decrease the sound level by one or more levels if the decibel value is less than the previous decibel value, step S407 is to decrease the minimum level, step S407 is to execute the interaction operation corresponding to the current sound level, in the current sound level, the single animation completion time may be updated to the corresponding time, and finally, step S408 is to end the recording operation, and if not, step S408 is to end the recording operation, step S408 is to end, the recording operation is to repeat the above steps S408.

EXAMPLE III

Fig. 5 is a schematic structural diagram illustrating a voice-based interactive system according to a third embodiment of the present invention, where the system includes:

a determining module 53, adapted to determine a voice volume value in a current sampling time, and determine whether the current sampling time is a first sampling time;

a first executing module 54, adapted to determine, if yes, a volume level corresponding to the voice volume value in the current sampling time as an initial level, and execute an interactive operation corresponding to the initial level;

and the second executing module 55 is adapted to, if not, determine the volume level corresponding to the voice volume value in the current sampling time according to the volume level corresponding to the voice volume value in the last sampling time and the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time, and execute the interactive operation corresponding to the volume level corresponding to the voice volume value in the current sampling time.

Optionally, the second executing module 55 is specifically adapted to:

the second execution module 55 is specifically adapted to: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, reducing at least one volume level on the basis of the reference volume level.

Optionally, wherein the system further comprises: the first setting module 51 is adapted to set a plurality of volume levels sequentially arranged from high to low, and set an operation type and/or an operation content of an interactive operation corresponding to each volume level, respectively.

Optionally, wherein the system further comprises a second setting module 52 adapted to:

The specific structure and operation principle of each module described above may refer to the description of the corresponding part in the method embodiment, and are not described herein again.

In addition, the present invention also provides another voice-based interactive system, which is different from the system shown in fig. 5 in that the system further includes, on the basis of the system shown in fig. 5: the device comprises a receiving module and a presenting module. Accordingly, the system comprises: a receiving module, a determining module, a first executing module, a second executing module, and a presenting module, wherein,

the receiving module is connected with the determining module and is suitable for receiving voice input content for realizing interactive operation;

The display module is respectively connected with the first execution module and the second execution module.

Optionally, the receiving module is specifically adapted to: receiving voice input content for realizing interactive operation through a preset interactive entrance; wherein the interaction portal comprises: an entry for implementing a resource configuration activity, an entry for displaying an interactive animation;

Optionally, the second execution module is specifically adapted to:

Optionally, the voice volume value in the sampling time is determined according to an average volume value, a maximum volume value, and/or a minimum volume value of the voice input content received in the sampling time.

Optionally, the system further comprises a first setting module adapted to:

Optionally, the operation type of the interactive operation includes: an interactive animation type, and/or a resource configuration type;

Example four

An embodiment of the present application provides a non-volatile computer storage medium, where the computer storage medium stores at least one executable instruction, and the computer executable instruction may execute the voice-based interaction method in any method embodiment.

The executable instructions may be specifically configured to cause the processor to:

determining a voice volume value in the current sampling time, and judging whether the current sampling time is the first sampling time;

Additionally, the executable instructions may be further operable to cause the processor to: receiving voice input content for realizing interactive operation;

EXAMPLE five

Fig. 6 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.

As shown in fig. 6, the electronic device may include: a processor (processor)602, a communication Interface 606, a memory 604, and a communication bus 608.

Wherein:

the processor 602, communication interface 606, and memory 604 communicate with one another via a communication bus 608.

A communication interface 606 for communicating with network elements of other devices, such as clients or other servers.

The processor 602 is configured to execute the program 610, and may specifically perform relevant steps in the above-described voice-based interaction method embodiment.

In particular, program 610 may include program code comprising computer operating instructions.

The processor 602 may be a central processing unit CPU or an application specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

And a memory 604 for storing a program 610. Memory 604 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program 510 may specifically be used to cause the processor 502 to perform the following operations:

Additionally, the program 510 may also be used to cause the processor 502 to:

receiving voice input content for realizing interactive operation;

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in a voice input information based lottery system according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims

1. A voice-based interaction method, comprising:

2. The method according to claim 1, wherein the step of determining the volume level corresponding to the voice volume value at the current sampling time according to the volume level corresponding to the voice volume value at the last sampling time and the change amount of the voice volume value at the current sampling time relative to the voice volume value at the last sampling time specifically comprises:

3. The method according to claim 2, wherein the step of increasing at least one volume level based on the reference volume level if the voice volume value in the current sampling time is changed by a positive amount with respect to the voice volume value in the last sampling time comprises: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, increasing at least one volume level on the basis of the reference volume level;

4. A method according to any of claims 1-3, wherein the speech volume value within the sampling time is determined from an average volume value, a maximum volume value, and/or a minimum volume value of the speech input content received within the sampling time.

5. The method of claim 1, wherein prior to performing the method, further comprising:

6. The method of claim 5, wherein the type of operation of the interaction comprises: an interactive animation type, and/or a resource configuration type;

7. The method of claim 5 or 6, wherein the method further comprises, prior to performing:

8. A voice-based interaction method, comprising:

receiving voice input content for realizing interactive operation;

9. The method of claim 8, wherein the step of receiving speech input content for enabling interactive operations specifically comprises: receiving voice input content for realizing interactive operation through a preset interactive entrance; wherein the interaction portal comprises: an entry for implementing a resource configuration activity, an entry for displaying an interactive animation;

10. The method according to claim 8 or 9, wherein the step of determining the volume level corresponding to the voice volume value at the current sampling time according to the volume level corresponding to the voice volume value at the last sampling time and the amount of change of the voice volume value at the current sampling time relative to the voice volume value at the last sampling time specifically comprises:

11. The method according to claim 10, wherein the step of increasing at least one volume level based on the reference volume level if the voice volume value in the current sampling time is changed by a positive amount with respect to the voice volume value in the last sampling time comprises: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, increasing at least one volume level on the basis of the reference volume level;

12. The method according to claim 8 or 9, wherein the voice volume value within the sampling time is determined according to an average volume value, a maximum volume value, and/or a minimum volume value of the voice input content received within the sampling time.

13. The method of claim 8 or 9, wherein prior to performing the method, further comprising:

14. The method of claim 13, wherein the type of operation of the interaction comprises: an interactive animation type, and/or a resource configuration type;

15. A voice-based interactive system comprising:

16. The system of claim 15, wherein the second execution module is specifically adapted to:

17. The system of claim 16, wherein the second execution module is specifically adapted to: judging whether the variation of the voice volume value in the current sampling time relative to the voice volume value in the last sampling time is larger than a preset variation threshold value or not; if so, increasing at least one volume level on the basis of the reference volume level;

18. The system according to any of claims 15-17, wherein the speech volume value within the sampling time is determined based on an average volume value, a maximum volume value, and/or a minimum volume value of the speech input content received within the sampling time.

19. The system of claim 15, wherein the system further comprises: the first setting module is suitable for setting a plurality of volume levels which are sequentially arranged from high to low, and respectively setting the operation type and/or the operation content of the interactive operation corresponding to each volume level.

20. The system of claim 19, wherein the type of operation of the interaction comprises: an interactive animation type, and/or a resource configuration type;

21. The system according to claim 19 or 20, wherein the system further comprises a second setup module adapted to:

22. A voice-based interactive system comprising: a receiving module, a determining module, a first executing module, a second executing module, and a presenting module, wherein,

23. The system of claim 22, wherein the receiving module is specifically adapted to: receiving voice input content for realizing interactive operation through a preset interactive entrance; wherein the interaction portal comprises: an entry for implementing a resource configuration activity, an entry for displaying an interactive animation;

24. The system according to claim 22 or 23, wherein the second execution module is specifically adapted to:

25. The system of claim 24, wherein the second execution module is specifically adapted to:

26. The system according to claim 22 or 23, wherein the voice volume value within the sampling time is determined according to an average volume value, a maximum volume value, and/or a minimum volume value of the voice input content received within the sampling time.

27. The system according to claim 22 or 23, wherein the system further comprises a first setup module adapted to:

28. The system of claim 27, wherein the type of operation of the interaction comprises: an interactive animation type, and/or a resource configuration type;

29. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is configured to store at least one executable instruction that causes the processor to perform operations corresponding to the voice-based interaction method of any one of claims 1-7.

30. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is configured to store at least one executable instruction that causes the processor to perform operations corresponding to the voice-based interaction method of any one of claims 8-14.

31. A computer storage medium having stored therein at least one executable instruction that causes a processor to perform operations corresponding to the voice-based interaction method of any one of claims 1-7.

32. A computer storage medium having stored therein at least one executable instruction that causes a processor to perform operations corresponding to the voice-based interaction method of any of claims 8-14.