CN113314095A - Processing method, mobile terminal and storage medium - Google Patents

Processing method, mobile terminal and storage medium Download PDF

Info

Publication number
CN113314095A
CN113314095A CN202110496312.6A CN202110496312A CN113314095A CN 113314095 A CN113314095 A CN 113314095A CN 202110496312 A CN202110496312 A CN 202110496312A CN 113314095 A CN113314095 A CN 113314095A
Authority
CN
China
Prior art keywords
speech rate
user
determining
target
grade
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110496312.6A
Other languages
Chinese (zh)
Inventor
刘欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Microphone Holdings Co Ltd
Shenzhen Transsion Holdings Co Ltd
Original Assignee
Shenzhen Microphone Holdings Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Microphone Holdings Co Ltd filed Critical Shenzhen Microphone Holdings Co Ltd
Priority to CN202110496312.6A priority Critical patent/CN113314095A/en
Publication of CN113314095A publication Critical patent/CN113314095A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • G10L13/047Architecture of speech synthesisers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/043Time compression or expansion by changing speed
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Child & Adolescent Psychology (AREA)
  • Telephone Function (AREA)

Abstract

The application discloses a processing method, a mobile terminal and a storage medium, wherein the method comprises the following steps: acquiring a target speed level; and controlling the man-machine interaction application to perform voice broadcast according to the target speed level. By adopting the method provided by the application, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized.

Description

Processing method, mobile terminal and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a processing method, a mobile terminal, and a storage medium.
Background
Text To Speech (TTS) broadcasting technology refers To a technology for converting Text content into audio content and broadcasting the audio content, and is widely applied To scenes where information is not suitable or cannot be obtained visually. The TTS broadcasting technology can convert text content in real time, and under the action of a special intelligent voice controller, the voice rhythm output by the text is smooth, so that listeners feel natural when listening to information and do not have the feeling of indifference and acerbity of machine voice output.
In some implementations, in order to improve the user experience of the voice broadcast, the expression mode of the TTS voice broadcast may be changed, such as male voice, female voice, local language, and the like. In the course of conceiving and implementing the present application, the inventors found that at least the following problems existed: related application for adjusting the speech speed of TTS voice broadcast does not exist, and user experience is influenced.
The foregoing description is provided for general background information and is not admitted to be prior art.
Disclosure of Invention
In view of the above technical problems, the present application provides a processing method, a mobile terminal, and a storage medium, which can implement automation and intellectualization of speech rate adjustment for voice broadcast.
In order to solve the above technical problem, the present application provides a processing method applied to a mobile terminal, including:
acquiring a target speed level;
and controlling the man-machine interaction application to perform voice broadcast according to the target speed level.
Optionally, the obtaining the target speech rate level includes: and acquiring the speed level set by the user, and taking the speed level set by the user as a target speed level.
Optionally, the obtaining of the speed level set by the user includes: displaying a speech rate adjustment interface, wherein the speech rate adjustment interface comprises at least one speech rate grade option; when the selection operation aiming at the at least one speed grade option is detected, the speed grade corresponding to the speed grade option selected by the selection operation is determined as the speed grade set by the user.
Optionally, the obtaining of the speed level set by the user includes: displaying a speed adjustment interface, wherein the speed adjustment interface comprises a speed adjustment sliding bar; and determining the speech rate grade set for the user according to the sliding operation of the sliding strip aiming at the speech rate.
Optionally, the obtaining the target speech rate level includes: and when the speech rate intelligent adjusting function is detected to be started, determining a target speech rate grade according to the obtained speech rate adjusting reference parameter.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring an age parameter of a user; and determining the target speech rate grade according to the age parameter.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring text information of voice to be broadcasted, wherein optionally, the text information includes any one or more of the following conditions: text length, text keywords, text content; determining a target speech rate grade according to the text information; optionally, the target speech rate level is one or more speech rate levels.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring the emotional state of a user; and determining a target speech rate grade according to the emotional state of the user.
Optionally, the obtaining of the emotional state of the user includes: acquiring voice information of a user in a man-machine interaction process; and determining the emotional state of the user according to the voice information of the user.
Optionally, the obtaining of the emotional state of the user includes: acquiring face image data of a user in a human-computer interaction process; an emotional state of the user is determined from the facial image data of the user.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring environmental parameters in a human-computer interaction process; and determining the target speech rate grade according to the environment parameter.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter further includes: and processing the content to be broadcasted according to the speed regulation reference parameter.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring the speech rate grade of the voice of a user in the man-machine interaction process; and determining a target speech rate grade according to the speech rate grade of the voice of the user.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring multimedia preference information of a user; and determining a target speech rate grade according to the multimedia preference information of the user.
Optionally, the determining a target speech rate level according to the obtained speech rate adjustment reference parameter includes: and acquiring voice information of the user in the man-machine interaction process, judging whether the user is a target user, and if so, determining that a preset speech rate grade corresponding to the target user is the target speech rate grade.
Optionally, the method further comprises: if a speed adjusting instruction input by a user through voice is detected in the man-machine interaction process, adjusting the speed level of voice broadcasting of the man-machine interaction application according to the speed adjusting instruction; optionally, the speech rate adjustment instruction is used to instruct to decrease or increase the speech rate level.
The present application further provides a voice adjusting apparatus, which includes an obtaining unit and a control unit, wherein: the obtaining unit is used for obtaining a target speed level; the control unit is used for controlling the man-machine interaction application to carry out voice broadcast according to the target speed level.
The present application further provides a mobile terminal, including: the device comprises a memory and a processor, wherein the memory stores a processing program, and the processing program realizes the steps of any one of the methods when being executed by the processor.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method as set forth in any of the above.
As described above, in the processing method of the present application, by obtaining the target speech rate level and controlling the human-computer interaction application to perform the voice broadcast according to the target speech rate level, automation and intellectualization of speech rate adjustment of the voice broadcast can be realized.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic hardware structure diagram of a mobile terminal implementing various embodiments of the present application;
fig. 2 is a communication network system architecture diagram provided in an embodiment of the present application;
FIG. 3 is a flow chart diagram illustrating a method of processing according to a first embodiment;
FIG. 4 is a schematic flow diagram of another processing method according to the second embodiment;
FIG. 5 is a diagram illustrating a speech rate adjustment interface according to a second embodiment;
FIG. 6 is a diagram showing another speech rate adjustment interface according to the second embodiment;
FIG. 7 is a schematic flow chart diagram illustrating yet another method of processing according to a third embodiment;
FIG. 8 is a schematic diagram of a message prompt according to a third embodiment;
fig. 9 is a schematic structural diagram of a voice adjusting apparatus according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings. With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the recitation of an element by the phrase "comprising an … …" does not exclude the presence of additional like elements in the process, method, article, or apparatus that comprises the element, and further, where similarly-named elements, features, or elements in different embodiments of the disclosure may have the same meaning, or may have different meanings, that particular meaning should be determined by their interpretation in the embodiment or further by context with the embodiment.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope herein. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context. Also, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used in this specification, specify the presence of stated features, steps, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, steps, operations, elements, components, species, and/or groups thereof. The terms "or," "and/or," "including at least one of the following," and the like, as used herein, are to be construed as inclusive or mean any one or any combination. For example, "includes at least one of: A. b, C "means" any of the following: a; b; c; a and B; a and C; b and C; a and B and C ", again for example," A, B or C "or" A, B and/or C "means" any of the following: a; b; c; a and B; a and C; b and C; a and B and C'. An exception to this definition will occur only when a combination of elements, functions, steps or operations are inherently mutually exclusive in some way.
It should be understood that, although the steps in the flowcharts in the embodiments of the present application are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least some of the steps in the figures may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, in different orders, and may be performed alternately or at least partially with respect to other steps or sub-steps of other steps.
The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It should be noted that step numbers such as S301 and S302 are used herein for the purpose of more clearly and briefly describing the corresponding contents, and do not constitute a substantial limitation on the sequence, and those skilled in the art may perform S301 and then S302 in specific implementations, but these steps should be within the scope of the present application.
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for the convenience of description of the present application, and have no specific meaning in themselves. Thus, "module", "component" or "unit" may be used mixedly.
The mobile terminal may be implemented in various forms. For example, the mobile terminal described in the present application may include mobile terminals such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation device, a wearable device, a smart band, a pedometer, and the like, and fixed terminals such as a Digital TV, a desktop computer, and the like.
The following description will be given taking a mobile terminal as an example, and it will be understood by those skilled in the art that the configuration according to the embodiment of the present application can be applied to a fixed type terminal in addition to elements particularly used for mobile purposes.
Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal for implementing various embodiments of the present application, the mobile terminal 100 may include: RF (Radio Frequency) unit 101, WiFi module 102, audio output unit 103, a/V (audio/video) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, processor 110, and power supply 111. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals, which may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes each component of the mobile terminal in detail with reference to fig. 1:
the radio frequency unit 101 may be configured to receive and transmit signals during information transmission and reception or during a call, and optionally, receive downlink information of a base station and then process the downlink information to the processor 110; optionally, the uplink data is sent to the base station. Typically, radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communications), GPRS (General Packet Radio Service), CDMA2000(Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division duplex Long Term Evolution), and TDD-LTE (Time Division duplex Long Term Evolution).
WiFi belongs to short-distance wireless transmission technology, and the mobile terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 102, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 102, it is understood that it does not belong to the essential constitution of the mobile terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a call mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output related to a specific function performed by the mobile terminal 100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 103 may include a speaker, a buzzer, and the like.
The a/V input unit 104 is used to receive audio or video signals. The a/V input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, the Graphics processor 1041 Processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphic processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 may receive sounds (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, or the like, and may be capable of processing such sounds into audio data. The processed audio (voice) data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 101 in case of a phone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting audio signals.
The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Optionally, the light sensor includes an ambient light sensor that may adjust the brightness of the display panel 1061 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 1061 and/or the backlight when the mobile terminal 100 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.
The display unit 106 is used to display information input by a user or information provided to the user. The Display unit 106 may include a Display panel 1061, and the Display panel 1061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 107 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Alternatively, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect a touch operation performed by a user on or near the touch panel 1071 (e.g., an operation performed by the user on or near the touch panel 1071 using a finger, a stylus, or any other suitable object or accessory), and drive a corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Optionally, the touch detection device detects a touch orientation of a user, detects a signal caused by a touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 110, and can receive and execute commands sent by the processor 110. In addition, the touch panel 1071 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may include other input devices 1072. Optionally, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, and are not limited thereto.
Alternatively, the touch panel 1071 may cover the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or nearby, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of the touch event. Although the touch panel 1071 and the display panel 1061 are shown in fig. 1 as two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated to implement the input and output functions of the mobile terminal, and is not limited herein.
The interface unit 108 serves as an interface through which at least one external device is connected to the mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and external devices.
The memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a program storage area and a data storage area, and optionally, the program storage area may store an operating system, an application program (such as a sound playing function, an image playing function, and the like) required by at least one function, and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 109 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 110 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the mobile terminal. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor, optionally, the application processor mainly handles operating systems, user interfaces, application programs, etc., and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.
The mobile terminal 100 may further include a power supply 111 (e.g., a battery) for supplying power to various components, and preferably, the power supply 111 may be logically connected to the processor 110 via a power management system, so as to manage charging, discharging, and power consumption management functions via the power management system.
Although not shown in fig. 1, the mobile terminal 100 may further include a bluetooth module or the like, which is not described in detail herein.
In order to facilitate understanding of the embodiments of the present application, a communication network system on which the mobile terminal of the present application is based is described below.
Referring to fig. 2, fig. 2 is an architecture diagram of a communication Network system according to an embodiment of the present disclosure, where the communication Network system is an LTE system of a universal mobile telecommunications technology, and the LTE system includes a UE (User Equipment) 201, an E-UTRAN (Evolved UMTS Terrestrial Radio Access Network) 202, an EPC (Evolved Packet Core) 203, and an IP service 204 of an operator, which are in communication connection in sequence.
Optionally, the UE201 may be the mobile terminal 100 described above, and is not described herein again.
The E-UTRAN202 includes eNodeB2021 and other eNodeBs 2022, among others. Alternatively, the eNodeB2021 may be connected with other enodebs 2022 through a backhaul (e.g., X2 interface), the eNodeB2021 is connected to the EPC203, and the eNodeB2021 may provide the UE201 access to the EPC 203.
The EPC203 may include an MME (Mobility Management Entity) 2031, an HSS (Home Subscriber Server) 2032, other MMEs 2033, an SGW (Serving gateway) 2034, a PGW (PDN gateway) 2035, and a PCRF (Policy and Charging Rules Function) 2036, and the like. Optionally, the MME2031 is a control node that handles signaling between the UE201 and the EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as home location register (not shown) and holds subscriber specific information about service characteristics, data rates, etc. All user data may be sent through SGW2034, PGW2035 may provide IP address assignment for UE201 and other functions, and PCRF2036 is a policy and charging control policy decision point for traffic data flow and IP bearer resources, which selects and provides available policy and charging control decisions for a policy and charging enforcement function (not shown).
The IP services 204 may include the internet, intranets, IMS (IP Multimedia Subsystem), or other IP services, among others.
Although the LTE system is described as an example, it should be understood by those skilled in the art that the present application is not limited to the LTE system, but may also be applied to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, and future new network systems.
In order to better understand the embodiments of the present application, the hardware structure of the mobile terminal and the communication network system are introduced, and various embodiments of the present application are now proposed.
The processing method, the mobile terminal and the storage medium provided by the embodiment of the application are further described in detail below. Referring to fig. 3, fig. 3 is a flow chart illustrating a processing method according to a first embodiment. The processing method shown in fig. 3 includes S301 to S302. The method of the embodiment of the present application may be executed by the mobile terminal shown in fig. 1, or may be executed by a chip in the mobile terminal, and the mobile terminal may be applied to the communication network system shown in fig. 2. The method shown in fig. 3 is executed by a mobile terminal as an example. Wherein:
s301, acquiring a target speech rate grade.
In the embodiment of the present application, the target speech rate level may be determined in different manners. For example, the target speech rate level may be set by the user, or may be set according to the speech rate adjustment reference parameter when the speech rate intelligent adjustment function is turned on. Alternatively, the speech speed adjustment reference parameter may be an age parameter of the user, text information of the voice to be broadcasted, an emotional state of the user, an environmental parameter in a human-computer interaction process, and the like. Based on the mode, the speed of speech of the voice broadcast can be adjusted.
The target speech rate level may be set before the human-computer interaction, or may be set during the human-computer interaction, which is not limited herein.
And S302, controlling the man-machine interaction application to perform voice broadcast according to the target speech speed grade.
In the embodiment of the application, the mobile terminal controls the man-machine interaction application to generate the audio information of the voice to be broadcasted according to the target speed grade, and then the audio information is broadcasted. Based on the mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
In the method described in fig. 3, a target speech rate level is obtained, and the human-computer interaction application is controlled to perform voice broadcast according to the target speech rate level. Therefore, based on the method described in fig. 3, automation and intellectualization of the speech rate adjustment of the voice broadcast can be realized.
Referring to fig. 4, fig. 4 is a flow chart illustrating another processing method according to a second embodiment. The processing method shown in fig. 4 includes S401 to S403. The method of the embodiment of the present application may be executed by the mobile terminal shown in fig. 1, or may be executed by a chip in the mobile terminal, and the mobile terminal may be applied to the communication network system shown in fig. 2. The method shown in fig. 4 is executed by a mobile terminal as an example. Wherein:
s401, obtaining the speech rate grade set by the user.
Optionally, the obtaining, by the mobile terminal, the speech rate level set by the user includes: displaying a speech rate adjustment interface, wherein the speech rate adjustment interface comprises at least one speech rate grade option; when the selection operation aiming at the at least one speed grade option is detected, the speed grade corresponding to the speed grade option selected by the selection operation is determined as the speed grade set by the user.
For example, as shown in fig. 5, a speech rate adjustment interface provided in an embodiment of the present application is shown, where the speech rate adjustment interface includes three speech rate level options, an intelligent adjustment option, a determination option, and a cancellation option, and optionally, the three speech rate level options are faster, normal, and slower, respectively. When the mobile terminal detects the clicking operation of the user for the three speed level options and detects the clicking operation of the determined option, determining the speed level corresponding to the speed level option selected by the clicking operation as the speed level set by the user; and when the mobile terminal detects the click operation of the cancel option, exiting the speech speed adjusting interface. For example, the speech rate levels are divided into 10 levels, the speech rate level corresponding to the faster speech rate level option is 8, the speech rate level corresponding to the normal speech rate level option is 5, and the speech rate level corresponding to the slower speech rate level option is 3. The mobile terminal detects that the user clicks the normal speed level option, and the speed level corresponding to the normal speed level option is 5, so that the speed level set by the user is determined to be 5.
Optionally, when the mobile terminal detects that the user performs a selection operation on the intelligent adjustment option, the mobile terminal starts the speech rate intelligent adjustment function, and for a specific implementation, reference is made to the method shown in fig. 7, which is not described herein again.
Optionally, the obtaining, by the mobile terminal, the speech rate level set by the user includes: displaying a speed adjustment interface, wherein the speed adjustment interface comprises a speed adjustment sliding bar; and determining the speech rate grade set for the user according to the sliding operation of the sliding strip aiming at the speech rate.
For example, as shown in fig. 6, a speech rate adjustment interface provided in an embodiment of the present application is shown, where the speech rate adjustment interface includes a speech rate adjustment slider, an intelligent adjustment option, a determination option, and a cancellation option, and optionally, the speech rate adjustment slider slides from left to right to indicate that the speech rate is slow to fast, different positions where the speech rate adjustment slider slides correspond to different speech rate levels, and the middle position where the speech rate adjustment slider slides indicates a normal speech rate. When the mobile terminal detects the click operation of the determined option according to the sliding operation of the sliding bar for the speech rate, determining the speech rate grade corresponding to the slid position as the speech rate grade set by the user; and when the mobile terminal detects the click operation of the cancel option, exiting the speech speed adjusting interface. For example, the speech rate level is divided into 10 levels, the mobile terminal detects a sliding operation of the slide bar for adjusting the speech rate, and the speech rate level corresponding to the position to which the slide bar is slid is 7, so that the speech rate level set by the user is determined to be 7.
Optionally, when the mobile terminal detects that the user performs a selection operation on the intelligent adjustment option, the mobile terminal starts the speech rate intelligent adjustment function, and for a specific implementation, reference is made to the method shown in fig. 7, which is not described herein again.
S402, taking the speech rate grade set by the user as a target speech rate grade.
In the embodiment of the application, the user sets the speech speed grade of the voice broadcast by himself, the problem that the user has difference in the receiving degree of the speech speed of the voice broadcast is solved, and meanwhile, the broadcast speech speed can be adjusted more quickly.
And S403, controlling the man-machine interaction application to perform voice broadcast according to the target speech speed grade.
Optionally, a specific implementation manner of S403 is the same as the specific implementation manner of S302, and is not described herein again.
In the method described in fig. 4, the human-computer interaction application performs voice broadcast according to the obtained target speech rate level, and optionally, the target speech rate level is set by the user. Therefore, based on the method described in fig. 4, automation and intellectualization of the speech rate adjustment of the voice broadcast can be realized.
Referring to fig. 7, fig. 7 is a flowchart illustrating another processing method according to a third embodiment. The processing method shown in fig. 7 includes S701 to S703. The method of the embodiment of the present application may be executed by the mobile terminal shown in fig. 1, or may be executed by a chip in the mobile terminal, and the mobile terminal may be applied to the communication network system shown in fig. 2. The method shown in fig. 7 is executed by a mobile terminal as an example. Wherein:
s701, when the fact that the speech rate intelligent adjusting function is started is detected, determining a target speech rate grade according to the obtained speech rate adjusting reference parameter.
In the embodiment of the application, the user starts the intelligent voice speed adjusting function, the problem that the user has difference in the receiving degree of the voice broadcasting voice speed is solved, and meanwhile, the broadcasting voice speed can be adjusted more conveniently.
Optionally, the determining, by the mobile terminal, the target speech rate level according to the obtained speech rate adjustment reference parameter includes: and acquiring an age parameter of the user, and determining a target speech speed grade according to the age parameter. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, the speech rate level is divided into 10 levels, and if the age parameter is 12 years old or below, the speech rate level is set to 3 levels; if the age parameter is 13 to 30 years old, the speech rate grade is set to 6 grades; if the age parameter is 31-50 years old, the speech rate grade is set to 5 grade; if the age parameter is above 50 years old, the speech rate level is set to level 2. At this time, the acquired age parameter of the user is 23 years, and the speech rate level corresponding to 23 years is set to 6 levels, so that the target speech rate level is determined to be 6.
Optionally, the determining, by the mobile terminal, the target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring text information of voice to be broadcasted, wherein the text information comprises any one or more of the following conditions: text length, text keywords, text content; determining a target speech rate grade according to the text information; optionally, the target speech rate level is one or more speech rate levels. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, the obtained text information of the voice to be broadcasted is the text length, and when the text length exceeds a preset threshold, the mobile terminal can improve the speech speed level. Assuming that the speech rate grade is totally divided into 10 grades, the current speech rate grade is 5, the preset threshold of the text length is 500 words, the text length of the obtained speech to be broadcasted is 600 words, and exceeds the preset threshold of the text length, so that the speech rate grade is improved to 7 grades, and the target speech rate grade is determined to be 7.
For another example, the obtained text information of the voice to be broadcasted is a text keyword, and after the text keyword appears, the mobile terminal can reduce or improve the speed level. Assuming that the speech rate level is totally divided into 10 levels, the text keyword is "however", the speech rate level before the text keyword appears is 6, and the speech rate level after the text keyword appears is reduced to 4, and thus, the target speech rate level is two speech rate levels.
For another example, the obtained text information of the voice to be broadcasted is text content, and when the text content is special information, the mobile terminal can reduce the speech speed level; when the text content is general information, the speed level can be improved; the speech rate level may be lowered when the text content is science knowledge or results of a returned query. Assuming that the speech rate grade is divided into 10 grades, the current speech rate grade is 5, and if the text content is rainstorm today, the speech rate grade is reduced to 4; if the text content is today clear, the speech speed grade is improved to 6; if the text content is popular science knowledge, the speed level is reduced to 4.
Optionally, the determining, by the mobile terminal, the target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring the emotional state of a user; and determining a target speech rate grade according to the emotional state of the user. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, when the acquired emotional state of the user is sad, the mobile terminal may decrease the speech rate level; when the acquired emotional state of the user is happy, the mobile terminal can improve the speech rate level. Assuming that the speech rate level is totally divided into 10 levels, the current speech rate level is 5, and when the obtained emotional state of the user is sad, the speech rate level is reduced to 4; and when the acquired emotional state of the user is happy, increasing the speed level to 6.
Optionally, the obtaining, by the mobile terminal, the emotional state of the user includes: acquiring voice information of a user in a man-machine interaction process; and determining the emotional state of the user according to the voice information of the user. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, the voice information of the user in the human-computer interaction process is "today's task is not completed yet", and the emotional state of the user can be determined to be sad according to the voice information of the user.
Optionally, the obtaining, by the mobile terminal, the emotional state of the user includes: acquiring face image data of a user in a human-computer interaction process; an emotional state of the user is determined from the facial image data of the user.
For example, the facial image of the user is a happy expression in the human-computer interaction process, and the emotional state of the user can be determined to be happy according to the facial image data of the user.
Optionally, the determining, by the mobile terminal, the target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring environmental parameters in a human-computer interaction process; and determining the target speech rate grade according to the environment parameter. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, when the environmental parameters in the human-computer interaction process are represented as a noisy environment, the mobile terminal can reduce the speech rate level; when the environment parameter in the human-computer interaction process is expressed as a quiet environment, the mobile terminal can improve the speech speed grade. Assuming that the speech rate grade is divided into 10 grades, the current speech rate grade is 5, and when the environmental parameters in the human-computer interaction process are represented as a noisy environment, the speech rate grade is reduced to 4; when the environment parameter in the human-computer interaction process is expressed as a quiet environment, the speech rate level is increased to 6.
Optionally, the mobile terminal may increase or decrease the volume according to an environmental parameter during the human-computer interaction. For example, when the environmental parameter in the human-computer interaction process is represented as a noisy environment, the volume can be increased; when the environment parameter during the man-machine interaction is represented as a quiet environment, the volume may be reduced.
Optionally, the environmental parameter may also be a time and/or location parameter, e.g. the volume may be decreased if at home and/or at night, and increased if outdoors and/or during the day.
Optionally, the mobile terminal determines the target speech rate level according to the obtained speech rate adjustment reference parameter, and further includes: and processing the content to be broadcasted according to the speed regulation reference parameter. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, when it is acquired that the user is in an impatient state, the mobile terminal may increase the speech rate level and/or further determine important content, and only broadcast the important content. Assuming that the speech rate grades are totally divided into 10 grades, the current speech rate grade is 5, when the fact that the user is in a fidget state is obtained, the speech rate grade is improved to 6, key contents in the contents to be broadcasted can be further determined, and only the key contents are broadcasted.
For another example, when it is acquired that the user is in a relaxed state, the mobile terminal may decrease the speech rate level, and/or further determine the important content, and only broadcast the important content. Assuming that the speech rate grades are totally divided into 10 grades, the current speech rate grade is 5, when the situation that the user is in a relaxed state is obtained, the speech rate grade is reduced to 4, key contents in the contents to be broadcasted can be further determined, and only the key contents are broadcasted.
Optionally, the mobile terminal may display the secondary content of the content to be broadcasted without broadcasting.
Optionally, the mobile terminal may increase the volume when broadcasting the main content of the contents to be broadcasted.
Optionally, the manner of determining the important content may be according to user habits or big data analysis, or may be user setting (for example, according to a set keyword, determining a line or a paragraph where the keyword is located as the important content) or user selection (for example, through touch operation).
Optionally, the determining, by the mobile terminal, the target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring the speech rate grade of the voice of a user in the man-machine interaction process; and determining a target speech rate grade according to the speech rate grade of the voice of the user. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, when the speech rate level of the user's speech belongs to a faster speech rate in the human-computer interaction process, the mobile terminal may increase the speech rate level; when the speech rate level of the user's speech belongs to a slower speech rate in the human-computer interaction process, the mobile terminal may decrease the speech rate level. Assuming that the speech rate level is divided into 10 levels, when the speech rate level is 1 to 4, a slower speech rate is represented; when the speech rate grade is 5, the normal speech rate is represented; a speech rate rating of 6 to 10 indicates a faster speech rate. The current speech rate level is 5, and the obtained speech rate level of the voice of the user in the man-machine interaction process is 7, so that the speech rate level can be improved to 6, and the target speech rate level is determined to be 6.
Optionally, the mobile terminal determines the speech rate level of the user's speech as the target speech rate level in the human-computer interaction process. For example, if the speech rate level of the voice of the user in the acquired human-computer interaction process is 6, the target speech rate level is set to 6.
Optionally, the determining, by the mobile terminal, the target speech rate level according to the obtained speech rate adjustment reference parameter includes: acquiring multimedia preference information of a user; and determining a target speech rate grade according to the multimedia preference information of the user. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, when the multimedia preference information of the user is electronic rock music, hot blood video and fun video, the mobile terminal can improve the speed level; when the multimedia preference information of the user is light music and slow-rhythm video, the mobile terminal can reduce the speech rate grade. Assuming that the speech rate level is divided into 10 levels, when the speech rate level is 1 to 4, a slower speech rate is represented; when the speech rate grade is 5, the normal speech rate is represented; a speech rate rating of 6 to 10 indicates a faster speech rate. When the acquired multimedia preference information of the user is electronic rock music, determining the target speech rate grade as 7; and when the acquired multimedia preference information of the user is light music, determining the target speech speed level as 4.
Alternatively, the multimedia preference information of the user can be set by the user or obtained according to the user usage habit and/or big data analysis.
Optionally, the determining, by the mobile terminal, the target speech rate level according to the obtained speech rate adjustment reference parameter includes: and acquiring voice information of the user in the man-machine interaction process, judging whether the user is a target user, and if so, determining that the preset speech rate grade corresponding to the target user is the target speech rate grade. Based on the possible implementation mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
For example, assuming that the preset speech rate level for the target user a is 6, in the process of man-machine interaction, the voice information of the user is acquired, and the mobile terminal determines whether the user is the target user a according to the voice information. If the user is the target user a, the preset speech rate level corresponding to the target user is determined as the target speech rate level, that is, the target speech rate level is 6.
And S702, controlling the man-machine interaction application to perform voice broadcast according to the target speed level.
Optionally, a specific implementation manner of S702 is the same as the specific implementation manner of S302 described above, and is not described herein again.
And S703, if a speed adjusting instruction input by the voice of the user is detected in the man-machine interaction process, adjusting the speed level of voice broadcasting of the man-machine interaction application according to the speed adjusting instruction.
In the embodiment of the present application, the speech rate adjustment instruction is used to instruct to decrease or increase the speech rate level. Based on the possible implementation mode, the method is beneficial to improving the user experience. For example, in the human-computer interaction process, the speech speed adjusting instruction input by the user voice is detected to be the reduction of the speech speed level, so that the mobile terminal reduces the speech speed level of the human-computer interaction application for voice broadcasting.
Optionally, if the same broadcast content is subjected to voice broadcast by using the target speech rate level for multiple times within the preset time, the mobile terminal prompts the user whether the speech rate level needs to be reduced or increased. In the preset time, the voice broadcast is carried out on the same broadcast content by adopting the target speech speed grade for many times, which may indicate that the user does not clearly hear the broadcast content or indicates that the user likes the broadcast content, so that the mobile terminal can prompt the user whether the speech speed grade needs to be reduced or improved. The preset time may be set to any time, and is not limited herein. The prompt can be a voice prompt or a mode of outputting prompt information in an interface.
As shown in fig. 8, the mobile terminal performs the prompt by outputting the prompt information in the interface. For example, within 10 minutes, when the mobile terminal detects that the same broadcast content is subjected to voice broadcast by using the target speech rate level for multiple times, the mobile terminal displays a message prompt box such as that shown in fig. 8 on the display screen, where the message prompt box includes a "speech rate increase option" and a "speech rate decrease option"; and determining whether to increase the speed of voice broadcast or decrease the speed of voice broadcast according to the selection operation of the user aiming at the two options.
Optionally, if the same broadcast content is subjected to voice broadcast by adopting the target speech rate level for multiple times within the preset time, the mobile terminal reduces the speech rate level and increases the broadcast volume. In the preset time, the target speech speed grade is adopted for carrying out voice broadcast on the same broadcast content for multiple times, and the fact that the user does not clearly hear the broadcast content is probably indicated, so that the broadcast speech speed grade can be reduced, and the broadcast volume is increased. Based on the mode, the automation and the intellectualization of the speed regulation of the voice broadcast can be realized, and the improvement of the user experience is facilitated.
In the method described in fig. 7, the human-computer interaction application performs voice broadcast according to the obtained target speech rate level, and optionally, the target speech rate level is determined according to the speech rate adjustment reference parameter when the speech rate intelligent adjustment function is turned on. Therefore, based on the method described in fig. 7, automation and intellectualization of the speech rate adjustment of the voice broadcast can be realized.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a voice adjusting apparatus according to an embodiment of the present application. The apparatus 90 comprises an acquisition unit 901 and a control unit 902, wherein:
an obtaining unit 901 is configured to obtain a target speech rate level. Alternatively, the operation performed by the acquiring unit 901 may refer to the description in S301 in the method shown in fig. 3.
And the control unit 902 is configured to control the human-computer interaction application to perform voice broadcast according to the target speech speed level. Alternatively, the operation performed by the control unit 902 may refer to the description in S302 in the method shown in fig. 3.
In some embodiments, the obtaining unit 901, when obtaining the target speech rate level, is specifically configured to: and acquiring the speed level set by the user, and taking the speed level set by the user as a target speed level.
In some embodiments, the obtaining unit 901, when obtaining the speech rate level set by the user, is specifically configured to: displaying a speech rate adjustment interface, wherein the speech rate adjustment interface comprises at least one speech rate grade option; when the selection operation aiming at the at least one speed grade option is detected, the speed grade corresponding to the speed grade option selected by the selection operation is determined as the speed grade set by the user.
In some embodiments, the obtaining unit 901, when obtaining the speech rate level set by the user, is specifically configured to: displaying a speed adjustment interface, wherein the speed adjustment interface comprises a speed adjustment sliding bar; and determining the speech rate grade set for the user according to the sliding operation of the sliding strip aiming at the speech rate.
In some embodiments, the obtaining unit 901, when obtaining the target speech rate level, is specifically configured to: and when the speech rate intelligent adjusting function is detected to be started, determining a target speech rate grade according to the obtained speech rate adjusting reference parameter.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is specifically configured to: acquiring an age parameter of a user; and determining the target speech rate grade according to the age parameter.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is specifically configured to: acquiring text information of voice to be broadcasted, wherein optionally, the text information includes any one or more of the following conditions: text length, text keywords, text content; determining a target speech rate grade according to the text information; optionally, the target speech rate level is one or more speech rate levels.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is specifically configured to: acquiring the emotional state of a user; and determining a target speech rate grade according to the emotional state of the user.
In some embodiments, the obtaining unit 901, when obtaining the emotional state of the user, is specifically configured to: acquiring voice information of a user in a man-machine interaction process; and determining the emotional state of the user according to the voice information of the user.
In some embodiments, the obtaining unit 901, when obtaining the emotional state of the user, is specifically configured to: acquiring face image data of a user in a human-computer interaction process; an emotional state of the user is determined from the facial image data of the user.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is specifically configured to: acquiring environmental parameters in a human-computer interaction process; and determining the target speech rate grade according to the environment parameter.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is further configured to: and processing the content to be broadcasted according to the speed regulation reference parameter.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is specifically configured to: acquiring the speech rate grade of the voice of a user in the man-machine interaction process; and determining a target speech rate grade according to the speech rate grade of the voice of the user.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is specifically configured to: acquiring multimedia preference information of a user; and determining a target speech rate grade according to the multimedia preference information of the user.
In some embodiments, the obtaining unit 901, when determining the target speech rate level according to the obtained speech rate adjustment reference parameter, is specifically configured to: and acquiring voice information of the user in the man-machine interaction process, judging whether the user is a target user, and if so, determining that the preset speech rate grade corresponding to the target user is the target speech rate grade.
In some embodiments, the apparatus further comprises an adjustment unit for: if a speed adjusting instruction input by a user through voice is detected in the man-machine interaction process, adjusting the speed level of voice broadcasting of the man-machine interaction application according to the speed adjusting instruction; optionally, the speech rate adjustment instruction is used to instruct to decrease or increase the speech rate level.
It should be noted that the operations performed by the units of the apparatus shown in fig. 9 may be related to the method embodiment described above. And will not be described in detail herein. The above units can be realized by hardware, software or a combination of hardware and software.
The application further provides a mobile terminal, which includes a memory, a user interface, and a processor, where the memory stores a processing program, and the processing program, when executed by the processor, implements the steps of the processing method in any of the above embodiments in combination with the user interface. Wherein the user interface includes input devices including a sound pickup device, a touch screen, and the like, and output devices including a speaker, a display screen, and the like.
The present application further provides a computer-readable storage medium, on which a processing program is stored, and when the processing program is executed by a processor, the processing program implements the steps of the processing method in any of the above embodiments.
In the embodiments of the mobile terminal and the computer-readable storage medium provided in the present application, all technical features of the embodiments of the processing method are included, and the expanding and explaining contents of the specification are basically the same as those of the embodiments of the method, and are not described herein again.
Embodiments of the present application also provide a computer program product, which includes computer program code, when the computer program code runs on a computer, the computer is caused to execute the method in the above various possible embodiments.
Embodiments of the present application further provide a chip, which includes a memory and a processor, where the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that a device in which the chip is installed executes the method in the above various possible embodiments.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.
The units in the device in the embodiment of the application can be merged, divided and deleted according to actual needs.
In the present application, the same or similar term concepts, technical solutions and/or application scenario descriptions will be generally described only in detail at the first occurrence, and when the description is repeated later, the detailed description will not be repeated in general for brevity, and when understanding the technical solutions and the like of the present application, reference may be made to the related detailed description before the description for the same or similar term concepts, technical solutions and/or application scenario descriptions and the like which are not described in detail later.
In the present application, each embodiment is described with emphasis, and reference may be made to the description of other embodiments for parts that are not described or illustrated in any embodiment.
The technical features of the technical solution of the present application may be arbitrarily combined, and for brevity of description, all possible combinations of the technical features in the embodiments are not described, however, as long as there is no contradiction between the combinations of the technical features, the scope of the present application should be considered as being described in the present application.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a mobile terminal (e.g., a mobile phone, a computer, a server, a controlled terminal, or a network device) to execute the method of each embodiment of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, memory Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims (10)

1. A method of processing, the method comprising:
acquiring a target speed level;
and controlling the man-machine interaction application to perform voice broadcast according to the target speed level.
2. The method according to claim 1, wherein said obtaining a target speech rate level comprises at least one of:
when the fact that the speech rate intelligent adjusting function is started is detected, determining a target speech rate grade according to the obtained speech rate adjusting reference parameter;
and acquiring a speech rate grade set by a user, and taking the speech rate grade set by the user as a target speech rate grade.
3. The method according to claim 2, wherein said determining a target speech rate level according to the obtained speech rate adjustment reference parameter comprises at least one of:
acquiring an age parameter of a user, and determining a target speech speed grade according to the age parameter;
acquiring text information of voice to be broadcasted, and determining a target speed level according to the text information.
4. The method according to claim 2, wherein said determining a target speech rate level according to the obtained speech rate adjustment reference parameter comprises:
acquiring the emotional state of a user;
and determining a target speech rate grade according to the emotional state of the user.
5. The method of claim 4, wherein the obtaining of the emotional state of the user comprises at least one of:
acquiring voice information of a user in a man-machine interaction process, and determining the emotional state of the user according to the voice information of the user;
the method comprises the steps of obtaining face image data of a user in a man-machine interaction process, and determining the emotional state of the user according to the face image data of the user.
6. The method according to claim 2, wherein said determining a target speech rate level according to the obtained speech rate adjustment reference parameter comprises:
acquiring environmental parameters in a human-computer interaction process;
and determining the target speech rate grade according to the environment parameters.
7. The method according to claim 2, wherein said determining a target speech rate level according to the obtained speech rate adjustment reference parameter comprises at least one of:
acquiring the speech rate grade of the voice of a user in the man-machine interaction process, and determining a target speech rate grade according to the speech rate grade of the voice of the user;
acquiring multimedia preference information of a user, and determining a target speech speed grade according to the multimedia preference information of the user;
acquiring voice information of a user in a man-machine interaction process, judging whether the user is a target user, and if so, determining that a preset speech rate grade corresponding to the target user is the target speech rate grade.
8. The method according to claim 2, wherein said obtaining the user-set speech rate level comprises at least one of:
displaying a speed of speech adjustment interface, wherein the speed of speech adjustment interface comprises at least one speed of speech level option; when the selection operation aiming at the at least one speech rate grade option is detected, determining the speech rate grade corresponding to the speech rate grade option selected by the selection operation as the speech rate grade set by the user;
displaying a speed adjusting interface, wherein the speed adjusting interface comprises a speed adjusting sliding bar; and determining the speech rate grade set for the user according to the sliding operation of the speech rate adjusting sliding strip.
9. A mobile terminal, characterized in that the mobile terminal comprises: memory, processor, wherein the memory has stored thereon a processing program which, when executed by the processor, implements the steps of the processing method of any of claims 1 to 8.
10. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the processing method according to any one of claims 1 to 8.
CN202110496312.6A 2021-05-07 2021-05-07 Processing method, mobile terminal and storage medium Pending CN113314095A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110496312.6A CN113314095A (en) 2021-05-07 2021-05-07 Processing method, mobile terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110496312.6A CN113314095A (en) 2021-05-07 2021-05-07 Processing method, mobile terminal and storage medium

Publications (1)

Publication Number Publication Date
CN113314095A true CN113314095A (en) 2021-08-27

Family

ID=77371558

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110496312.6A Pending CN113314095A (en) 2021-05-07 2021-05-07 Processing method, mobile terminal and storage medium

Country Status (1)

Country Link
CN (1) CN113314095A (en)

Similar Documents

Publication Publication Date Title
CN108289244B (en) Video subtitle processing method, mobile terminal and computer readable storage medium
CN108572764B (en) Character input control method and device and computer readable storage medium
CN107329682B (en) Edge interaction method and mobile terminal
CN109302528B (en) Photographing method, mobile terminal and computer readable storage medium
CN112004174A (en) Noise reduction control method and device and computer readable storage medium
CN114126015A (en) Power consumption control method, intelligent terminal and storage medium
CN107729104B (en) Display method, mobile terminal and computer storage medium
CN113314120B (en) Processing method, processing apparatus, and storage medium
CN114398113A (en) Interface display method, intelligent terminal and storage medium
CN108153477B (en) Multi-touch operation method, mobile terminal and computer-readable storage medium
CN114065168A (en) Information processing method, intelligent terminal and storage medium
CN114443199A (en) Interface processing method, intelligent terminal and storage medium
CN113805837A (en) Audio processing method, mobile terminal and storage medium
CN113329347A (en) Processing method of Bluetooth device, mobile terminal and storage medium
CN113590069A (en) Switching method, mobile terminal and storage medium
CN112163148A (en) Information display method, mobile terminal and storage medium
CN113314095A (en) Processing method, mobile terminal and storage medium
CN109669594B (en) Interaction control method, equipment and computer readable storage medium
CN114067852A (en) Recording method, intelligent terminal and storage medium
CN114254287A (en) Device control method, mobile terminal and storage medium
CN117221438A (en) Volume adjusting method, intelligent terminal and storage medium
CN113986059A (en) Label display method, intelligent terminal and storage medium
CN114116104A (en) Information control method, intelligent terminal and storage medium
CN115831136A (en) Processing method, intelligent terminal and storage medium
CN115118812A (en) Information processing method, intelligent terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination