CN114999457A - Voice system testing method and device, storage medium and electronic equipment - Google Patents

Voice system testing method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN114999457A
CN114999457A CN202210625488.1A CN202210625488A CN114999457A CN 114999457 A CN114999457 A CN 114999457A CN 202210625488 A CN202210625488 A CN 202210625488A CN 114999457 A CN114999457 A CN 114999457A
Authority
CN
China
Prior art keywords
voice
test
tested
response
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210625488.1A
Other languages
Chinese (zh)
Inventor
蒲敏超
孙玉杰
王彦琴
邓朝明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202210625488.1A priority Critical patent/CN114999457A/en
Publication of CN114999457A publication Critical patent/CN114999457A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/01Assessment or evaluation of speech recognition systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The embodiment of the application discloses a method and a device for testing a voice system, a storage medium and electronic equipment, wherein the embodiment of the application acquires a test case, determines a test text, voice attributes and evaluation dimensions according to the test case, performs voice synthesis processing on the test text according to the voice attributes to obtain test voice, broadcasts the test voice to the voice equipment to be tested, acquires response voice, a response interface and process execution information of the equipment to be tested on the test voice, and acquires a test result of the voice equipment to be tested on the evaluation dimensions according to the response voice, the response interface and the process execution information.

Description

Voice system testing method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of information processing technologies, and in particular, to a method and an apparatus for testing a speech system, a storage medium, and an electronic device.
Background
With the development of intelligent terminals, such as smart phones, smart home appliances and other devices, more and more intelligent terminal devices are equipped with a voice system function, and the voice system can intelligently respond to voice instructions of users. The intelligent degree and the reliability of the voice system also need to be tested in the testing stage of the intelligent terminal, most of the conventional testing schemes are manual testing by developers, and the testing efficiency is low due to the lack of an automatic testing scheme aiming at the voice system.
Disclosure of Invention
The embodiment of the application provides a testing method and device of a voice system, a storage medium and electronic equipment, which can improve testing efficiency.
In a first aspect, an embodiment of the present application provides a method for testing a speech system, including:
obtaining a test case, and determining a test text, voice attributes and evaluation dimensions according to the test case;
performing voice synthesis processing on the test text according to the voice attribute to obtain test voice;
playing the test voice for the voice equipment to be tested, and acquiring response voice, response interface and process execution information of the voice equipment to be tested to the test voice;
and obtaining a test result of the to-be-tested voice equipment on the evaluation dimension according to the response voice, the response interface and the process execution information.
In a second aspect, an embodiment of the present application further provides a testing apparatus for a speech system, including:
the parameter determining module is used for acquiring a test case, and determining a test text, a voice attribute and an evaluation dimension according to the test case;
the voice synthesis module is used for carrying out voice synthesis processing on the test text according to the voice attribute to obtain test voice;
the test interaction module is used for playing the test voice to the voice equipment to be tested and acquiring response voice, response interface and process execution information of the voice equipment to be tested to the test voice;
and the test evaluation module is used for obtaining a test result of the voice equipment to be tested on the evaluation dimension according to the response voice, the response interface and the process execution information.
In a third aspect, an embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method for testing a speech system according to any embodiment of the present application.
In a fourth aspect, an embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory has a computer program, and the processor is configured to execute the method for testing a speech system according to any embodiment of the present application by calling the computer program.
According to the technical scheme provided by the embodiment of the application, when the voice equipment to be tested is tested, the test case is obtained, the test text, the voice attribute and the evaluation dimension are determined according to the test case, the voice synthesis processing is carried out on the test text according to the voice attribute to obtain the test voice, then the test voice is played and played on the voice equipment to be tested, then the response voice, the response interface and the process execution information of the voice equipment to be tested on the test voice are obtained, the test result of the voice equipment to be tested on the evaluation dimension is obtained according to the response voice, the response interface and the process execution information, the automatic test of the voice system on a plurality of evaluation dimensions can be realized only by determining the test case through the scheme provided by the embodiment of the application, and the test efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a first flowchart illustrating a testing method of a speech system according to an embodiment of the present application.
Fig. 2 is a schematic application scenario diagram of a testing method of a speech system according to an embodiment of the present application.
Fig. 3 is a schematic structural diagram of a testing apparatus of a speech system according to an embodiment of the present application.
Fig. 4 is a first structural schematic diagram of an electronic device according to an embodiment of the present application.
Fig. 5 is a second structural schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without inventive step, are within the scope of the present application.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
An execution main body of the testing method of the voice system may be the testing apparatus of the voice system provided in the embodiment of the present application, or an electronic device integrated with the testing apparatus of the voice system, where the testing apparatus of the voice system may be implemented in a hardware or software manner. The electronic device may be a smart phone, a tablet computer, a palm computer, a notebook computer, or a desktop computer.
Referring to fig. 1, fig. 1 is a first flowchart illustrating a testing method of a speech system according to an embodiment of the present application. The specific process of the testing method of the voice system provided by the embodiment of the application can be as follows:
101. and acquiring a test case, and determining a test text, voice attributes and evaluation dimensions according to the test case.
The electronic device in the embodiment of the application is used as a test terminal, and the test terminal is connected with the voice device to be tested, wherein the connection can be wired connection, for example, through a USB data line; or may be a wireless connection, for example, a near field communication connection such as WIFI. The voice device to be tested has a voice system, such as a voice assistant, a voice response system, etc., and the testing of the voice device to be tested is essentially a testing of the voice system. The voice device to be tested can be a smart phone, a smart household appliance, a vehicle-mounted smart terminal and the like. The present application is not limited to this, and any device having a voice response function may be used as the voice device to be tested in the embodiment of the present application.
In addition, an automated testing system is deployed on the testing terminal in the embodiment of the present application, as shown in fig. 2, which is an application scenario diagram of the testing method of the voice system provided in the embodiment of the present application. The automated testing system includes but is not limited to the following modules: the system comprises a human voice module, a recording module, a testing module and an evaluation and verification module (not shown in the figure). The life module is used for simulating human occurrence, expressing a test text in the test case by using sound as test voice, wherein the test voice can comprise various types including but not limited to awakening word voice, inquiry voice, dialogue voice, instruction voice and the like, and the awakening word voice can be used for awakening a voice system. The voice system after awakening can respond to the inquiry voice or conversation voice of the user, and can execute corresponding operation corresponding to the instruction voice. Generally, after a tested voice device detects a voice signal in an awake state, the tested voice device responds to a test voice or performs voice broadcast on an execution result, and voices under the conditions are collectively called response voices, wherein a recording module is used for recording the response voices. The evaluation verification module is used for evaluating the voice equipment to be tested in each evaluation dimension, such as awakening accuracy, response accuracy and the like. The test module is responsible for driving the cooperative work of the modules.
In addition, the test terminal needs to be configured with a plurality of test cases, and is also provided with functions of adding, deleting, modifying and checking the test cases, and supports writing scripts of the test cases by using java or python languages. Different test cases can achieve different test purposes. The partial automation test case may be represented as follows:
table 1 partial test case examples
Example numbering Purpose of example Description of use cases
TC_000_Wakeup Verifying wake-up functionality Test wake-up success rate
TC_001_Wakeup Verifying wake-up functionality Testing support for multiple languages
TC_002_Wakeup Verifying wake-up functionality Testing wake-up conditions at different speech rates
……
TC_100_Answer Verification answer function Verifying whether TTS broadcast voice is jammed or not
TC_101_Answer Verification answer function Verifying whether TTS answer is correct
……
TC_200_Smart Verifying intelligent degree Verifying recognition rates of different sentences
TC_201_Smart Verifying intelligent degree Verifying recognition rates of different speech rates
……
TC_300_Rely Verifying authenticity Validating different instructionsExecution success rate of skills
TC_301_Rely Verifying authenticity Verifying stability when performing continuous session testing
……
The above table shows a description of a part of exemplary test cases, such as "TC _001_ Wakeup", where the test case can be used to test the support of the voice device to be tested on multiple languages, and the evaluation dimension of the test case is the support of the voice device to be tested on multiple languages. For the test case, test texts corresponding to a plurality of different languages and voice attributes required by the test case need to be predefined. The test text and voice attributes will be explained in detail below.
For a tester, when the voice device to be tested needs to be tested, the tester only needs to connect the test terminal with the voice device to be tested, select a test case to be run, and trigger a test instruction. And the test terminal responds to the test instruction, obtains a test case corresponding to the test instruction, and determines a test text, voice attributes and evaluation dimensions required by the test according to the test case.
102. And carrying out voice synthesis processing on the test text according to the voice attribute to obtain the test voice.
After the test text and the voice attribute are determined, performing voice synthesis processing on the test text according to the voice attribute to obtain test voice. That is to say, the test terminal simulates human voice according to the preset voice attribute, and the voice device to be tested analyzes and responds to the test voice when receiving the test voice. For example, the test voice is "hello, a small cloth" (the small cloth is the name of the voice assistant), and the sentence is a wake-up sentence in which the wake-up word "small cloth" is included. When the voice device to be tested in the standby state receives the testing voice, the voice is firstly analyzed, and when the voice device to be tested in the standby state detects that the voice device to be tested contains the awakening word, the main process of the voice assistant is started, namely the voice assistant enters the working state from the standby state and can respond to other voices.
In some embodiments, performing speech synthesis processing on the test text according to the speech attribute to obtain the test speech may include: acquiring a voice synthesis algorithm matched with the first voice attribute; converting the test text according to a speech synthesis algorithm to obtain intermediate test speech matched with the first speech attribute; and adjusting the intermediate test voice according to the second voice attribute to obtain the test voice.
In this embodiment, the voice attributes include a first voice attribute and a second voice attribute. Wherein the first voice attributes include, but are not limited to: tone, dialect type (e.g., Cantonese, Shanghai, etc.), language type (e.g., Chinese, English, etc.), etc. The second voice attributes include, but are not limited to, speech rate, volume, etc. For different first voice attributes, parameters of a voice synthesis algorithm may be different when performing voice synthesis, and based on this, when configuring the voice synthesis algorithm, different parameters need to be determined according to different first voice attributes, so as to determine a voice synthesis algorithm corresponding to each first voice attribute. During testing, after a first voice attribute is obtained according to a test case, a voice synthesis algorithm matched with the first voice attribute is determined from a plurality of preset voice synthesis algorithms, and then a test text is converted according to the voice synthesis algorithm to obtain an intermediate test voice matched with the first voice attribute. And then, adjusting the intermediate test voice according to the second voice attribute to obtain the final test voice. For example, parameters such as the speech rate and the volume of the intermediate test speech are adjusted to obtain the final test speech.
103. And playing the test voice for the voice equipment to be tested, and acquiring the response voice, the response interface and the process execution information of the voice equipment to be tested to the test voice.
After the test voice is obtained, the test voice is played through the voice module, after the voice device to be tested receives the voice signal, the test voice is analyzed, and corresponding feedback is made according to different test voices, wherein the feedback measures mainly comprise: responding the test voice in a voice broadcasting mode, namely playing response voice and displaying a corresponding user interaction interface on a display interface of the tested voice equipment; and starting a corresponding process to execute the operation corresponding to the test voice. Still taking the test voice as the wake-up sentence as an example, if the wake-up sentence is "how you get well cloth", and the answer sentence is "i am there or" what there is a good note ", etc., it should be noted that the answer sentence is generally pre-configured, and when the voice device to be tested can accurately identify the wake-up sentence, an accurate answer can be made to the wake-up sentence, so that the accuracy degree of the answer sentence can be used as an evaluation index of the wake-up capability. And after the speech sound equipment to be tested is awakened, the voice system enters a working state from a standby state, that is, the main process of the voice system needs to be started, so that the test terminal needs to monitor the starting condition of the main process, if the starting is successful, the starting duration and other information are monitored, and process execution information is obtained and is used as another evaluation index of the awakening capability. After the voice system is woken up, the voice device to be tested displays the UI interface after the system is woken up on its display interface, for example, the test voice is a wake-up voice, and the response interface is a voice assistant main interface, and for example, the test voice is "little cloth, how much is the weather today? If the voice is question-answer voice, the answer voice may be "cloudy in the weather today", and the response interface is a display interface of the weather query result. For another example, if the test voice is "help me open internet cloud music", the voice system needs to start a process of the internet cloud music, and display an interface of the internet cloud music on the interface.
In some embodiments, the step of playing the test voice to the voice device to be tested, and acquiring the response voice, the response interface, and the process execution information of the voice device to be tested to the test voice may include: playing a test voice for the voice equipment to be tested, and acquiring a response voice of the voice equipment to be tested to the test voice through a recording module; and acquiring a response interface displayed by responding to the test voice and a started process from the voice equipment to be tested through the test case, and monitoring a starting result of the process to obtain process execution information.
In this embodiment, the test terminal is connected to the voice device to be tested, and can read the response interface and the process execution information from the voice device to be tested through the running test case. For example, in an embodiment, if the test terminal executes a script of a test case in a Python environment, the test terminal can directly read a response interface and process execution information from the voice device to be tested.
Or, in another embodiment, acquiring a response interface displayed by responding to the test voice and a started process from the voice device to be tested through the test case, and monitoring a start result of the process to obtain process execution information, including: compiling the test case to obtain a test file which can run on the voice equipment to be tested; and sending the test file to the voice equipment to be tested for operation so as to acquire a response interface of the voice equipment to be tested responding to the test voice display, and monitoring the process started by the voice equipment to be tested to obtain process execution information.
In this embodiment, the test file may be a jar file, that is, the test terminal compiles the test case to obtain a jar file that can be run on the voice device to be tested, and then pushes the jar file to the voice device to be tested, and the voice device to be tested executes the jar script using characteristics of a Classloader execution command line, so as to obtain a response interface and a started process displayed by the test voice response, and in addition, the test case may monitor a start result of the process to obtain process execution information.
In an embodiment, the step of playing the test voice to the voice device to be tested, and obtaining the response voice of the voice device to be tested to the test voice through the recording module may include: adding preset environmental noise to the test voice to obtain mixed test voice, and playing the mixed test voice to the voice equipment to be tested; acquiring the voice to be processed of the voice equipment to be tested through a recording module; carrying out format conversion processing on the voice to be processed to obtain the voice to be processed with a preset audio format; and carrying out audio preprocessing on the voice to be processed with the preset audio format to obtain response voice.
In this embodiment, in order to improve the trueness of the test environment, some noise signals in the real environment are collected in advance and stored. After the test voices are obtained, some preset environmental noises are added to the test voices so as to test the response capability of the voice system to voice signals in a noise scene, and the response capability is used as an index of the reliability degree of the voice system. And adding preset environmental noise to the test voice to obtain mixed test voice, and playing the mixed test voice to the voice equipment to be tested. When the voice system is in a monitoring state, the collected mixed test voice played by the test terminal can be acquired. The testing module of the testing terminal opens the recording module to record until the voice assistant answers, and after the recording is finished, the broadcasted voice to be processed of the voice equipment to be tested is obtained, and format conversion processing is carried out on the voice to be processed to obtain the voice to be processed with a preset audio format, such as a PCM format, so that subsequent operation is facilitated; and performing audio preprocessing, such as noise reduction, cutting and the like, on the voice to be processed in a preset audio format to obtain response voice.
104. And obtaining a test result of the voice equipment to be tested on the evaluation dimension according to the response voice, the response interface and the process execution information.
After the three feedback information are obtained, the three feedback information are integrated, and the test result of the speech system in the evaluation dimension corresponding to the current test case is calculated.
In an embodiment, obtaining a test result of the voice device to be tested in the evaluation dimension according to the response voice, the response interface, and the process execution information includes: acquiring preset text information, a preset response interface and preset process execution information corresponding to the test voice; extracting text information in the response voice, and calculating a first matching degree according to the text information and preset text information; calculating a second matching degree of the response interface and the preset response interface and a third matching degree between the process execution information and the preset process execution information; and calculating to obtain a test result of the to-be-tested voice equipment on the evaluation dimension according to the first matching degree, the second matching degree and the third matching degree.
In the test case, for a test voice, three expected preset feedbacks corresponding to the test voice are preset and are respectively recorded as preset text information, a preset response interface and preset process execution information. And acquiring the three kinds of information, and calculating the matching degree between the real feedback information and the expected feedback information for each kind of feedback information so as to judge whether the actual result meets the expected result. Specifically, a first matching degree is calculated according to the text information and preset text information, and the first matching degree reflects whether the voice broadcast of the voice system meets an expected result or not; and calculating a second matching degree of the response interface and the preset response interface and a third matching degree between the process execution information and the preset process execution information, and synthesizing the three matching degrees to obtain a test result of the to-be-tested voice equipment on the evaluation dimension. The test result may be matching degrees of various feedbacks, or may be indication information of whether the test is successful or not obtained by combining three matching degrees. For the same test case, the test set by the tester can be repeatedly executed for multiple times to obtain multiple test results, and then the success rate of the current evaluation dimension is determined according to the multiple test results.
By the mode, for the tester, after the test instruction is triggered, the test terminal can automatically test the voice system based on the test case selected by the tester and output the test result. And other manual operations are not required to be executed by testers, so that the testing efficiency is improved.
In particular implementation, the present application is not limited by the execution sequence of the described steps, and some steps may be performed in other sequences or simultaneously without conflict.
As can be seen from the above, in the testing method of the voice system provided in the embodiment of the present application, when the voice device to be tested is tested, the test case is obtained, the test text, the voice attribute, and the evaluation dimension are determined according to the test case, the voice synthesis processing is performed on the test text according to the voice attribute, the test voice is obtained, then the test voice is played and played on the voice device to be tested, then the response voice, the response interface, and the process execution information of the voice device to be tested on the test voice are obtained, and the test result of the voice device to be tested on the evaluation dimension is obtained according to the response voice, the response interface, and the process execution information.
In one embodiment, a testing device for a voice system is also provided. Referring to fig. 3, fig. 3 is a schematic structural diagram of a testing apparatus 300 for a speech system according to an embodiment of the present application. The testing apparatus 300 of the speech system is applied to an electronic device, and the testing apparatus 300 of the speech system includes a parameter determining module 301, a speech synthesizing module 302, a test interaction module 303, and a test evaluation module 304, as follows:
the parameter determining module 301 is configured to obtain a test case, and determine a test text, a voice attribute, and an evaluation dimension according to the test case;
a speech synthesis module 302, configured to perform speech synthesis processing on the test text according to the speech attribute to obtain a test speech;
the test interaction module 303 is configured to play the test voice for the voice device to be tested, and acquire response voice, a response interface, and process execution information of the voice device to be tested for the test voice;
and the test evaluation module 304 is configured to obtain a test result of the to-be-tested voice device in the evaluation dimension according to the response voice, the response interface, and the process execution information.
In some embodiments, the speech synthesis module 302 is configured to obtain a speech synthesis algorithm matching the first speech attribute; converting the test text according to the voice synthesis algorithm to obtain an intermediate test voice matched with the first voice attribute; and adjusting the intermediate test voice according to the second voice attribute to obtain the test voice.
In some embodiments, the test interaction module 303 is configured to play the test voice for the voice device to be tested according to the state identifier, and obtain a response voice of the voice device to be tested to the test voice through the recording module; and acquiring a response interface displayed by responding to the test voice and a started process from the voice equipment to be tested through the test case, and monitoring a starting result of the process to obtain process execution information.
In some embodiments, the test interaction module 303 is configured to add a preset environmental noise to the test voice to obtain a mixed test voice, and play the mixed test voice to the voice device to be tested; acquiring the voice to be processed of the voice equipment to be tested through a recording module; carrying out format conversion processing on the voice to be processed to obtain the voice to be processed with a preset audio format; and carrying out audio preprocessing on the voice to be processed in the preset audio format to obtain response voice.
In some embodiments, the test interaction module 303 is configured to compile the test case to obtain a test file that can be run on the to-be-tested speech device; and sending the test file to the voice equipment to be tested for operation so as to acquire a response interface of the voice equipment to be tested responding to the test voice display, and monitoring a process started by the voice equipment to be tested to obtain process execution information.
In some embodiments, the test evaluation module 304 is configured to obtain preset text information, a preset response interface, and preset process execution information corresponding to the test voice; extracting text information in the response voice, and calculating a first matching degree according to the text information and the preset text information; calculating a second matching degree of the response interface and the preset response interface and a third matching degree between the process execution information and the preset process execution information; and calculating to obtain a test result of the to-be-tested voice equipment on the evaluation dimension according to the first matching degree, the second matching degree and the third matching degree.
In some embodiments, the evaluation dimension includes at least one of a wake-up function, a degree of reliability, and a degree of intelligence.
It should be noted that the testing apparatus of the speech system provided in the embodiment of the present application and the testing method of the speech system in the foregoing embodiment belong to the same concept, and any method provided in the testing method embodiment of the speech system can be implemented by the testing apparatus of the speech system, and a specific implementation process thereof is described in the testing method embodiment of the speech system, and is not described herein again.
As can be seen from the above, the testing apparatus for a speech system provided in this embodiment of the present application, when testing a speech device to be tested, obtains a test case, determines a test text, a speech attribute, and an evaluation dimension according to the test case, performs speech synthesis processing on the test text according to the speech attribute to obtain a test speech, and then plays the test speech on the speech device to be tested, and then obtains a response speech, a response interface, and process execution information of the speech device to be tested on the test speech, and obtains a test result of the speech device to be tested on the evaluation dimension according to the response speech, the response interface, and the process execution information.
The embodiment of the application also provides the electronic equipment. The electronic device can be a smart phone, a tablet computer and the like. Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 400 comprises a processor 401 and a memory 402. The processor 401 is electrically connected to the memory 402.
The processor 401 is a control center of the electronic device 400, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or calling a computer program stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the electronic device.
Memory 402 may be used to store computer programs and data. The memory 402 stores computer programs containing instructions executable in the processor. The computer program may constitute various functional modules. The processor 401 executes various functional applications and data processing by calling a computer program stored in the memory 402.
In this embodiment, the processor 401 in the electronic device 400 loads instructions corresponding to one or more processes of the computer program into the memory 402 according to the following steps, and the processor 401 runs the computer program stored in the memory 402, so as to implement various functions:
obtaining a test case, and determining a test text, voice attributes and evaluation dimensions according to the test case;
performing voice synthesis processing on the test text according to the voice attribute to obtain test voice;
playing the test voice for the voice equipment to be tested, and acquiring response voice, response interface and process execution information of the voice equipment to be tested to the test voice;
and obtaining a test result of the voice equipment to be tested on the evaluation dimension according to the response voice, the response interface and the process execution information.
In some embodiments, please refer to fig. 5, wherein fig. 5 is a second structural diagram of an electronic device according to an embodiment of the present application. The electronic device 400 further comprises: radio frequency circuit 403, display 404, control circuit 405, input unit 406, audio circuit 407, sensor 408, and power supply 409. The processor 401 is electrically connected to the rf circuit 403, the display 404, the control circuit 405, the input unit 406, the audio circuit 407, the sensor 408, and the power source 409.
The radio frequency circuit 403 is used for transceiving radio frequency signals to communicate with a network device or other electronic devices through wireless communication.
The display screen 404 may be used to display information entered by or provided to the user as well as various graphical user interfaces of the electronic device, which may be comprised of images, text, icons, video, and any combination thereof.
The control circuit 405 is electrically connected to the display screen 404, and is configured to control the display screen 404 to display information.
The input unit 406 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint), and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control. The input unit 406 may include a fingerprint recognition module.
The audio circuit 407 may provide an audio interface between the user and the electronic device through a speaker, microphone. Wherein the audio circuit 407 comprises a microphone. The microphone is electrically connected to the processor 401. The microphone is used for receiving voice information input by a user.
The sensor 408 is used to collect external environmental information. The sensors 408 may include one or more of ambient light sensors, acceleration sensors, gyroscopes, etc.
The power supply 409 is used to power the various components of the electronic device 400. In some embodiments, the power source 409 may be logically connected to the processor 401 through a power management system, so that functions of managing charging, discharging, and power consumption are implemented through the power management system.
Although not shown in the drawings, the electronic device 400 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.
In this embodiment, the processor 401 in the electronic device 400 loads instructions corresponding to one or more processes of the computer program into the memory 402 according to the following steps, and the processor 401 runs the computer program stored in the memory 402, so as to implement various functions:
obtaining a test case, and determining a test text, voice attributes and evaluation dimensions according to the test case;
performing voice synthesis processing on the test text according to the voice attribute to obtain test voice;
playing the test voice for the voice equipment to be tested, and acquiring response voice, a response interface and process execution information of the voice equipment to be tested for the test voice;
and obtaining a test result of the voice equipment to be tested on the evaluation dimension according to the response voice, the response interface and the process execution information.
Therefore, the embodiment of the application provides an electronic device, when the electronic device tests a voice device to be tested, a test case is obtained, a test text, a voice attribute and an evaluation dimension are determined according to the test case, voice synthesis processing is performed on the test text according to the voice attribute, a test voice is obtained, then the test voice is played on the voice device to be tested, then response voice, a response interface and process execution information of the device to be tested on the test voice are obtained, and a test result of the voice device to be tested on the evaluation dimension is obtained according to the response voice, the response interface and the process execution information.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program runs on a computer, the computer executes the method for testing a speech system according to any of the above embodiments.
It should be noted that, all or part of the steps in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer readable storage medium, which may include, but is not limited to: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Furthermore, the terms "first", "second", and "third", etc. in this application are used to distinguish different objects, and are not used to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules recited, but rather, some embodiments include additional steps or modules not recited, or inherent to such process, method, article, or apparatus.
The method, the apparatus, the storage medium, and the electronic device for testing the voice system provided in the embodiments of the present application are described in detail above. The principle and the implementation of the present application are explained herein by applying specific examples, and the above description of the embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for testing a speech system, comprising:
obtaining a test case, and determining a test text, voice attributes and evaluation dimensions according to the test case;
performing voice synthesis processing on the test text according to the voice attribute to obtain test voice;
playing the test voice for the voice equipment to be tested, and acquiring response voice, response interface and process execution information of the voice equipment to be tested to the test voice;
and obtaining a test result of the voice equipment to be tested on the evaluation dimension according to the response voice, the response interface and the process execution information.
2. The method of claim 1, wherein the voice attributes comprise a first voice attribute and a second voice attribute; the performing speech synthesis processing on the test text according to the speech attribute to obtain a test speech includes:
acquiring a voice synthesis algorithm matched with the first voice attribute;
converting the test text according to the voice synthesis algorithm to obtain an intermediate test voice matched with the first voice attribute;
and adjusting the intermediate test voice according to the second voice attribute to obtain the test voice.
3. The method of claim 1, wherein the playing the test voice to the voice device to be tested and acquiring the response voice, the response interface and the process execution information of the voice device to be tested to the test voice comprises:
playing the test voice for the voice equipment to be tested, and acquiring the response voice of the voice equipment to be tested to the test voice through a recording module;
and acquiring a response interface displayed by responding to the test voice and a started process from the voice equipment to be tested through the test case, and monitoring a starting result of the process to obtain process execution information.
4. The method of claim 3, wherein the playing the test voice to the voice device to be tested and obtaining the response voice of the voice device to be tested to the test voice through a recording module comprises:
adding preset environmental noise to the test voice to obtain mixed test voice, and playing the mixed test voice to the voice equipment to be tested;
acquiring the voice to be processed of the voice equipment to be tested through a recording module;
carrying out format conversion processing on the voice to be processed to obtain the voice to be processed with a preset audio format;
and carrying out audio preprocessing on the voice to be processed with the preset audio format to obtain response voice.
5. The method of claim 3, wherein the obtaining, from the voice device to be tested, a response interface displayed by responding to the test voice and a started process through the test case, and monitoring a start result of the process to obtain process execution information includes:
compiling the test case to obtain a test file which can run on the voice equipment to be tested;
and sending the test file to the voice equipment to be tested for operation so as to acquire a response interface of the voice equipment to be tested responding to the test voice display, and monitoring a process started by the voice equipment to be tested to obtain process execution information.
6. The method of claim 3, wherein obtaining the test result of the voice device under test in the evaluation dimension according to the response voice, the response interface and the process execution information comprises:
acquiring preset text information, a preset response interface and preset process execution information corresponding to the test voice;
extracting text information in the response voice, and calculating a first matching degree according to the text information and the preset text information;
calculating a second matching degree of the response interface and the preset response interface and a third matching degree between the process execution information and the preset process execution information;
and calculating to obtain a test result of the to-be-tested voice equipment on the evaluation dimension according to the first matching degree, the second matching degree and the third matching degree.
7. The method of any of claims 1 to 6, wherein the evaluation dimension comprises at least one of a wake-up function, a degree of reliability, and a degree of intelligence.
8. An apparatus for testing a speech system, comprising:
the parameter determining module is used for acquiring a test case, and determining a test text, a voice attribute and an evaluation dimension according to the test case;
the voice synthesis module is used for carrying out voice synthesis processing on the test text according to the voice attribute to obtain test voice;
the test interaction module is used for playing the test voice to the voice equipment to be tested and acquiring response voice, response interface and process execution information of the voice equipment to be tested to the test voice;
and the test evaluation module is used for obtaining a test result of the voice equipment to be tested on the evaluation dimension according to the response voice, the response interface and the process execution information.
9. A computer-readable storage medium, on which a computer program is stored which, when run on a computer, causes the computer to carry out a method of testing a speech system according to any one of claims 1 to 7.
10. An electronic device comprising a processor and a memory, said memory storing a computer program, wherein said processor is adapted to execute the method of testing a speech system according to any one of claims 1 to 7 by invoking said computer program.
CN202210625488.1A 2022-06-02 2022-06-02 Voice system testing method and device, storage medium and electronic equipment Pending CN114999457A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210625488.1A CN114999457A (en) 2022-06-02 2022-06-02 Voice system testing method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210625488.1A CN114999457A (en) 2022-06-02 2022-06-02 Voice system testing method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN114999457A true CN114999457A (en) 2022-09-02

Family

ID=83032112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210625488.1A Pending CN114999457A (en) 2022-06-02 2022-06-02 Voice system testing method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114999457A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116893864A (en) * 2023-07-17 2023-10-17 无锡车联天下信息技术有限公司 Method and device for realizing voice assistant of intelligent cabin and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116893864A (en) * 2023-07-17 2023-10-17 无锡车联天下信息技术有限公司 Method and device for realizing voice assistant of intelligent cabin and electronic equipment
CN116893864B (en) * 2023-07-17 2024-02-13 无锡车联天下信息技术有限公司 Method and device for realizing voice assistant of intelligent cabin and electronic equipment

Similar Documents

Publication Publication Date Title
US11670302B2 (en) Voice processing method and electronic device supporting the same
CN112863547B (en) Virtual resource transfer processing method, device, storage medium and computer equipment
CN107147618A (en) A kind of user registering method, device and electronic equipment
CN108694944B (en) Method and apparatus for generating natural language expressions by using a framework
CN108694947B (en) Voice control method, device, storage medium and electronic equipment
US10997965B2 (en) Automated voice processing testing system and method
CN109903773B (en) Audio processing method, device and storage medium
CN108469966A (en) Voice broadcast control method and device, intelligent device and medium
CN108108142A (en) Voice information processing method, device, terminal device and storage medium
CN110175012B (en) Skill recommendation method, skill recommendation device, skill recommendation equipment and computer readable storage medium
CN112840396A (en) Electronic device for processing user words and control method thereof
US20200265843A1 (en) Speech broadcast method, device and terminal
CN106328176B (en) A kind of method and apparatus generating song audio
CN109637536B (en) Method and device for automatically identifying semantic accuracy
CN110808029A (en) Vehicle-mounted machine voice test system and method
CN109119071A (en) A kind of training method and device of speech recognition modeling
CN110225386A (en) A kind of display control method, display equipment
US20220172722A1 (en) Electronic device for processing user utterance and method for operating same
CN111724781B (en) Audio data storage method, device, terminal and storage medium
CN112417107A (en) Information processing method and device
CN114999457A (en) Voice system testing method and device, storage medium and electronic equipment
CN111179907A (en) Voice recognition test method, device, equipment and computer readable storage medium
CN108600559B (en) Control method and device of mute mode, storage medium and electronic equipment
US10976997B2 (en) Electronic device outputting hints in an offline state for providing service according to user context
KR20210001082A (en) Electornic device for processing user utterance and method for operating thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination