WO2020135160A1 - 终端、语音服务器的确定方法和计算机可读存储介质 - Google Patents

终端、语音服务器的确定方法和计算机可读存储介质 Download PDF

Info

Publication number
WO2020135160A1
WO2020135160A1 PCT/CN2019/126018 CN2019126018W WO2020135160A1 WO 2020135160 A1 WO2020135160 A1 WO 2020135160A1 CN 2019126018 W CN2019126018 W CN 2019126018W WO 2020135160 A1 WO2020135160 A1 WO 2020135160A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
voice server
determining
character string
server
Prior art date
Application number
PCT/CN2019/126018
Other languages
English (en)
French (fr)
Inventor
周文杰
罗清刚
Original Assignee
深圳Tcl新技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳Tcl新技术有限公司 filed Critical 深圳Tcl新技术有限公司
Publication of WO2020135160A1 publication Critical patent/WO2020135160A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals

Definitions

  • the present application relates to the field of voice recognition technology, and in particular, to a method for determining a terminal, a voice server, and a computer-readable storage medium.
  • the voice dialogue system has gradually become a popular way of human-computer interaction.
  • GUI graphical user interface
  • Voice recognition is the primary link of voice interaction and has a great impact on the user experience; however, the current mainstream voice recognition services have the following problems: First, the availability is not high, and some service providers sometimes lose their response completely, making the terminal unable to get the voice server to return The results of speech recognition; Second, there are regional differences, in different provinces of the country, the response speed of each service provider is different; thus causing the terminal to use a voice recognition service with poor server quality.
  • the main purpose of the present application is to provide a method for determining a terminal, a voice server, and a computer-readable storage medium, aiming to solve the problem that the terminal uses a voice recognition service with poor server quality.
  • the present application provides a method for determining a voice server.
  • the method for determining a voice server is applied to a terminal.
  • the terminal includes a voice receiving module.
  • the determining party of the voice book type server includes the following steps:
  • the server with the highest service quality score is used as the target voice server.
  • the step of determining the service quality score of each voice server according to the text return duration corresponding to each voice server and the voice recognition quality score includes:
  • the step of determining the target duration according to the text return duration corresponding to the current voice server includes:
  • the preset duration is used as the target duration corresponding to the current server.
  • the step of determining the voice recognition quality score corresponding to each voice server according to each text information includes:
  • the speech recognition quality score of the voice server corresponding to the text information is determined according to the score corresponding to each of the character strings and the number of character strings in the text information.
  • the step of determining the score of each character string in the text information includes:
  • the first preset score is used as the score of the target character string
  • a second preset score is used as the score of the target character string, where the second preset score is less than the first preset score .
  • the step of determining the score of each character string in the text information includes:
  • the score of each character string after the serial number character string is set in the text information is determined as the second preset score
  • Each character string before the set serial number character string is used as the target character string, and the step of determining the truth value corresponding to the target character string is performed.
  • the method further includes:
  • the service priority corresponding to each voice server is saved.
  • the method for determining the voice server further includes:
  • the voice information is sent to the voice server with the highest service priority.
  • the present application also provides a terminal, the terminal includes a voice receiving module, a processor, a memory, and a determination program of a voice server stored on the memory and running on the processor, the When the program for determining the voice server is executed by the processor, each step of the method for determining the voice server as described above is implemented.
  • the present application also provides a computer-readable storage medium that stores a determination program of a voice server, and the determination program of the voice server is executed by a processor to implement the voice as described above The steps of the server determination method.
  • the terminal and the method for determining the voice server and the computer-readable storage medium provided by the present application after receiving the voice information, the terminal sends the voice information to each voice server to receive the text information fed back by each voice server and determine the feedback by each voice server The text return time of the text information, and then determine the voice recognition quality score of each voice server according to each text information, to determine the service quality score of each server according to the voice recognition quality score of each voice server and the text return time, so as to The voice server with the highest quality score is used as the target voice server, that is, the subsequent voice information of the terminal is recognized and fed back by the target voice server, so that the terminal can obtain a voice recognition service with better service quality.
  • FIG. 1 is a schematic diagram of a hardware structure of a terminal involved in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a first embodiment of a method for determining a voice server of this application
  • FIG. 3 is a detailed flowchart of step S300 in FIG. 2;
  • FIG. 4 is a detailed flowchart of step S310 in FIG. 3;
  • FIG. 5 is a schematic flowchart of a second embodiment of a method for determining a voice server of this application
  • FIG. 6 is a schematic flowchart of a third embodiment of a method for determining a voice server of this application.
  • the main solutions of the embodiments of the present application are: after receiving the voice information, sending the voice information to each voice server; receiving the text information fed back by each voice server, and determining that each voice server feeds back the text information Text return time; determine the voice recognition quality score corresponding to each voice server according to each text information, to determine each voice server according to the text return time corresponding to each voice server and the voice recognition quality score Service quality score; use the server with the highest service quality score as the target voice server.
  • the terminal determines the service quality score of each voice server according to the text return time and voice recognition quality score of each voice server, the server with the highest service instruction score is selected as the target server, so that the terminal can obtain a voice recognition service with better service quality .
  • the terminal may be as shown in FIG. 1.
  • the solution of the embodiment of the present application relates to a terminal.
  • the terminal includes: a processor 101, such as a CPU, a memory 102, a communication bus 103, and a voice receiving module 104.
  • the communication bus 103 is configured to implement connection communication between these components.
  • the memory 102 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as a disk memory. As shown in FIG. 1, the memory 102, which is a computer storage medium, may include a voice server determination program; and the processor 101 may be configured to call the voice server determination program stored in the memory 102 and perform the following operations:
  • the server with the highest service quality score is used as the target voice server.
  • the processor 101 may be configured to call the determination program of the voice server stored in the memory 102 and perform the following operations:
  • the processor 101 may be configured to call the determination program of the voice server stored in the memory 102 and perform the following operations:
  • the preset duration is used as the target duration corresponding to the current server.
  • the processor 101 may be configured to call the determination program of the voice server stored in the memory 102 and perform the following operations:
  • the speech recognition quality score of the voice server corresponding to the text information is determined according to the score corresponding to each of the character strings and the number of character strings in the text information.
  • the processor 101 may be configured to call the determination program of the voice server stored in the memory 102 and perform the following operations:
  • the first preset score is used as the score of the target character string
  • a second preset score is used as the score of the target character string, where the second preset score is less than the first preset score .
  • the processor 101 may be configured to call the determination program of the voice server stored in the memory 102 and perform the following operations:
  • the score of each character string after the serial number character string set in the text information is determined as the second preset score
  • Each character string before the set serial number character string is used as the target character string, and the step of determining the truth value corresponding to the target character string is performed.
  • the processor 101 may be configured to call the determination program of the voice server stored in the memory 102 and perform the following operations:
  • the service priority corresponding to each voice server is saved.
  • the processor 101 may be configured to call the determination program of the voice server stored in the memory 102 and perform the following operations:
  • the voice information is sent to the voice server with the highest service priority.
  • the terminal after receiving the voice information, the terminal sends the voice information to each voice server, thereby receiving the text information fed back by each voice server, and determining the text return time of the text information fed back by each voice server, and then according to each text Information to determine the voice recognition quality score of each voice server, to determine the service quality score of each server according to the voice recognition quality score of each voice server and the text return time, so that the voice server with the highest service quality score is used as the target voice server, That is, the subsequent voice information of the terminal is recognized and fed back by the target voice server, so that the terminal can obtain a voice recognition service with better service quality.
  • FIG. 2 is a first embodiment of a determiner of a voice server of the present application.
  • the determiner of the voice server includes the following steps:
  • Step S100 After receiving the voice information, send the voice information to each voice server;
  • the execution subject is a terminal, and the terminal is provided with a voice receiving module.
  • the terminal collects voice information sent by the user through the voice receiving module.
  • the terminal may be a household appliance such as a television, a mobile phone, an air conditioner, and the like.
  • the terminal is communicatively connected to multiple voice servers.
  • the terminal can send voice information to each voice server. After receiving the voice information, each voice server will recognize the voice information to convert the voice information into text information.
  • Step S200 Receive text information fed back by each of the voice servers, and determine the length of time the text returned by each voice server returns the text information;
  • the voice server After converting the voice information into text information, the voice server will feed back the text information to the terminal.
  • the terminal When the terminal receives text information, it will record the text return time of the voice server that returns the text information. Specifically, when the terminal sends voice information to each voice server, it will start timing, and then, after receiving the text returned by the voice server For information, calculate the interval between the time when the voice information is sent and the time when the text information is received. This interval is the text return time of the voice server. Further, the terminal is provided with a preset interval duration.
  • the terminal stops receiving the text information fed back by the voice server, that is, the voice server feeds back the text information after the preset interval duration, and the voice can be deemed
  • the voice service quality of the server is poor, and the preset interval duration can be any suitable value, such as 10s.
  • Step S300 Determine the voice recognition quality score corresponding to each voice server according to each text information, to determine the service of each voice server according to the text return time corresponding to each voice server and the voice recognition quality score Quality score
  • the terminal determines the service quality score of the voice server through the text return time of the voice server and the voice recognition quality score.
  • the voice recognition quality score represents the quality of the text fed back by the voice server.
  • the voice recognition quality score can be passed through the text Information to determine, specifically, please refer to FIG. 3, that is, determining the voice recognition quality score corresponding to each voice server according to each text information in step S300 includes:
  • Step S310 Determine the score of each character string in the text information
  • each character in the text information is characterized by a corresponding character string, and one character corresponds to a unique character string.
  • the character corresponding to the character string u4eca is "present", and the terminal performs each character string in the text information. Score to get the score corresponding to each character string.
  • step S310 includes:
  • Step S311 sequentially determining each character string in the text information as a target character string
  • Step S312 Determine the true value corresponding to the target character string to determine whether the target character string matches the true value
  • Step S313 When the target character string matches the true value, use the first preset score as the score of the target character string;
  • Step S314 When the target character string does not match the true value, use a second preset score as the score of the target character string, where the second preset score is less than the first Set points;
  • the terminal will use each character string in the text information as the target character string in turn, and then determine the true value corresponding to the target character string.
  • the terminal receives the text information from multiple voice servers, and then determines the target character string in the text information Number, for example, the position of the target character string in the text information is the fifth character string (from left to right, from top to bottom, sort each character string in each text message), and then obtain each text
  • the fifth character string in the message, and then determine the number of the same character string take the string with the largest number as the true value corresponding to the target character string, for example, there are five text messages, if there are two sets of the same in the five character strings Strings, where the number of strings in one group is 3 and the number of strings in the other group is 2, then the number of strings of 3 is the true value corresponding to the target string;
  • the target character string After determining the true value corresponding to the target character string, determine whether the target character string matches the true value, that is, determine whether the target character string is consistent with the true value. If they are consistent, the score corresponding to the target character string is the first preset score , If the target character string is not consistent with the true value, the score corresponding to the target character string is the second preset score, the second preset score is less than the first preset score, the first preset score and the second The preset score may be any suitable value, for example, the first preset score is 1, and the second preset score is 0.
  • the score corresponding to each character in the text information is obtained, that is, the score corresponding to each character in each text information is completed.
  • Step S320 Determine the voice recognition quality score of the voice server corresponding to the text information according to the score corresponding to each of the character strings and the number of character strings in the text information;
  • each score is used to obtain the speech recognition quality score of the voice server corresponding to the text information.
  • the speech recognition quality score refer to the following formula:
  • Score i is the score corresponding to the character string
  • Score text is the speech recognition quality score
  • n is the number of character strings in the text information.
  • the sum of the scores corresponding to each character string is divided by the number of each character string in the text information to obtain the voice recognition quality score of the voice server corresponding to the text information.
  • the terminal can calculate the service quality score corresponding to the voice server according to the voice recognition quality score corresponding to the voice server and the text return time. Specifically, the voice recognition quality score The corresponding weight is given to the duration of the text return, so that the speech recognition quality score and the duration of the text return are weighted to obtain the service quality score corresponding to the voice server.
  • the service quality score can be calculated by referring to the following formula:
  • Score tts is the service quality score of the voice server
  • Score text is the voice recognition quality score
  • A is the weight corresponding to the text return duration
  • T is the text return duration
  • B is the weight corresponding to the voice recognition quality.
  • the sum of the weight A and the weight B is 1, and A and B can be any suitable values.
  • A is 0.88 and B is 0.12.
  • the service quality score corresponding to each voice server can be calculated.
  • the setting rule is: the higher the service quality score, the higher the service priority of the voice server, so that each voice The service priority corresponding to the server is saved.
  • Step S400 using the server with the highest service quality score as the target voice server;
  • the voice server with the highest service quality score is used as the target server, so that the terminal sends subsequent voice information to the target voice server, so that the terminal enjoys better service quality voice recognition service.
  • the terminal After the terminal receives the voice information, it will first determine whether the terminal stores the service priority of each voice server. If the terminal does not store the service priority of each voice server, then perform steps S100-S400. When the terminal stores the service priority of each voice server, it sends voice information to the voice server with the highest service priority.
  • the terminal after receiving the voice information, the terminal sends the voice information to each voice server, thereby receiving the text information fed back by each voice server, and determining the text return time of the text information fed back by each voice server, and then Determine the voice recognition quality score of each voice server according to each text information, to determine the service quality score of each server according to the voice recognition quality score of each voice server and the text return time, so as to target the voice server with the highest service quality score
  • the voice server that is, the subsequent voice information of the terminal is recognized and fed back by the target voice server, so that the terminal can obtain a voice recognition service with better service quality.
  • FIG. 5 is a second embodiment of a method for determining a voice server of the present application. Based on the first embodiment, in step S300, a determination is made according to the text return duration corresponding to each of the voice servers and the voice recognition quality score
  • the service quality scores of the voice servers include:
  • Step S330 sequentially using each of the voice servers as the current voice server
  • Step S340 Determine the target duration according to the text return duration corresponding to the current voice server
  • Step S350 Perform weighted calculation on the target duration corresponding to the current voice server and the voice recognition quality score to obtain a service quality score corresponding to the current voice server;
  • the terminal directly calculates the service quality score of the voice server based on the weight of the text return time and the voice recognition quality score; and the text return time of each voice server is less than a certain time period, and the text return of these voice servers can be determined
  • the rate is faster, that is, each voice server whose text return duration is less than the preset duration is regarded as superior in text return rate; for this, the terminal determines the target duration according to the text return duration of the voice server, if the text return duration is less than the preset duration , Then the preset duration is taken as the target duration corresponding to the voice server.
  • the text return duration is taken as the target duration corresponding to the voice server; then by using each voice server as the current voice server in turn, Therefore, the weighted calculation of the target duration corresponding to the current voice server and the voice recognition quality score can obtain the service quality score corresponding to the current voice server, and so on to obtain the service quality score corresponding to each voice server.
  • the preset duration can be any suitable value, for example, 180ms.
  • the terminal determines the target duration of the text returned by the voice server by comparing the text return duration with the preset duration, so that the terminal can reasonably calculate the service quality score of each voice server. High degree of intelligence.
  • FIG. 6 is a third embodiment of a method for determining a voice server of the present application. Based on the first or second embodiment, the step S310 further includes:
  • Step S316 when the number is less than the set number, determine the score of each character string after setting the serial number character string in the text information as the second preset score;
  • Step S317 taking each character string before the set serial number character string as the target character string, and performing the step of determining the true value corresponding to the target character string.
  • the quality of each voice server's conversion of voice information into text information is good or bad, and the number of character strings in text information of poor quality is less than the number of character strings in text information converted by other voice servers (same voice information).
  • the terminal will count the number of character strings corresponding to each text information, thereby determining the set number according to the number of character strings of each text information, for example, there are 5 text information, Among them, the number of character strings in three text messages is 50, the number of character strings in one text message is 35, and one is 45. There is a large difference between 35 and 50, that is, a voice server that provides text messages of 35 character strings The quality of voice recognition is poor.
  • the set number can be set to 36, which is used to characterize the voice recognition quality of the voice server that provides text information below 36 character strings.
  • each character string before setting the serial number character string is taken as the target character string, so as to determine the score of the target character string, that is, execute steps S312-step S314.
  • steps S311-S314 are executed.
  • each character string in the text information has a corresponding serial number, and the sorting method is from left to right and from top to bottom.
  • the terminal reduces the computing resources of the terminal while accurately determining the voice recognition quality of the voice server.
  • the terminal determines the number of character strings in the text information, and if the number is less than the set number, determines the score of each character string after setting the serial number character string in the text information as the second pre Set a score, and use each string before the set serial number string as the target string to determine the score of the target string, so that the terminal can reduce the terminal while accurately determining the voice recognition quality of the voice server Computing resources.
  • the present application also provides a terminal, the terminal includes a voice receiving module, a processor, a memory, and a determination program of a voice server stored on the memory and executable on the processor, and the determination program of the voice server When executed by the processor, each step of the method for determining the voice server described in the above embodiment is implemented.
  • the present application also provides a computer-readable storage medium that stores a voice server determination program, which is executed by a processor to implement the voice server determination as described in the above embodiment The various steps of the method.
  • the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation.
  • the technical solution of the present application can essentially be embodied in the form of software products, and the computer software products are stored in a storage medium (such as ROM/RAM) as described above , Magnetic disks, optical disks), including several instructions to enable a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to perform the method described in each embodiment of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Telephonic Communication Services (AREA)

Abstract

一种语音服务器的确定方法,包括以下步骤:在接收到语音信息后,向各个语音服务器发送语音信息(S100);接收各个语音服务器反馈的文本信息,并确定各个语音服务器反馈文本信息的文本返回时长(S200);根据各个文本信息确定各个语音服务器对应的语音识别质量分值,以根据各个语音服务器对应的文本返回时长以及语音识别质量分值,确定各个语音服务器的服务质量评分(S300);将服务质量评分最高的服务器作为目标语音服务器(S400)。以及一种终端以及计算机可读存储介质。

Description

终端、语音服务器的确定方法和计算机可读存储介质
相关申请
本申请要求2018年12月24日申请的,申请号为201811588241.7,名称为“终端、语音服务器的确定方法和计算机可读存储介质”的中国专利申请的优先权,在此将其全文引入作为参考。
技术领域
本申请涉及语音识别技术领域,尤其涉及一种终端、语音服务器的确定方法和计算机可读存储介质。
背景技术
随着人工智能技术的发展,语音对话系统已经逐渐成为一种流行的人机交互方式,相比于传统的GUI(图像用户界面)交互,语音交互最大的优势是在长文本输入场合的便捷性。
语音识别是语音交互的首要环节,对用户体验影响很大;但目前主流的语音识别服务存在以下问题:一是可用性不高,某些服务商有时会完全失去响应,使得终端无法得到语音服务器返回的语音识别结果;二是有地域差异,在国内不同省份,各服务商响应速度不一样;从而造成终端使用服务器质量较差的语音识别服务。
发明内容
本申请的主要目的在于提供一种终端、语音服务器的确定方法和计算机可读存储介质,旨在解决终端使用服务器质量较差的语音识别服务的问题。
为实现上述目的,本申请提供一种语音服务器的确定方法,所述语音服务器的确定方法应用于终端,所述终端包括语音接收模块,所述语音书别服务器的确定方包括以下步骤:
在接收到语音信息后,向各个语音服务器发送所述语音信息;
接收各个所述语音服务器反馈的文本信息,并确定各个所述语音服务器反馈所述文本信息的文本返回时长;
根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值,以根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分;
将服务质量评分最高的所述服务器作为目标语音服务器。
在一实施例中,所述根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分的步骤包括:
依次将各个所述语音服务器作为当前语音服务器;
根据所述当前语音服务器对应的文本返回时长确定目标时长;
对所述当前语音服务器对应的目标时长以及语音识别质量分值进行加权计算,以得到所述当前语音服务器对应的服务质量评分。
在一实施例中,所述根据所述当前语音服务器对应的文本返回时长确定目标时长的步骤包括:
判断所述当前服务器对应的文本返回时长是否小于预设时长;
在所述当前服务器对应的文本返回时长大于或等于预设时长时,将所述当前服务器对应的文本返回时长,作为所述当前服务器对应的目标时长;
在所述时长小于预设时长时,将所述预设时长作为所述当前服务器对应的目标时长。
在一实施例中,所述根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值的步骤包括:
确定所述文本信息中各个字符串的分值;
根据各个所述字符串对应的分值以及所述文本信息中字符串的数量,确定所述文本信息对应的语音服务器的语音识别质量分值。
在一实施例中,所述确定所述文本信息中各个字符串的分值的步骤包括:
依次将所述文本信息中的各个字符串确定为目标字符串
确定所述目标字符串对应的真值,以判断所述目标字符串是否匹配所述真值;
在所述目标字符串匹配所述真值时,将第一预设分值作为所述目标字符串的分值;
在所述目标字符串不匹配所述真值时,将第二预设分值作为所述目标字符串的分值,其中,所述第二预设分值小于所述第一预设分值。
在一实施例中,所述确定所述文本信息中各个字符串的分值的步骤包括:
确定所述文本信息中字符串的数量;
在所述数量小于设定数量时,将所述文本信息中设定序号字符串之后的 各个字符串的分值,确定为第二预设分值;
将所述设定序号字符串之前的各个字符串作为目标字符串,并执行所述确定所述目标字符串对应的真值的步骤。
在一实施例中,所述根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分的步骤之后,还包括:
根据各个所述语音服务器的服务质量评分,确定各个所语音服务器的服务优先级,其中,所述语音服务器的服务质量评分越大,所述语音服务器的服务优先级越高;
保存各个所述语音服务器对应的服务优先级。
在一实施例中,所述语音服务器的确定方法,还包括:
在接收到语音信息后,判断终端是否存储各个所述语音服务器的服务优先级;
在终端未存储各个所述语音服务器的服务优先级时,执行所述向各个语音服务器发送所述语音信息的步骤;
在所述终端存储各个语音服务器的服务优先级时,将所述语音信息发送至服务优先级最大的所述语音服务器。
为实现上述目的,本申请还提供一种终端,所述终端包括语音接收模块、处理器、存储器和存储在所述存储器上并可在所述处理器上运行的语音服务器的确定程序,所述语音服务器的确定程序被所述处理器执行时实现如上所述的语音服务器的确定方法的各个步骤。
为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质存储有语音服务器的确定程序,所述语音服务器的确定程序被处理器执行时实现如上所述的语音服务器的确定方法的各个步骤。
本申请提供的终端、语音服务器的确定方法和计算机可读存储介质,终端在接收到语音信息后,向各个语音服务器发送语音信息,从而接收各个语音服务器反馈的文本信息,并确定各个语音服务器反馈文本信息的文本返回时长,再根据各个文本信息确定各个语音服务器的语音识别质量分值,以根据各个语音服务器的语音识别质量分值以及文本返回时长来确定各个服务器的服务质量评分,从而将服务质量评分最高的语音服务器作为目标语音服务 器,也即使得终端后续的语音信息均由目标语音服务器进行识别反馈,从而使得终端能够得到服务质量较好的语音识别服务。
附图说明
图1为本申请实施例涉及的终端的硬件结构示意图;
图2为本申请语音服务器的确定方法第一实施例的流程示意图;
图3为图2中步骤S300的细化流程示意图;
图4为图3中步骤S310的细化流程示意图;
图5为本申请语音服务器的确定方法第二实施例的流程示意图;
图6为本申请语音服务器的确定方法第三实施例的流程示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不设置为限定本申请。
本申请实施例的主要解决方案是:在接收到语音信息后,向各个语音服务器发送所述语音信息;接收各个所述语音服务器反馈的文本信息,并确定各个所述语音服务器反馈所述文本信息的文本返回时长;根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值,以根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分;将服务质量评分最高的所述服务器作为目标语音服务器。
由于终端根据各个语音服务器的文本返回时长以及语音识别质量分值,确定各个语音服务器的服务质量评分,从而选取服务指令评分最高的服务器作为目标服务器,使得终端能够得到服务质量较好的语音识别服务。
作为一种实现方案,终端可以如图1所示。
本申请实施例方案涉及的是终端,终端包括:处理器101,例如CPU,存储器102,通信总线103以及语音接收模块104。其中,通信总线103设置为实现这些组件之间的连接通信。
存储器102可以是高速RAM存储器,也可以是稳定的存储器(non-volatilememory),例如磁盘存储器。如图1所示,作为一种计算机存储介质的存储器102中可以包括语音服务器的确定程序;而处理器101可以设置 为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
在接收到语音信息后,向各个语音服务器发送所述语音信息;
接收各个所述语音服务器反馈的文本信息,并确定各个所述语音服务器反馈所述文本信息的文本返回时长;
根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值,以根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分;
将服务质量评分最高的所述服务器作为目标语音服务器。
在一实施例中,处理器101可以设置为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
依次将各个所述语音服务器作为当前语音服务器;
根据所述当前语音服务器对应的文本返回时长确定目标时长;
对所述当前语音服务器对应的目标时长以及语音识别质量分值进行加权计算,以得到所述当前语音服务器对应的服务质量评分。
在一实施例中,处理器101可以设置为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
判断所述当前服务器对应的文本返回时长是否小于预设时长;
在所述当前服务器对应的文本返回时长大于或等于预设时长时,将所述当前服务器对应的文本返回时长,作为所述当前服务器对应的目标时长;
在所述时长小于预设时长时,将所述预设时长作为所述当前服务器对应的目标时长。
在一实施例中,处理器101可以设置为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
确定所述文本信息中各个字符串的分值;
根据各个所述字符串对应的分值以及所述文本信息中字符串的数量,确定所述文本信息对应的语音服务器的语音识别质量分值。
在一实施例中,处理器101可以设置为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
依次将所述文本信息中的各个字符串确定为目标字符串
确定所述目标字符串对应的真值,以判断所述目标字符串是否匹配所述 真值;
在所述目标字符串匹配所述真值时,将第一预设分值作为所述目标字符串的分值;
在所述目标字符串不匹配所述真值时,将第二预设分值作为所述目标字符串的分值,其中,所述第二预设分值小于所述第一预设分值。
在一实施例中,处理器101可以设置为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
确定所述文本信息中字符串的数量;
在所述数量小于设定数量时,将所述文本信息中设定序号字符串之后的各个字符串的分值,确定为第二预设分值;
将所述设定序号字符串之前的各个字符串作为目标字符串,并执行所述确定所述目标字符串对应的真值的步骤。
在一实施例中,处理器101可以设置为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
根据各个所述语音服务器的服务质量评分,确定各个所语音服务器的服务优先级,其中,所述语音服务器的服务质量评分越大,所述语音服务器的服务优先级越高;
保存各个所述语音服务器对应的服务优先级。
在一实施例中,处理器101可以设置为调用存储器102中存储的语音服务器的确定程序,并执行以下操作:
在接收到语音信息后,判断终端是否存储各个所述语音服务器的服务优先级;
在终端未存储各个所述语音服务器的服务优先级时,执行所述向各个语音服务器发送所述语音信息的步骤;
在所述终端存储各个语音服务器的服务优先级时,将所述语音信息发送至服务优先级最大的所述语音服务器。
本实施例根据上述方案,终端在接收到语音信息后,向各个语音服务器发送语音信息,从而接收各个语音服务器反馈的文本信息,并确定各个语音服务器反馈文本信息的文本返回时长,再根据各个文本信息确定各个语音服务器的语音识别质量分值,以根据各个语音服务器的语音识别质量分值以及 文本返回时长来确定各个服务器的服务质量评分,从而将服务质量评分最高的语音服务器作为目标语音服务器,也即使得终端后续的语音信息均由目标语音服务器进行识别反馈,从而使得终端能够得到服务质量较好的语音识别服务。
基于上述终端的硬件构架,提出本申请语音服务器的确定方法的实施例。
参照图2,图2为本申请语音服务器的确定方的第一实施例,所述语音服务器的确定方包括以下步骤:
步骤S100,在接收到语音信息后,向各个语音服务器发送所述语音信息;
在本申请中,执行主体为终端,终端设有语音接收模块,终端通过语音接收模块采集用户发出的语音信息,终端可以是电视机、手机、空调器等家用电器。终端与多个语音服务器通信连接,终端可向各个语音服务器发送语音信息,各个语音服务器在接收到语音信息后,会识别语音信息,以将语音信息转换为文本信息。
步骤S200,接收各个所述语音服务器反馈的文本信息,并确定各个所述语音服务器反馈所述文本信息的文本返回时长;
语音服务器在将语音信息转化为文本信息后,会将文本信息反馈至终端。终端在接收到文本信息时,会记录返回该文本信息的语音服务器的文本返回时长,具体的,终端在向各个语音服务器发送语音信息时,会开始计时,然后,在接收到语音服务器反馈的文本信息,计算语音信息发送时间点与文本信息接收时间点之间的间隔时长,该间隔时长即为语音服务器的文本返回时长。进一步的,终端设有预设间隔时长,在计时时长达到预设间隔时长时,终端停止接收语音服务器反馈的文本信息,也即语音服务器在预设间隔时长后反馈文本信息,即可认定该语音服务器的语音服务质量较差,预设间隔时长可为任意合适的数值,例如10s。
步骤S300,根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值,以根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分;
在本申请中,终端通过语音服务器的文本返回时长以及语音识别质量分值来确定该语音服务器的服务质量评分,语音识别质量分值表征语音服务器反馈的文本的质量,语音识别质量分数可通过文本信息来确定,具体的,请 参照图3,也即步骤S300中根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值包括:
步骤S310,确定所述文本信息中各个字符串的分值;
在本申请中,文本信息中的各个文字用对应的字符串来表征,一个文字对应唯一的字符串,比如,字符串u4eca对应的文字为“今”,终端为文本信息中的各个字符串进行打分,以得到各个字符串对应的分值,具体的,请参照图4,即步骤S310包括:
步骤S311,依次将所述文本信息中的各个字符串确定为目标字符串;
步骤S312,确定所述目标字符串对应的真值,以判断所述目标字符串是否匹配所述真值;
步骤S313,在所述目标字符串匹配所述真值时,将第一预设分值作为所述目标字符串的分值;
步骤S314,在所述目标字符串不匹配所述真值时,将第二预设分值作为所述目标字符串的分值,其中,所述第二预设分值小于所述第一预设分值;
终端会再将文本信息中的各个字符串依次作为目标字符串,然后确定目标字符串对应的真值,具体的,终端接收由多个语音服务器的文本信息,然后确定目标字符串在文本信息中的序号,例如,目标字符串在文本信息中的位置为第五个字符串(按照从左至右,从上往下,对各个文本信息中的各个字符串进行排序),然后,获取各个文本信息中第五个字符串,再确定相同字符串的数量,将数量最多的字符串作为目标字符串对应的真值,比如,有五个文本信息,若5个字符串中有2组相同的字符串,其中一组的字符串数量为3,另外一组的字符串数量为2个,那么数量为3个的字符串即为目标字符串对应的真值;
在确定目标字符串对应的真值后,判断目标字符串是否匹配真值,也即判断目标字符串是否与真值一致,若是一致,该目标字符串对应的分值为第一预设分值,若目标字符串与真值不一致,则该目标字符串对应的分值为第二预设分值,第二预设分值小于第一预设分值,第一预设分值与第二预设分值可为任意合适的数值,比如,第一预设分值为1,第二预设分值为0。
以此类推,即得到文本信息中各个字符对应的分值,也即完成各个文本信息中各个字符对应的分值。
步骤S320,根据各个所述字符串对应的分值以及所述文本信息中字符串的数量,确定所述文本信息对应的语音服务器的语音识别质量分值;
在得到各个字符串对应的分值后,再统计文本信息中字符串的数量,有各个分值来得到文本信息对应的语音服务器的语音识别质量分值,语音识别质量分值的获取可参照如下公式:
Figure PCTCN2019126018-appb-000001
其中,Score i为字符串对应的分值,Score text为语音识别质量分值,n为文本信息中的字符串的数量。
通过上述公式可知,各个字符串对应的分值之和除以文本信息中各个字符串的数量,即可得到文本信息对应的语音服务器的语音识别质量分值。
在得到语音服务器对应的语音识别质量分值后,终端即可根据语音服务器对应的语音识别质量分值与文本返回时长来计算该语音服务器对应的服务质量评分,具体的,对语音识别质量分值与文本返回时长赋予对应的权重,从而对语音识别质量分值以及文本返回时长进行加权计算,从而得到语音服务器对应的服务质量评分,服务质量评分可参照如下公式计算得到:
Score tts=A/T+B*Score text
其中,Score tts为语音服务器的服务质量评分,Score text为语音识别质量分值,A为文本返回时长对应的权重,T为文本返回时长,B为语音识别质量对应的权重。
权重A与权重B之和为1,A与B可为任意合适的数值,比如,A为0.88,B为0.12。
通过上述公式,即可计算得到各个语音服务器对应的服务质量评分,服务质量评分越高,表明该语音服务器提供的语音识别服务越好。
在得到各个语音服务器对应的服务质量评分后,根据服务质量评分对各个语音服务器进行服务优先级的设置,设置规则为:服务质量评分越高的语音服务器的服务优先级越大,从而将各个语音服务器对应的服务优先级保存。
步骤S400,将服务质量评分最高的所述服务器作为目标语音服务器;
终端在得到各个语音服务器的服务质量评分后,将服务质量评分最高的 语音服务器作为目标服务器,使得终端将后续的语音信息发送至该目标语音服务器,从而使得终端享受较好的服务质量的语音识别服务。
需要说明是,在当终端接收到语音信息后,会先判断终端是否存储有各个语音服务器的服务优先级,若是终端未存储各个语音服务器的服务优先级是,则执行步骤S100-步骤S400,若是终端存储各个语音服务器的服务优先级时,则将语音信息发送至服务优先级最大的语音服务器。
在本实施例提供的技术方案中,终端在接收到语音信息后,向各个语音服务器发送语音信息,从而接收各个语音服务器反馈的文本信息,并确定各个语音服务器反馈文本信息的文本返回时长,再根据各个文本信息确定各个语音服务器的语音识别质量分值,以根据各个语音服务器的语音识别质量分值以及文本返回时长来确定各个服务器的服务质量评分,从而将服务质量评分最高的语音服务器作为目标语音服务器,也即使得终端后续的语音信息均由目标语音服务器进行识别反馈,从而使得终端能够得到服务质量较好的语音识别服务。
参照图5,图5为本申请语音服务器的确定方法的第二实施例,基于第一实施例,所述步骤S300中根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分的包括:
步骤S330,依次将各个所述语音服务器作为当前语音服务器;
步骤S340,根据所述当前语音服务器对应的文本返回时长确定目标时长;
步骤S350,对所述当前语音服务器对应的目标时长以及语音识别质量分值进行加权计算,以得到所述当前语音服务器对应的服务质量评分;
在一实施例中,终端直接根据文本返回时长以及语音识别质量分值加权计算得到语音服务器的服务质量评分;而各个语音服务器的文本返回时长在小于一定的时长,即可认定这些语音服务器文本返回速率较快,也即将文本返回时长小于预设时长的各个语音服务器在文本返回速率均视为优等;对此,终端根据语音服务器的文本返回时长来确定目标时长,若是文本返回时长小于预设时长,则将预设时长作为语音服务器对应的目标时长,若是文本返回时长大于或等于预设时长,则将文本返回时长作为语音服务器对应的目标时长;然后通过将各个语音服务器依次作为当前语音服务器,从而对当前语音服务器对应的目标时长以及语音识别质量分值进行加权计算,即可得到当前 语音服务器对应的服务质量评分,以此类推,得到各个语音服务器对应的服务质量评分。
预设时长可为任意合适的数值,比如,180ms。
在本实施例提供的技术方案中,终端通过文本返回时长与预设时长的比对,来确定语音服务器返回文本的目标时长,从而使得终端能够合理的计算各个语音服务器的服务质量评分,终端的智能化程度高。
参照图6,图6为本申请语音服务器的确定方法的第三实施例,基于第一或第二实施例,所述步骤S310还包括:
步骤S315,确定所述文本信息中字符串的数量;
步骤S316,在所述数量小于设定数量时,将所述文本信息中设定序号字符串之后的各个字符串的分值,确定为第二预设分值;
步骤S317,将所述设定序号字符串之前的各个字符串作为目标字符串,并执行所述确定所述目标字符串对应的真值的步骤。
各个语音服务器将语音信息转换为文本信息的质量有好有差,质量较差的文本信息中字符串的数量比其他语音服务器转换的文本信息中的字符串数量少(同一语音信息)。对此,终端在得到各个语音服务器返回的文本信息后,会统计各个文本信息对应的字符串数量,由此根据各个文本信息的字符串数量来确定设定数量,例如,有5个文本信息,其中3个文本信息中字符串数量为50,1个文本信息中的字符串数量为35个,一个为45个,35与50相差较大,也即提供35个字符串的文本信息的语音服务器的语音识别质量较差,此时可将设定数量设置为36,用以表征提供36个字符串以下的文本信息的语音服务器的语音识别质量较差。
对于语音识别质量较差的语音服务器与语音识别质量较好的语音服务器,有不同的字符串分值确定方法,具体的,由于某一文本信息中的字符串与其他文本信息中的字符串的数量差较大,即可认定该文本信息对应的语音服务器的语音识别质量差,此时,将该文本信息中设定序号字符串之后的各个字符串的分值确定为第二预设分值,并同时将设定序号字符串之前的各个字符串作为目标字符串,从而对目标字符串进行分值的确定,也即执行步骤S312-步骤S314。当然,在当文本信息中的数量大于设定数量时,则执行步骤S311-步骤S314。需要说明的是,文本信息中各个字符串有对应的序号,排序 的方式按照从左至右以及从上至下的顺序。
通着这种方式,使得终端在准确确定语音服务器的语音识别质量的同时,减小终端的计算资源。
在本实施例提供的技术方案中,终端确定文本信息中字符串的数量,若数量小于设定数量时,将文本信息中设定序号字符串之后的各个字符串的分值确定为第二预设分值,并将设定序号字符串之前的各个字符串作为目标字符串,以对目标字符串进行分值的确定,从而使得终端在准确确定语音服务器的语音识别质量的同时,减小终端的计算资源。
本申请还提供一种终端,所述终端包括语音接收模块、处理器、存储器和存储在所述存储器上并可在所述处理器上运行的语音服务器的确定程序,所述语音服务器的确定程序被所述处理器执行时实现如上实施例所述的语音服务器的确定方法的各个步骤。
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质存储有语音服务器的确定程序,所述语音服务器的确定程序被处理器执行时实现如上实施例所述的语音服务器的确定方法的各个步骤。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对示例性技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的可选实施例,并非因此限制本申请的专利范围,凡是 利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (15)

  1. 一种语音服务器的确定方法,其中,所述语音服务器的确定方法应用于终端,所述终端包括语音接收模块,所述语音服务器的确定方包括以下步骤:
    在接收到语音信息后,向各个语音服务器发送所述语音信息;
    接收各个所述语音服务器反馈的文本信息,并确定各个所述语音服务器反馈所述文本信息的文本返回时长;
    根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值,以根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分;以及
    将服务质量评分最高的所述服务器作为目标语音服务器。
  2. 如权利要求1所述的语音服务器的确定方法,其中,所述接收各个所述语音服务器反馈的文本信息的步骤包括:
    接收预设时间间隔时长内的各个所述语音服务器反馈的文本信息。
  3. 如权利要求1所述的语音服务器的确定方法,其中,所述根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分的步骤包括:
    依次将各个所述语音服务器作为当前语音服务器;
    根据所述当前语音服务器对应的文本返回时长确定目标时长;
    对所述当前语音服务器对应的目标时长以及语音识别质量分值进行加权计算,以得到所述当前语音服务器对应的服务质量评分。
  4. 如权利要求3所述的语音服务器的确定方法,其中,所述根据所述当前语音服务器对应的文本返回时长确定目标时长的步骤包括:
    判断所述当前服务器对应的文本返回时长是否小于预设时长;
    在所述当前服务器对应的文本返回时长大于或等于预设时长时,将所述当前服务器对应的文本返回时长,作为所述当前服务器对应的目标时长;
    在所述时长小于预设时长时,将所述预设时长作为所述当前服务器对应的目标时长。
  5. 如权利要求1所述的语音服务器的确定方法,其中,所述根据各个所 述文本信息确定各个所述语音服务器对应的语音识别质量分值的步骤包括:
    确定所述文本信息中各个字符串的分值;
    根据各个所述字符串对应的分值以及所述文本信息中字符串的数量,确定所述文本信息对应的语音服务器的语音识别质量分值。
  6. 如权利要求5所述的语音服务器的确定方法,其中,所述确定所述文本信息中各个字符串的分值的步骤包括:
    依次将所述文本信息中的各个字符串确定为目标字符串;
    确定所述目标字符串对应的真值,以判断所述目标字符串是否匹配所述真值;
    在所述目标字符串匹配所述真值时,将第一预设分值作为所述目标字符串的分值;
    在所述目标字符串不匹配所述真值时,将第二预设分值作为所述目标字符串的分值,其中,所述第二预设分值小于所述第一预设分值。
  7. 如权利要求6所述的语音服务器的确定方法,其中,所述确定所述文本信息中各个字符串的分值的步骤包括:
    确定所述文本信息中字符串的数量;
    在所述数量小于设定数量时,将所述文本信息中设定序号字符串之后的各个字符串的分值,确定为第二预设分值;
    将所述设定序号字符串之前的各个字符串作为目标字符串,并执行所述确定所述目标字符串对应的真值的步骤。
  8. 如权利要求7所述的语音服务器的确定方法,其中,所述确定所述文本信息中字符串的数量的步骤之后,还包括:
    在所述数量大于或等于设定数量时,执行所述依次将所述文本信息中的各个字符串确定为目标字符串的步骤。
  9. 如权利要求7所述的语音服务器的确定方法,其特征在于,各个所述文本信息中字符串的排序方式相同。
  10. 如权利要求7所述的语音服务器的确定方法,其中,所述设定数量根据各个所述文本信息的字符串数量确定。
  11. 如权利要求6所述的语音服务器的确定方法,其中,所述确定所述目标字符串对应的真值的步骤包括:
    确定所述目标字符串在所述文本信息中的位置;
    根据所述位置在各个所述文本信息中确定待确定字符串,各个所述待确定字符串在对应的文本信息中的位置相同,所述待确定字符串包括目标字符串;
    确定相同的待确定字符串的数量,并将数量最多的待确定字符串对应的真值确定为所述目标字符串对应的真值。
  12. 如权利要求1所述的语音服务器的确定方法,其中,所述根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分的步骤之后,还包括:
    根据各个所述语音服务器的服务质量评分,确定各个所语音服务器的服务优先级,其中,所述语音服务器的服务质量评分越大,所述语音服务器的服务优先级越高;
    保存各个所述语音服务器对应的服务优先级。
  13. 如权利要求1所述的语音服务器的确定方法,其中,所述语音服务器的确定方法,还包括:
    在接收到语音信息后,判断终端是否存储各个所述语音服务器的服务优先级;
    在终端未存储各个所述语音服务器的服务优先级时,执行所述向各个语音服务器发送所述语音信息的步骤;
    在所述终端存储各个语音服务器的服务优先级时,将所述语音信息发送至服务优先级最大的所述语音服务器。
  14. 一种终端,其中,所述终端包括语音接收模块、处理器、存储器和存储在所述存储器上并可在所述处理器上运行的语音服务器的确定程序,所述语音服务器的确定程序被所述处理器执行时实现如下步骤:
    在接收到语音信息后,向各个语音服务器发送所述语音信息;
    接收各个所述语音服务器反馈的文本信息,并确定各个所述语音服务器反馈所述文本信息的文本返回时长;
    根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值,以根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分;以及
    将服务质量评分最高的所述服务器作为目标语音服务器。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有语音服务器的确定程序,所述语音服务器的确定程序被处理器执行时实现如下步骤:
    在接收到语音信息后,向各个语音服务器发送所述语音信息;
    接收各个所述语音服务器反馈的文本信息,并确定各个所述语音服务器反馈所述文本信息的文本返回时长;
    根据各个所述文本信息确定各个所述语音服务器对应的语音识别质量分值,以根据各个所述语音服务器对应的文本返回时长以及语音识别质量分值,确定各个所述语音服务器的服务质量评分;以及
    将服务质量评分最高的所述服务器作为目标语音服务器。
PCT/CN2019/126018 2018-12-24 2019-12-17 终端、语音服务器的确定方法和计算机可读存储介质 WO2020135160A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811588241.7 2018-12-24
CN201811588241.7A CN109493862B (zh) 2018-12-24 2018-12-24 终端、语音服务器的确定方法和计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2020135160A1 true WO2020135160A1 (zh) 2020-07-02

Family

ID=65711869

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/126018 WO2020135160A1 (zh) 2018-12-24 2019-12-17 终端、语音服务器的确定方法和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN109493862B (zh)
WO (1) WO2020135160A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437333A (zh) * 2020-11-10 2021-03-02 深圳Tcl新技术有限公司 节目播放方法、装置、终端设备以及存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493862B (zh) * 2018-12-24 2021-11-09 深圳Tcl新技术有限公司 终端、语音服务器的确定方法和计算机可读存储介质
CN113327571B (zh) * 2021-06-18 2023-08-04 京东科技控股股份有限公司 语音合成代理方法、装置、电子设备和可读存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006208644A (ja) * 2005-01-27 2006-08-10 Toppan Printing Co Ltd 語学会話力測定サーバシステム及び語学会話力測定方法
EP1705562A1 (en) * 2005-03-18 2006-09-27 Orange SA Applications server and method of providing services
CN103247291A (zh) * 2013-05-07 2013-08-14 华为终端有限公司 一种语音识别设备的更新方法、装置及系统
CN103440867A (zh) * 2013-08-02 2013-12-11 安徽科大讯飞信息科技股份有限公司 语音识别方法及系统
CN103956168A (zh) * 2014-03-29 2014-07-30 深圳创维数字技术股份有限公司 一种语音识别方法、装置及终端
US9247059B1 (en) * 2014-11-03 2016-01-26 Verizon Patent And Licensing Inc. Priority token-based interactive voice response server
CN107564525A (zh) * 2017-10-23 2018-01-09 深圳北鱼信息科技有限公司 语音识别方法及装置
CN109493862A (zh) * 2018-12-24 2019-03-19 深圳Tcl新技术有限公司 终端、语音服务器的确定方法和计算机可读存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103117058B (zh) * 2012-12-20 2015-12-09 四川长虹电器股份有限公司 基于智能电视平台的多语音引擎切换系统及方法
CN103077718B (zh) * 2013-01-09 2015-11-25 华为终端有限公司 语音处理方法、系统和终端
CN103677729B (zh) * 2013-12-18 2017-02-08 北京搜狗科技发展有限公司 一种语音输入方法和系统
JP6440513B2 (ja) * 2014-05-13 2018-12-19 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 音声認識機能を用いた情報提供方法および機器の制御方法
CN107545887A (zh) * 2016-06-24 2018-01-05 中兴通讯股份有限公司 语音指令处理方法及装置
CN107170450B (zh) * 2017-06-14 2021-03-12 上海智蕙林医疗科技有限公司 语音识别方法及装置
CN107979856B (zh) * 2017-11-22 2020-10-27 深圳市沃特沃德股份有限公司 连接引擎的方法与装置

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006208644A (ja) * 2005-01-27 2006-08-10 Toppan Printing Co Ltd 語学会話力測定サーバシステム及び語学会話力測定方法
EP1705562A1 (en) * 2005-03-18 2006-09-27 Orange SA Applications server and method of providing services
CN103247291A (zh) * 2013-05-07 2013-08-14 华为终端有限公司 一种语音识别设备的更新方法、装置及系统
CN103440867A (zh) * 2013-08-02 2013-12-11 安徽科大讯飞信息科技股份有限公司 语音识别方法及系统
CN103956168A (zh) * 2014-03-29 2014-07-30 深圳创维数字技术股份有限公司 一种语音识别方法、装置及终端
US9247059B1 (en) * 2014-11-03 2016-01-26 Verizon Patent And Licensing Inc. Priority token-based interactive voice response server
CN107564525A (zh) * 2017-10-23 2018-01-09 深圳北鱼信息科技有限公司 语音识别方法及装置
CN109493862A (zh) * 2018-12-24 2019-03-19 深圳Tcl新技术有限公司 终端、语音服务器的确定方法和计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437333A (zh) * 2020-11-10 2021-03-02 深圳Tcl新技术有限公司 节目播放方法、装置、终端设备以及存储介质
CN112437333B (zh) * 2020-11-10 2024-02-06 深圳Tcl新技术有限公司 节目播放方法、装置、终端设备以及存储介质

Also Published As

Publication number Publication date
CN109493862A (zh) 2019-03-19
CN109493862B (zh) 2021-11-09

Similar Documents

Publication Publication Date Title
WO2020135160A1 (zh) 终端、语音服务器的确定方法和计算机可读存储介质
CN109961780B (zh) 一种人机交互方法、装置、服务器和存储介质
TWI690919B (zh) 語音關鍵字識別方法、裝置、終端、伺服器、電腦可讀儲存介質及電腦程式產品
WO2018036555A1 (zh) 会话处理方法及装置
JP6309539B2 (ja) 音声入力を実現する方法および装置
WO2017166650A1 (zh) 语音识别方法及装置
US10270736B2 (en) Account adding method, terminal, server, and computer storage medium
CN107688398B (zh) 确定候选输入的方法和装置及输入提示方法和装置
WO2015090137A1 (en) A voice message search method, device, and system
WO2020087655A1 (zh) 一种翻译方法、装置、设备及可读存储介质
WO2022134421A1 (zh) 基于多知识图谱的智能答复方法、装置、计算机设备及存储介质
CN105045919B (zh) 一种信息输出方法及装置
CN104462051B (zh) 分词方法及装置
US10621516B2 (en) Content delivery method, apparatus, and storage medium
US10453477B2 (en) Method and computer system for performing audio search on a social networking platform
WO2020257991A1 (zh) 用户识别方法及相关产品
WO2015196987A1 (zh) 支持自然语言的数据查询方法、开放平台及用户终端
US10897368B2 (en) Integrating an interactive virtual assistant into a meeting environment
WO2019196238A1 (zh) 一种语音识别方法、终端设备及计算机可读存储介质
CN112002311A (zh) 文本纠错方法、装置、计算机可读存储介质及终端设备
CN108536680B (zh) 一种房产信息的获取方法和装置
WO2023273776A1 (zh) 语音数据的处理方法及装置、存储介质、电子装置
WO2018176705A1 (zh) 语音业务应答的方法及装置
CN110838284B (zh) 一种语音识别结果的处理方法、装置和计算机设备
WO2019041284A1 (zh) 资源搜索方法及相关产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19903403

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19903403

Country of ref document: EP

Kind code of ref document: A1