CN107025393B

CN107025393B - Resource calling method and device

Info

Publication number: CN107025393B
Application number: CN201611123587.0A
Authority: CN
Inventors: 项臻; 方莉娜; 周苏强; 刘阳; 陆殷
Original assignee: Alibaba Group Holding Ltd
Current assignee: Advanced Nova Technology Singapore Holdings Ltd
Priority date: 2016-12-08
Filing date: 2016-12-08
Publication date: 2020-07-10
Anticipated expiration: 2036-12-08
Also published as: CN107025393A

Abstract

The application discloses a method and a device for resource calling, firstly, according to a received resource calling request, verification characters are displayed, then, audio to be recognized input by a user according to the verification characters is collected, voice recognition is further carried out on the audio to be recognized, and finally, whether the resource is allowed to be used or not is determined according to a voice recognition result of the audio to be recognized and reference characters preset aiming at the verification characters. Therefore, in the method and the device, when the resource is called, the user does not need to input the character password any more, the audio to be identified is only needed to be input according to the displayed verification character, and the reference character is compared with the identification result to determine whether the resource is allowed to be called or not, so that the complicated operation of manually inputting the character password by the user is avoided, and the resource calling efficiency is improved.

Description

Resource calling method and device

Technical Field

The present application relates to the field of information technologies, and in particular, to a method and an apparatus for resource invocation.

Background

With the development of information technology, more and more services are executed through a network, wherein a common service is a service for calling resources through the network, and since many resources can be called through the network, a terminal does not need to store the resources locally, but calls the resources through the network when needed.

Generally, according to the difference of sharing modes, resources shared on a network can be divided into unconditionally shared resources and conditionally shared resources, wherein any user in the former can call the unconditionally shared resources, and only users meeting the conditions can call the non-shared resources. A conditional shared resource is generally determined by setting an extraction code, setting a password, and the like (hereinafter, the extraction code, the password, and the like are collectively referred to as a character password), and granting the qualified user the right to invoke the shared resource (i.e., the resource can be invoked). For example, user a shares a photograph but only wants to be visible to a user allowed by him/herself (e.g., the friends of user a), so user a can invoke the photograph by setting a character password that only users who know the character password can call, wherein the user who entered the character password can be considered a qualified user with permission to invoke the photograph.

However, in the prior art, the character password input by the user when the conditional shared resource is called is generally characters such as numbers, letters, special symbols and the like, which requires manual input by the user to cause tedious operation of the user on one hand, and increases the probability of error input by the user to cause low efficiency of resource calling on the other hand.

Disclosure of Invention

The embodiment of the application provides a method for resource calling, which is used for solving the problem of low efficiency of resource calling due to complex operation of an input mode when the resource is called through a character password in the prior art.

The embodiment of the application provides a device for resource calling, which is used for solving the problem that in the prior art, when a resource is called through a character password, the efficiency of resource calling is low due to the complex operation of an input mode.

The embodiment of the application adopts the following technical scheme:

a method of resource invocation, comprising:

receiving a resource calling request;

displaying a verification character according to the resource calling request;

collecting audio to be identified input by a user according to the verification characters;

performing voice recognition on the audio to be recognized;

and determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a reference character preset for the verification character.

A method of resource invocation, comprising:

receiving a resource calling request;

playing a standard pronunciation according to the resource calling request;

collecting the audio to be recognized input by the user according to the played standard pronunciation;

performing voice recognition on the audio to be recognized;

and determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a preset reference character aiming at the standard pronunciation.

A method of resource invocation, comprising:

receiving a red packet acquisition request;

displaying verification characters according to the red packet acquisition request;

performing voice recognition on the audio to be recognized;

and determining whether the red packet is allowed to be acquired or not according to the recognition result of the audio to be recognized and a reference character preset for the verification character.

An apparatus of resource invocation, comprising:

the receiving module receives a resource calling request;

the display module displays verification characters according to the resource calling request;

the acquisition module is used for acquiring the audio to be identified input by the user according to the verification character;

the recognition module is used for carrying out voice recognition on the audio to be recognized;

and the comparison calling module is used for determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and the reference character preset for the verification character.

An apparatus of resource invocation, comprising:

the receiving module receives a resource calling request;

the playing module plays the standard pronunciation according to the resource calling request;

the acquisition module is used for acquiring the audio to be recognized input by the user according to the played standard pronunciation;

and the comparison calling module is used for determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and the preset reference character aiming at the standard pronunciation.

An apparatus of resource invocation, comprising:

the receiving module receives a red packet acquisition request;

the display module displays verification characters according to the red packet acquisition request;

and the comparison calling module is used for determining whether the red packet is allowed to be acquired or not according to the recognition result of the audio to be recognized and the reference character preset for the verification character.

The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects:

firstly, according to a received resource calling request, displaying verification characters, then collecting audio to be recognized input by a user according to the verification characters, further carrying out voice recognition on the audio to be recognized, and finally, according to a voice recognition result of the audio to be recognized and reference characters preset aiming at the verification characters, determining whether the resource is allowed to be used or not. Therefore, in the application, when the resource is called, the user does not need to input the character password, but only needs to input the corresponding audio to be identified according to the displayed verification character, and whether the resource is called can be determined according to the result of comparing the identification result with the reference character. The complicated operation of manually inputting the character password by the user is avoided, the error probability of inputting the character password by the user is avoided, and the resource calling efficiency is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a process of resource invocation provided in an embodiment of the present application;

fig. 2 is a schematic diagram of an interface of a verification page provided in an embodiment of the present application;

FIG. 3 is a schematic diagram of an interface of another verification page provided in an embodiment of the present application;

FIG. 4 is a detailed process of resource invocation provided by an embodiment of the present application;

fig. 5 is a process of another resource invocation provided by an embodiment of the present application;

fig. 6 is a process of another resource invocation provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of an interface of another verification page provided in an embodiment of the present application;

fig. 8 is a schematic structural diagram of a device for resource invocation according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of another apparatus for resource invocation according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of another apparatus for resource invocation according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 1 is a process of resource invocation provided in an embodiment of the present application, which specifically includes the following steps:

s101: a resource invocation request is received.

S102: and displaying verification characters according to the resource calling request.

In the prior art, generally, resource calling between servers can be automatically performed through a preset calling protocol, and when a terminal needs to call a resource from a server, if the resource has a corresponding calling condition, the terminal needs to acquire a verification page first, and can call the resource after passing the verification of the verification page, wherein a server acquiring the verification page and a server providing the resource may be the same server or different servers.

Therefore, similarly in the embodiment of the present application, the terminal may determine the resource invocation request, and when the terminal determines the resource invocation request, the terminal may monitor the operation of the user and receive the resource invocation request generated according to the operation of the user, so that the terminal may send the resource invocation request to the server for subsequent operation. Of course, the executing entity in step S101 may also be the server, that is, the server may receive the resource invocation request sent by the terminal, that is, the present application does not limit whether the executing entity in step S101 is the terminal or the server, and for convenience of the following description, it is needless to say that the following embodiments provided in the present application are all regarded as processes for executing resource invocation by the terminal.

Specifically, since the resource call may be regarded as a service, after the terminal receives the resource call request through the operation of the user, the terminal may further send the resource call request to the service server, that is, the server may be a service server.

And then, the terminal can also receive a verification page returned by the service server, wherein the verification page can be a preset verification page of the service server, the service server can return the verification page to the terminal after receiving the resource calling request, and the content of the verification page can be set by staff according to the requirements. Of course, the terminal is not limited in this application to have to send the resource invocation request to the service server, and the resource invocation request may be set by a worker according to the needs of the actual application (e.g., which server is set to receive the resource invocation request).

In addition, since the verification page is a page that prompts the user to perform a verification operation, the verification page may display a verification character so that the user may perform an operation for verification based on the verification character, where the operation may include: a clicking gesture operation, a long-pressing gesture operation, and the like, the verification character is a character corresponding to at least one language, that is, the verification character may include: at least one of chinese characters, english characters, japanese characters, german characters, korean characters, french characters, vietnamese characters, thai characters, spanish characters, latin characters, russian characters, bangladesh characters, portuguese characters, italian characters, hindi characters, arabic characters, and the like. For example, the verification page may display "Hello" when the verification character is english, may display "GutenTag" when the verification character is german, may display "Bonjour" when the verification character is french, and the like.

It should be noted that the verification page may be a web page opened by any application, where the application may be browser software, instant messaging software, etc., and since the web page opened by the application is a well-established method in the prior art, this application does not describe this much, and in addition, the framework of the verification page may employ HyperText Markup language fifth edition (HTM L), and since the operation of the user in the verification page is involved, the verification page may also carry a script language (JavaScript, JS) code, VBScript, or Practical extract and Report language (PER L), etc., which is not particularly limited in this application.

In the embodiment of the present application, the service server may be a single device, or may be a system composed of multiple devices, that is, a distributed server, and the terminal may be a device such as a mobile phone, a tablet computer, or a personal computer.

S103: and acquiring the audio to be identified input by the user according to the verification character.

In the embodiment of the application, after the terminal receives the verification page, since the user can perform corresponding operation for verification according to the verification requirement of the verification page, the terminal can visually monitor the operation of the user and send the audio to be recognized, which is acquired according to the operation, to the voice recognition server, so as to facilitate the operation of the subsequent steps.

Specifically, the verification page may carry a scripting language (JavaScript, JS) code, that is, the terminal may execute the JS code by receiving the verification page and executing the verification page. The verification page may also carry prompt information, so that a user using the terminal may determine how to operate to perform verification after the terminal displays the verification page, for example, the prompt information may be: please say "hello" of japanese with me, "hello" that can press the lower button and say japanese, etc., so that after the terminal displays the verification page, the user can determine what kind of operation needs to be performed, wherein JS code of the verification page may further include JS code that calls a sensor interface of the terminal and JS code of a recording button, so as to subsequently perform operations of collecting audio to be recognized, etc., input by the user according to the verification character.

As shown in fig. 2, it can be seen that the verification character in the verification interface is a french character "Bonjour", and the prompt information is: please say "hello" in french with me, prompt information: press the lower button and speak the french "hello" and record button.

In addition, since the verification characters displayed in the verification page may be in a language other than chinese, the audio to be recognized collected by the terminal may be speech in a different language.

Further, since the user cannot completely determine how to pronounce the non-chinese language, for the verification character displayed on the verification page, the user may not necessarily input the corresponding audio to be recognized according to the verification character, and therefore, in the present application, the verification page may further carry a standard pronunciation corresponding to the verification character and a JS code of the corresponding play button, as shown in fig. 3. In the interface of the verification page shown in fig. 3, it can be seen that the verification character in the verification interface is a french character "Bonjour", and the prompt information: please say "hello" in french with me, prompt information: press the lower button and speak "hello" in french, record button, and play button. The terminal can play the standard pronunciation by monitoring the gesture operation of the user on the play key. Then, the terminal collects the audio to be recognized which is input by the user according to the played standard pronunciation.

Furthermore, since the JS code of the recording key, the JS code of the playback key, and the JS code for calling the sensor interface of the terminal can be carried in the verification page, the terminal can monitor the operation of the user on the key, and when the operation of the user on the key is monitored, the sensor interface of the terminal is called to collect the audio to be recognized, which is input by the user according to the played standard pronunciation, by operating the JS code of the verification page. The user operation may be a click gesture operation, a long-press gesture operation, and the like, and the application is not particularly limited as long as the terminal determines to collect the audio to be recognized by monitoring the user operation, and the sensor may be a microphone of the terminal.

S104: and carrying out voice recognition on the audio to be recognized.

In the embodiment of the application, after the terminal collects the audio to be recognized input by the user, the terminal can also perform voice recognition on the audio to be recognized, so as to perform subsequent operations according to the result of the voice recognition.

Specifically, because the resources required to be occupied by the speech recognition are more, and the process of recognizing the speech by the terminal is slower than the process of recognizing the speech by the server, the terminal can send the audio to be recognized to the speech recognition server for speech recognition and receive the recognition result returned by the speech recognition server when the speech recognition is required. Similarly, in the application, the terminal may also send the audio to be recognized to the speech recognition server, so that the speech recognition server performs speech recognition on the audio to be recognized, and receives a recognition result returned by the speech recognition server.

In addition, since the conventional speech recognition server can usually only recognize a specific language, for example, a speech server that performs speech recognition on english outputs only an english recognition result regardless of which country speech the received audio corresponds to, and similarly, a speech server that performs speech recognition on chinese outputs only a chinese speech recognition result.

Since the verification characters displayed in the verification page in step S101 of the present application may be characters corresponding to multiple languages, the audio to be recognized sent by the terminal to the voice server in step S102 also has a high probability of being audio corresponding to multiple languages, so that the voice recognition server can recognize only the characters of the audio to be recognized corresponding to one language.

Thus, in the present application, when the speech recognition server may be a server that performs speech recognition on chinese, the recognition result returned by the speech recognition server to the terminal may be a recognition result of the audio to be recognized according to a chinese phonetic pronunciation, and the recognition result may be a chinese character. For example, assuming that the pronunciation of the audio to be recognized is "buruhe" corresponding to "Bonjou" in french, the speech recognition server does not recognize "Bonjou" in french, but recognizes the chinese corresponding to the "buruhe" in french, for example, "boo such as a river", and the terminal can receive the speech recognition result returned by the speech recognition terminal as "boo such as a river".

That is, in the present application, the speech recognition server does not recognize characters corresponding to different languages, but only recognizes characters corresponding to different languages in the chinese language.

It should be noted that the speech recognition server may be a single device, such as a server dedicated to perform speech recognition, or the speech recognition server may be a system composed of a plurality of servers, such as a distributed server, and it is not limited in this application that the speech recognition server is only used for performing speech recognition, that is, the speech recognition server may be the same server as the service server in step S101 or the same server as the server providing resources, and of course, the speech recognition server, the service server, and the server providing resources may also be different servers, which is not specifically limited in this application.

And S105, determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a reference character preset for the verification character.

In this embodiment of the application, after the terminal receives the recognition result returned by the speech recognition server, it may determine whether to allow the resource to be invoked according to the recognition result of the audio to be recognized and the reference character preset for the verification character. Since homophones are a problem for Chinese, for similar pronunciations, Chinese characters corresponding to recognition results returned by the voice recognition server may be different, for example, the pronunciation of "tong", and the voice recognition server may return "homo" or "bronze", but "homo" and "bronze" are two different Chinese characters, so that the recognition results can be compared with the reference characters conveniently, and the server can also convert the recognition results into English characters for comparison.

Specifically, because the pronunciation of each word in the chinese language is formed by combining the pronunciations of single characters, the pronunciation of the single character does not correspond to the chinese pinyin of each word, and the pronunciation of the single character cannot correspond to the pronunciation of one word by connecting the chinese pinyins of two characters together, in the present application, the terminal can determine the english characters corresponding to each chinese character in the recognition result according to the correspondence between the pre-stored chinese characters and the english characters, and perform the pairing with the english characters corresponding to each reference character carried in the verification page to determine the accuracy of the recognition result.

First, since the terminal needs to determine whether the recognition result is correct or not to determine whether to allow the resource to be called, the terminal needs to compare the correct "answer" corresponding to the verification character with the recognition result, and thus, in the present application, the JS code corresponding to the verification page returned by the service server may carry a reference character corresponding to the pronunciation of the verification character.

The reference character corresponding to the pronunciation of the verification character can be an English character, and the reference character is a Chinese character corresponding to each Chinese pronunciation of the verification character and corresponds to each Chinese character in a Chinese pinyin mode because the subsequent reference character needs to be compared with the corresponding character of the Chinese character of the recognition result. For example, when the verification page displays the english character "Hello", reference characters corresponding to the verification characters may be "ha" and "lou", that is, the pronunciation of english Hello corresponds to the pinyin of "ha" and "ph" of the chinese pronunciation, when the verification page displays the german character "GutenTag", reference characters corresponding to the verification characters may be "gu", "teng", "ta", and "ge", that is, the pronunciations of german GutenTag correspond to the pinyins of "guest", "pain", "step", and "brow", respectively, of the chinese pronunciation, when the verification page displays the french character "Bonjour", reference characters corresponding to the verification characters may be "ben" and "zhu", that is, the pronunciations of french Bonjour correspond to the pinyins of "this" and "pig" of the chinese pronunciation, and so on.

Secondly, the terminal can determine the Chinese pinyin corresponding to each Chinese character in the recognition result according to the corresponding relationship between the pre-stored Chinese character and the Chinese pinyin, and the Chinese pinyin is used as the English character corresponding to each Chinese character in the recognition result, namely, the terminal calls the corresponding relationship between the pre-stored Chinese character and the Chinese pinyin by operating the JS code of the verification page, and determines the Chinese pinyin corresponding to the recognition result according to the corresponding relationship between the Chinese character and the Chinese pinyin, and the Chinese pinyin is used as the English character corresponding to the recognition result.

Then, the terminal can compare the english character corresponding to the recognition result with the english character corresponding to the reference character, determine the accuracy of the recognition result, and determine whether to allow the resource to be called according to the accuracy.

Further, since the chinese pinyin also has pronunciation tones, in the present application, in order to determine the pronunciation tones, numbers may be added after the recognition result of the audio to be recognized, and numbers are added to the english characters corresponding to the reference characters to represent the tones of each character, specifically, e.g. one to four sounds may be respectively labeled as "a 1", "a 2", "a 3", and "a 4".

In addition, since the verification page carries the reference character corresponding to the pronunciation of the verification character in the present application, and the reference character is an english character corresponding to the pronunciation of the verification character, when the terminal compares the english character corresponding to the recognition result with the reference character, the terminal may compare the english character corresponding to each chinese character of the recognition result with each english character of the reference character, respectively. For example, if the english characters corresponding to the recognition result are "bu", "ru", and "he", the verification character is "Bonjou", and the reference character corresponding to the verification character is "benzhu", the terminal may compare the "bu", "ru", and "he" with the "ben" and "zhu", respectively, and determine the accuracy of the recognition result.

Further, the accuracy of the recognition result may be determined by the same method as the accuracy determination method of the existing speech recognition technology, such as a boundary distance algorithm, which is not described herein again.

And finally, the terminal can judge whether the accuracy of the identification result is greater than a preset threshold value by running the JS code carried in the verification page, when the accuracy is determined to be greater than the preset threshold value, the terminal can determine that the resource is allowed to be called and send the instruction for calling the resource, and when the accuracy is determined not to be greater than the preset threshold value, the terminal can determine that the resource is not allowed to be called, does not send the instruction for calling the resource and displays error information.

In addition, since the resource is usually stored in a separate device, such as a database, a server providing the resource, and the like, the terminal may transmit the instruction for resource invocation to a preset invocation address according to the preset invocation address when determining to transmit the instruction for resource invocation, where the invocation address may be an address of the device, such as the database, the server providing the resource, and the like.

It should be noted that the server providing the resource, the service server and the voice recognition server may be the same server or different servers, and this application is not limited to this specifically.

With the method for resource invocation as shown in fig. 1, after a terminal or a server receives a resource invocation request, a verification page of displayed verification characters can be determined according to the resource invocation request, the verification characters can be characters corresponding to any language, and since it is usually necessary for a user to determine whether to allow resource invocation when invoking resources, the verification page can be sent to the terminal used by the user, so that the terminal can display the verification page and the verification characters on the verification page, further, since the verification characters can be corresponding characters of any language, the verification page can also carry an audio file of standard pronunciation of the verification characters, so that by playing the standard pronunciation, an audio to be recognized input by the user according to the standard pronunciation of the verification characters is collected (i.e., playing the standard pronunciation to enable a user to use the standard pronunciation as the audio to be recognized input by the user and collected by the terminal), finally performing voice recognition on the audio to be recognized, determining the accuracy of the comparison between the recognition result of the audio to be recognized and the reference character preset aiming at the verification character, and determining whether to allow to call the resource. Therefore, when determining whether the resource is allowed to be called, the user does not need to manually input the character password, and only needs to follow and read the standard pronunciation, so that the time for inputting the character is saved, the user operation is simple and convenient, and the efficiency of calling the resource is improved.

In addition, in the present application, the speech recognition server may be a server for performing speech recognition on any speech, and since the speech recognition server performs speech recognition on which language generally, the speech is the character of which language is the recognition result returned by the server, in the present application, the speech recognition server may return the character corresponding to any language, and of course, in the present application, the service server returns the reference character preset for the verification character carried in the verification page of the terminal, and may also be the character of the language corresponding to the pronunciation of the character corresponding to any language.

Of course, since pronunciations of most languages can be represented by english characters, the reference character may be an english character in general, and the present application is not limited to this specifically.

In addition, in the present application, since a user usually needs to log in an account owned by the user and then execute a service, when the terminal sends the resource calling request to the service server, the terminal may also send the resource calling request to the service server through the logged-in account, and then, after receiving the resource calling request, the service server may further determine account information corresponding to the account according to the account, and further determine a language corresponding to international information of the user according to the nationality information of the user in the account information. For example, if the nationality information of the user is the united nations of america, the service server may determine that the nationality information of the user corresponds to the english language, and if the international information of the user is the people's republic of china, the service server may determine that the nationality information of the user corresponds to the chinese language.

Therefore, when the service server determines the verification character carried by the verification page, the service server may determine, according to the language corresponding to the nationality of the user, another language other than the language corresponding to the nationality of the user as the language corresponding to the verification character, for example, when it is determined that the language corresponding to the nationality information of the user is english, the verification character is a character corresponding to another language other than english, such as a french character, a german character, a chinese character, and the like.

Further, since there may be different languages used by users belonging to the same nationality but different regions, for example, a nationality is a user in canada, if the user lives in quebeck province, the user has a higher probability of using french as a language used in daily life, and if the user lives in worthwhile city, the user has a higher probability of using application as a language used in daily life, in the present application, the service server may further determine account information of the user such as a birth address and a living region, determine a language corresponding to the common language of the user, determine other languages other than the language corresponding to the common language of the user, and display characters corresponding to the language not commonly used by the user in the verification page as a language corresponding to the verification characters.

Furthermore, the terminal may also determine the nationality information of the user according to the account information of the user who sent the resource invocation request, and determine the displayed authentication character in the same manner as the above-described procedure. The user information may be stored in the terminal or the server, and is obtained from the server by the terminal, and the language corresponding to the verification character may also be determined by the terminal, that is, the application does not limit the server to determine the verification character as a verification character corresponding to other languages besides the language corresponding to the user, specifically, the language corresponding to the verification character is determined by the terminal or the server, and the user information may be set by a worker according to the needs of actual application.

In addition, based on the resource calling process shown in fig. 1, the present application provides a detailed flow of resource calling, as shown in fig. 4.

Fig. 4 is a detailed process of resource invocation provided in the embodiment of the present application, including:

s201: the terminal receives a resource calling request.

S202: the terminal forwards the resource calling request to the service server.

S203: the service server returns a verification page to the terminal.

S204: the terminal displays the verification characters in the verification page.

S205: and the terminal monitors the playing operation of the user and plays the standard pronunciation corresponding to the verification character.

S206: and the terminal monitors the recording operation of the user and collects the audio to be identified input by the user according to the verification character.

S207: and the terminal sends the audio to be recognized to a voice recognition server.

S208: the voice recognition server returns the recognition result to the terminal.

S209: and the terminal determines whether to allow the resource to be called or not according to the identification result and a reference character preset aiming at the verification character.

The service server and the voice server may be the same device, the playing operation and the recording operation may be gesture operations of a user, and the playing operation and the recording operation may be the same (for example, both the clicking gesture operations) or the playing operation and the recording operation may be different (for example, the playing operation is the clicking gesture operation and the recording operation is the long-pressing gesture operation).

In addition, since steps S101, S104, and S105 may be executed by the terminal or by a preset server, in this application, the server may also receive a resource calling request sent by the terminal, and return the verification page to the terminal, so that the terminal displays the verification character by running the verification page, the server may convert the recognition result into an english character, and compare the english character with a reference character corresponding to the verification character, so as to determine whether to allow the resource to be called according to the correctness of the recognition result, that is, the steps may not be executed by the terminal running the JS code of the verification page, and the server may also execute the above steps, which is not described in detail herein.

In another embodiment of the present application, the request for invoking the resource may be a request for obtaining a red envelope, the verification page may be a page for preempting the red envelope, and the resource may be a balance in the red envelope in the request for obtaining the red envelope, as shown in fig. 5.

Fig. 5 is a process of resource invocation provided in an embodiment of the present application, which specifically includes the following steps:

s301: the terminal can first receive the red packet acquisition request and send the red packet acquisition request to the service server.

S302: and receiving the page of the red packet returned by the service server, and displaying the verification character by operating the page of the red packet, wherein the verification character can be a character corresponding to any language.

S303: the terminal can play the standard pronunciation corresponding to the verification character by monitoring the operation of the user and collect the audio to be recognized input by the user according to the identification pronunciation of the verification character.

S304: and sending the audio to be recognized to the voice recognition server for voice recognition.

S305: and determining the correct rate of the recognition result by receiving the recognition result returned by the voice recognition server and the reference character preset for the verification character, and determining whether to allow the balance in the red envelope to be called according to the correct rate.

Further, normally, when the terminal determines that the accuracy of the identification result is greater than the preset threshold, the terminal may determine that the balance in the red packet is allowed to be called, but since the balance in the red packet may also be called for a limited number of times, for example, setting the number of times that the balance in the red packet can be picked up to 5 times, only 5 terminals may call the balance in the red packet, and other terminals cannot call the balance in the red packet, and receive a returned error message, for example, "the red packet has been robbed up |)! The number of times the red envelope can be picked up may be set by a worker according to the needs of the actual application, and the present application is not particularly limited.

In addition, in the resource invoking process shown in fig. 1, 4, and 5, the application is not limited, and the terminal must execute the steps by running the JS code carried in the verification page, but of course, the terminal may also execute the steps by receiving the application program integrated with the JS code in advance, or receiving the application program of the SDK package corresponding to the JS code, and running the application program integrated with the JS code or receiving the application program of the SDK package corresponding to the JS code.

In another embodiment provided by the present application, the verification page may also not display the verification character, but directly play the standard pronunciation corresponding to the verification character, and then the process of the resource invocation may be as shown in fig. 6.

Fig. 6 is a process of resource invocation provided in an embodiment of the present application, which specifically includes the following steps:

s601: a resource invocation request is received.

S602: and playing the standard pronunciation according to the resource calling request.

S603: and collecting the audio to be recognized input by the user according to the played standard pronunciation.

S604: and carrying out voice recognition on the audio to be recognized.

S605: and determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a preset reference character aiming at the standard pronunciation.

In step S602, the terminal may receive a verification page returned by the service server, where the verification page may carry the standard pronunciation, the JS code corresponding to the play key, and the JS code corresponding to the record key, so that the user may perform a gesture operation on the play key to play the standard pronunciation, and an interface of the terminal may be as shown in fig. 7. As can be seen in fig. 7, the verification character shown in fig. 2 or fig. 3 is not displayed in the verification page, but only the prompt message, the play key and the record key are displayed, so that the user can play the standard pronunciation according to the prompt of the prompt message, where the prompt message includes: press the lower button and speak "hello" in french, and click to play "hello" in french.

Then, further in step S603, the terminal may collect the audio to be recognized of the standard pronunciation input played by the user.

Further, the verification page may further carry a reference character preset for the standard pronunciation, so that in step S605, the accuracy of the recognition result is determined according to the recognition result and the reference character, and whether to allow the resource to be called is determined according to the accuracy. For example, if the standard pronunciation is a standard pronunciation of the french language "Bonjou", the reference character may be "ben" and "zhu", and the terminal compares the english character corresponding to each chinese character of the recognition result with the english character of the reference character, respectively.

Further, the terminal directly plays the standard pronunciation by running the code of the verification page after receiving the verification page in step S602, i.e., the standard pronunciation can be directly played without the user operating the verification page. Of course, the number of times the standard pronunciation can be played automatically, and the time interval between playing can also be set by the staff according to the needs of the practical application, which is not specifically limited in the present application. Of course, since the user may need to return to play the standard pronunciation, the terminal may play the standard pronunciation by monitoring the gesture operation of the user on the play key.

It should be noted that all execution subjects of the steps of the method provided in the embodiments of the present application may be the same apparatus, or different apparatuses may also be used as execution subjects of the method. For example, the execution subject of step S101 and step S102 may be device 1, and the execution subject of step S103 may be device 2; for another example, the execution subject of step S101 may be device 1, and the execution subjects of step S102 and step S103 may be device 2; etc., i.e., the server may be a distributed server consisting of a plurality of devices. Meanwhile, the execution main body of each step of the method provided by the embodiment of the application is not limited to a server, and can also be a terminal, and the terminal can be a mobile phone, a personal computer, a tablet computer and other devices.

Taking the resource calling process provided in fig. 1 as an example, step S101 may be a step of receiving a resource calling request by a server, step S102 may be a step of displaying a verification character by a terminal according to the resource calling request, step S103 may be a step of acquiring, by the terminal, an audio to be recognized input by a user according to the verification character, step S104 may be a step of performing voice recognition on the audio to be recognized by the server, and step S105 may be a step of determining, by the server, whether to allow the resource to be called according to a recognition result of the audio to be recognized and a reference character preset for the verification character, or;

step S101 may be that the terminal receives a resource call request, step S102 may be that the terminal displays a verification character according to the resource call request, step S103 may be that the terminal collects an audio to be recognized input by a user according to the verification character, step S104 may be that the server performs voice recognition on the audio to be recognized, and step S105 may be that the server determines whether to allow the resource to be called or not according to a recognition result of the audio to be recognized and a reference character preset for the verification character;

step S101 may be that the terminal receives a resource call request, step S102 may be that the terminal displays a verification character according to the resource call request, step S103 may be that the terminal collects an audio to be recognized input by a user according to the verification character, step S104 may be that the server performs voice recognition on the audio to be recognized, and step S105 may be that the terminal determines whether to allow the resource to be called or not according to a recognition result of the audio to be recognized and a reference character preset for the verification character;

step S101 may be that the server receives a resource call request, step S102 may be that a terminal displays a verification character according to the resource call request, step S103 may be that the terminal collects an audio to be recognized that is input by a user according to the verification character, step S104 may be that the terminal performs voice recognition on the audio to be recognized, and step S105 may be that the server determines whether to allow the resource to be called or not according to a recognition result of the audio to be recognized and a reference character preset for the verification character;

step S101 may be receiving a resource call request by the terminal, step S102 may be displaying a verification character by the terminal according to the resource call request, step S103 may be acquiring, by the terminal, an audio to be recognized input by a user according to the verification character, step S104 may be performing, by the terminal, voice recognition on the audio to be recognized, step S105 may be determining, by the server, whether to allow the resource to be called according to a recognition result of the audio to be recognized and a reference character preset for the verification character, and so on.

It should be noted that, the displaying of the verification character in step S102 may be performed by the terminal, but the process of determining the verification character according to the resource invocation request may be performed by the terminal or the server, and this application is not limited in this respect.

It can be seen that, in the resource invoking process provided by the present application, the execution main body of each step may be set according to the needs of the actual application, and may be a terminal or a server, and as described above, the server may be a same server executing multiple operations, or different servers respectively executing different operations, and further, the server may be a single device, or may be a distributed server. It should be noted that whether the server is a single device or a distributed server does not conflict with whether the server performs multiple operations or one operation, that is, the server may be a single device and perform multiple operations, or the server may be a distributed server and perform one operation, and so on.

Based on the process of resource invocation shown in fig. 1, the embodiment of the present application further provides a device for resource invocation, as shown in fig. 8.

Fig. 8 is a schematic structural diagram of a device for resource invocation according to an embodiment of the present application, including:

a receiving module 401, which receives a resource calling request;

a display module 402, configured to display a verification character according to the resource calling request;

the acquisition module 403 is used for acquiring the audio to be identified input by the user according to the verification character;

the recognition module 404 performs voice recognition on the audio to be recognized;

and a comparison calling module 405 for determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and the reference character preset for the verification character.

The display module 402 sends the resource calling request to a server, receives a verification page returned by the server, and displays verification characters carried in the verification page.

The verification character is a character corresponding to at least one language.

The acquisition module 403 receives the verification page returned by the server and the standard pronunciation corresponding to the verification character carried in the verification page, plays the standard pronunciation, and acquires the audio to be recognized input by the user according to the played standard pronunciation.

The recognition module 404 sends the audio to be recognized to a server, so that the server performs speech recognition on the audio to be recognized, and receives a recognition result returned by the server.

The recognition result is a chinese character, the reference character is an english character, the comparison and calling module 405 determines, according to a pre-stored correspondence between the chinese character and the english character, an english character corresponding to each chinese character in the recognition result, compares the chinese pinyin corresponding to the recognition result with the english character corresponding to the reference character, determines a correctness of the recognition result, and determines whether to allow the resource to be called according to the correctness.

The comparison and calling module 405 determines the chinese pinyin corresponding to each chinese character in the recognition result according to the pre-stored correspondence between the chinese character and the chinese pinyin, and uses the determined chinese pinyin as the english character corresponding to each chinese character in the recognition result.

Specifically, the device for resource invocation shown in fig. 8 may be located in a terminal, where the terminal may specifically be a mobile phone, a tablet computer, a personal computer, and other devices.

Based on the process of resource invocation shown in fig. 6, the embodiment of the present application further provides a device for resource invocation, as shown in fig. 9.

Fig. 9 is a schematic structural diagram of a device for resource invocation according to an embodiment of the present application, including:

a receiving module 701, which receives a red packet obtaining request;

the playing module 702 plays the standard pronunciation according to the request for obtaining the red packet;

the acquisition module 703 is used for acquiring the audio to be identified, which is input by the user according to the played standard;

the recognition module 704 performs voice recognition on the audio to be recognized;

the comparison calling module 705 determines whether the red packet is allowed to be acquired according to the recognition result of the audio to be recognized and the preset reference character for the standard pronunciation.

The playing module 702 sends the resource calling request to a service server, receives a verification page returned by the service server, and plays a standard pronunciation carried in the verification page.

The standard pronunciation is audio corresponding to at least one language.

The recognition module 704 sends the audio to be recognized to a voice recognition server, so that the voice recognition server performs voice recognition on the audio to be recognized and receives a recognition result returned by the voice recognition server.

The recognition result is a Chinese character, the reference character is an English character, the comparison calling module 705 determines the English character corresponding to each Chinese character in the recognition result according to the corresponding relationship between the pre-stored Chinese character and English character, compares the Chinese pinyin corresponding to the recognition result with the English character corresponding to the reference character, determines the accuracy of the recognition result, and determines whether to allow the resource to be called according to the accuracy.

The comparison and calling module 705 determines the corresponding chinese pinyin for each chinese character in the recognition result according to the pre-stored correspondence between the chinese character and the chinese pinyin, and uses the determined chinese pinyin as the corresponding english character for each chinese character in the recognition result.

Specifically, the device for resource invocation shown in fig. 9 may be located in a terminal, where the terminal may specifically be a mobile phone, a tablet computer, a personal computer, and other devices.

Based on the process of resource invocation shown in fig. 5, the embodiment of the present application further provides a device for resource invocation, as shown in fig. 10.

Fig. 10 is a schematic structural diagram of an apparatus for resource invocation according to an embodiment of the present application, including:

the receiving module 501 receives a red packet obtaining request;

the display module 502 displays the verification character according to the red packet obtaining request;

the acquisition module 503 is used for acquiring the audio to be identified input by the user according to the verification character;

the recognition module 504 is used for performing voice recognition on the audio to be recognized;

and the comparison calling module 505 determines whether to allow the red packet to be acquired according to the recognition result of the audio to be recognized and the preset reference character for the verification character.

The display module 502 sends the resource calling request to a service server, receives a verification page returned by the service server, and displays verification characters carried in the verification page.

The acquisition module 503 receives the verification page returned by the service server and the standard pronunciation corresponding to the verification character carried by the verification page, plays the standard pronunciation, and acquires the audio to be recognized input by the user according to the played standard pronunciation.

The recognition module 504 sends the audio to be recognized to a voice recognition server, so that the voice recognition server performs voice recognition on the audio to be recognized and receives a recognition result returned by the voice recognition server.

The recognition result is a chinese character, the reference character is an english character, the comparison and calling module 505 determines, according to a pre-stored correspondence between the chinese character and the english character, an english character corresponding to each chinese character in the recognition result, compares the chinese pinyin corresponding to the recognition result with the english character corresponding to the reference character, determines a correctness of the recognition result, and determines whether to allow the resource to be called according to the correctness.

The comparison and calling module 505 determines the corresponding chinese pinyin for each chinese character in the recognition result according to the pre-stored correspondence between the chinese character and the chinese pinyin, and takes the determined chinese pinyin as the corresponding english character for each chinese character in the recognition result.

Specifically, the device for resource invocation shown in fig. 10 may be located in a terminal, where the terminal may specifically be a mobile phone, a tablet computer, a personal computer, and the like.

In the 90 th generation of 20 th century, it is obvious that improvements in Hardware (for example, improvements in Circuit structures such as diodes, transistors and switches) or software (for improvement in method flow) can be distinguished for a technical improvement, however, as technology develops, many of the improvements in method flow today can be regarded as direct improvements in Hardware Circuit structures, designers almost all obtain corresponding Hardware Circuit structures by Programming the improved method flow into Hardware circuits, and therefore, it cannot be said that an improvement in method flow cannot be realized by Hardware entity modules, for example, Programmable logic devices (Programmable logic devices L organic devices, P L D) (for example, Field Programmable Gate Arrays (FPGAs) are integrated circuits whose logic functions are determined by user Programming of devices), and a digital system is "integrated" on a P L D "by self Programming of designers without requiring many kinds of integrated circuits manufactured and manufactured by special chip manufacturers to design and manufacture, and only a Hardware software is written in Hardware programs such as Hardware programs, software programs, such as Hardware programs, software, Hardware programs, software programs, Hardware programs, software, Hardware programs, software, Hardware programs, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software, Hardware, software.

A controller may be implemented in any suitable manner, e.g., in the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, Application Specific Integrated Circuits (ASICs), programmable logic controllers (PLC's) and embedded microcontrollers, examples of which include, but are not limited to, microcontrollers 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone L abs C8051F320, which may also be implemented as part of the control logic of a memory.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method for resource invocation, the method comprising:

receiving a resource calling request;

displaying a verification character according to the resource calling request, wherein the verification character is a character corresponding to at least one language in multiple languages;

collecting audio to be recognized input by a user according to the standard pronunciation corresponding to the verification character;

performing voice recognition on the audio to be recognized;

determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a reference character preset for the verification character; the reference character is a reference character corresponding to the standard pronunciation of the verification character, so that the reference character is a Chinese character corresponding to each Chinese pronunciation of the verification character and corresponds to each Chinese character in a Chinese pinyin mode;

the recognition result is a recognition result of the audio to be recognized according to Chinese phonetic pronunciation, the recognition result is a Chinese character, and the reference character is an English character;

the determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and the reference character preset for the verification character specifically includes:

determining English characters corresponding to the Chinese characters in the recognition result according to the corresponding relation between pre-stored Chinese characters and English characters;

comparing the English character corresponding to the recognition result with the English character corresponding to the reference character to determine the accuracy of the recognition result;

and determining whether to allow the resource to be called or not according to the accuracy.

2. The method of claim 1, wherein displaying the validation character according to the resource invocation request specifically comprises:

sending the resource calling request to a server;

and receiving a verification page returned by the server, and displaying verification characters carried in the verification page.

3. The method of claim 1, wherein displaying the validation character specifically comprises:

determining a language corresponding to a user according to account information of the user sending the resource calling request;

and displaying the verification characters corresponding to at least one other language except the language corresponding to the user.

4. The method according to claim 1, wherein the step of collecting the audio to be recognized input by the user according to the verification character comprises:

receiving a verification page returned by a server and a standard pronunciation corresponding to the verification character carried by the verification page;

playing the standard pronunciation;

and collecting the audio to be recognized input by the user according to the played standard pronunciation.

5. The method of claim 1, wherein performing speech recognition on the audio to be recognized specifically comprises:

sending the audio to be recognized to a server so that the server performs voice recognition on the audio to be recognized;

and receiving the identification result returned by the server.

6. The method according to claim 1, wherein determining the english character corresponding to each chinese character in the recognition result according to a pre-stored correspondence between the chinese character and the english character specifically comprises:

and determining the Chinese pinyin corresponding to each Chinese character in the recognition result according to the corresponding relation between the pre-stored Chinese character and the Chinese pinyin, and taking the Chinese pinyin as the English character corresponding to each Chinese character in the recognition result.

7. A method for resource invocation, the method comprising:

receiving a resource calling request;

playing a standard pronunciation corresponding to the verification character according to the resource calling request;

performing voice recognition on the audio to be recognized;

determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a reference character preset for the standard pronunciation; the reference character is a Chinese character corresponding to each Chinese pronunciation of the verification character and corresponds to each Chinese character in a Chinese pinyin mode;

8. The method according to claim 7, wherein determining the english character corresponding to each chinese character in the recognition result according to a pre-stored correspondence between the chinese character and the english character specifically comprises:

9. A method for resource invocation, the method comprising:

receiving a red packet acquisition request;

displaying a verification character according to the red packet obtaining request, wherein the verification character is a character corresponding to at least one language in multiple languages;

performing voice recognition on the audio to be recognized;

determining whether the red packet is allowed to be acquired or not according to the recognition result of the audio to be recognized and a reference character preset for the verification character; the reference character is a reference character corresponding to the standard pronunciation of the verification character, so that the reference character is a Chinese character corresponding to each Chinese pronunciation of the verification character and corresponds to each Chinese character in a Chinese pinyin mode;

10. The method according to claim 9, wherein displaying a validation character according to the get red packet request specifically comprises:

sending the resource calling request to a service server;

and receiving a verification page returned by the service server, and displaying verification characters carried in the verification page.

11. The method according to claim 9, wherein the step of collecting the audio to be recognized input by the user according to the verification character comprises:

receiving a verification page returned by a service server and a standard pronunciation corresponding to the verification character carried by the verification page;

playing the standard pronunciation;

12. The method of claim 9, wherein performing speech recognition on the audio to be recognized specifically comprises:

sending the audio to be recognized to a voice recognition server so that the voice recognition server performs voice recognition on the audio to be recognized;

and receiving a recognition result returned by the voice recognition server.

13. The method according to claim 9, wherein determining the english character corresponding to each chinese character in the recognition result according to a pre-stored correspondence between the chinese character and the english character specifically comprises:

14. An apparatus for resource invocation, comprising:

the receiving module receives a resource calling request;

the display module displays a verification character according to the resource calling request, wherein the verification character is a character corresponding to at least one language in multiple languages;

the acquisition module is used for acquiring the audio to be recognized input by the user according to the standard pronunciation corresponding to the verification character;

the comparison calling module is used for determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a reference character preset for the verification character; the reference character is a reference character corresponding to the standard pronunciation of the verification character, so that the reference character is a Chinese character corresponding to each Chinese pronunciation of the verification character and corresponds to each Chinese character in a Chinese pinyin mode;

15. The apparatus according to claim 14, wherein the display module sends the resource calling request to a server, receives a verification page returned by the server, and displays verification characters carried in the verification page.

16. The apparatus according to claim 14, wherein the display module determines a language corresponding to the user according to account information of the user sending the resource invocation request, and displays the verification character corresponding to at least one other language except the language corresponding to the user.

17. The device according to claim 14, wherein the collection module receives a verification page returned by the server and a standard pronunciation corresponding to the verification character carried by the verification page, plays the standard pronunciation, and collects the audio to be recognized input by the user according to the played standard pronunciation.

18. The apparatus according to claim 14, wherein the recognition module sends the audio to be recognized to a server, so that the server performs speech recognition on the audio to be recognized, and receives a recognition result returned by the server.

19. The apparatus of claim 14, wherein the comparison calling module determines a corresponding chinese pinyin for each chinese character in the recognition result according to a pre-stored correspondence between the chinese character and the chinese pinyin, as the corresponding english character for each chinese character in the recognition result.

20. An apparatus for resource invocation, comprising:

the receiving module receives a resource calling request;

the playing module plays the standard pronunciation corresponding to the verification character according to the resource calling request;

the comparison calling module is used for determining whether to allow the resource to be called according to the recognition result of the audio to be recognized and a reference character preset for the standard pronunciation; the reference character is a Chinese character corresponding to each Chinese pronunciation of the verification character and corresponds to each Chinese character in a Chinese pinyin mode;

21. The apparatus of claim 20, wherein the comparison calling module determines a corresponding chinese pinyin for each chinese character in the recognition result according to a pre-stored correspondence between the chinese character and the chinese pinyin, as an english character corresponding to each chinese character in the recognition result.

22. An apparatus for resource invocation, comprising:

the receiving module receives a red packet acquisition request;

the display module displays a verification character according to the red packet acquisition request, wherein the verification character is a character corresponding to at least one language in multiple languages;

the comparison calling module is used for determining whether the red packet is allowed to be acquired or not according to the recognition result of the audio to be recognized and a reference character preset for the verification character; the reference character is a reference character corresponding to the standard pronunciation of the verification character, so that the reference character is a Chinese character corresponding to each Chinese pronunciation of the verification character and corresponds to each Chinese character in a Chinese pinyin mode;

23. The apparatus of claim 22, wherein the display module sends the resource invocation request to a service server, receives a verification page returned by the service server, and displays verification characters carried in the verification page.

24. The apparatus according to claim 22, wherein the collection module receives a verification page returned by the service server and a standard pronunciation corresponding to the verification character carried in the verification page, plays the standard pronunciation, and collects the audio to be recognized input by the user according to the played standard pronunciation.

25. The apparatus of claim 22, wherein the recognition module sends the audio to be recognized to a speech recognition server, so that the speech recognition server performs speech recognition on the audio to be recognized and receives a recognition result returned by the speech recognition server.

26. The apparatus of claim 22, wherein the comparison calling module determines a corresponding chinese pinyin for each chinese character in the recognition result according to a pre-stored correspondence between the chinese character and the chinese pinyin, as the corresponding english character for each chinese character in the recognition result.