CN116363787A

CN116363787A - Voice-based traffic method, device, computer equipment and storage medium

Info

Publication number: CN116363787A
Application number: CN202310330825.9A
Authority: CN
Inventors: 苏同胜; 孙连鹏
Original assignee: Hainan Shengzhi Internet Technology Co ltd
Current assignee: Hainan Shengzhi Internet Technology Co ltd
Priority date: 2023-03-30
Filing date: 2023-03-30
Publication date: 2023-06-30

Abstract

The application provides a voice-based traffic method, a voice-based traffic device, computer equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: receiving a first voice signal of a target object sent by traffic equipment; decrypting the first voice signal of the target object based on a preset encryption rule; verifying the identity of the target object based on the decrypted first voice signal to obtain a verification result; and returning the verification result to the passing equipment so that the passing equipment can determine whether the target object is allowed to pass or not based on the verification result. The method provides a passing mode, and because the voice of the user is not changed generally, the feasibility of passing of the user is improved, and convenience is brought to the user for traveling; and the voice signal of the user can be encrypted, so that the safety of the information of the user in the information transmission process is ensured.

Description

Voice-based traffic method, device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a voice-based traffic method, apparatus, computer device, and storage medium.

Background

The gate is a channel management device, which is used for managing the people flow and standardizing the people to go in and out. In daily life, many communities, office buildings, stations, subway stations and the like use gates to conduct pass verification, and people entering and exiting are managed and recorded. The common passing verification method comprises the steps of reading an identity card, scanning a two-dimensional code or recognizing face. However, the existing gate has a single function, and if the user does not carry an identity card, the two-dimensional code cannot be provided due to network blocking, face recognition failure is caused due to makeup, and the like, so that the travel of the user is affected.

Disclosure of Invention

The embodiment of the application provides a voice-based passing method, a voice-based passing device, a voice-based passing computer device and a voice-based storage medium, and provides a passing mode, so that the voice of a user is not changed generally, the passing feasibility of the user is improved, and convenience is brought to the travel of the user; and the voice signal of the user can be encrypted, so that the safety of the information of the user in the information transmission process is ensured. The technical scheme is as follows:

in one aspect, a voice-based traffic method is provided, the method comprising:

Receiving a first voice signal of a target object sent by passing equipment, wherein the passing equipment is used for managing the passing of the object, and the first voice signal is used for verifying the identity of the target object;

decrypting the first voice signal of the target object based on a preset encryption rule;

verifying the identity of the target object based on the decrypted first voice signal to obtain a verification result, wherein the verification result is used for indicating whether the target object is allowed to pass;

and returning the verification result to the passing equipment so that the passing equipment can determine whether the target object is allowed to pass or not based on the verification result.

In another aspect, a voice-based pass through device is provided, the device comprising:

the system comprises a receiving module, a transmitting module and a receiving module, wherein the receiving module is used for receiving a first voice signal of a target object sent by passing equipment, the passing equipment is used for managing the passing of the object, and the first voice signal is used for verifying the identity of the target object;

the decryption module is used for decrypting the first voice signal of the target object based on a preset encryption rule;

the verification module is used for verifying the identity of the target object based on the decrypted first voice signal to obtain a verification result, wherein the verification result is used for indicating whether the target object is allowed to pass;

And the sending module is used for returning the verification result to the passing equipment so that the passing equipment can determine whether the target object is allowed to pass or not based on the verification result.

In some embodiments, the authentication module comprises:

the identification unit is used for identifying the decrypted first voice signal to obtain target information to be verified;

and the verification unit is used for verifying the target information based on the reference information of the target object to obtain the verification result, wherein the reference information comprises various information for verifying the identity of the target object, and the various information comprises information of the type to which the target information belongs.

In some embodiments, the identifying unit is configured to perform voiceprint extraction on the decrypted first voice signal to obtain voiceprint features of the target object; performing semantic recognition on the decrypted first voice signal to obtain content information to be verified of the target object; the target information is determined based on the voiceprint feature and the content information.

In some embodiments, the authentication unit comprises:

the first verification subunit is used for verifying the voiceprint characteristics based on the reference voiceprints in the reference information to obtain a first intermediate result;

The second verification subunit is used for verifying the content information based on the reference content in the reference information to obtain a second intermediate result;

and the determining subunit is used for determining the verification result based on the first intermediate result and the second intermediate result.

In some embodiments, the first intermediate result and the second intermediate result are both verification scores, the verification scores being used to represent a similarity between the reference information and the information to be verified;

the determining subunit is configured to determine a weight of the first intermediate result and a weight of the second intermediate result based on the priority of the voiceprint information and the priority of the content information; based on the weight of the first intermediate result and the weight of the second intermediate result, carrying out weighted summation on the first intermediate result and the second intermediate result to obtain a target score; under the condition that the target score reaches a first verification threshold value, determining that the verification result is allowed traffic; and under the condition that the target score does not reach a first verification threshold value, determining that the verification result is not allowed to pass.

The determining subunit is configured to determine, for any one of the voiceprint information and the content information, the verification result based on the verification score corresponding to the information when the priority of the information is higher than that of the other information and the verification score corresponding to the information is higher than that of the other information; determining the verification result based on the verification score corresponding to the information under the condition that the priority of the information is higher than that of the other information, the verification score corresponding to the information is lower than that of the other information, and the verification score corresponding to the information is not lower than a second verification threshold value; and under the condition that the priority of the information is higher than that of the other information, the verification score corresponding to the information is lower than that of the other information, and the verification score corresponding to the information is lower than a second verification threshold value, carrying out weighted summation on the verification score corresponding to the information and the verification score corresponding to the other information to obtain the verification result.

In some embodiments, the apparatus further comprises:

a determining module, configured to determine a health condition of the target object based on the first speech signal;

The verification module is used for verifying the target information based on the reference information of the target object to obtain a third intermediate result; determining, based on the health condition, a fourth intermediate result indicating whether the target object under the health condition is allowed to pass; and determining the verification result based on the third intermediate result and the fourth intermediate result.

In some embodiments, the receiving module is further configured to receive a second voice signal of the target object sent by the passing device, where the passing device is allowed to pass, and the second voice signal is used to modify reference information of the target object;

the decryption module is further configured to decrypt the second voice signal based on the preset encryption rule;

the identification unit is used for identifying the decrypted second voice signal to obtain a modification target and modification content of the target object;

the device also comprises a modification module, a modification module and a modification module, wherein the modification module is used for replacing the content corresponding to the modification target in the reference information of the target object with the modification content;

And the sending module is also used for returning a modification result to the passing equipment under the condition that modification is completed, so that the passing equipment displays the modification result.

In another aspect, a computer device is provided that includes a processor and a memory for storing at least one segment of a computer program that is loaded and executed by the processor to implement a voice-based pass-through method in an embodiment of the present application.

In another aspect, a computer readable storage medium having stored therein at least one segment of a computer program loaded and executed by a processor to implement a voice-based pass through method as in embodiments of the present application is provided.

In another aspect, a computer program product is provided, comprising a computer program stored in a computer readable storage medium, the computer program being read from the computer readable storage medium by a processor of a computer device, the computer program being executed by the processor to cause the computer device to perform the voice-based pass-through method provided in each of the above aspects or various alternative implementations of each of the aspects.

The embodiment of the application provides a voice-based passing method, when a user cannot verify through modes such as identity card, scanning two-dimensional code or face recognition, the user can verify through voice, namely when the user is near passing equipment, the passing equipment can acquire voice signals of the user and send the voice signals to a server, the server recognizes the identity of the user according to the voice signals of the user so as to determine whether the user is allowed to pass, a passing mode is provided, and the voice of the user is not changed generally, so that the passing feasibility of the user is improved, and convenience is brought to the user for traveling; and the voice signal of the user can be encrypted, so that the safety of the information of the user in the information transmission process is ensured.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic illustration of an implementation environment of a voice-based pass-through method provided in accordance with an embodiment of the present application;

FIG. 2 is a flow chart of a voice-based pass method provided in accordance with an embodiment of the present application;

FIG. 3 is a flow chart of another voice-based pass method provided in accordance with an embodiment of the present application;

FIG. 4 is an interactive flow chart of a voice-based pass-through method provided in accordance with an embodiment of the present application;

FIG. 5 is a block diagram of a voice-based pass-through device provided in accordance with an embodiment of the present application;

FIG. 6 is a block diagram of another voice-based pass device provided in accordance with an embodiment of the present application;

fig. 7 is a block diagram of a terminal according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The terms "first," "second," and the like in this application are used to distinguish between identical or similar items that have substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the "first," "second," and "nth" terms, nor is it limited to the number or order of execution.

The term "at least one" in this application means one or more, and the meaning of "a plurality of" means two or more.

It should be noted that, information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals referred to in this application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions. For example, the speech signals referred to in this application are all acquired with sufficient authorization.

The voice-based traffic method provided by the embodiment of the application can be executed by computer equipment. In some embodiments, the computer device is a terminal or a server. In the following, taking a computer device as a server as an example, an implementation environment of the voice-based traffic method provided in the embodiment of the present application is introduced, and fig. 1 is a schematic diagram of an implementation environment of the voice-based traffic method provided in the embodiment of the present application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102. The terminal 101 and the server 102 can be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.

In some embodiments, terminal 101 refers to a pass through device. For example, the terminal 101 may be an access gate in a community, ticket checking equipment at a station, or identity verification equipment at an office, which is not limited in the embodiment of the present application. The terminal 101 has an application running thereon that supports voice acquisition. The application may be a sound recording type application, an information verification type application, a voice assistant, or the like, which is not limited in the embodiments of the present application. The terminal 101 is capable of collecting a voice signal of a user. The terminal 101 then transmits the voice signal to the server 102, and the server 102 verifies the identity of the user based on the voice signal of the user.

In some embodiments, the server 102 is a stand-alone physical server, can be a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like. The server 102 is used to provide background services for applications that support voice capture. In some embodiments, the server 102 takes on primary computing work and the terminal 101 takes on secondary computing work; alternatively, the server 102 takes on secondary computing work and the terminal 101 takes on primary computing work; alternatively, a distributed computing architecture is used for collaborative computing between the server 102 and the terminal 101.

Fig. 2 is a flowchart of a voice-based traffic method according to an embodiment of the present application, and referring to fig. 2, an example of the voice-based traffic method is described in the embodiment of the present application. The voice-based traffic method comprises the following steps:

201. the server receives a first voice signal of the target object sent by the passing equipment, the passing equipment is used for managing passing of the object, and the first voice signal is used for verifying the identity of the target object.

In the embodiment of the present application, the object refers to a pedestrian. The traffic device is used for managing the flow of people and standardizing the entry and exit of pedestrians, and can be also called as a channel management device. The target object refers to a user ready to pass after the identity of the current equipment to be passed is verified. The traffic device is capable of collecting a first speech signal of the target object when the target object is in proximity to the traffic device. The pass device then transmits the first voice signal to the server. After receiving the first voice signal, the server can verify the identity of the target object based on the first voice signal.

202. The server decrypts the first voice signal of the target object based on a preset encryption rule.

In the embodiment of the application, the information transmitted between the passing equipment and the server is encrypted based on a preset encryption rule. The preset encryption rule may be a symmetric encryption algorithm or an asymmetric encryption algorithm, which is not limited in the embodiment of the present application. After receiving the first voice signal of the target object, the server decrypts the first voice signal according to a preset encryption rule.

203. The server verifies the identity of the target object based on the decrypted first voice signal to obtain a verification result, wherein the verification result is used for indicating whether the target object is allowed to pass.

In the embodiment of the application, after decrypting the first voice signal, the server can identify the decrypted first voice signal. And then, the server verifies the identity of the target object based on the identification result of the first voice signal to obtain a verification result of the target object. The verification result may be that the target object is allowed to pass or not allowed to pass, which is not limited in the embodiment of the present application.

204. The server returns a verification result to the passing device so that the passing device determines whether to allow the target object to pass or not based on the verification result.

In the embodiment of the application, the server sends the verification result to the passing equipment. And under the condition that the verification result is that the passing is allowed, the passing equipment passes through the target object. And under the condition that the verification result is that the passing is not allowed, the passing equipment prevents the passing of the target object. Under the condition that the passing is not allowed, the communication equipment can output prompt information to prompt that the target object cannot pass, and the target object is prevented from repeatedly outputting the voice signal to the passing equipment for the passing equipment not to acquire the complete first voice signal. The prompt information can be characters displayed by the passing equipment, can also be voice played by the passing equipment, and the like, and the poison thorn is not limited in the embodiment of the application.

Fig. 3 is a flowchart of another voice-based traffic method provided according to an embodiment of the present application, and referring to fig. 3, an example of the voice-based traffic method is described in the embodiment of the present application. The voice-based traffic method comprises the following steps:

301. the server receives a first voice signal of the target object sent by the passing equipment, the passing equipment is used for managing passing of the object, and the first voice signal is used for verifying the identity of the target object.

In the embodiment of the application, the passing equipment may be an access gate in a community, ticket checking equipment of a station, or identity verification equipment of an office place, etc., which is not limited in the embodiment of the application. The passing equipment has a voice input function. The passing equipment can automatically collect voice signals in the surrounding environment, and when the target object is in the vicinity of the passing equipment, the voice signals of the target object are automatically obtained. Or, the passing equipment is provided with a voice input control, and the voice signal of the target object is acquired in response to the triggering operation of the voice input control. The voice input control can be a control displayed in a screen of the passing equipment, or can be a control installed on an entity of the passing equipment, and the embodiment of the application is not limited to the control.

Under the condition that the passing equipment can automatically collect the voice signals in the surrounding environment, the passing equipment can collect the voice signals in the surrounding environment in real time. Or when the passing equipment detects that the target object appears in the preset range, the passing equipment can start a voice recording function, and at the moment, voice signals in the surrounding environment are collected again to obtain the voice signals of the target object. The method can avoid the passing equipment from collecting a plurality of useless voice signals, thereby reducing the running consumption of the passing equipment.

Network connection exists between the passing equipment and the server, and the connection mode is not limited in the embodiment of the application. Illustratively, there is a long HTTP/2 (Hyper Text Transfer Protocol/2, hyperText transfer protocol/version 2.0) connection between the pass device and the server. The pass device may send the acquired voice signal to the server, and the server identifies and verifies the voice signal. That is, the traffic device may send the first voice signal of the target object to the server, and the server verifies the identity of the target object based on the first voice signal.

302. The server decrypts the first voice signal of the target object based on a preset encryption rule.

In the embodiment of the present application, the information transmitted between the server and the passing device may be information encrypted based on a connection manner between the server and the passing device. That is, the preset encryption rule is determined based on a connection manner between the server and the passing device. Illustratively, the preset encryption rules are determined based on HTTP/2 long connections. The pass device may encrypt the first voice signal of the target object based on the established HTTP/2 long connection with the server. The pass device then transmits the encrypted first voice signal to the server. When the server receives the first voice signal, the server can decrypt the first voice signal of the target object based on the established HTTP/2 long connection with the passing equipment to obtain a decrypted first voice signal. According to the scheme provided by the embodiment of the application, the voice signal of the target object usually contains the personal information of the target object, and the voice signal of the target object is encrypted, so that the safety of the voice signal in the information transmission process is ensured, namely, the safety of the personal information of the target object is ensured.

303. And the server identifies the decrypted first voice signal to obtain target information to be verified.

In the embodiment of the application, the server can identify the first voice signal under the condition that the decrypted first voice signal is obtained, so that target information to be verified is obtained from the first voice signal. The target information may be content information in the first voice signal, a voiceprint feature of the target object, or a health condition of the target object, which is not limited in the embodiment of the present application. The content information may be a certificate number such as an identification number, a social security card number, or a school number of the target object, or may be information with personal characteristics such as a home address, a favorite color, or a favorite song, which is not limited in the embodiment of the present application.

In some embodiments, the target information includes content information in the first speech signal and voiceprint characteristics of the target object. Correspondingly, the process of obtaining the target information by the server is as follows: and the server performs voiceprint extraction on the decrypted first voice signal to obtain voiceprint characteristics of the target object. And the server performs semantic recognition on the decrypted first voice signal to obtain the content information to be verified of the target object. Then, the server determines target information based on the voiceprint characteristics and the content information. The voiceprint extraction process and the semantic recognition process may be processed by the same neural network model, or may be processed by different neural network models, which is not limited in the embodiment of the present application. The embodiment of the application does not limit the sequence of acquiring the voiceprint features and acquiring the content information. According to the scheme provided by the embodiment of the application, the target information to be verified is determined through the voiceprint characteristics and the content information of the target object, so that the identity of the target object can be verified based on the voiceprint characteristics and the content information, and the voiceprint characteristics have relative stability and specificity due to the fact that the voice of a person can be kept relatively stable for a long time, and the content information in the first voice signal is also information with personal characteristics, so that the accuracy of the follow-up verification can be improved, the feasibility of controlling the user to pass based on the voice signal of the user is further improved, and the passing mode is enriched.

In some embodiments, the target information includes a health condition of the target subject. Correspondingly, the process of obtaining the target information by the server is as follows: the server determines a health condition of the target object based on the first speech signal. Then, the server takes the health status of the target object as target information. According to the scheme provided by the embodiment of the application, as symptoms of certain diseases of the human body can be reflected on sound, for example, certain respiratory diseases not only can change the sound of the human body, but also can be infectious, and the health condition of the target object is determined through the voice signal of the target object, so that the target object can be verified based on the health condition, the passing of the target object is controlled, and public safety is maintained.

In some embodiments, the target information includes content information in the first speech signal and voiceprint characteristics of the target object. The health condition of the target object is used as auxiliary information, and the passing of the target object is controlled by combining the target information. That is, in the subsequent verification process, the server verifies the target information and the health condition of the target object respectively, and determines whether to allow the target object to pass or not based on the verification results of the target information and the health condition of the target object. The server can determine whether the identity of the target object is correct based on the target information, and if the identity is correct, determine whether the target object under the health condition can pass based on the health condition. According to the scheme provided by the embodiment of the application, the target object is verified through the voiceprint characteristics, the content information and the health condition of the target object, so that the accuracy of identity verification can be improved, public safety can be maintained, the feasibility of controlling the user to pass based on the voice signal of the user is improved, and the passing mode is enriched.

304. The server verifies the target information based on the reference information of the target object to obtain a verification result, wherein the reference information comprises various information for verifying the identity of the target object, and the various information comprises information of the type to which the target information belongs.

In the embodiment of the present application, the reference information of the target object is information stored in advance for verifying the identity of the target object. The reference information includes various types of information. By way of example, the reference information may include various types of information such as an identification card number of the target object, a social security card number, voiceprint information, a health index allowing passage, a favorite color, and a home address. Voiceprint information refers to a reference voiceprint of a target object that is pre-entered. The database of the server stores reference information of a plurality of objects. After the server determines the target information, the server may compare the target information of the target object with the stored reference information of the target object, thereby verifying the identity of the target object.

In some embodiments, the target information includes content information in the first speech signal and voiceprint characteristics of the target object. Correspondingly, the process of verifying the target information by the server comprises the following steps: and the server verifies the voiceprint characteristics based on the reference voiceprints in the reference information to obtain a first intermediate result. And the server verifies the content information based on the reference content in the reference information to obtain a second intermediate result. Then, the server determines a verification result based on the first intermediate result and the second intermediate result. The embodiment of the application does not limit the sequence of verifying the voiceprint features and verifying the content information. According to the scheme provided by the embodiment of the application, the voice of the person can be kept relatively stable for a long time, so that the voice print characteristics have relative stability and specificity, the content information in the first voice signal is also information with personal characteristics, the identity of the target object is verified through the voice print characteristics and the content information of the target object, the accuracy of identity verification can be improved, the feasibility of controlling the user to pass based on the voice signal of the user is further improved, and the passing mode is enriched.

The first intermediate result and the second intermediate result may be both the result of whether to allow the traffic or not, or may be both the verification score, which is not limited in the embodiment of the present application. Wherein the verification score is used to represent the similarity between the reference information and the information to be verified. The above-described process of determining the verification result based on the first intermediate result and the second intermediate result may be processed in any one of the following three ways.

In the first way, the first intermediate result and the second intermediate result are both the result of whether or not the passage is permitted. Accordingly, the process of determining the verification result based on the first intermediate result and the second intermediate result is as follows: in the case where the first intermediate result and the second intermediate result are both allowed traffic, the server determines that the authentication result is allowed traffic. And under the condition that the first intermediate result and the second intermediate result have the impermissible traffic, the server determines that the verification result is impermissible traffic. According to the scheme provided by the embodiment of the application, the passing of the identity verification of the target object can be confirmed only under the condition that the first intermediate result and the second intermediate result are allowed to pass, the accuracy of the identity verification can be improved, the target object is allowed to pass on the basis, the feasibility of controlling the passing of the user based on the voice signal of the user is improved, and the passing mode is enriched.

In a second way, the first intermediate result and the second intermediate result are both verification scores, and the server can determine the verification results by weighting the first intermediate result and the second intermediate result. Accordingly, the process of determining the verification result based on the first intermediate result and the second intermediate result is as follows: the server determines a weight of the first intermediate result and a weight of the second intermediate result based on the priority of the voiceprint information and the priority of the content information. Then, the server performs weighted summation on the first intermediate result and the second intermediate result based on the weight of the first intermediate result and the weight of the second intermediate result, and a target score is obtained. Then, in the case where the target score reaches the first verification threshold, the server determines that the verification result is permitted traffic. And under the condition that the target score does not reach the first verification threshold value, the server determines that the verification result is not allowed to pass. According to the scheme provided by the embodiment of the application, through the priority of the voiceprint information and the priority of the content information, the first intermediate result and the second intermediate result are weighted and summed, so that the obtained target score can more accurately represent the verification result of the target object, the verification accuracy can be improved, the passing of the target object can be controlled more accurately, the feasibility of controlling the passing of the user based on the voice signal of the user is improved, and the passing mode is enriched.

In a third manner, the first intermediate result and the second intermediate result are both verification scores, and the server may select any one of the first intermediate result and the second intermediate result to determine a final verification result based on the priority of the information corresponding to the intermediate result. The priority of the information is used to indicate the accuracy of verifying the information. According to the scheme provided by the embodiment of the application, the final verification result is determined by selecting the intermediate result corresponding to a certain item of information according to the priority of the verified information, a calculation mode of the verification result is provided, and the final verification result is determined by selecting the intermediate result of the verified information with high accuracy, so that the verification accuracy can be improved, the passing of a target object can be controlled more accurately, the feasibility of controlling the passing of the user based on the voice signal of the user is improved, and the passing mode is enriched.

Illustratively, the above-described process of determining the verification result based on the first intermediate result and the second intermediate result may be divided into the following three cases:

in the first case, the server selects an intermediate result with a higher priority of the corresponding information and a higher verification score to determine the final verification result. Accordingly, for any one of the voiceprint information and the content information, in the case where the priority of the information is higher than that of the other information and the authentication score corresponding to the information is higher than that of the other information, the server determines the authentication result based on the authentication score corresponding to the information. According to the method, the final verification result is determined by selecting the intermediate result of the information with high verification accuracy, so that the verification accuracy can be improved, the passing of the target object can be controlled more accurately, the feasibility of controlling the passing of the user based on the voice signal of the user is improved, and the passing modes are enriched.

In the second case, when the priority of the corresponding information is high, but the verification score is low, the server selects the intermediate result corresponding to the information to determine the final verification result as long as the verification score of the information is not lower than a certain threshold. That is, in the case where the priority of the information is higher than that of the other information, the authentication score corresponding to the information is lower than that corresponding to the other information, and the authentication score corresponding to the information is not lower than the second authentication threshold, the server determines the authentication result based on the authentication score corresponding to the information. According to the method, under the condition that the intermediate result of the information with high verification accuracy is not lower than the second verification threshold value, the final verification result is still determined based on the intermediate result of the information with high verification accuracy, so that the verification accuracy can be improved, the passing of the target object can be controlled more accurately, the feasibility of controlling the passing of the user based on the voice signal of the user is improved, and the passing modes are enriched.

In the third case, in the case where the priority of the corresponding information is high and the verification score is lower than a certain threshold, the server may select an intermediate result corresponding to the information with the low priority to determine the final verification result. Or, the server may further combine intermediate results corresponding to the information to be verified in the target information to determine a final verification result. Correspondingly, when the priority of the information is higher than that of the other information, the verification score corresponding to the information is lower than that of the other information, and the verification score corresponding to the information is lower than the second verification threshold, the server performs weighted summation on the verification score corresponding to the information and the verification score corresponding to the other information to obtain a verification result. According to the method, under the condition that the intermediate result of the information with high verification accuracy is lower than the second verification threshold value, the final verification result is determined by combining the intermediate results corresponding to the information to be verified in the target information, so that the verification accuracy can be improved, the passing of the target object can be controlled more accurately, the feasibility of controlling the passing of the user based on the voice signal of the user is improved, and the passing modes are enriched.

In some embodiments, the server may also determine the health of the target object by the voice signal of the target object. Then, the server uses the health condition as auxiliary information and combines the target information to control the passing of the target object. Accordingly, the process of determining the verification result by the server is as follows: and verifying the target information by the server based on the reference information of the target object to obtain a third intermediate result. The server determines a fourth intermediate result based on the health condition. Then, the server determines a verification result based on the third intermediate result and the fourth intermediate result. The fourth intermediate result is used for indicating whether the target object under the health condition is allowed to pass through. According to the scheme provided by the embodiment of the application, whether the identity of the target object is correct can be determined based on the target information, and whether the target object can pass under the health condition is determined based on the health condition under the condition that the identity is correct. The method not only can improve the accuracy of identity verification, but also can be beneficial to maintaining public safety, thereby improving the feasibility of controlling the passage of the user based on the voice signal of the user and enriching the passage way.

305. The server returns a verification result to the passing device so that the passing device determines whether to allow the target object to pass or not based on the verification result.

In the embodiment of the application, the server sends the verification result to the passing equipment under the condition that the verification result is obtained. The passing device determines whether to pass the target object based on the verification result. In the case that the verification result is that the passing is allowed, the passing equipment can display a prompt text or a mark allowing the passing; alternatively, the pass device may play a prompt voice allowing the pass. In the case that the verification result is that the traffic is not allowed, the traffic device can display a prompt text or a mark of the traffic not allowed and display the reason of the traffic not allowed; alternatively, the pass device may play the prompt voice to allow the pass and play the reason for not allowing the pass.

The server may also send the recognition result of the first voice signal to the traffic device. Accordingly, the traffic device may display the recognition result of the first voice signal. For example, the pass device may display content information in the first voice signal, such as an identification card number of the target object, or the like.

In some embodiments, in the case that the verification result is that the passage is allowed, that is, in the case that the verification is passed, the server may further modify the parameter information stored in advance based on the voice signal of the target object. Correspondingly, under the condition that the verification result is that the passing is allowed, the server receives a second voice signal of the target object sent by the passing equipment. Then, the server decrypts the second voice signal based on the preset encryption rule. Then, the server identifies the decrypted second voice signal to obtain the modification target and the modification content of the target object. Then, the server replaces the content corresponding to the modification target in the reference information of the target object with the modification content. Then, the server returns a modification result to the passing device under the condition that the modification is completed, so that the passing device displays the modification result. Wherein the second speech signal is used to modify the reference information of the target object. The modification target is used to represent the type of information to be modified. The modified content is used to represent the specific content to be modified. The modification described above can be regarded as a modification of "intention+slot". Wherein, "intent" refers to modification objectives; "slot" refers to modified content. According to the scheme provided by the embodiment of the application, under the condition that the target object passes verification, the target object can also perform voice interaction with the passing equipment, so that the background server of the passing equipment can modify the pre-stored reference information, the follow-up operation can verify the identity of the target object based on the modified reference information, the requirement of a user is met, the operation is simple, manual operation of the user is not needed, the man-machine interaction efficiency is improved, and convenience is further brought to the passing of the user.

The passing equipment can also support the verification functions of reading identity cards, scanning two-dimensional codes or recognizing face. The target object can control the passing device to start any one of the verification functions through the voice signal. Correspondingly, the passing equipment acquires a third voice signal of the target object and sends the third voice signal to the server. The server decrypts the third voice signal based on a preset encryption rule. Then, the server identifies the decrypted third voice signal to obtain the function name of the function to be started. Then, the server function name is sent to the passing equipment so that the passing equipment starts the corresponding verification function of the passing equipment. The pass device may display a prompt to successfully initiate a certain authentication function. For any one verification function, the server stores verification parameters corresponding to the verification function. The verification parameter may be calculated as one of the parameter information of the target object. That is, the target object may adjust the verification parameters of the target object through the voice signal.

For example, the verification parameter is a recognition threshold of the face recognition function. And under the condition that the similarity between the face of the currently shot target object and the face in the reference information reaches the recognition threshold, the server determines that the verification result of the target object is allowed to pass. And under the condition that the similarity between the face of the currently shot target object and the face in the reference information is lower than the recognition threshold, the server determines that the verification result of the target object is not allowed to pass. The target object may adjust its recognition threshold by the speech signal. In the case of a target object makeup, the corresponding recognition threshold can be lowered by the voice signal when the target object is present.

In order to more clearly describe the voice-based traffic method provided in the embodiments of the present application, the voice-based traffic method is further described below with reference to the accompanying drawings. For example, fig. 4 is an interactive flowchart of a voice-based pass method provided according to an embodiment of the present application. Referring to fig. 4, the voice-based pass method may include steps 401 to 414 described below. Step 401, the passing equipment collects a first voice signal of a target object. Step 402, the traffic device sends a first voice signal to the server. Step 403, the server decrypts the first voice signal based on a preset encryption rule. Step 404, the server identifies the decrypted first voice signal to obtain target information to be verified. Step 405, the server verifies the target information based on the reference information of the target object, and obtains a verification result. Step 406, the server returns the verification result to the passing equipment. Step 407, displaying the verification result by the passing device, and determining whether to allow the target object to pass or not based on the verification result. Step 408, under the condition that the verification result is that the passing is allowed, the passing equipment collects the second voice signal of the target object. Step 409, the traffic device sends the second voice signal to the server. Step 410, the server decrypts the second voice signal based on the preset encryption rule. Step 411, the server identifies the decrypted second voice signal, and obtains a modification target and modification content of the target object. Step 412, the server replaces the content corresponding to the modification target in the reference information of the target object with the modification content. Step 413, the server returns the modification result to the passing device. Step 414, the traffic device displays the modification result.

The embodiment of the application provides a voice-based passing method, when a user cannot verify through modes such as identity card, scanning two-dimensional code or face recognition, the user can verify through voice, namely when the user is near passing equipment, the passing equipment can acquire voice signals of the user and send the voice signals to a server, the server recognizes the identity of the user according to the voice signals of the user so as to determine whether the user is allowed to pass, the passing modes are enriched, and because the voice of the user is generally unchanged, the passing feasibility of the user is improved, and convenience is brought to the user for traveling; in addition, the voice signal of the user can be encrypted, so that the safety of the information of the user in the information transmission process is ensured; in addition, the reference information for verifying the identity of the user can be modified through the voice signal, manual operation of the user is not needed, the man-machine interaction efficiency is improved, and convenience is brought to the passing of the user.

Fig. 5 is a block diagram of a voice-based pass-through device provided in accordance with an embodiment of the present application. The voice-based traffic device is configured to perform the steps when the voice-based traffic method is performed, referring to fig. 5, and the voice-based traffic device includes: a receiving module 501, a decrypting module 502, a verifying module 503 and a transmitting module 504.

The receiving module 501 is configured to receive a first voice signal of a target object sent by a passing device, where the passing device is configured to manage passing of the object, and the first voice signal is configured to verify an identity of the target object;

the decryption module 502 is configured to decrypt the first voice signal of the target object based on a preset encryption rule;

a verification module 503, configured to verify the identity of the target object based on the decrypted first voice signal, to obtain a verification result, where the verification result is used to indicate whether the target object is allowed to pass;

and a sending module 504, configured to return a verification result to the passing device, so that the passing device determines whether to allow the target object to pass based on the verification result.

In some embodiments, fig. 6 is a block diagram of another voice-based pass device provided according to an embodiment of the present application, see fig. 6, a verification module 503, comprising:

the identifying unit 5031 is configured to identify the decrypted first voice signal, so as to obtain target information to be verified;

the verification unit 5032 is configured to verify the target information based on the reference information of the target object, to obtain a verification result, where the reference information includes multiple information for verifying the identity of the target object, and the multiple information includes information of a type to which the target information belongs.

In some embodiments, with continued reference to fig. 6, the identifying unit 5031 is configured to perform voiceprint extraction on the decrypted first voice signal to obtain voiceprint features of the target object; semantic recognition is carried out on the decrypted first voice signal, so that content information to be verified of the target object is obtained; based on the voiceprint features and the content information, target information is determined.

In some embodiments, with continued reference to fig. 6, the authentication unit 5032 includes:

a first verification subunit 50321, configured to verify the voiceprint feature based on the reference voiceprint in the reference information, to obtain a first intermediate result;

a second verification subunit 50322, configured to verify the content information based on the reference content in the reference information, to obtain a second intermediate result;

a determination subunit 50323 for determining a verification result based on the first intermediate result and the second intermediate result.

with continued reference to fig. 6, a determining subunit 50323 is configured to determine the weight of the first intermediate result and the weight of the second intermediate result based on the priority of the voiceprint information and the priority of the content information; based on the weight of the first intermediate result and the weight of the second intermediate result, carrying out weighted summation on the first intermediate result and the second intermediate result to obtain a target score; under the condition that the target score reaches a first verification threshold value, determining that the verification result is allowed traffic; and under the condition that the target score does not reach the first verification threshold value, determining that the verification result is not allowed to pass.

with continued reference to fig. 6, the determining subunit 50323 is configured to determine, for any one of the voiceprint information and the content information, a verification result based on the verification score corresponding to the information when the priority of the information is higher than that of the other information and the verification score corresponding to the information is higher than that of the other information; determining a verification result based on the verification score corresponding to the information under the condition that the priority of the information is higher than that of the other information, the verification score corresponding to the information is lower than that of the other information, and the verification score corresponding to the information is not lower than a second verification threshold value; and under the condition that the priority of the information is higher than that of the other information, the verification score corresponding to the information is lower than that of the other information, and the verification score corresponding to the information is lower than a second verification threshold value, carrying out weighted summation on the verification score corresponding to the information and the verification score corresponding to the other information to obtain a verification result.

In some embodiments, with continued reference to fig. 6, the apparatus further comprises:

a determining module 505, configured to determine a health condition of the target object based on the first voice signal;

a verification module 503, configured to verify the target information based on the reference information of the target object, to obtain a third intermediate result; determining a fourth intermediate result based on the health condition, the fourth intermediate result being used to indicate whether the target object under the health condition is allowed to pass; and determining a verification result based on the third intermediate result and the fourth intermediate result.

In some embodiments, with continued reference to fig. 6, the receiving module 501 is further configured to receive, if the verification result is that the passing is allowed, a second voice signal of the target object sent by the passing device, where the second voice signal is used to modify reference information of the target object;

the decryption module 502 is further configured to decrypt the second voice signal based on a preset encryption rule;

the identifying unit 5031 is configured to identify the decrypted second speech signal, so as to obtain a modification target and modification content of the target object;

with continued reference to fig. 6, the apparatus further includes:

the modification module 506 is configured to replace content corresponding to the modification target in the reference information of the target object with the modified content;

And the sending module 504 is further configured to return a modification result to the passing device when the modification is completed, so that the passing device displays the modification result.

The embodiment of the application provides a voice-based passing device, when a user cannot verify through modes such as identity card, scanning two-dimensional code or face recognition, the user can verify through voice, namely when the user is near passing equipment, the passing equipment can acquire voice signals of the user and send the voice signals to a server, the server recognizes the identity of the user according to the voice signals of the user so as to determine whether the user is allowed to pass, a passing mode is provided, and the voice of the user is not changed generally, so that the passing feasibility of the user is improved, and convenience is brought to the user for traveling; in addition, the voice signal of the user can be encrypted, so that the safety of the information of the user in the information transmission process is ensured; in addition, the reference information for verifying the identity of the user can be modified through the voice signal, manual operation of the user is not needed, the man-machine interaction efficiency is improved, and convenience is brought to the passing of the user.

It should be noted that: the voice-based traffic device provided in the above embodiment is only exemplified by the above-mentioned division of each functional module when an application program is running, and in practical application, the above-mentioned functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the device is divided into different functional modules to perform all or part of the functions described above. In addition, the voice-based traffic device and the voice-based traffic method provided in the above embodiments belong to the same concept, and detailed implementation processes of the voice-based traffic device and the voice-based traffic method are detailed in the method embodiments, which are not repeated here.

In the embodiment of the present application, the computer device may be configured as a terminal or a server, and when the computer device is configured as a terminal, the technical solution provided in the embodiment of the present application may be implemented by the terminal as an execution body, and when the computer device is configured as a server, the technical solution provided in the embodiment of the present application may be implemented by the server as an execution body, and also the technical solution provided in the present application may be implemented by interaction between the terminal and the server, which is not limited in this embodiment of the present application.

Fig. 7 is a block diagram of a terminal 700 according to an embodiment of the present application. The terminal 700 refers to a passing device such as an entrance gate in a community, ticket checking device at a station, authentication device at an office, smart phone, notebook computer or desktop computer.

In general, the terminal 700 includes: a processor 701 and a memory 702.

Processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 701 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 701 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit, image processor) for taking care of rendering and drawing of content that the display screen is required to display. In some embodiments, the processor 701 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

Memory 702 may include one or more computer-readable storage media, which may be non-transitory. The memory 702 may also include high-speed random access memory as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one computer program for execution by processor 701 to implement the voice-based pass method provided by the method embodiments herein.

In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by a bus or signal lines. The individual peripheral devices may be connected to the peripheral device interface 703 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 704, a display 705, a camera assembly 706, audio circuitry 707, and a power supply 708.

A peripheral interface 703 may be used to connect I/O (Input/Output) related at least one peripheral device to the processor 701 and memory 702. In some embodiments, the processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 704 is configured to receive and transmit RF (Radio Frequency) signals, also referred to as electromagnetic signals. The radio frequency circuitry 704 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 704 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. In some embodiments, the radio frequency circuit 704 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 704 may also include NFC (Near Field Communication ) related circuitry, which is not limited in this application.

The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 705 is a touch display, the display 705 also has the ability to collect touch signals at or above the surface of the display 705. The touch signal may be input to the processor 701 as a control signal for processing. At this time, the display 705 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 705 may be one and disposed on the front panel of the terminal 700; in other embodiments, the display 705 may be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in other embodiments, the display 705 may be a flexible display disposed on a curved surface or a folded surface of the terminal 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The display 705 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 706 is used to capture images or video. In some embodiments, camera assembly 706 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

The audio circuit 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing, or inputting the electric signals to the radio frequency circuit 704 for voice communication. For the purpose of stereo acquisition or noise reduction, a plurality of microphones may be respectively disposed at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 707 may also include a headphone jack.

The power supply 708 is used to power the various components in the terminal 700. The power source 708 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power source 708 comprises a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 700 further includes one or more sensors 709. The one or more sensors 709 include, but are not limited to: acceleration sensor 710, gyro sensor 711, pressure sensor 712, optical sensor 713, and proximity sensor 714.

The acceleration sensor 710 may detect the magnitudes of accelerations on three coordinate axes of a coordinate system established with the terminal 700. For example, the acceleration sensor 710 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 701 may control the display screen 705 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 710. Acceleration sensor 710 may also be used for the acquisition of motion data of a game or user.

The gyro sensor 711 may detect a body direction and a rotation angle of the terminal 700, and the gyro sensor 711 may collect a 3D motion of the user on the terminal 700 in cooperation with the acceleration sensor 710. The processor 701 may implement the following functions according to the data collected by the gyro sensor 711: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

The pressure sensor 712 may be disposed at a side frame of the terminal 700 and/or at a lower layer of the display screen 705. When the pressure sensor 712 is disposed at a side frame of the terminal 700, a grip signal of the user to the terminal 700 may be detected, and the processor 701 performs a left-right hand recognition or a shortcut operation according to the grip signal collected by the pressure sensor 712. When the pressure sensor 712 is disposed at the lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The optical sensor 713 is used to collect the intensity of ambient light. In one embodiment, the processor 701 may control the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 713. Specifically, when the intensity of the ambient light is high, the display brightness of the display screen 705 is turned up; when the ambient light intensity is low, the display brightness of the display screen 705 is turned down. In another embodiment, the processor 701 may also dynamically adjust the shooting parameters of the camera assembly 706 based on the ambient light intensity collected by the optical sensor 713.

A proximity sensor 714, also known as a distance sensor, is typically provided on the front panel of the terminal 700. The proximity sensor 714 is used to collect the distance between the user and the front of the terminal 700. In one embodiment, when the proximity sensor 714 detects that the distance between the user and the front of the terminal 700 gradually decreases, the processor 701 controls the display 705 to switch from the bright screen state to the off screen state; when the proximity sensor 714 detects that the distance between the user and the front surface of the terminal 700 gradually increases, the processor 701 controls the display screen 705 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the structure shown in fig. 7 is not limiting of the terminal 700 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.

Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 800 may have a relatively large difference due to configuration or performance, and may include one or more processors (Central Processing Units, CPU) 801 and one or more memories 802, where the memories 802 store at least one computer program, and the at least one computer program is loaded and executed by the processor 801 to implement the voice-based traffic method provided in the above-described method embodiments. Of course, the server may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.

The present application also provides a computer readable storage medium having stored therein at least one section of a computer program loaded and executed by a processor of a computer device to implement the operations performed by the computer device in the voice-based traffic method of the above embodiments. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.

Embodiments of the present application also provide a computer program product comprising a computer program stored in a computer readable storage medium. The processor of the computer device reads the computer program from the computer-readable storage medium, and the processor executes the computer program so that the computer device performs the voice-based traffic method provided in the above-described various alternative implementations.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing description of the preferred embodiments is merely exemplary in nature and is in no way intended to limit the invention, since it is intended that all modifications, equivalents, improvements, etc. that fall within the spirit and scope of the invention.

Claims

1. A voice-based traffic method, the method comprising:

2. The method according to claim 1, wherein verifying the identity of the target object based on the decrypted first voice signal, to obtain a verification result, comprises:

Identifying the decrypted first voice signal to obtain target information to be verified;

and verifying the target information based on the reference information of the target object to obtain the verification result, wherein the reference information comprises various information for verifying the identity of the target object, and the various information comprises information of the type to which the target information belongs.

3. The method according to claim 2, wherein the identifying the decrypted first voice signal to obtain target information to be verified comprises:

voiceprint extraction is carried out on the decrypted first voice signal, so that voiceprint characteristics of the target object are obtained;

performing semantic recognition on the decrypted first voice signal to obtain content information to be verified of the target object;

the target information is determined based on the voiceprint feature and the content information.

4. A method according to claim 3, wherein verifying the target information based on the reference information of the target object, to obtain the verification result, comprises:

verifying the voiceprint characteristics based on the reference voiceprints in the reference information to obtain a first intermediate result;

Verifying the content information based on the reference content in the reference information to obtain a second intermediate result;

the validation result is determined based on the first intermediate result and the second intermediate result.

5. The method of claim 4, wherein the first intermediate result and the second intermediate result are both verification scores, the verification scores representing a similarity between the reference information and information to be verified;

the determining the verification result based on the first intermediate result and the second intermediate result includes:

determining a weight of the first intermediate result and a weight of the second intermediate result based on the priority of the voiceprint information and the priority of the content information;

based on the weight of the first intermediate result and the weight of the second intermediate result, carrying out weighted summation on the first intermediate result and the second intermediate result to obtain a target score;

under the condition that the target score reaches a first verification threshold value, determining that the verification result is allowed traffic;

and under the condition that the target score does not reach a first verification threshold value, determining that the verification result is not allowed to pass.

6. The method of claim 4, wherein the first intermediate result and the second intermediate result are both verification scores, the verification scores representing a similarity between the reference information and information to be verified;

for any one of the voiceprint information and the content information, determining the verification result based on the verification score corresponding to the information when the priority of the information is higher than that of the other information and the verification score corresponding to the information is higher than that of the other information;

determining the verification result based on the verification score corresponding to the information under the condition that the priority of the information is higher than that of the other information, the verification score corresponding to the information is lower than that of the other information, and the verification score corresponding to the information is not lower than a second verification threshold value;

and under the condition that the priority of the information is higher than that of the other information, the verification score corresponding to the information is lower than that of the other information, and the verification score corresponding to the information is lower than a second verification threshold value, carrying out weighted summation on the verification score corresponding to the information and the verification score corresponding to the other information to obtain the verification result.

7. The method according to claim 2, wherein the method further comprises:

determining a health condition of the target object based on the first speech signal;

the verifying the target information based on the reference information of the target object to obtain the verification result includes:

verifying the target information based on the reference information of the target object to obtain a third intermediate result;

determining, based on the health condition, a fourth intermediate result indicating whether the target object under the health condition is allowed to pass;

and determining the verification result based on the third intermediate result and the fourth intermediate result.

8. The method according to claim 2, wherein the method further comprises:

receiving a second voice signal of the target object sent by the passing equipment under the condition that the verification result is that passing is allowed, wherein the second voice signal is used for modifying the reference information of the target object;

decrypting the second voice signal based on the preset encryption rule;

identifying the decrypted second voice signal to obtain a modification target and modification content of the target object;

Replacing the content corresponding to the modification target in the reference information of the target object with the modification content;

and under the condition that the modification is completed, returning a modification result to the passing equipment so as to enable the passing equipment to display the modification result.

9. A voice-based pass through device, the device comprising:

10. A computer device, characterized in that it comprises a processor and a memory for storing at least one computer program, which is loaded by the processor and which carries out the speech-based traffic method according to any of claims 1 to 8.

11. A computer readable storage medium, characterized in that the computer readable storage medium is adapted to store at least one computer program for performing the speech based traffic method according to any of claims 1 to 8.

12. A computer program product comprising a computer program which, when executed by a processor, implements the speech-based pass-through method according to any one of claims 1 to 8.