WO2021051533A1

WO2021051533A1 - Address information-based blacklist identification method, apparatus, device, and storage medium

Info

Publication number: WO2021051533A1
Application number: PCT/CN2019/117117
Authority: WO
Inventors: 李江; 王健宗; 彭俊清
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-09-19
Filing date: 2019-11-11
Publication date: 2021-03-25
Also published as: CN110767238A; CN110767238B

Abstract

The present application relates to the field of artificial intelligence, and disclosed therein are an address information-based blacklist identification method, an apparatus, a device, and a storage medium, which are used for dividing a blacklist feature library into blacklist feature sub-libraries having smaller dimensions and comparing voiceprint features according to the blacklist feature sub-library corresponding to address information, thereby improving the voiceprint identification efficiency. The method according to the present application comprises: acquiring a voice file of a target user, the voice file comprising audio and address information, and the address information comprising an incoming telephone segment or/and Internet Protocol (IP) address information; carrying out feature extraction on the audio by means of a preset algorithm to obtain a feature file; determining whether the feature file is valid or not; if the feature file is invalid, generating a status code that extraction has failed; and if the feature file is valid, determining according to the incoming telephone segment or the IP information a target address range to which the target user belongs, calling a preconfigured blacklist model and the target address range to perform similarity scoring on the feature file, and performing a corresponding operation according to the scoring result.

Description

Blacklist identification method, device, equipment and storage medium based on address information

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on September 19, 2019, the application number is 201910884630.2, and the invention title is "Blacklist identification method, device, equipment and storage medium based on address information", all of which The content is incorporated in the application by reference.

Technical field

This application relates to the field of artificial intelligence, and in particular to a blacklist identification method, device, equipment and storage medium based on address information.

Background technique

Voiceprint recognition achieves the purpose of distinguishing unknown voices by analyzing the characteristics of one or more speech signals. Simply put, it is a technology to distinguish whether a certain sentence is spoken by a certain person. The theoretical basis of voiceprint recognition is that each voice has a unique feature, which can effectively distinguish the voices of different people. The two most important factors that determine the characteristics of the voiceprint are the size of the sound cavity and the way the vocal organs are manipulated.

Voiceprint is a very important feature of the human body. In theory, no two people have exactly the same voiceprint characteristics. The speaker identification technology based on voiceprint recognition is of great significance in actual production. For example, bank credit card business can record the voiceprint feature database of blacklisted users and compare whether the user's voice is in the blacklisted feature database. Analyze and identify whether the user is on the blacklist, so as to guide the bank's credit card business personnel to make corresponding processing strategies.

In the existing scheme, the establishment of the blacklist feature database covers the whole country and covers all age groups. Therefore, the blacklist feature database will be very large. The inventor realized that the comparison efficiency based on the huge blacklist feature database would be very slow and difficult to be fast. Get a response.

Summary of the invention

This application provides a blacklist identification method, device, equipment and storage medium based on address information, which are used to divide the blacklist feature database into smaller blacklist feature sub-databases, and based on the pair of blacklist feature sub-databases corresponding to the address information The voiceprint features are compared to improve the efficiency of voiceprint recognition.

The first aspect of the embodiments of the present application provides a blacklist recognition method based on address information, including: acquiring a voice file of a target user, the voice file including audio and address information, and the address information including the incoming call section or / And Internet Protocol address IP information; perform feature extraction on the audio through a preset algorithm to obtain a feature file; determine whether the feature file is valid; if the feature file is invalid, generate an extraction failure status code, the status The code is used to indicate the reason for the extraction failure; if the feature file is valid, the target address range to which the target user belongs is determined according to the incoming phone segment or the IP information, and the preset blacklist model and all the addresses are called. According to the target address range, the feature files are scored for similarity, and corresponding operations are performed according to the scoring results.

A second aspect of the embodiments of the present application provides a blacklist recognition device based on address information, including: an acquiring unit for acquiring a voice file of a target user, the voice file including audio and address information; and an extracting unit for Perform feature extraction on the audio through a preset algorithm to obtain a feature file; a judging unit for judging whether the feature file is valid; a first generating unit, if the feature file is invalid, for generating a status code of extraction failure The status code is used to indicate the reason for the extraction failure; the scoring unit, if the feature file is valid, is used to determine the target to which the target user belongs based on the incoming phone section or the Internet Protocol address IP information Address range, call the preset blacklist model and the target address range to score the similarity of the feature files, and perform corresponding operations according to the scoring results.

The third aspect of the embodiments of the present application provides a blacklist identification device based on address information, including a memory, a processor, and a computer program stored in the memory and running on the processor. The processor The above-mentioned blacklist identification method based on address information is realized when the computer program is executed.

The fourth aspect of the embodiments of the present application provides a computer-readable storage medium that stores instructions in the computer-readable storage medium. When the instructions run on a computer, the computer executes the blacklist based on address information. Identify the steps of the method.

In the technical solution provided by the embodiments of the present application, the voice file of the target user is obtained, the voice file includes audio and address information, and the address information includes the incoming phone segment or/and Internet Protocol address IP information; the audio is characterized by a preset algorithm Extract and obtain the signature file; determine whether the signature file is valid; if the signature file is invalid, generate a status code of the extraction failure, the status code is used to indicate the reason for the extraction failure; if the signature file is valid, according to the incoming phone segment or IP information Determine the target address range to which the target user belongs, call the preset blacklist model and target address range to score the similarity of the feature files, and perform corresponding operations according to the scoring results. In this embodiment of the application, the blacklist feature database is divided into blacklist feature sub-bases with smaller dimensions, and voiceprint features are compared according to the blacklist feature sub-bases corresponding to address information, which improves the efficiency of voiceprint recognition.

Description of the drawings

FIG. 1 is a schematic diagram of an embodiment of a blacklist identification method based on address information in an embodiment of the application;

2 is a schematic diagram of another embodiment of a blacklist identification method based on address information in an embodiment of this application;

3 is a schematic diagram of an embodiment of a blacklist identification device based on address information in an embodiment of the application;

4 is a schematic diagram of another embodiment of a blacklist identification device based on address information in an embodiment of the application;

Fig. 5 is a schematic diagram of an embodiment of a blacklist identification device based on address information in an embodiment of the application.

detailed description

This application provides a blacklist identification method, device, equipment, and storage medium based on address information. The blacklist feature database is divided into smaller-dimensional blacklist feature sub-bases, and the blacklist feature sub-bases corresponding to the address information are used for voice matching. The feature of the pattern is compared to improve the efficiency of voiceprint recognition.

In order to enable those skilled in the art to better understand the solution of the present application, the embodiments of the present application will be described below in conjunction with the accompanying drawings in the embodiments of the present application.

The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects, without having to use To describe a specific order or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances, so that the embodiments described herein can be implemented in a sequence other than the content illustrated or described herein. In addition, the terms "including" or "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not necessarily limited to those clearly listed. Steps or units, but may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or equipment.

Please refer to Fig. 1, a flowchart of a method for identifying a blacklist based on address information provided by an embodiment of the present application, which specifically includes:

101. Acquire a voice file of the target user. The voice file includes audio and address information, and the address information includes incoming phone segment or/and Internet Protocol address IP information.

The server obtains the voice file of the target user, the voice file includes audio and address information, and the address information includes incoming telephone section or/and IP information. Specifically, the server receives the voice file of the target user; the server parses the voice file to obtain the audio and address identification of the target user; the server queries the preset table according to the address identification to obtain the address information corresponding to the address identification, and the address information includes the incoming line Phone segment or/and IP information.

For example, when the server obtains the audio of the target user through the phone or the network, it will also determine the specific address of the target user according to the incoming phone segment or internet protocol address (IP) information. For a specific service, the basic information of the target user (except sensitive information) is maintained. For example, when the server obtains the target user's voice file through the network, the voice file includes audio and address identifiers as well as An identity identifier that indicates the basic information of the target user, where the basic information includes age, gender, and so on.

It is understandable that the execution subject of this application may be a blacklist identification device based on address information, or may also be a terminal or a server, which is not specifically limited here. The embodiment of the present application takes the server as the execution subject as an example for description.

102. Perform feature extraction on the audio by using a preset algorithm to obtain a feature file.

The server performs feature extraction on the audio through a preset algorithm to obtain a feature file. Specifically, the server converts the audio from an analog signal form to a digital signal form; the server pre-emphasizes the audio in the digital signal form; the server performs windowing processing on the pre-emphasized audio; the server performs discrete integration of the windowed audio The inner leaf transforms to obtain the target complex number; the server maps the target complex number to the Mel spectrum to obtain the logarithmic energy; the server converts the logarithmic energy to obtain the cepstral coefficient; the server calculates the energy and the difference according to the cepstral coefficient, and generates the signature file .

It should be noted that, before performing feature extraction, the server needs to sample and quantize the collected audio, that is, convert the audio continuous waveform into discrete data points with a certain sampling rate and number of sampling bits. Since the sounds in daily life are generally below 8kHz, according to Nyquist's law, the 16kHz sampling rate is sufficient to make the sampled data contain most of the sound information. 16kHz means that 16k samples are sampled in 1s. These samples are stored in amplitude values. In order to effectively store the amplitude values, they need to be quantized into integers. For the 16-bit sampling number, it can represent an integer value between -32768 and 32767, so the sampling amplitude value can be quantized to the nearest integer value.

For example, for the frequency spectrum of a sound signal, the energy of the low frequency part is usually higher than the energy of the high frequency part. After 10 times of Hz, the spectrum energy will be attenuated by 20dB, and due to the influence of the noise of the circuit itself when the microphone is collecting the sound signal, It will also increase the energy of the low frequency part. In order to make the energy of the high frequency part and the energy of the low frequency part have a similar amplitude, it is necessary to pre-enhance the high-frequency energy of the collected sound, that is, pre-emphasize the audio in the form of a digital signal.

In a relatively short period of time, the audio after pre-emphasis can be considered to be smooth, which is called windowing. The window is described by three parameters: window length (in milliseconds), offset and shape. Each windowed audio signal is called a frame, the number of milliseconds in each frame is called the frame length, and the distance between the left borders of two adjacent frames is called the frame shift. The process of extracting a frame from the audio signal s[n] can be expressed as y[n]=w[n]s[n]. If w[n] is a rectangular window, the signal will be cut off at the boundary. These discontinuities Will affect the Fourier analysis. Therefore, in the Mel frequency cepstral coefficient, the windowing generally uses the Hamming window with edge smoothing reduced to 0, the expression is as follows:

L is the frame length.

It is understandable that the server performs discrete Fourier transform on the windowed audio, and the process of obtaining the target complex number specifically includes: the server obtains the windowed audio signal x[n],...,[m], n And m are integers greater than 0; the server calls the first preset formula to generate the target complex number X[k], the first preset formula is:

N is a power of 2, k is an integer, and X[k] represents the amplitude and phase of a certain frequency component in the windowed audio signal.

The server maps the target complex number to the Mel spectrum, and the process of obtaining the logarithmic energy specifically includes:

The server smoothes the target complex number through the preset filter group; the server corresponds the smoothed complex number to the mel scale on the mel spectrum, and one mel scale represents a treble unit; the server uses the second preset formula Correspond the smoothed complex number to the mel scale to obtain the target scale. The second preset formula is:

The server calculates the logarithmic energy of the target scale according to the third preset formula. The third preset formula is:

H _m (k) is the frequency response of the filter bank, and M represents the number of filters in the preset filter bank. It should be noted that the response of average people to sound pressure is logarithmic, and people are not as sensitive to subtle changes in high sound pressure as low sound pressure. In addition, the use of logarithms can reduce the sensitivity of the extracted features to changes in the input sound energy, because the distance between the sound and the microphone changes, so the sound energy collected by the microphone also changes. The sensitivity of human hearing to different frequency bands is different. The human ear is not as sensitive to high frequencies as low frequencies. This dividing line is about 1000 Hz. The property of simulating human hearing when extracting sound features can improve recognition performance. The filter bank is a set of triangular filter banks with a Mel scale. The 10 filters below 1000 Hz are linearly separated, and the remaining filters above 1000 Hz are logarithmically separated.

The server calculates the energy and difference according to the cepstral coefficient, and the process of generating the signature file includes:

Specifically, the energy of a certain frame is defined as the sum of the squares of sample points in a certain frame. For a windowed signal x, the energy from sample point t1 to sample point t2 is:

The features extracted above are considered separately for each frame and are static, while the actual sound is continuous, and there is a connection between frames. Therefore, it is necessary to add features to represent such dynamic changes between frames. This is usually calculated by calculating each frame. The first-order difference or even the second-order difference of 13 features in one frame (12 cepstrum features plus 1 energy) can be realized. A simple way to calculate the difference is to calculate the difference between the 13 features of the current frame before and after the frame:

If the second-order difference is not considered, the final Mel frequency cepstral coefficient feature of each frame is 26 dimensions: 12-dimensional cepstral coefficient, 12-dimensional cepstral coefficient difference, 1-dimensional energy and 1-dimensional energy difference.

103. Determine whether the feature file is valid.

The server judges whether the signature file is valid. Specifically, the server determines whether the format of the signature file meets the preset quality requirements; if the format of the signature file does not meet the preset quality requirements, the server determines that the signature file is invalid; if the format of the signature file meets the preset quality requirements, the server determines the feature Whether there are voices of multiple users in the file; if there are no voices of multiple users in the feature file, the server determines that the feature file is valid; if there are voices of multiple users in the feature file, the server determines that the feature file is invalid.

104. If the feature file is invalid, a status code of the extraction failure is generated, and the status code is used to indicate the reason for the extraction failure.

If the signature file is invalid, the server generates an extraction failure status code. The status code is used to indicate the reason for the extraction failure. That is, the extraction failure status code can simply inform the reason for the failure to extract, such as poor voice quality, multiple people talking, etc. Different reasons for failure correspond to different status codes. For example, if the format of the feature file does not meet the preset quality requirements, the server determines that the feature file is invalid and generates the first status code that the extraction fails; if there are multiple user voices in the feature file, the server determines that the feature file is invalid and generates the extraction The failed second status code, where the first status code and the second status code are different.

105. If the signature file is valid, determine the target address range to which the target user belongs according to the incoming phone segment or IP information, call the preset blacklist model and target address range to score the similarity of the signature file, and perform the score based on the score result Operate accordingly.

If the signature file is valid, the server determines the target address range to which the target user belongs according to the incoming phone segment or IP information, calls the preset blacklist model and target address range to score the similarity of the signature file, and makes corresponding responses according to the scoring results operating. Specifically, if the signature file is valid, the server determines the target address range to which the target user belongs based on the incoming phone segment or IP information; the server determines the corresponding target blacklist model in the preset blacklist model according to the target address range, and A preset blacklist model corresponds to a different blacklist feature sub-database; the server scores the similarity of the feature files through the target blacklist model to obtain the target score; if the target score is greater than or equal to the first threshold, the server determines The target user is in the blacklist feature sub-database corresponding to the target blacklist model, and the first prompt message is returned. The first prompt message is used to indicate that the target user is prohibited from receiving normal services; if the target score is less than the first threshold, the server determines The target user is not in the blacklist feature sub-database corresponding to the target blacklist model, and a second prompt message is returned. The second prompt message is used to instruct the target user to accept normal services.

It should be noted that different addresses correspond to different target blacklist models, and the corresponding blacklist feature sub-databases are also different. If the user is not in the blacklist feature sub-database, it means that it is a normal user and accepts the normal service. If the service process is not smooth, the user can also be registered in the blacklist feature sub-database; if it is in the blacklist feature sub-database, It is regarded as a blacklisted user and will not receive normal services (such as not granting loans, etc.).

The total score of the target user is compared with the voiceprint features (feature files) extracted from the voice and the voiceprint features in the blacklist feature library, and then combined with the address information to score. The scoring here is to calculate the similarity of voiceprint features. Usually there is a threshold according to the model training. When the score is higher than the threshold, it proves that the two voiceprint features are close, and it can be considered as a comparison.

It is understandable that this application can be mainly applied to bank loan business, according to the user's credit rating to decide whether to include the user in the blacklist, and at the same time, the user’s voiceprint characteristics are registered to the blacklist characteristics according to the user’s region, age, and gender. Sub-library. When a user's voiceprint characteristics are registered in the blacklist feature sub-database, in the future, if the person has an incoming call, it can be judged that he belongs to the blacklist based on his voiceprint characteristics, so the loan business may not be processed.

In this embodiment of the application, the blacklist feature database is divided into blacklist feature sub-bases with smaller dimensions, and voiceprint features are compared according to the blacklist feature sub-bases corresponding to address information, which improves the efficiency of voiceprint recognition.

Please refer to FIG. 2, another flowchart of a method for identifying a blacklist based on address information provided by an embodiment of the present application, which specifically includes:

201. A preset blacklist model is generated, and the preset blacklist model is used for blacklist registration.

The server generates a preset blacklist model, and the preset blacklist model is used for blacklist registration. Specifically, the server performs sub-database registration processing on the blacklist, obtains key information such as the user's age, region, and gender from customer information (non-sensitive), phone segment, network IP, etc., and extracts voiceprint features from the voice of the call; The basic information of the registered user saves the voiceprint characteristics in the corresponding database, which is the sub-database registration of the blacklist. The sub-database registration only saves the blacklist feature sub-database with the finest dimension, such as (males in East China over 50 years old), while the larger-dimensional database will be synthesized from the finest database, such as (males in Eastern China can be over 50 years old in Eastern China). , Two blacklist feature sub-databases for males under 50 in East China). Based on this sub-database registration scheme, it can be directly matched in the corresponding small database according to the user’s region, gender and age. (If one of the three elements of the user’s region, gender or age cannot be determined, you can Find the library that should be synthesized, and match in these blacklist feature sub-libraries.) This sub-library registration scheme avoids the disadvantages of matching a huge blacklist library at a time during actual use, and it is also very compatible with bank credit cards Actual business scenarios.

It should be noted that in addition to information on dimensions such as region, age group, gender, etc., information on other dimensions can also be obtained, such as the customer’s occupation or the customer’s ID in the system, etc. However, sensitive customer information should not be used as a dimension.

202. Acquire a voice file of the target user. The voice file includes audio and address information, and the address information includes incoming phone segment or/and Internet Protocol address IP information.

For example, when the server obtains the audio of the target user through the phone or the network, it also determines the specific address of the target user according to the incoming phone segment or the Internet protocol address (IP) information of the network. For a specific service, the basic information of the target user (except sensitive information) is maintained. For example, when the server obtains the target user's voice file through the network, the voice file includes audio and address identifiers as well as An identity identifier that indicates the basic information of the target user, where the basic information includes age, gender, and so on.

203. Perform feature extraction on the audio by using a preset algorithm to obtain a feature file.

It should be noted that, before performing feature extraction, the server needs to sample and quantize the collected audio, that is, convert the audio continuous waveform into discrete data points with a certain sampling rate and number of sampling bits. Since the sound in daily life is generally below 8kHz, according to Nyquist's law, the sampling rate of 16kHz is sufficient to make the sampled data contain most of the sound information. 16kHz means that 16k samples are sampled in 1s. These samples are stored in amplitude values. In order to effectively store the amplitude values, they need to be quantized into integers. For the 16-bit sampling number, it can represent an integer value between -32768 and 32767, so the sampling amplitude value can be quantized to the nearest integer value.

For example, for the frequency spectrum of a sound signal, the energy of the low frequency part is usually higher than the energy of the high frequency part. After 10 times of Hz, the spectrum energy will be attenuated by 20dB, and due to the influence of the noise of the circuit itself when the microphone is collecting the sound signal, It will also increase the energy of the low frequency part. In order to make the energy of the high frequency part and the energy of the low frequency part have similar amplitude, it is necessary to pre-enhance the high-frequency energy of the collected sound, that is, pre-emphasize the audio in the form of digital signals.

L is the frame length.

204. Determine whether the feature file is valid.

205. If the feature file is invalid, a status code of the extraction failure is generated, and the status code is used to indicate the reason for the extraction failure.

206. If the signature file is valid, determine the target address range to which the target user belongs according to the incoming phone segment or IP information, call the preset blacklist model and target address range to score the similarity of the signature file, and perform the score based on the score result Operate accordingly.

The blacklist recognition method based on address information in the embodiment of this application is described above, and the blacklist recognition device based on address information in the embodiment of this application is described below. Please refer to FIG. 3, the blacklist based on address information in the embodiment of this application is described. An embodiment of the list identification device includes:

The obtaining unit 301 is configured to obtain a voice file of a target user, the voice file includes audio and address information, and the address information includes incoming telephone section or/and Internet Protocol address IP information;

The extraction unit 302 is configured to perform feature extraction on the audio by using a preset algorithm to obtain a feature file;

The judging unit 303 is used to judge whether the feature file is valid;

The first generating unit 304, if the signature file is invalid, is used to generate a status code of the extraction failure, and the status code is used to indicate the reason for the extraction failure;

The scoring unit 305, if the feature file is valid, is used to determine the target address range to which the target user belongs according to the incoming phone segment or the IP information, and call a preset blacklist model and the target address The range scores the similarity of the feature files, and performs corresponding operations based on the scoring results.

Referring to FIG. 4, another embodiment of the device for identifying a blacklist based on address information in an embodiment of the present application includes:

The judging unit 303 is used to judge whether the feature file is valid;

Optionally, the scoring unit 305 is specifically used for:

If the signature file is valid, the target address range to which the target user belongs is determined based on the incoming phone segment or the IP information; the corresponding target address range is determined in the preset blacklist model according to the target address range Target blacklist model, each preset blacklist model corresponds to a different blacklist feature sub-database; the target blacklist model is used to score the similarity of the feature files to obtain the target score; if the target If the score is greater than or equal to the first threshold, it is determined that the target user is in the blacklist feature database corresponding to the target blacklist model, and a first prompt message is returned. The first prompt message is used to indicate the target The user is prohibited from receiving normal services; if the target score is less than the first threshold, it is determined that the target user is not in the blacklist feature database corresponding to the target blacklist model, and a second prompt message is returned, so The second prompt message is used to indicate that the target user accepts normal services.

Optionally, the obtaining unit 301 is specifically configured to:

Receive the voice file of the target user; parse the voice file to obtain the audio and address identification of the target user; query a preset table according to the address identification to obtain the address information corresponding to the address identification, the address The information includes incoming phone segment or/and IP information.

Optionally, the extraction unit 302 includes:

The first conversion module 3021 is used to convert the audio from an analog signal form to a digital signal form;

The pre-emphasis module 3022 is used to pre-emphasize audio in the form of digital signals;

The windowing module 3023 is used for windowing the pre-emphasized audio;

The transform module 3024 is used to perform discrete Fourier transform on the windowed audio to obtain the target complex number;

The corresponding module 3025 is used to map the target complex number to the Mel spectrum to obtain logarithmic energy;

The second conversion module 3026 is configured to convert the logarithmic energy to obtain the cepstral coefficient;

The calculation module 3027 is used to calculate the energy and the difference according to the cepstral coefficients to generate a feature file.

Optionally, the transformation module 3024 is specifically used for:

Get the windowed audio signal x[n],...,[m], where n and m are integers greater than 0; call the first preset formula to generate the target complex number X[k], the first preset formula is:

Optionally, the corresponding module 3025 is specifically used for:

The target complex number is smoothed through the preset filter bank; the smoothed complex number is corresponding to the mel scale on the mel spectrum, and one mel scale represents a treble unit; and the second preset formula The smoothed complex number corresponds to the mel scale to obtain the target scale, and the second preset formula is:

The logarithmic energy of the target scale is calculated according to a third preset formula, and the third preset formula is:

H _m (k) is the frequency response of the filter bank, and M represents the number of filters in the preset filter bank.

Optionally, the blacklist identification device based on address information further includes:

The second generating unit 306 is configured to generate a preset blacklist model, and the preset blacklist model is used for blacklist registration.

The above Figures 3 to 4 describe in detail the address information-based blacklist identification device in this embodiment of the application from the perspective of modular functional entities, and the following describes the address information-based blacklist identification device in this embodiment of the application from the perspective of hardware processing Give a detailed description.

FIG. 5 is a schematic structural diagram of a blacklist recognition device based on address information provided by an embodiment of the present application. The blacklist recognition device 500 based on address information may have relatively large differences due to different configurations or performance, and may include one or more A processor (central processing units, CPU) 501 (for example, one or more processors), a memory 509, and one or more storage media 508 (for example, one or more storage devices with a large amount of data) storing application programs 507 or data 506. Among them, the memory 509 and the storage medium 508 may be short-term storage or persistent storage. The program stored in the storage medium 508 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the blacklist identification device based on address information. Furthermore, the processor 501 may be configured to communicate with the storage medium 508, and execute a series of instruction operations in the storage medium 508 on the blacklist recognition device 500 based on address information.

The blacklist identification device 500 based on address information may also include one or more power supplies 502, one or more wired or wireless network interfaces 503, one or more input and output interfaces 504, and/or, one or more operating systems 505 , Such as Windows Serve, Mac OS X, Unix, Linux, FreeBSD and so on. Those skilled in the art can understand that the structure of the blacklist recognition device based on address information shown in FIG. 5 does not constitute a limitation on the blacklist recognition device based on address information, and may include more or less components than shown in the figure. Or some parts are combined, or different parts are arranged. The processor 501 can perform the functions of the acquisition unit 301, the extraction unit 302, the judgment unit 303, the generation unit 304, the scoring unit 305, and the generation unit 306 in the foregoing embodiment.

The following specifically introduces each component of the address information-based blacklist identification device with reference to Figure 5:

The processor 501 is the control center of the blacklist identification device based on address information, and can perform processing according to the set blacklist identification method based on address information. The processor 501 uses various interfaces and lines to connect various parts of the entire blacklist identification device based on address information, by running or executing software programs and/or modules stored in the memory 509, and calling data stored in the memory 509, Perform various functions of the blacklist recognition device based on address information, divide the blacklist feature database into smaller blacklist feature sub-bases, and compare the voiceprint features according to the blacklist feature sub-bases corresponding to the address information to improve Voiceprint recognition efficiency. The storage medium 508 and the memory 509 are both carriers for storing data. In the embodiment of the present application, the storage medium 508 may refer to an internal memory with a small storage capacity but a fast speed, and the storage medium 509 may have a large storage capacity but a slow storage speed. External memory.

The memory 509 may be used to store software programs and modules. The processor 501 executes various functional applications and data processing of the blacklist identification device 500 based on address information by running the software programs and modules stored in the memory 509. The memory 509 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, at least one application program required by a function (for example, feature extraction of the audio through a preset algorithm to obtain a feature file), etc. ; The storage data area can store data created according to the use of the blacklist identification device based on the address information (such as the status code of the extraction failure) and so on. In addition, the memory 509 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices. The blacklist identification method program based on address information and the received data stream provided in the embodiment of the present application are stored in the memory, and the processor 501 calls it from the memory 509 when it needs to be used.

The present application also provides a computer-readable storage medium. The computer-readable storage medium may be a non-volatile computer-readable storage medium, and the computer-readable storage medium may also be a volatile computer-readable storage medium. The computer-readable storage medium stores instructions, and when the instructions run on the computer, the computer executes the following steps of the blacklist identification method based on address information:

Acquire a voice file of the target user, the voice file includes audio and address information, and the address information includes incoming phone segment or/and Internet Protocol address IP information;

Performing feature extraction on the audio by using a preset algorithm to obtain a feature file;

Determine whether the feature file is valid;

If the signature file is invalid, a status code of extraction failure is generated, and the status code is used to indicate the reason for the extraction failure;

If the signature file is valid, the target address range to which the target user belongs is determined according to the incoming phone segment or the IP information, and the preset blacklist model and the target address range are called to perform similarity to the signature file Scoring, and perform corresponding operations based on the scoring results.

When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, twisted pair) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, an optical disc), or a semiconductor medium (for example, a solid state disk (SSD)).

Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working process of the above-described system, device, and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

Claims

A blacklist identification method based on address information includes:

Acquire a voice file of the target user, the voice file includes audio and address information, and the address information includes incoming phone segment or/and Internet Protocol address IP information;

Perform feature extraction on the audio by using a preset algorithm to obtain a feature file;

Determine whether the feature file is valid;

If the signature file is invalid, a status code of extraction failure is generated, and the status code is used to indicate the reason for the extraction failure;

If the signature file is valid, the target address range to which the target user belongs is determined according to the incoming phone segment or the IP information, and the preset blacklist model and the target address range are called to perform similarity to the signature file Scoring, and perform corresponding operations based on the scoring results.
The method for identifying a blacklist based on address information according to claim 1, wherein if the feature file is valid, the target to which the target user belongs is determined according to the incoming phone segment or the Internet Protocol address IP information Address range, call the preset blacklist model and the target address range to score the similarity of the feature files, and perform corresponding operations based on the scoring results, including:

If the signature file is valid, determine the target address range to which the target user belongs based on the incoming phone segment or the IP information;

Determining a corresponding target blacklist model in the preset blacklist model according to the target address range, and each preset blacklist model corresponds to a different blacklist feature sub-database;

Scoring the similarity of the feature files through the target blacklist model to obtain a target score;

If the target score is greater than or equal to the first threshold, it is determined that the target user is in the blacklist feature database corresponding to the target blacklist model, and a first prompt message is returned. The first prompt message is used for Indicating that the target user is prohibited from receiving normal services;

If the target score is less than the first threshold, it is determined that the target user is not in the blacklist feature database corresponding to the target blacklist model, and a second prompt message is returned, where the second prompt message is used for Instruct the target user to accept normal services.
The blacklist recognition method based on address information according to claim 1, said acquiring a voice file of a target user, said voice file including audio and address information, and said address information including incoming phone segment or/and IP information include:

Receive the voice file of the target user;

Parse the voice file to obtain the audio and address identification of the target user;

The preset table is queried according to the address identifier to obtain address information corresponding to the address identifier, and the address information includes incoming telephone section or/and IP information.
The blacklist recognition method based on address information according to claim 1, wherein the feature extraction of the audio through a preset algorithm to obtain a feature file comprises:

Converting the audio from an analog signal form to a digital signal form;

Pre-emphasize audio in the form of digital signals;

Windowing the pre-emphasized audio;

Perform discrete Fourier transform on the windowed audio to obtain the target complex number;

Corresponding the target complex number to the Mel spectrum to obtain logarithmic energy;

Converting the logarithmic energy to obtain the cepstrum coefficient;

The energy and difference are calculated according to the cepstral coefficients, and a characteristic file is generated.
According to the method for identifying a blacklist based on address information according to claim 4, the discrete Fourier transform of the windowed audio to obtain the target complex number comprises:

Obtain the windowed audio signal x[n],...,[m], where n and m are integers greater than 0;

Call the first preset formula to generate the target complex number X[k], the first preset formula is:
N is a power of 2, k is an integer, and X[k] represents the amplitude and phase of a certain frequency component in the windowed audio signal.
The method for identifying a blacklist based on address information according to claim 5, wherein the corresponding to the target complex number to the Mel spectrum to obtain logarithmic energy comprises:

Smoothing the target complex number through a preset filter bank;

Correspond the smoothed complex number to the mel scale on the mel spectrum, and one mel scale represents a treble unit;

Corresponding the smoothed complex number to the mel scale by a second preset formula to obtain the target scale, the second preset formula is:

The logarithmic energy of the target scale is calculated according to a third preset formula, and the third preset formula is:

H m (k) is the frequency response of the filter bank, and M represents the number of filters in the preset filter bank.
The blacklist recognition method based on address information according to any one of claims 1-6, said acquiring a voice file of a target user, said voice file including audio and address information, and said address information including incoming phone segments Or/and before the IP information, the method further includes:

A preset blacklist model is generated, and the preset blacklist model is used for blacklist registration.
A blacklist recognition device based on address information includes:

The acquiring unit is configured to acquire a voice file of the target user, the voice file includes audio and address information, and the address information includes incoming telephone section or/and Internet Protocol address IP information;

An extraction unit, configured to perform feature extraction on the audio by using a preset algorithm to obtain a feature file;

A judging unit for judging whether the feature file is valid;

The first generating unit, if the signature file is invalid, is used to generate a status code of extraction failure, and the status code is used to indicate the reason for the extraction failure;

The scoring unit, if the feature file is valid, is used to determine the target address range to which the target user belongs according to the incoming phone segment or the IP information, and call a preset blacklist model and the target address range Score the similarity of the feature files, and perform corresponding operations based on the scoring results.
The device for identifying a blacklist based on address information according to claim 8, wherein the scoring unit is specifically configured to:

If the signature file is valid, determine the target address range to which the target user belongs based on the incoming phone segment or the IP information;

Determining a corresponding target blacklist model in the preset blacklist model according to the target address range, and each preset blacklist model corresponds to a different blacklist feature sub-database;

Scoring the similarity of the feature files through the target blacklist model to obtain a target score;

If the target score is greater than or equal to the first threshold, it is determined that the target user is in the blacklist feature database corresponding to the target blacklist model, and a first prompt message is returned. The first prompt message is used for Indicating that the target user is prohibited from receiving normal services;

If the target score is less than the first threshold, it is determined that the target user is not in the blacklist feature database corresponding to the target blacklist model, and a second prompt message is returned, where the second prompt message is used for Instruct the target user to accept normal services.
According to the device for identifying a blacklist based on address information according to claim 8, the acquiring unit is specifically configured to:

Receive the voice file of the target user;

Parse the voice file to obtain the audio and address identification of the target user;

The preset table is queried according to the address identifier to obtain address information corresponding to the address identifier, and the address information includes incoming telephone section or/and IP information.
According to the device for identifying a blacklist based on address information according to claim 8, the extracting unit comprises:

The first conversion module is used to convert the audio from an analog signal form to a digital signal form;

The pre-emphasis module is used to pre-emphasize the audio in the form of digital signals; the windowing module is used to perform windowing processing on the pre-emphasized audio;

The transform module is used to perform discrete Fourier transform on the windowed audio to obtain the target complex number;

The corresponding module is used to map the target complex number to the Mel spectrum to obtain logarithmic energy;

The second conversion module is used to convert the logarithmic energy to obtain the cepstral coefficient;

The calculation module is used to calculate the energy and the difference according to the cepstral coefficient to generate a characteristic file.
According to the device for identifying a blacklist based on address information according to claim 11, the conversion module is specifically configured to:

Obtain the windowed audio signal x[n],...,[m], where n and m are integers greater than 0;

Call the first preset formula to generate the target complex number X[k], the first preset formula is:
N is a power of 2, k is an integer, and X[k] represents the amplitude and phase of a certain frequency component in the windowed audio signal.
According to the device for identifying a blacklist based on address information according to claim 12, the corresponding module is specifically configured to:

Smoothing the target complex number through a preset filter bank;

Correspond the smoothed complex number to the mel scale on the mel spectrum, and one mel scale represents a treble unit;

Corresponding the smoothed complex number to the mel scale by a second preset formula to obtain the target scale, the second preset formula is:

The logarithmic energy of the target scale is calculated according to a third preset formula, and the third preset formula is:

H m (k) is the frequency response of the filter bank, and M represents the number of filters in the preset filter bank.
The device for identifying a blacklist based on address information according to any one of claims 8-13, the device for identifying a blacklist based on address information further comprises:

The second generating unit is configured to generate a preset blacklist model, and the preset blacklist model is used for blacklist registration.
A blacklist identification device based on address information includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer program:

Acquire a voice file of the target user, the voice file includes audio and address information, and the address information includes incoming phone segment or/and Internet Protocol address IP information;

Perform feature extraction on the audio by using a preset algorithm to obtain a feature file;

Determine whether the feature file is valid;

If the signature file is invalid, a status code of extraction failure is generated, and the status code is used to indicate the reason for the extraction failure;

If the signature file is valid, the target address range to which the target user belongs is determined according to the incoming phone segment or the IP information, and the preset blacklist model and the target address range are called to perform similarity to the signature file Scoring, and perform corresponding operations based on the scoring results.
The blacklist identification device based on address information according to claim 15, wherein said processor executes said computer program to realize said if said feature file is valid, according to said incoming telephone section or said Internet The protocol address IP information determines the target address range to which the target user belongs, calls the preset blacklist model and the target address range to score the similarity of the signature files, and performs corresponding operations based on the score results, including the following steps:

If the signature file is valid, determine the target address range to which the target user belongs based on the incoming phone segment or the IP information;

Determining a corresponding target blacklist model in the preset blacklist model according to the target address range, and each preset blacklist model corresponds to a different blacklist feature sub-database;

Scoring the similarity of the feature files through the target blacklist model to obtain a target score;

If the target score is greater than or equal to the first threshold, it is determined that the target user is in the blacklist feature database corresponding to the target blacklist model, and a first prompt message is returned. The first prompt message is used for Indicating that the target user is prohibited from receiving normal services;

If the target score is less than the first threshold, it is determined that the target user is not in the blacklist feature database corresponding to the target blacklist model, and a second prompt message is returned, where the second prompt message is used for Instruct the target user to accept normal services.
The blacklist recognition device based on address information according to claim 15, wherein the processor executes the computer program to realize the acquisition of the voice file of the target user, the voice file includes audio and address information, and the address information includes When incoming telephone segment or/and IP information, include the following steps:

Receive the voice file of the target user;

Parse the voice file to obtain the audio and address identification of the target user;

The preset table is queried according to the address identifier to obtain address information corresponding to the address identifier, and the address information includes incoming telephone section or/and IP information.
According to the address information-based blacklist recognition device according to claim 15, when the processor executes the computer program to realize the feature extraction of the audio through a preset algorithm to obtain a feature file, the method includes the following steps:

Converting the audio from an analog signal form to a digital signal form;

Pre-emphasize audio in the form of digital signals;

Windowing the pre-emphasized audio;

Perform discrete Fourier transform on the windowed audio to obtain the target complex number;

Corresponding the target complex number to the Mel spectrum to obtain logarithmic energy;

Converting the logarithmic energy to obtain the cepstrum coefficient;

The energy and difference are calculated according to the cepstral coefficients, and a characteristic file is generated.
The blacklist recognition device based on address information according to claim 18, wherein the processor executes the computer program to implement the discrete Fourier transform of the windowed audio to obtain the target complex number, comprising the following steps :

Obtain the windowed audio signal x[n],...,[m], where n and m are integers greater than 0;

Call the first preset formula to generate the target complex number X[k], the first preset formula is:
N is a power of 2, k is an integer, and X[k] represents the amplitude and phase of a certain frequency component in the windowed audio signal.
A computer-readable storage medium in which instructions are stored, and when the instructions are run on a computer, the computer executes the following steps:

Acquire a voice file of the target user, the voice file includes audio and address information, and the address information includes incoming phone segment or/and Internet Protocol address IP information;

Perform feature extraction on the audio by using a preset algorithm to obtain a feature file;

Determine whether the feature file is valid;

If the signature file is invalid, a status code of extraction failure is generated, and the status code is used to indicate the reason for the extraction failure;

If the signature file is valid, the target address range to which the target user belongs is determined according to the incoming phone segment or the IP information, and the preset blacklist model and the target address range are called to perform similarity to the signature file Scoring, and perform corresponding operations based on the scoring results.