CN112581967B - Voiceprint retrieval method, front-end back-end server and back-end server - Google Patents

Voiceprint retrieval method, front-end back-end server and back-end server Download PDF

Info

Publication number
CN112581967B
CN112581967B CN202011228722.4A CN202011228722A CN112581967B CN 112581967 B CN112581967 B CN 112581967B CN 202011228722 A CN202011228722 A CN 202011228722A CN 112581967 B CN112581967 B CN 112581967B
Authority
CN
China
Prior art keywords
voiceprint
speaker
voice
database
end equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011228722.4A
Other languages
Chinese (zh)
Other versions
CN112581967A (en
Inventor
叶林勇
肖龙源
李稀敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Kuaishangtong Technology Co Ltd
Original Assignee
Xiamen Kuaishangtong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Kuaishangtong Technology Co Ltd filed Critical Xiamen Kuaishangtong Technology Co Ltd
Priority to CN202011228722.4A priority Critical patent/CN112581967B/en
Publication of CN112581967A publication Critical patent/CN112581967A/en
Application granted granted Critical
Publication of CN112581967B publication Critical patent/CN112581967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a voiceprint retrieval method, a front-end rear-end server and a rear-end server, wherein the voiceprint retrieval method is characterized in that collected voice data is marked with a speaker ID, and voiceprint characteristics are extracted from the voice data at the rear-end server; constructing a voiceprint database according to the speaker ID and the voiceprint characteristics; exporting and registering the voiceprint database on front-end equipment; and performing voiceprint retrieval on the current voice data extracted by the front-end equipment to obtain the current speaker ID, so that the efficiency of voiceprint retrieval on the voice acquired by chat tools such as QQ and WeChat of the front-end equipment is improved, and the technical problem that the existing voiceprint retrieval method is too dependent on a back-end server and cannot realize offline retrieval is solved.

Description

Voiceprint retrieval method, front-end back-end server and back-end server
Technical Field
The invention relates to the technical field of voiceprint recognition, in particular to a voiceprint retrieval method, front-end equipment and a back-end server applying the method.
Background
Voiceprint recognition (Voiceprint Recognition, VPR), also known as speaker recognition (Speaker Recognition). Each person's voice contains unique biological characteristics, and voiceprint recognition refers to a technical means for recognizing a speaker by using the speaker's voice. The voiceprint recognition has high safety and reliability as the technology of fingerprint recognition and the like, and can be applied to all occasions needing to be identified. Such as in financial fields like criminal investigation, banking, securities, insurance, etc. Compared with the traditional identification technology, the voiceprint identification has the advantages of simple voiceprint extraction process, low cost, uniqueness and difficult counterfeiting and impersonation.
The main tasks of voiceprint recognition comprise voice signal processing, voiceprint feature extraction, voiceprint modeling, voiceprint matching, decision making and the like. The voiceprint matching currently mainly comprises the following steps of 1:1 comparison and 1: n retrieves two application scenarios.
Wherein, 1: the N voiceprint retrieval application scene is divided into two steps:
step one: extracting N-dimensional voiceprint feature vectors from the marked M speaker voices by using a pre-trained voiceprint model;
step two: and storing the N-dimensional feature vectors of the extracted voices of the M speakers in a database.
Traditional 1: when N voiceprint retrieval is performed, the voice acquired through front-end equipment such as a mobile phone needs to be transmitted to a back-end server, then voiceprint characteristics are extracted from the voice at the back-end server, N-dimensional characteristic vectors of M speakers in a voiceprint database are compared in sequence until the N-dimensional characteristic vectors of the M speakers are compared completely.
By adopting the traditional voiceprint retrieval scheme, voiceprint retrieval is performed on the voice extracted by front-end equipment such as qq or WeChat, the voice is required to be sent to a back-end server, then 1:N voiceprint retrieval is performed on the back-end server, and the retrieval result is returned to the front-end equipment after the retrieval of the back-end server is finished. The voiceprint retrieval mode needs to send the voices of all front-end devices to the back-end server, so that the bandwidth pressure of the back-end server is high, and the time consumption is long; and the method also needs to wait for a long time for the back-end server to return to the front-end equipment after analyzing the search result, and has low efficiency. In particular, in the absence of a network, there is no way for the off-line head-end to do 1: and N voiceprint retrieval.
Disclosure of Invention
The invention mainly aims to provide a voiceprint retrieval method, a front-end rear-end server and a rear-end server, and aims to solve the technical problems that the existing voiceprint retrieval method is too dependent on the rear-end server and can not realize offline retrieval.
In order to achieve the above object, the present invention provides a voiceprint retrieval method, which includes the steps of:
labeling the collected voice data by using a speaker ID, and extracting voiceprint features from the voice data at a back-end server;
constructing a voiceprint database according to the speaker ID and the voiceprint characteristics;
exporting and registering the voiceprint database on front-end equipment;
and performing voiceprint retrieval on the current voice data extracted by the front-end equipment to obtain the current speaker ID.
Preferably, the voiceprint database is constructed by inputting voiceprint features corresponding to M speaker IDs into a pre-trained model, and outputting N-dimensional voiceprint feature vectors corresponding to each speaker ID; the N-dimensional feature vectors of the M speaker IDs are stored in a database, and a voiceprint database with the capacity of M x N is established; in the voiceprint database, voiceprint feature vectors of each speaker are mapped by using speaker IDs.
Preferably, the voice print database is registered on the front-end equipment, and each speaker ID and the voice print characteristics corresponding to the speaker ID are exported and registered on the front-end equipment by using a voice print database exporting tool; the voiceprint features of the M speakers are respectively exported to M ark files.
Further, the data storage format of the voiceprint features is as follows: model name |model version| [ X1, X2, X3...xn ], where X1 to Xn are N-dimensional voiceprint feature vectors extracted by each speaker; when the voiceprint features are imported into front-end equipment, further matching the model name and model version of the ark file with the local model name and local model version of the front-end equipment; if the model names and/or model versions are inconsistent, the speaker voiceprint feature model is failed to be imported.
Further, the ark file is named by using a speaker ID, when the voiceprint feature is imported into the front-end device, the speaker ID of the ark file is further matched with a local speaker ID, and if the speaker ID is matched with the local voiceprint database, the voiceprint feature is imported failed.
Preferably, the extracting of the current voice data is that a voice file of the instant messaging software is extracted by a voice extracting tool, and the voice file is subjected to format conversion by a voice converting tool and is stored in a cache of the front-end equipment.
Furthermore, the voice file extracted by the voice extraction tool adopts a SILK compression format, and the voice conversion tool converts the voice file from the SILK compression format to a WAV format.
Further, voiceprint retrieval is performed on the front-end equipment, namely voiceprint features are extracted from voice files in a cache of the front-end equipment, the extracted voiceprint features are compared with the voiceprint features in a locally registered voiceprint database, and the speaker ID corresponding to the voiceprint features in the voiceprint database and the current voice data are judged to be the same speaker according to the similarity and/or confidence.
Corresponding to the voiceprint retrieval method, the invention provides front-end equipment, which comprises:
the voice acquisition module is used for acquiring current voice data;
the data storage module is used for importing a pre-constructed voiceprint database; the voiceprint database is constructed by marking a speaker ID of pre-collected voice data, extracting voiceprint characteristics from the voice data at a back-end server, and constructing the voiceprint database according to the speaker ID and the voiceprint characteristics
And the voiceprint retrieval module is used for extracting current voiceprint characteristics from the current voice data, and performing voiceprint retrieval on the voiceprint database according to the current voiceprint characteristics to obtain the current speaker ID.
In addition, to achieve the above object, the present invention also provides a back-end server, including:
the data importing module is used for importing pre-acquired voice data;
the data processing module is used for marking the speaker ID of the voice data and extracting voiceprint features of the voice data;
the voiceprint database construction module is used for constructing a voiceprint database according to the speaker ID and the voiceprint characteristics;
and the data export module is used for exporting and registering the voiceprint database to the front-end equipment.
The beneficial effects of the invention are as follows:
(1) According to the voice print searching method and device, the voice print database is built in the back-end server, exported and registered on the front-end device, voice print characteristics are directly matched on the front-end device when voice print searching is carried out, so that the current speaker ID is identified, and the problem that sensitive voice data cannot be registered on the front-end device, and therefore offline devices cannot carry out voice print searching is solved;
(2) The invention can directly carry out voiceprint retrieval on the front-end equipment without carrying out data interaction with the back-end server in the voiceprint retrieval process, and can greatly improve the efficiency of voiceprint retrieval of voices acquired by chat tools such as QQ, weChat and the like of the front-end equipment;
(3) When the voiceprint feature file is exported, the invention further carries out the matching of the model name and the model version, thereby effectively protecting the data security of the registered voiceprint data of the back-end server;
(4) The invention adopts the speaker ID to name the voiceprint characteristic file, and further carries out speaker ID matching when the voiceprint characteristic file is derived, thereby avoiding repeated registration of voiceprint data and occupying the storage space of front-end equipment.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the specific embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
First embodiment (retrieval method)
The voiceprint retrieval method of the embodiment comprises the following steps:
a. labeling the collected voice data by using a speaker ID, and extracting voiceprint features from the voice data at a back-end server;
b. constructing a voiceprint database according to the speaker ID and the voiceprint characteristics;
c. exporting and registering the voiceprint database on front-end equipment;
d. and performing voiceprint retrieval on the current voice data extracted by the front-end equipment to obtain the current speaker ID.
In the step a, the voice data may be collected by a recording device, or may be collected by an intelligent terminal with a microphone, such as a mobile phone, or may also be imported from a third party voice database. The labeling of the speaker ID can be performed on external equipment or on a back-end server; moreover, the labeling can be manual labeling, and intelligent automatic labeling can also be performed by adopting a model.
In the step b, the voiceprint database is constructed by inputting voiceprint features corresponding to M speaker IDs into a pre-trained model, and outputting N-dimensional voiceprint feature vectors corresponding to each speaker ID; the N-dimensional feature vectors of the M speaker IDs are stored in a database, and a voiceprint database with the capacity of M x N is established; in the voiceprint database, the voiceprint feature vector of each speaker adopts speaker ID as mapping, and the N-dimensional voiceprint feature vector of the speaker can be found by using the speaker ID. In this embodiment, the voiceprint feature uses MFCC (Mel Frequency Cepstrum Coefficient) features, i.e., mel-frequency cepstral coefficients.
In the step c, registering the voiceprint database on the front-end equipment, namely using a voiceprint database export tool to export and register each speaker ID and the voiceprint characteristics corresponding to each speaker ID on the front-end equipment; the voiceprint features of the M speakers are respectively exported to M ark files.
In this embodiment, the data storage format of the voiceprint feature is: model name |model version| [ X1, X2, X3...xn ], where X1 to Xn are N-dimensional voiceprint feature vectors extracted by each speaker; when the voiceprint features are imported into front-end equipment, further matching the model name and model version of the ark file with the local model name and local model version of the front-end equipment; if the model names and/or model versions are inconsistent, the speaker voiceprint feature model is failed to be imported.
The ark file is named by adopting a speaker ID, and the file naming format is as follows: id. When the voiceprint feature is imported into the front-end equipment, the speaker ID of the ark file is further matched with the local speaker ID, and if the matching speaker ID exists in the local voiceprint database, the voiceprint feature is imported in failure. If the mark file indicates that the voiceprint registration ID of the speaker is 001, and if the speaker ID with the ID of 001 already exists in the local voiceprint database, the voiceprint feature of the speaker does not need to be imported again.
In the step d, the current voice data is extracted by extracting the voice file of the instant messaging software through a voice extracting tool, and the voice file is subjected to format conversion through a voice converting tool and is stored in a cache of the front-end equipment. The voice file extracted by the voice extraction tool adopts a SILK compression format, and the voice conversion tool converts the voice file from the SILK compression format to a WAV format.
And performing voiceprint retrieval on the front-end equipment, namely extracting voiceprint features from voice files in a cache of the front-end equipment, comparing the extracted voiceprint features with the voiceprint features in a locally registered voiceprint database, and judging that the speaker ID corresponding to the voice features in the voiceprint database and the current voice data are the same speaker according to the similarity and/or confidence.
Second embodiment (front-end equipment):
the embodiment also correspondingly provides front-end equipment, which comprises:
the voice acquisition module is used for acquiring current voice data;
the data storage module is used for importing a pre-constructed voiceprint database; the voiceprint database is constructed by marking a speaker ID of pre-collected voice data, extracting voiceprint characteristics from the voice data at a back-end server, and constructing the voiceprint database according to the speaker ID and the voiceprint characteristics
And the voiceprint retrieval module is used for extracting current voiceprint characteristics from the current voice data, and performing voiceprint retrieval on the voiceprint database according to the current voiceprint characteristics to obtain the current speaker ID.
Extracting current voiceprint characteristics from the current voice data, wherein the current voiceprint characteristics are voice files in SILK compression format of instant messaging software such as WeChat, QQ and the like, and the voice files adopt SILK compression format; and then, converting the SILK compression format voice into WAV voice format by using a developed voice conversion tool for SILK to WAV format, storing the SILK compression format voice into a cache of the front-end equipment, and extracting MFCC features from the WAV voice in the local cache.
In this embodiment, voiceprint retrieval is performed on the voiceprint database according to the current voiceprint feature, which is to perform 1 on the extracted MFCC feature and the locally registered voiceprint database: n voiceprint retrieval, namely comparing feature similarity one by one, outputting 5 voiceprint features with highest scores, outputting confidence scores, and judging that the voice data in the voiceprint database and the current voice data retrieved are the same speaker when the confidence score of a voice of 5 results is larger than a threshold value x.
The front-end device includes: mobile terminals or fixed terminals with voice input functions such as mobile phones, tablet computers and sound boxes, and the front-end equipment can comprise a memory, a processor, an input unit, a display unit, a power supply and other components.
The voice acquisition module can be a recording tool or an instant messaging tool such as QQ and WeChat, and the voiceprint retrieval module can be APP software with an identity verification function or APP software special for voiceprint retrieval. The voiceprint database can be imported through a voiceprint importing tool on the front-end equipment or imported through voiceprint database importing equipment of a third party.
The memory may be used to store software programs and modules, and the processor executes the software programs and modules stored in the memory to perform various functional applications and data processing. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function, and the like. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory may also include a memory controller to provide access to the memory by the processor and the input unit.
The input unit may be used to receive input digital or character or image information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control, in addition to a voice acquisition module such as a microphone.
The display unit may be used to display information entered by a user or provided to a user and various graphical user interfaces of the back-end server, which may be composed of graphics, text, icons, video and any combination thereof. The display unit may include a display panel, and alternatively, the display panel may be configured in the form of an LCD (Liquid Crystal Display ), an OLED (Organic Light-Emitting Diode), or the like.
Third embodiment (backend server):
the embodiment also provides a backend server, which includes:
the data importing module is used for importing pre-acquired voice data;
the data processing module is used for marking the speaker ID of the voice data and extracting voiceprint features of the voice data;
the voiceprint database construction module is used for constructing a voiceprint database according to the speaker ID and the voiceprint characteristics;
and the data export module is used for exporting and registering the voiceprint database to the front-end equipment.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the front-end device embodiment and the back-end server embodiment, the description is relatively simple, as it is substantially similar to the method embodiment, and reference is made to the description of the method embodiment for relevant points.
Also, herein, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
While the foregoing description illustrates and describes the preferred embodiments of the present invention, it is to be understood that the invention is not limited to the forms disclosed herein, but is not to be construed as limited to other embodiments, but is capable of use in various other combinations, modifications and environments and is capable of changes or modifications within the scope of the inventive concept, either as described above or as a matter of skill or knowledge in the relevant art. And that modifications and variations which do not depart from the spirit and scope of the invention are intended to be within the scope of the appended claims.

Claims (9)

1.A voiceprint retrieval method is characterized by comprising the following steps:
labeling the collected voice data by using a speaker ID, and extracting voiceprint features from the voice data at a back-end server;
constructing a voiceprint database according to the speaker ID and the voiceprint characteristics;
exporting and registering the voiceprint database on front-end equipment;
performing voiceprint retrieval on the current voice data extracted by the front-end equipment to obtain a current speaker ID;
the voice print database is registered on the front-end equipment, and each speaker ID and the voice print characteristics corresponding to the speaker ID are exported and registered on the front-end equipment by using a voice print database exporting tool; the data storage format of the voiceprint features is as follows: model name |model version| [ X1, X2, X3...xn ], where X1 to Xn are N-dimensional voiceprint feature vectors extracted by each speaker; when the voiceprint features are imported into front-end equipment, further matching a model name and a model version with a local model name and a local model version of the front-end equipment; if the model names and/or model versions are inconsistent, the voice print feature of the speaker fails to be imported.
2. The voiceprint retrieval method of claim 1, wherein: the voiceprint database is constructed by inputting voiceprint features corresponding to M speaker IDs into a pre-trained model, and outputting N-dimensional voiceprint feature vectors corresponding to each speaker ID; the N-dimensional feature vectors of the M speaker IDs are stored in a database, and a voiceprint database with the capacity of M x N is established; in the voiceprint database, voiceprint feature vectors of each speaker are mapped by using speaker IDs.
3. The voiceprint retrieval method of claim 1, wherein: the voiceprint features of the M speakers are exported to M ark files, respectively.
4. A voiceprint retrieval method according to claim 3, wherein: the ark file is named by adopting a speaker ID, when the voiceprint feature is imported into front-end equipment, the speaker ID of the ark file is further matched with a local speaker ID, and if the speaker ID is matched with a local voiceprint database, the voiceprint feature is imported failed.
5. The voiceprint retrieval method of claim 1, wherein: the current voice data is extracted by extracting a voice file of instant messaging software through a voice extracting tool, and the voice file is subjected to format conversion through a voice conversion tool and is stored in a cache of front-end equipment.
6. The voiceprint retrieval method of claim 5, wherein: and the voice file extracted by the voice extraction tool adopts a SILK compression format, and the voice conversion tool converts the voice file from the SILK compression format to a WAV format.
7. The voiceprint retrieval method of claim 5, wherein: and performing voiceprint retrieval on the front-end equipment, namely extracting voiceprint features from voice files in a cache of the front-end equipment, comparing the extracted voiceprint features with the voiceprint features in a locally registered voiceprint database, and judging that the speaker ID corresponding to the voice features in the voiceprint database and the current voice data are the same speaker according to the similarity and/or confidence.
8. A front-end device, comprising:
the voice acquisition module is used for acquiring current voice data;
the data storage module is used for importing and registering a pre-constructed voiceprint database to the front-end equipment; the voice print database is formed by marking a speaker ID of pre-collected voice data, extracting voice print characteristics from the voice data at a back-end server, and constructing the voice print database according to the speaker ID and the voice print characteristics;
the voiceprint retrieval module is used for extracting current voiceprint characteristics from the current voice data, and performing voiceprint retrieval on the voiceprint database according to the current voiceprint characteristics to obtain a current speaker ID;
the voice print database is registered on the front-end equipment, and each speaker ID and the voice print characteristics corresponding to the speaker ID are exported and registered on the front-end equipment by using a voice print database exporting tool; the data storage format of the voiceprint features is as follows: model name |model version| [ X1, X2, X3...xn ], where X1 to Xn are N-dimensional voiceprint feature vectors extracted by each speaker; when the voiceprint features are imported into front-end equipment, further matching a model name and a model version with a local model name and a local model version of the front-end equipment; if the model names and/or model versions are inconsistent, the voice print feature of the speaker fails to be imported.
9. A back-end server, comprising:
the data importing module is used for importing pre-acquired voice data;
the data processing module is used for marking the speaker ID of the voice data and extracting voiceprint features of the voice data;
the voiceprint database construction module is used for constructing a voiceprint database according to the speaker ID and the voiceprint characteristics;
the data export module is used for exporting and registering the voiceprint database to front-end equipment;
the voice print database is registered on the front-end equipment, and each speaker ID and the voice print characteristics corresponding to the speaker ID are exported and registered on the front-end equipment by using a voice print database exporting tool; the data storage format of the voiceprint features is as follows: model name |model version| [ X1, X2, X3...xn ], where X1 to Xn are N-dimensional voiceprint feature vectors extracted by each speaker; when the voiceprint features are imported into front-end equipment, further matching a model name and a model version with a local model name and a local model version of the front-end equipment; if the model names and/or model versions are inconsistent, the voice print feature of the speaker fails to be imported.
CN202011228722.4A 2020-11-06 2020-11-06 Voiceprint retrieval method, front-end back-end server and back-end server Active CN112581967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011228722.4A CN112581967B (en) 2020-11-06 2020-11-06 Voiceprint retrieval method, front-end back-end server and back-end server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011228722.4A CN112581967B (en) 2020-11-06 2020-11-06 Voiceprint retrieval method, front-end back-end server and back-end server

Publications (2)

Publication Number Publication Date
CN112581967A CN112581967A (en) 2021-03-30
CN112581967B true CN112581967B (en) 2023-06-23

Family

ID=75120252

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011228722.4A Active CN112581967B (en) 2020-11-06 2020-11-06 Voiceprint retrieval method, front-end back-end server and back-end server

Country Status (1)

Country Link
CN (1) CN112581967B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112992152B (en) * 2021-04-22 2021-09-14 北京远鉴信息技术有限公司 Individual-soldier voiceprint recognition system and method, storage medium and electronic equipment
CN113407768B (en) * 2021-06-24 2024-02-02 深圳市声扬科技有限公司 Voiceprint retrieval method, voiceprint retrieval device, voiceprint retrieval system, voiceprint retrieval server and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101950564A (en) * 2010-10-13 2011-01-19 镇江华扬信息科技有限公司 Remote digital voice acquisition, analysis and identification system
CN106682090A (en) * 2016-11-29 2017-05-17 上海智臻智能网络科技股份有限公司 Active interaction implementing device, active interaction implementing method and intelligent voice interaction equipment
CN109273008A (en) * 2018-10-15 2019-01-25 腾讯科技(深圳)有限公司 Processing method, device, computer storage medium and the terminal of voice document

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8825482B2 (en) * 2005-09-15 2014-09-02 Sony Computer Entertainment Inc. Audio, video, simulation, and user interface paradigms
CN107104994B (en) * 2016-02-22 2021-07-20 华硕电脑股份有限公司 Voice recognition method, electronic device and voice recognition system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101950564A (en) * 2010-10-13 2011-01-19 镇江华扬信息科技有限公司 Remote digital voice acquisition, analysis and identification system
CN106682090A (en) * 2016-11-29 2017-05-17 上海智臻智能网络科技股份有限公司 Active interaction implementing device, active interaction implementing method and intelligent voice interaction equipment
CN109273008A (en) * 2018-10-15 2019-01-25 腾讯科技(深圳)有限公司 Processing method, device, computer storage medium and the terminal of voice document

Also Published As

Publication number Publication date
CN112581967A (en) 2021-03-30

Similar Documents

Publication Publication Date Title
CN107038220B (en) Method, intelligent robot and system for generating memorandum
CN107492379B (en) Voiceprint creating and registering method and device
CN109493850B (en) Growing type dialogue device
KR102241972B1 (en) Answering questions using environmental context
CN111243603B (en) Voiceprint recognition method, system, mobile terminal and storage medium
CN112581967B (en) Voiceprint retrieval method, front-end back-end server and back-end server
US7949651B2 (en) Disambiguating residential listing search results
US10810278B2 (en) Contextual deep bookmarking
US11756572B2 (en) Self-supervised speech representations for fake audio detection
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
CN112836521A (en) Question-answer matching method and device, computer equipment and storage medium
CN104361311A (en) Multi-modal online incremental access recognition system and recognition method thereof
CN112465144A (en) Multi-modal demonstration intention generation method and device based on limited knowledge
CN109637529A (en) Voice-based functional localization method, apparatus, computer equipment and storage medium
CN113051384B (en) User portrait extraction method based on dialogue and related device
CN111400463A (en) Dialog response method, apparatus, device and medium
WO2021159734A1 (en) Data processing method and apparatus, device, and medium
CN116205749A (en) Electronic policy information data management method, device, equipment and readable storage medium
CN112287134B (en) Search model training and recognition method, electronic device and storage medium
CN112328871B (en) Reply generation method, device, equipment and storage medium based on RPA module
CN111625636B (en) Method, device, equipment and medium for rejecting man-machine conversation
CN114528851A (en) Reply statement determination method and device, electronic equipment and storage medium
CN111916086B (en) Voice interaction control method, device, computer equipment and storage medium
CN103928024B (en) A kind of voice inquiry method and electronic equipment
JP5850886B2 (en) Information processing apparatus and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant