CN106782567A - Method and device for establishing voiceprint model - Google Patents

Method and device for establishing voiceprint model Download PDF

Info

Publication number
CN106782567A
CN106782567A CN201611005290.4A CN201611005290A CN106782567A CN 106782567 A CN106782567 A CN 106782567A CN 201611005290 A CN201611005290 A CN 201611005290A CN 106782567 A CN106782567 A CN 106782567A
Authority
CN
China
Prior art keywords
audio file
sound
groove model
face video
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611005290.4A
Other languages
Chinese (zh)
Other versions
CN106782567B (en
Inventor
卢道和
陈朝亮
杨军
黄叶飞
杨粟
李晓俊
钟伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN201611005290.4A priority Critical patent/CN106782567B/en
Publication of CN106782567A publication Critical patent/CN106782567A/en
Application granted granted Critical
Publication of CN106782567B publication Critical patent/CN106782567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Collating Specific Patterns (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for establishing a voiceprint model, wherein the method comprises the following steps: when a face video is obtained and a face image of the face video is successfully identified, extracting an audio file in the face video and recording the audio file as a first audio file; outputting prompt information to prompt an auditor to audit the face video; and when a notification message that the face video is approved is received, establishing a voiceprint model according to the first audio file. The invention further obtains the audio file of the user on the basis of face recognition, establishes the voiceprint model according to the obtained audio file, and confirms that the user is a real user when the face video of the user is received next time and only when the face image in the face video is successfully recognized and the audio file in the face video is matched with the established voiceprint model, thereby improving the accuracy of user recognition.

Description

The method and apparatus for setting up sound-groove model
Technical field
The present invention relates to identity identification technical field, more particularly to a kind of method and apparatus for setting up sound-groove model.
Background technology
With the development of science and technology, many bankings can not go bank counter to handle now, such as bank card Inquiry business, freezes business, business of opening an account etc., and user directly can handle by phone or on the internet every business.But It is, it is existing to handle every business by phone or on the internet, it is required for being input into bank card account number and password, if silver-colored Row card account input error or Password Input mistake, then need to re-enter.And, when 3 passwords of user input all mistakes When, bank card will be locked, and user then cannot again handle corresponding business, until user goes bank counter to unlock bank Card.Therefore, existing solution can only confirm the identity of user by recognition of face.
The above is only used for auxiliary and understands technical scheme, does not represent and recognizes that the above is existing skill Art.
The content of the invention
It is a primary object of the present invention to provide a kind of method and apparatus for setting up sound-groove model, it is intended to how solve in people The technical problem of identifying user accuracy rate is improved on the basis of face identification.
To achieve the above object, a kind of method for setting up sound-groove model that the present invention is provided, the sound-groove model of setting up Method includes:
When getting face video, and successfully recognizing the facial image of the face video, the face video is extracted In audio file, be designated as the first audio file;
Output prompt message, to point out auditor to audit the face video;
When the notification message that the face video examination & verification passes through is received, vocal print is set up according to first audio file Model.
Preferably, it is described when the notification message that the face video examination & verification passes through is received, according to first audio The step of file sets up sound-groove model includes:
When the notification message that the face video examination & verification passes through is received, judge whether existing sound-groove model;
If there is no sound-groove model, sound-groove model is set up according to first audio file;
If existing sound-groove model, already present sound-groove model is deleted, extract the second stored audio file, its In, second audio file is the audio file for succeeding in registration;
Sound-groove model is set up according to first audio file and second audio file.
Preferably, it is described to include the step of extract stored the second audio file:
Judge whether to be stored with second audio file of preset number;
It is described according to first audio file and institute if second audio file of the preset number that is stored with Stating the step of the second audio file sets up sound-groove model includes:
Second audio file and first audio file according to nearest stored preset number set up vocal print mould Type.
Preferably, after the step of second audio file of the preset number that judges whether to be stored with, also include:
If second audio file of the preset number that is not stored with, all described second sound that acquisition is stored Frequency file;
It is described to include the step of set up sound-groove model according to first audio file and second audio file:
Sound-groove model is set up according to acquired all described second audio file and first audio file.
Preferably, it is described to get face video, and when successfully recognizing the facial image of the face video, extract institute State the audio file in face video, the step of be designated as the first audio file after, also include:
Judge whether existing sound-groove model;
If there is no sound-groove model, output prompt message is performed, to point out auditor to audit the face video Step;
If existing sound-groove model, audio file corresponding with the sound-groove model is extracted, be designated as the 3rd audio file;
First audio file is contrasted with the 3rd audio file, first audio file and institute is obtained State the similarity between the 3rd audio file;
Similarity between first audio file and the 3rd audio file is sent to asynchronous auditing system, and Perform output prompt message, with point out auditor audit the face video the step of.
Additionally, to achieve the above object, the present invention also provides a kind of device for setting up sound-groove model, described to set up vocal print mould The device of type includes:
Extraction module, face video is got for working as, and when successfully recognizing the facial image of the face video, is extracted Audio file in the face video, is designated as the first audio file;
Output module, for exporting prompt message, to point out auditor to audit the face video;
Module is set up, for when the notification message that the face video examination & verification passes through is received, according to first sound Frequency file sets up sound-groove model.
Preferably, the module of setting up includes:
Judging unit, for when the notification message that the face video examination & verification passes through is received, judging whether existing Sound-groove model;
Unit is set up, if for there is no sound-groove model, sound-groove model is set up according to first audio file;
Extraction unit, if for existing sound-groove model, deleting already present sound-groove model, extracts stored second Audio file, wherein, second audio file is the audio file for succeeding in registration;
The unit of setting up is additionally operable to set up sound-groove model according to first audio file and second audio file.
Preferably, the judging unit is additionally operable to judge whether to be stored with second audio file of preset number;
If the unit of setting up is additionally operable to be stored with second audio file of the preset number, according to nearest institute Second audio file and first audio file for storing preset number set up sound-groove model.
Preferably, the module of setting up also includes:
Acquiring unit, if for second audio file of the preset number that is not stored with, what acquisition was stored All second audio files;
It is described set up unit be additionally operable to according to acquired in all described second audio file and first audio file Set up sound-groove model.
Preferably, the device for setting up sound-groove model also includes:
Judge module, for judging whether existing sound-groove model;
If the output module is additionally operable to no presence of sound-groove model, prompt message is exported, to point out auditor to audit The face video;
If the extraction module is additionally operable to existing sound-groove model, audio text corresponding with the sound-groove model is extracted Part, is designated as the 3rd audio file;
The device for setting up sound-groove model also includes:
Contrast module, for first audio file to be contrasted with the 3rd audio file, obtains described Similarity between one audio file and the 3rd audio file;
Sending module, it is different for the similarity between first audio file and the 3rd audio file to be sent to Step auditing system.
The present invention gets face video by working as, and when successfully recognizing the facial image of the face video, extracts institute The audio file in face video is stated, the first audio file is designated as;Output prompt message, to point out auditor to audit the people Face video;When the notification message that the face video examination & verification passes through is received, vocal print is set up according to first audio file Model.Realize on the basis of recognition of face, further obtain the audio file of user, built according to acquired audio file Vertical sound-groove model, when the face video of user is received next time, only when the facial image in face video is recognized successfully, and When audio file in face video coincide with the sound-groove model set up, confirmation user is real user, to improve user The accuracy of identification.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the first embodiment of the method that the present invention sets up sound-groove model;
Fig. 2 is the schematic flow sheet of the second embodiment of the method that the present invention sets up sound-groove model;
Fig. 3 is the high-level schematic functional block diagram of the first embodiment of the device that the present invention sets up sound-groove model;
Fig. 4 is the high-level schematic functional block diagram of the second embodiment of the device that the present invention sets up sound-groove model.
The realization of the object of the invention, functional characteristics and advantage will be described further referring to the drawings in conjunction with the embodiments.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
Reference picture 1, Fig. 1 is the schematic flow sheet of the first embodiment of the method that the present invention sets up sound-groove model.
In the present embodiment, the method for setting up sound-groove model includes:
Step S10, when getting face video, and successfully recognizing the facial image of the face video, extracts described Audio file in face video, is designated as the first audio file;
When user is needed by phone or internet handling bank business, the server prompts user institute where bank The mobile terminal held calls camera to obtain the face video of user, wherein, the face video includes the face figure of user Picture and audio file.It should be noted that the method that the server obtains the face video can be:Extracting user's face In image process, make the corresponding numeral of display in the screen of the mobile terminal or word, allow user within the regular hour Read shown numeral or word;Or during user's facial image is extracted, in making the screen of the mobile terminal Output prompt message, points out user that the language of predetermined number is read within the regular hour.The mobile terminal includes but does not limit In smart mobile phone and panel computer.
When the face video is got, the server extracts the facial image in the face video, will be carried The facial image for taking is contrasted with the facial image for prestoring the user, wherein, the face figure of the user that will be prestored As being designated as the facial image that prestores.When the similarity between the facial image and the facial image that prestores is more than or equal to default phase When seemingly spending, the server confirms that the facial image is recognized successfully;When between the facial image and the facial image that prestores When similarity is less than the default similarity, the server confirms the facial image recognition failures.The default similarity Can according to specific needs set, such as may be configured as 60%, 70%, or 80% etc..
When the facial image is successfully recognized, the server extracts the audio file in the face video, and will The audio file extracted from the face video is designated as the first audio file.
Step S20, exports prompt message, to point out auditor to audit the face video;
When first audio file is obtained, the server exports prompt message to asynchronous auditing system, to point out Asynchronous examination personnel audit the authenticity of the face video.It should be noted that when the examination personnel are examining During the authenticity of face video described in core, the examination personnel can by the facial image in the face video with The facial image for prestoring is contrasted.Wherein, the facial image for prestoring can be a width, or several.When The examination personnel confirm that the facial image in the face video is real, when being user, the examination & verification work Make personnel and the notification message for passing through is audited to the server by the asynchronous auditing system return;As the examination people When member confirms that the facial image in the face video is not user, the examination personnel are by the asynchronous examination & verification System returns to the notification message of examination & verification failure to the server.
It is true according to the notification message when the server receives the notification message transmitted by the asynchronous auditing system During the fixed face video examination & verification failure, the server terminates the Establishing process of sound-groove model.
In the present embodiment, the server first extracts audio file in the face video, then just output prompting Information.In other embodiments, the server can also first export prompt message, after face video examination & verification passes through, The server extracts facial image from the face video again.
Step S30, when the notification message that the face video examination & verification passes through is received, according to first audio file Set up sound-groove model.
When the server receives the notification message that the face video examination & verification transmitted by the asynchronous auditing system passes through When, the server sets up sound-groove model according to the first audio file extracted from the face video.
Further, the step S30 includes:
Step a, when the notification message that the face video examination & verification passes through is received, judges whether existing vocal print mould Type;
Step b, if there is no sound-groove model, sound-groove model is set up according to first audio file;
Step c, if existing sound-groove model, deletes already present sound-groove model, extracts stored the second audio text Part, wherein, second audio file is the audio file for succeeding in registration;
Step d, sound-groove model is set up according to first audio file and second audio file.
Further, when the server receives the notification message that the face video examination & verification passes through, the service Device judge in database whether existing sound-groove model.When there is no sound-groove model in the database, the server root Sound-groove model is set up according to first audio file.When existing sound-groove model in the database, the server is deleted Sound-groove model in the database.After sound-groove model during the server deletes the database, the server exists The second stored audio file is extracted in the database, wherein, second audio file is registration in the database Successful audio file.It should be noted that the audio file for succeeding in registration is the audio file for having set up sound-groove model, i.e., The audio file for succeeding in registration is the deleted corresponding audio file of history vocal print model.When the server obtains described During the second audio file, be overlapped for first audio file and second audio file by the server, obtains sound Line model.It is overlapped and obtains sound-groove model by first audio file and second audio, optimizes the server In sound-groove model, set up sound-groove model is more met the sound characteristic of user.
Further, it is described to include the step of extract stored the second audio file:
Step e, second audio file of the preset number that judges whether to be stored with;
If second audio file of the preset number that is stored with, the step d includes:
Step f, second audio file and first audio file according to nearest stored preset number are set up Sound-groove model.
Further, during the second stored audio file is extracted, the server judges the server Second audio file of the preset number that whether is stored with the database.The preset number can according to specific needs and Set, such as may be configured as 3,5 or 6 etc..When second audio file of the preset number that is stored with the database When, second audio file and first audio file of the preset number that the server will be stored recently are folded Plus, set up sound-groove model.Such as when the preset number is set to 5, and at least 5 the second audio texts that are stored with data During part, the server will be started at from current time, second audio file and described first that extraction is stored for nearest 5 times Audio file is overlapped, and sets up the sound-groove model.
Further, the method for setting up sound-groove model also includes
Step g, if second audio file of the preset number that is not stored with, it is all described that acquisition is stored Second audio file;
The step d includes:
Step h, sound-groove model is set up according to acquired all described second audio file and first audio file.
When second audio file of the preset number that is not stored with the database, the server is obtained All of second audio file stored in the database, by acquired all of described second audio file and described the One audio file is overlapped, and sets up sound-groove model.When the second audio file as described in three that is only stored with the database When, then be overlapped for three the second audio files and first audio file by the server, it is proposed that sound-groove model.
The present embodiment gets face video by working as, and when successfully recognizing the facial image of the face video, extracts Audio file in the face video, is designated as the first audio file;Output prompt message, it is described to point out auditor to audit Face video;When the notification message that the face video examination & verification passes through is received, sound is set up according to first audio file Line model.Realize on the basis of recognition of face, the audio file of user is further obtained, according to acquired audio file Sound-groove model is set up, when the face video of user is received next time, only when the facial image in face video is recognized successfully, And the audio file in face video, when being coincide with the sound-groove model set up, confirmations user is real user, to improve use The accuracy of family identification.
Further, reference picture 2, Fig. 2 is that the flow of the second embodiment of the method that the present invention sets up sound-groove model is illustrated Figure, proposes that the present invention sets up the second embodiment of the method for sound-groove model based on first embodiment.
In the present embodiment, the method for setting up sound-groove model also includes:
Step S40, judges whether existing sound-groove model;
If there is no sound-groove model, step S20 is performed;
Step S50, if existing sound-groove model, extracts audio file corresponding with the sound-groove model, is designated as the 3rd Audio file;
Step S60, first audio file is contrasted with the 3rd audio file, obtains first audio Similarity between file and the 3rd audio file;
Step S70, asynchronous examining is sent to by the similarity between first audio file and the 3rd audio file Core system.
In the present embodiment, as execution of step S70, step S20 is performed.
When the server extracts the facial image from the face video, the server judges the number According in storehouse whether existing sound-groove model.When there is no sound-groove model in the database, the server output prompting letter Cease to asynchronous auditing system, so that the asynchronous auditing system prompting auditor audits the face video.May be appreciated It is, when there is no sound-groove model in the database, to represent that the server gets the face video of user for the first time.Need It is noted that the server and the asynchronous auditing system can be in together in a computer, it is also possible in two meters In calculation machine.
When existing sound-groove model in the database, the server extracts audio corresponding with the sound-groove model File, that is, extract the audio file for setting up the sound-groove model, is designated as the 3rd audio file.When obtaining the 3rd audio file When, the server is contrasted first audio file with the 3rd audio file, obtains the first audio text Similarity between part and the 3rd audio file.By the phase between first audio file and the 3rd audio file Asynchronous auditing system is sent to like degree, the server exports prompt message to the asynchronous auditing system, for described asynchronous Auditing system prompting auditor audits the face video;When the asynchronous auditing result passes through, the server is then built Vertical sound-groove model, when the asynchronous auditing result is obstructed out-of-date, the server then terminates to set up the flow of sound-groove model.It is described Predetermined threshold value can be set according to specific needs, such as may be configured as 60%, 70%, or 85% etc..
The present embodiment by the first audio file in the face video is extracted, and in the database of server When there is sound-groove model, extract corresponding with the sound-groove model the 3rd audio file, by the 3rd audio file with it is described First audio file is contrasted, and subsequent operation is carried out according to comparing result.The accuracy rate of set up sound-groove model is improve, Set up sound-groove model is set more to meet the real sound characteristic of user.
The present invention further provides a kind of device for setting up sound-groove model.
Reference picture 3, Fig. 3 is the high-level schematic functional block diagram of the first embodiment of the device that the present invention sets up sound-groove model.
In the present embodiment, the device for setting up sound-groove model includes:
Extraction module 10, gets face video, and when successfully recognizing the facial image of the face video, carry for working as The audio file in the face video is taken, the first audio file is designated as;
When user is needed by phone or internet handling bank business, the server prompts user institute where bank The mobile terminal held calls camera to obtain the face video of user, wherein, the face video includes the face figure of user Picture and audio file.It should be noted that the method that the server obtains the face video can be:Extracting user's face In image process, make the corresponding numeral of display in the screen of the mobile terminal or word, allow user within the regular hour Read shown numeral or word;Or during user's facial image is extracted, in making the screen of the mobile terminal Output prompt message, points out user that the language of predetermined number is read within the regular hour.The mobile terminal includes but does not limit In smart mobile phone and panel computer.
When the face video is got, the server extracts the facial image in the face video, will be carried The facial image for taking is contrasted with the facial image for prestoring the user, wherein, the face figure of the user that will be prestored As being designated as the facial image that prestores.When the similarity between the facial image and the facial image that prestores is more than or equal to default phase When seemingly spending, the server confirms that the facial image is recognized successfully;When between the facial image and the facial image that prestores When similarity is less than the default similarity, the server confirms the facial image recognition failures.The default similarity Can according to specific needs set, such as may be configured as 60%, 70%, or 80% etc..
When the facial image is successfully recognized, the server extracts the audio file in the face video, and will The audio file extracted from the face video is designated as the first audio file.
Output module 20, for exporting prompt message, to point out auditor to audit the face video;
When first audio file is obtained, the server exports prompt message to asynchronous auditing system, to point out Asynchronous examination personnel audit the authenticity of the face video.It should be noted that when the examination personnel are examining During the authenticity of face video described in core, the examination personnel can by the facial image in the face video with The facial image for prestoring is contrasted.Wherein, the facial image for prestoring can be a width, or several.When The examination personnel confirm that the facial image in the face video is real, when being user, the examination & verification work Make personnel and the notification message for passing through is audited to the server by the asynchronous auditing system return;As the examination people When member confirms that the facial image in the face video is not user, the examination personnel are by the asynchronous examination & verification System returns to the notification message of examination & verification failure to the server.
It is true according to the notification message when the server receives the notification message transmitted by the asynchronous auditing system During the fixed face video examination & verification failure, the server terminates the Establishing process of sound-groove model.
In the present embodiment, the server first extracts audio file in the face video, then just output prompting Information.In other embodiments, the server can also first export prompt message, after face video examination & verification passes through, The server extracts facial image from the face video again.
Module 30 is set up, for when the notification message that the face video examination & verification passes through is received, according to described first Audio file sets up sound-groove model.
When the server receives the notification message that the face video examination & verification transmitted by the asynchronous auditing system passes through When, the server sets up sound-groove model according to the first audio file extracted from the face video.
Further, the module 30 of setting up includes:
Judging unit, for when the notification message that the face video examination & verification passes through is received, judging whether existing Sound-groove model;
Unit is set up, if for there is no sound-groove model, sound-groove model is set up according to first audio file;
Extraction unit, if for existing sound-groove model, deleting already present sound-groove model, extracts stored second Audio file, wherein, second audio file is the audio file for succeeding in registration;
The unit of setting up is additionally operable to set up sound-groove model according to first audio file and second audio file.
Further, when the server receives the notification message that the face video examination & verification passes through, the service Device judge in database whether existing sound-groove model.When there is no sound-groove model in the database, the server root Sound-groove model is set up according to first audio file.When existing sound-groove model in the database, the server is deleted Sound-groove model in the database.After sound-groove model during the server deletes the database, the server exists The second stored audio file is extracted in the database, wherein, second audio file is registration in the database Successful audio file.It should be noted that the audio file for succeeding in registration is the audio file for having set up sound-groove model, i.e., The audio file for succeeding in registration is the deleted corresponding audio file of history vocal print model.When the server obtains described During the second audio file, be overlapped for first audio file and second audio file by the server, obtains sound Line model.It is overlapped and obtains sound-groove model by first audio file and second audio, optimizes the server In sound-groove model, set up sound-groove model is more met the sound characteristic of user.
Further, the judging unit is additionally operable to judge whether to be stored with second audio file of preset number;
If the unit of setting up is additionally operable to be stored with second audio file of the preset number, according to nearest institute Second audio file and first audio file for storing preset number set up sound-groove model.
Further, during the second stored audio file is extracted, the server judges the server Second audio file of the preset number that whether is stored with the database.The preset number can according to specific needs and Set, such as may be configured as 3,5 or 6 etc..When second audio file of the preset number that is stored with the database When, second audio file and first audio file of the preset number that the server will be stored recently are folded Plus, set up sound-groove model.Such as when the preset number is set to 5, and at least 5 the second audio texts that are stored with data During part, the server will be started at from current time, second audio file and described first that extraction is stored for nearest 5 times Audio file is overlapped, and sets up the sound-groove model.
Further, the module 30 of setting up also includes:
Acquiring unit, if for second audio file of the preset number that is not stored with, what acquisition was stored All second audio files;
It is described set up unit be additionally operable to according to acquired in all described second audio file and first audio file Set up sound-groove model.
When second audio file of the preset number that is not stored with the database, the server is obtained All of second audio file stored in the database, by acquired all of described second audio file and described the One audio file is overlapped, and sets up sound-groove model.When the second audio file as described in three that is only stored with the database When, then be overlapped for three the second audio files and first audio file by the server, it is proposed that sound-groove model.
The present embodiment gets face video by working as, and when successfully recognizing the facial image of the face video, extracts Audio file in the face video, is designated as the first audio file;Output prompt message, it is described to point out auditor to audit Face video;When the notification message that the face video examination & verification passes through is received, sound is set up according to first audio file Line model.Realize on the basis of recognition of face, the audio file of user is further obtained, according to acquired audio file Sound-groove model is set up, when the face video of user is received next time, only when the facial image in face video is recognized successfully, And the audio file in face video, when being coincide with the sound-groove model set up, confirmations user is real user, to improve use The accuracy of family identification.
Reference picture 4, Fig. 4 is the high-level schematic functional block diagram of the second embodiment of the device that the present invention sets up sound-groove model, base Propose that the present invention sets up the second embodiment of the device of sound-groove model in first embodiment.
In the present embodiment, the device for setting up sound-groove model also includes:
Judge module 40, for judging whether existing sound-groove model;
If the output module 20 is additionally operable to no presence of sound-groove model, prompt message is exported, to point out auditor to examine Face video described in core;
If the extraction module 10 is additionally operable to existing sound-groove model, audio text corresponding with the sound-groove model is extracted Part, is designated as the 3rd audio file;
The device for setting up sound-groove model also includes:
Contrast module 50, for first audio file to be contrasted with the 3rd audio file, obtains described Similarity between first audio file and the 3rd audio file;
Sending module 60, for the similarity between first audio file and the 3rd audio file to be sent to Asynchronous auditing system.
When the server extracts the facial image from the face video, the server judges the number According in storehouse whether existing sound-groove model.When there is no sound-groove model in the database, the server output prompting letter Cease to asynchronous auditing system, so that the asynchronous auditing system prompting auditor audits the face video.May be appreciated It is, when there is no sound-groove model in the database, to represent that the server gets the face video of user for the first time.Need It is noted that the server and the asynchronous auditing system can be in together in a computer, it is also possible in two meters In calculation machine.
When existing sound-groove model in the database, the server extracts audio corresponding with the sound-groove model File, that is, extract the audio file for setting up the sound-groove model, is designated as the 3rd audio file.When obtaining the 3rd audio file When, the server is contrasted first audio file with the 3rd audio file, obtains the first audio text Similarity between part and the 3rd audio file.By the phase between first audio file and the 3rd audio file Asynchronous auditing system is sent to like degree, the server exports prompt message to the asynchronous auditing system, for described asynchronous Auditing system prompting auditor audits the face video;When the asynchronous auditing result passes through, the server is then built Vertical sound-groove model, when the asynchronous auditing result is obstructed out-of-date, the server then terminates to set up the flow of sound-groove model.It is described Predetermined threshold value can be set according to specific needs, such as may be configured as 60%, 70%, or 85% etc..
The present embodiment by the first audio file in the face video is extracted, and in the database of server When there is sound-groove model, extract corresponding with the sound-groove model the 3rd audio file, by the 3rd audio file with it is described First audio file is contrasted, and subsequent operation is carried out according to comparing result.The accuracy rate of set up sound-groove model is improve, Set up sound-groove model is set more to meet the real sound characteristic of user.
The embodiments of the present invention are for illustration only, and the quality of embodiment is not represented.Embodiment party more than The description of formula, it is required general that those skilled in the art can be understood that above-described embodiment method can add by software The mode of hardware platform is realized, naturally it is also possible to by hardware, but the former is more preferably implementation method in many cases.It is based on Such understanding, the part that technical scheme substantially contributes to prior art in other words can be with software product Form embody, the computer software product store in a storage medium (such as ROM/RAM, magnetic disc, CD), including Some instructions are used to so that a station terminal equipment (can be mobile phone, computer, server, or network equipment etc.) performs this hair Method described in bright each embodiment.
The preferred embodiments of the present invention are these are only, the scope of the claims of the invention is not thereby limited, it is every to utilize this hair Equivalent structure or equivalent flow conversion that bright specification and accompanying drawing content are made, or directly or indirectly it is used in other related skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of method for setting up sound-groove model, it is characterised in that the method for setting up sound-groove model includes:
When getting face video, and successfully recognizing the facial image of the face video, in the extraction face video Audio file, is designated as the first audio file;
Output prompt message, to point out auditor to audit the face video;
When the notification message that the face video examination & verification passes through is received, vocal print mould is set up according to first audio file Type.
2. the method for setting up sound-groove model as claimed in claim 1, it is characterised in that described to receive the face video Include during the notification message that examination & verification passes through, the step of set up sound-groove model according to first audio file:
When the notification message that the face video examination & verification passes through is received, judge whether existing sound-groove model;
If there is no sound-groove model, sound-groove model is set up according to first audio file;
If existing sound-groove model, already present sound-groove model is deleted, extract the second stored audio file, wherein, institute It is the audio file for succeeding in registration to state the second audio file;
Sound-groove model is set up according to first audio file and second audio file.
3. the method for setting up sound-groove model as claimed in claim 2, it is characterised in that the second audio that the extraction is stored The step of file, includes:
Judge whether to be stored with second audio file of preset number;
It is described according to first audio file and described if second audio file of the preset number that is stored with The step of two audio files set up sound-groove model includes:
Second audio file and first audio file according to nearest stored preset number set up sound-groove model.
4. the method for setting up sound-groove model as claimed in claim 3, it is characterised in that described to judge whether the present count that is stored with Described in purpose the step of the second audio file after, also include:
If second audio file of the preset number that is not stored with, all described second audio text that acquisition is stored Part;
It is described to include the step of set up sound-groove model according to first audio file and second audio file:
Sound-groove model is set up according to acquired all described second audio file and first audio file.
5. the method for setting up sound-groove model as described in any one of Claims 1-4, it is characterised in that described when getting people Face video, and when successfully recognizing the facial image of the face video, extracts the audio file in the face video, is designated as the After the step of one audio file, also include:
Judge whether existing sound-groove model;
If there is no sound-groove model, perform output prompt message, with point out auditor audit the face video the step of;
If existing sound-groove model, audio file corresponding with the sound-groove model is extracted, be designated as the 3rd audio file;
First audio file is contrasted with the 3rd audio file, first audio file is obtained with described Similarity between three audio files;
Similarity between first audio file and the 3rd audio file is sent to asynchronous auditing system, and is performed Output prompt message, with point out auditor audit the face video the step of.
6. a kind of device for setting up sound-groove model, it is characterised in that the device for setting up sound-groove model includes:
Extraction module, face video is got for working as, and when successfully recognizing the facial image of the face video, extracts described Audio file in face video, is designated as the first audio file;
Output module, for exporting prompt message, to point out auditor to audit the face video;
Module is set up, for when the notification message that the face video examination & verification passes through is received, according to first audio text Part sets up sound-groove model.
7. the device of sound-groove model is set up as claimed in claim 6, it is characterised in that the module of setting up includes:
Judging unit, for when the notification message that the face video examination & verification passes through is received, judging whether existing vocal print Model;
Unit is set up, if for there is no sound-groove model, sound-groove model is set up according to first audio file;
Extraction unit, if for existing sound-groove model, deleting already present sound-groove model, extracts the second stored audio File, wherein, second audio file is the audio file for succeeding in registration;
The unit of setting up is additionally operable to set up sound-groove model according to first audio file and second audio file.
8. the device of sound-groove model is set up as claimed in claim 7, it is characterised in that the judging unit is additionally operable to judgement is Second audio file of the no preset number that is stored with;
If the unit of setting up is additionally operable to be stored with second audio file of the preset number, according to being stored recently Second audio file and first audio file of preset number set up sound-groove model.
9. the device of sound-groove model is set up as claimed in claim 8, it is characterised in that the module of setting up also includes:
Acquiring unit, if for second audio file of the preset number that is not stored with, it is all that acquisition is stored Second audio file;
It is described set up unit be additionally operable to according to acquired in all described second audio file and first audio file set up Sound-groove model.
10. the device for setting up sound-groove model as described in any one of claim 6 to 9, it is characterised in that described to set up vocal print mould The device of type also includes:
Judge module, for judging whether existing sound-groove model;
If the output module is additionally operable to no presence of sound-groove model, prompt message is exported, it is described to point out auditor to audit Face video;
If the extraction module is additionally operable to existing sound-groove model, audio file corresponding with the sound-groove model is extracted, remembered It is the 3rd audio file;
The device for setting up sound-groove model also includes:
Contrast module, for first audio file to be contrasted with the 3rd audio file, obtains first sound Similarity between frequency file and the 3rd audio file;
Sending module, for the similarity between first audio file and the 3rd audio file to be sent into asynchronous examining Core system.
CN201611005290.4A 2016-11-11 2016-11-11 Method and device for establishing voiceprint model Active CN106782567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611005290.4A CN106782567B (en) 2016-11-11 2016-11-11 Method and device for establishing voiceprint model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611005290.4A CN106782567B (en) 2016-11-11 2016-11-11 Method and device for establishing voiceprint model

Publications (2)

Publication Number Publication Date
CN106782567A true CN106782567A (en) 2017-05-31
CN106782567B CN106782567B (en) 2020-04-03

Family

ID=58969608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611005290.4A Active CN106782567B (en) 2016-11-11 2016-11-11 Method and device for establishing voiceprint model

Country Status (1)

Country Link
CN (1) CN106782567B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107274906A (en) * 2017-06-28 2017-10-20 百度在线网络技术(北京)有限公司 Voice information processing method, device, terminal and storage medium
CN109325742A (en) * 2018-09-26 2019-02-12 平安普惠企业管理有限公司 Business approval method, apparatus, computer equipment and storage medium
CN111611437A (en) * 2020-05-20 2020-09-01 浩云科技股份有限公司 Method and device for preventing face voiceprint verification and replacement attack
CN114245204A (en) * 2021-12-15 2022-03-25 平安银行股份有限公司 Video surface signing method and device based on artificial intelligence, electronic equipment and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1646018A1 (en) * 2004-10-08 2006-04-12 Fujitsu Limited Biometric authentication device, biometric information authentication method, and program
CN201820245U (en) * 2010-12-01 2011-05-04 福州海景科技开发有限公司 Portrait biometric identification device in financial transaction based on portrait biometric identification technology
CN104834849A (en) * 2015-04-14 2015-08-12 时代亿宝(北京)科技有限公司 Dual-factor identity authentication method and system based on voiceprint recognition and face recognition
CN204576520U (en) * 2015-04-14 2015-08-19 时代亿宝(北京)科技有限公司 Based on the Dual-factor identity authentication device of Application on Voiceprint Recognition and recognition of face
CN105119872A (en) * 2015-02-13 2015-12-02 腾讯科技(深圳)有限公司 Identity verification method, client, and service platform
CN105550928A (en) * 2015-12-03 2016-05-04 城市商业银行资金清算中心 System and method of network remote account opening for commercial bank
CN105577664A (en) * 2015-12-22 2016-05-11 深圳前海微众银行股份有限公司 Cipher reset method and system, client and server

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1646018A1 (en) * 2004-10-08 2006-04-12 Fujitsu Limited Biometric authentication device, biometric information authentication method, and program
CN201820245U (en) * 2010-12-01 2011-05-04 福州海景科技开发有限公司 Portrait biometric identification device in financial transaction based on portrait biometric identification technology
CN105119872A (en) * 2015-02-13 2015-12-02 腾讯科技(深圳)有限公司 Identity verification method, client, and service platform
CN104834849A (en) * 2015-04-14 2015-08-12 时代亿宝(北京)科技有限公司 Dual-factor identity authentication method and system based on voiceprint recognition and face recognition
CN204576520U (en) * 2015-04-14 2015-08-19 时代亿宝(北京)科技有限公司 Based on the Dual-factor identity authentication device of Application on Voiceprint Recognition and recognition of face
CN105550928A (en) * 2015-12-03 2016-05-04 城市商业银行资金清算中心 System and method of network remote account opening for commercial bank
CN105577664A (en) * 2015-12-22 2016-05-11 深圳前海微众银行股份有限公司 Cipher reset method and system, client and server

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107274906A (en) * 2017-06-28 2017-10-20 百度在线网络技术(北京)有限公司 Voice information processing method, device, terminal and storage medium
CN109325742A (en) * 2018-09-26 2019-02-12 平安普惠企业管理有限公司 Business approval method, apparatus, computer equipment and storage medium
CN111611437A (en) * 2020-05-20 2020-09-01 浩云科技股份有限公司 Method and device for preventing face voiceprint verification and replacement attack
CN114245204A (en) * 2021-12-15 2022-03-25 平安银行股份有限公司 Video surface signing method and device based on artificial intelligence, electronic equipment and medium

Also Published As

Publication number Publication date
CN106782567B (en) 2020-04-03

Similar Documents

Publication Publication Date Title
US9361891B1 (en) Method for converting speech to text, performing natural language processing on the text output, extracting data values and matching to an electronic ticket form
CN106373575B (en) User voiceprint model construction method, device and system
CN110533288A (en) Business handling process detection method, device, computer equipment and storage medium
CN104205721B (en) The adaptive authentication method of context aware and device
JP4939121B2 (en) Methods, systems, and programs for sequential authentication using one or more error rates that characterize each security challenge
CN106506524A (en) Method and apparatus for verifying user
CN108510290B (en) Customer information amending method, device, computer equipment and storage medium in call
WO2016015687A1 (en) Voiceprint verification method and device
CN107977776A (en) Information processing method, device, server and computer-readable recording medium
AU2018354129B2 (en) System and method for automated online notarization meeting recovery
CN106713370B (en) A kind of identity identifying method, server and mobile terminal
CN106782567A (en) Method and device for establishing voiceprint model
JP2017010511A (en) Voiceprint authentication method and device
CN107886958A (en) Express cabinet pickup method and device based on voiceprint
CN106600397A (en) Remote account opening method and device
CN109462603A (en) Voiceprint authentication method, equipment, storage medium and device based on blind Detecting
CN109462482B (en) Voiceprint recognition method, voiceprint recognition device, electronic equipment and computer readable storage medium
CN106851602A (en) A kind of transaction system short-message verification method and system
CN109194689B (en) Abnormal behavior recognition method, device, server and storage medium
CN105205367B (en) Information processing method and electronic equipment
CN107346568A (en) The authentication method and device of a kind of gate control system
CN104183238A (en) Old people voiceprint recognition method based on questioning and answering
CN109729067A (en) Voice punch card method, device, equipment and computer storage medium
CN111816184B (en) Speaker recognition method, speaker recognition device, and recording medium
CN111415684B (en) Voice module testing method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant