CN110489659A - Data matching method and device - Google Patents

Data matching method and device Download PDF

Info

Publication number
CN110489659A
CN110489659A CN201910651114.5A CN201910651114A CN110489659A CN 110489659 A CN110489659 A CN 110489659A CN 201910651114 A CN201910651114 A CN 201910651114A CN 110489659 A CN110489659 A CN 110489659A
Authority
CN
China
Prior art keywords
vocal print
print feature
feature vector
user
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910651114.5A
Other languages
Chinese (zh)
Inventor
苏玉峰
吴欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910651114.5A priority Critical patent/CN110489659A/en
Publication of CN110489659A publication Critical patent/CN110489659A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • G06F16/436Filtering based on additional data, e.g. user or group profiles using biological or physiological data of a human being, e.g. blood pressure, facial expression, gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Physiology (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Multimedia (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a kind of data matching method and devices.The present invention relates to artificial intelligence field, which includes: the request for receiving target user and issuing, and request carries voice data;The vocal print feature for extracting voice data, obtains target vocal print feature;The user stored in presetting database is screened according to target vocal print feature, obtain recommendation list, recommendation list includes an at least user, similarity between the vocal print feature and target vocal print feature of all users that recommendation list includes is greater than or equal to the first default similarity threshold, and the vocal print feature of multiple users is stored in presetting database;Recommendation list is shown to target user.Technical solution provided in an embodiment of the present invention is able to solve the problem of platform can not protect the individual privacy of user.

Description

Data matching method and device
[technical field]
The present invention relates to artificial intelligence field more particularly to a kind of data matching methods and device.
[background technique]
With popularizing for network, user increasingly gets used to network social intercourse, and user (can answer in social platform for social With program) on input the age, geographical location, income, assets, the conditions such as educational background are screened, to find the people for oneself wanting to make friends. But this method has a problem that: before adding good friend, the personal information of user can be checked by people on platform, platform The individual privacy of user can not be protected.
[summary of the invention]
In view of this, can not be protected the embodiment of the invention provides a kind of data matching method and device to solve platform The problem of protecting the individual privacy of user.
The embodiment of the invention provides a kind of data matching methods, which comprises receives asking for target user's sending It asks, the request carries voice data;The vocal print feature for extracting the voice data obtains target vocal print feature;According to institute It states target vocal print feature to screen the user stored in presetting database, obtains recommendation list, the recommendation list includes An at least user, it is similar between the vocal print feature for all users that the recommendation list includes and the target vocal print feature Degree is greater than or equal to the first default similarity threshold, and the vocal print feature of multiple users is stored in the presetting database;To institute It states target user and shows the recommendation list.
Further, the vocal print feature for extracting the voice data, obtains target vocal print feature, comprising: from described N kind vocal print feature vector is extracted in voice data, wherein N >=2;Any two kinds are calculated separately in the N kind vocal print feature vector Average KL distance between vocal print feature vector;Average KL is special as the target vocal print apart from maximum two kinds of vocal print features Sign.
Further, calculate the average KL distance between two kinds of vocal print feature vectors, comprising: obtain the first vocal print feature to Amount and the second vocal print feature vector;Calculate separately the distribution of the second vocal print feature vector described in the first vocal print feature vector sum Mean value and covariance;The mean value and covariance of the distribution of the second vocal print feature vector according to the first vocal print feature vector sum Construct the first vocal print feature vector space and the corresponding probability distribution of the second vocal print characteristic vector space;According to The first vocal print feature vector space and the corresponding probability distribution of the second vocal print characteristic vector space, described in calculating Average KL distance between second vocal print feature vector described in first vocal print feature vector sum.
Further, the facial image of multiple users is also stored in the presetting database, the request also carries First facial image, before the recommendation list to target user displaying, the method also includes: it calculates separately Similarity between the facial image for each user that first facial image and the recommendation list include;By similarity The corresponding user of facial image less than the second default similarity threshold deletes from the recommendation list.
Further, each user for calculating separately first facial image and the recommendation list includes Similarity between facial image, comprising: illumination pretreatment, filter are carried out to first facial image using difference Gauss algorithm Except the low-frequency information of first facial image, retains the high-frequency information of first facial image, obtain Gaussian image;To institute It states Gaussian image and carries out image histogram equalization processing, obtain the uniform image of gray value;It is uniform to calculate the gray value The corresponding feature vector of image, using the feature vector being calculated as the corresponding feature vector of first facial image;Point The facial image for each user that the corresponding feature vector of first facial image and the recommendation list include is not calculated Similarity between corresponding feature vector.
The embodiment of the invention provides a kind of data matching device, described device includes: receiving unit, for receiving target The request that user issues, the request carry voice data;Extraction unit, the vocal print for extracting the voice data are special Sign, obtains target vocal print feature;Screening unit, for according to the target vocal print feature to the user stored in presetting database It is screened, obtains recommendation list, the recommendation list includes an at least user, all users that the recommendation list includes Vocal print feature and the target vocal print feature between similarity be greater than or equal to the first default similarity threshold, it is described default The vocal print feature of multiple users is stored in database;Display unit, for showing the recommendation list to the target user.
Further, the extraction unit includes: extraction subelement, for extracting N kind vocal print from the voice data Feature vector, wherein N >=2;First computation subunit, for calculating separately any two kinds of sound in the N kind vocal print feature vector Average KL distance between line feature vector;Subelement is determined, for the KL that will be averaged apart from maximum two kinds of vocal print feature conducts The target vocal print feature.
Further, first computation subunit includes: acquisition module, for obtaining the first vocal print feature vector sum the Two vocal print feature vectors;First computing module, for calculating separately the spy of the second vocal print described in the first vocal print feature vector sum Levy the mean value and covariance of vector distribution;Module is constructed, the second vocal print according to the first vocal print feature vector sum is used for The mean value and covariance of feature vector distribution construct the first vocal print feature vector space and the second vocal print feature vector The corresponding probability distribution in space;Second computing module, for according to the first vocal print feature vector space and described the It is special to calculate the second vocal print described in the first vocal print feature vector sum for the corresponding probability distribution of two vocal print feature vector spaces Levy the average KL distance between vector.
Further, the facial image of multiple users is also stored in the presetting database, the request also carries First facial image, described device further include: computing unit is used in the display unit to described in target user displaying Before recommendation list, the facial image for each user that first facial image includes with the recommendation list is calculated separately Between similarity;Unit is deleted, for the corresponding user of facial image by similarity less than the second default similarity threshold It is deleted from the recommendation list.
Further, the computing unit includes: to filter out subelement, for utilizing difference Gauss algorithm to described the first Face image carries out illumination pretreatment, filters out the low-frequency information of first facial image, retains the height of first facial image Frequency information, obtains Gaussian image;Image histogram equalization processing subelement, for carrying out image histogram to the Gaussian image Figure equalization processing obtains the uniform image of gray value;Second computation subunit, for calculating the uniform image of the gray value Corresponding feature vector, using the feature vector being calculated as the corresponding feature vector of first facial image;Third meter Operator unit, for calculate separately the corresponding feature vector of first facial image and the recommendation list includes each Similarity between the corresponding feature vector of the facial image of user.
In embodiments of the present invention, the request that target user issues is received, request carries voice data, extracts voice number According to vocal print feature, obtain target vocal print feature, the user stored in presetting database sieved according to target vocal print feature Choosing has achieved the effect that carrying out commending friends to user according to vocal print feature can not check and use on platform before adding good friend The personal information at family, the effective protection individual privacy of user.
[Detailed description of the invention]
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this field For those of ordinary skill, without any creative labor, it can also be obtained according to these attached drawings other attached Figure.
Fig. 1 is a kind of flow chart of optional data matching method according to embodiments of the present invention;
Fig. 2 is a kind of schematic diagram of optional data matching device according to embodiments of the present invention;
Fig. 3 is a kind of schematic diagram of computer equipment provided in an embodiment of the present invention.
[specific embodiment]
For a better understanding of the technical solution of the present invention, being retouched in detail to the embodiment of the present invention with reference to the accompanying drawing It states.
It will be appreciated that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Base Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its Its embodiment, shall fall within the protection scope of the present invention.
The term used in embodiments of the present invention is only to be not intended to be limiting merely for for the purpose of describing particular embodiments The present invention.In the embodiment of the present invention and the "an" of singular used in the attached claims, " described " and "the" It is also intended to including most forms, unless the context clearly indicates other meaning.
It should be appreciated that term "and/or" used herein is only a kind of incidence relation for describing affiliated partner, indicate There may be three kinds of relationships, for example, A and/or B, can indicate: individualism A, exist simultaneously A and B, individualism B these three Situation.In addition, character "/" herein, typicallys represent the relationship that forward-backward correlation object is a kind of "or".
The embodiment of the invention provides a kind of data matching method, this method can be applied to social platform.The party is used The social platform of method can meet the needs of user is according to sound friend-making.
As an alternative embodiment, store the voice data of multiple users in social platform, work as target user When wanting to make friends, target user issues to social platform and requests, and request carries voice data, and the vocal print for extracting voice data is special Sign, obtains target vocal print feature, the stored and vocal print feature of target vocal print feature relatively is found out, then by these sound The corresponding user of line feature recommends target user.
For example, user Xiao Ming wants to look for a sound in social platform as the people of some star makes friends, for the side of description Just, which is known as AA, then Xiao Ming is target user, and Xiao Ming issues to social platform and requests, and request carries the voice of AA Data (voice data can be the TV play from AA, film, interception obtains in variety), social platform extracts the voice of AA The vocal print feature of data obtains target vocal print feature.Social platform has a database, which can be described as preset data Library, the data of multiple users is stored in the database, and (including information such as age, geographical location, income, assets, educational background are also wrapped Include the information such as vocal print feature and the facial image of user).Assuming that storing the data of 5000 users in the database, then from this Screened in 5000 users several with target vocal print feature very close to user, these users are formed into recommendation list, and will Recommendation list shows Xiao Ming.Xiao Ming can select one or more users from recommendation list, send addition good friend's request, other side By the way that later, Xiao Ming becomes good friend with other side.Before adding good friend, Xiao Ming can not check the personal letter of user on platform Breath, the effective protection individual privacy of user.
As another optional embodiment, used the social activity of data matching method provided in an embodiment of the present invention flat Platform can meet the needs of user is according to audio+photo friend-making.Store the voice data and face of multiple users in social platform Image, when target user wants to make friends, target user issues to social platform and requests, and request carries voice data and face The facial image that request carries (for convenience, is known as the first facial image) by image, and the vocal print for extracting voice data is special Sign, obtains target vocal print feature, the stored and vocal print feature of target vocal print feature relatively is found out, then from these sound Facial image and the close user of the first facial image are found out in the corresponding user of line feature, and recommend target user.
For example, user Xiao Ming wants to look for a sound as AA star in social platform and face looks like the people of BB star It makes friends, then Xiao Ming is target user, and Xiao Ming issues to social platform and requests, and request carries voice data (voice of AA Data can be that the TV play from AA, film, interception obtains in variety) and BB facial image (i.e. above-mentioned first face figure Picture), social platform extracts the vocal print feature of the voice data of AA, obtains target vocal print feature.Social platform has a data Library, the database can be described as presetting database, stored in the database multiple users data (including the age, geographical location, The information such as income, assets, educational background also include the information such as vocal print feature and the facial image of user).Assuming that being stored in the database The data of 5000 users, then screened from this 5000 users several with target vocal print feature very close to user, will These users form recommendation list, it is assumed that have 50 users in recommendation list, then find out face from this 50 users and look like BB User, it is assumed that find out is No. 0009 user, No. 0078 user, No. 4560 users, then by No. 0009 user, No. 0078 with Family, No. 4560 users show Xiao Ming.Xiao Ming can select one or more users from recommendation list, send addition good friend and ask It asks, after other side passes through, Xiao Ming becomes good friend with other side.Before adding good friend, Xiao Ming can not check user on platform Personal information, the effective protection individual privacy of user.Assuming that Xiao Ming has sent addition good friend's request to No. 0078 user, then After No. 0078 user receives the request, the data of Xiao Ming can be checked, to decide whether Xiao Ming being added to good friend.If No. 0078 User agree to addition good friend, then after, Xiao Ming can check the data of No. 0078 user.
In embodiments of the present invention, the voice data and facial image of the multiple users stored in social platform be user from What own wish uploaded, there is no the voice data and facial image in the unwitting situation lower platform acquisition user of user to cause to invade The problem of violating privacy of user.
Fig. 1 is a kind of flow chart of optional data matching method according to embodiments of the present invention.As shown in Figure 1, this method Include:
Step S102, receives the request that target user issues, and request carries voice data.
Step S104 extracts the vocal print feature of voice data, obtains target vocal print feature.
Step S106 screens the user stored in presetting database according to target vocal print feature, obtains recommending name Single, recommendation list includes an at least user, the vocal print feature and target vocal print feature of all users that recommendation list includes it Between similarity be greater than or equal to the first default similarity threshold, the vocal print feature of multiple users is stored in presetting database.
Step S108 shows recommendation list to target user.
In embodiments of the present invention, the request that target user issues is received, request carries voice data, extracts voice number According to vocal print feature, obtain target vocal print feature, the user stored in presetting database sieved according to target vocal print feature Choosing has achieved the effect that carrying out commending friends to user according to vocal print feature can not check and use on platform before adding good friend The personal information at family, the effective protection individual privacy of user.
Inventors have found that original life cannot be represented completely using the feature vector that single vocal print feature extraction algorithm extracts The characteristics of object sample, therefore in embodiments of the present invention, inventors herein propose using at least two vocal print feature extraction algorithms come The method for representing primitive organism sample.Specifically, after receiving the request that target user issues, from the voice data of request carrying Middle extraction N kind vocal print feature vector, wherein N >=2;Calculate separately any two kinds of vocal print feature vectors in N kind vocal print feature vector Between average KL distance;By average KL apart from maximum two kinds of vocal print features as target vocal print feature.
Wherein, the average KL distance between two kinds of vocal print feature vectors is calculated, detailed process can be with are as follows: obtains the first vocal print Feature vector and the second vocal print feature vector;Calculate separately the equal of first vocal print feature vector sum the second vocal print feature vector distribution Value and covariance;The first sound is constructed according to the mean value of first vocal print feature vector sum the second vocal print feature vector distribution and covariance Line characteristic vector space and the corresponding probability distribution of the second vocal print characteristic vector space;It is empty according to the first vocal print feature vector Between probability distribution corresponding with the second vocal print characteristic vector space, calculate first the second vocal print feature of vocal print feature vector sum Average KL distance between vector.
Different types of vocal print feature vector can be extracted using different vocal print feature extraction algorithms, for example, can adopt Vocal print feature extraction algorithm has: MFCC mel cepstrum coefficients, the phase angle residual phase residual error, LPCC linear prediction Spectral function, the linear spectral function of MLSF Meier.
Specifically, the mean μ of the first vocal print feature vector distribution is calculated according to formula (1)A;The is calculated according to formula (2) The mean μ of two vocal print feature vector distributionsB;The covariance cov of the first vocal print feature vector distribution is calculated according to formula (3)A;Root The covariance cov of the second vocal print feature vector distribution is calculated according to formula (4)B;The first vocal print feature vector is calculated according to formula (5) The corresponding probability distribution P in spaceA(x);The corresponding probability distribution P of the second vocal print characteristic vector space is calculated according to formula (6)B (x);The average KL distance D between first vocal print feature vector sum the second vocal print feature vector is calculated according to formula (7).
Formula (1) is into (7), μAIndicate the mean value of the first vocal print feature vector distribution;μBIndicate the first vocal print feature vector The mean value of distribution;covAIndicate the covariance of the first vocal print feature vector distribution;covBIndicate the distribution of the second vocal print feature vector Covariance;PA(x) the corresponding probability distribution of the first vocal print feature vector space is indicated;PB(x) indicate that the second vocal print feature vector is empty Between corresponding probability distribution;D indicates the average KL distance between first vocal print feature vector sum the second vocal print feature vector;nATable Show the quantity for the feature vector that the first vocal print feature vector includes, wherein the ith feature that the first vocal print feature vector includes to Amount is xAi, i takes 1 to nABetween natural number;nBIndicate the quantity for the feature vector that the second vocal print feature vector includes, wherein The ith feature vector that second vocal print feature vector includes is xBi, i takes 1 to nBBetween natural number.
A variety of vocal print features of each user are stored in presetting database, wherein each vocal print feature and one kind Algorithm is corresponding.For example, storing the vocal print feature of m user in presetting database, this m user is respectively user U1, user U2, user U3 ..., user Um store N kind in presetting database also, for each user in this m user Vocal print feature, wherein the 1st kind of vocal print feature of user Uj is extracted according to vocal print feature extraction algorithm S1, user Uj The 2nd kind of vocal print feature be to be extracted according to vocal print feature extraction algorithm S2, the 3rd kind of vocal print feature of user Uj is basis Vocal print feature extraction algorithm S3 is extracted ... ..., and the N kind vocal print feature of user Uj is according to vocal print feature extraction algorithm What SN was extracted, j takes 1 to the random natural number between m.Vocal print feature extraction algorithm S1, vocal print feature extraction algorithm S2 ..., vocal print feature extraction algorithm SN refer specifically to MFCC mel cepstrum coefficients described above, residual phase Phase angle residual error, LPCC linear prediction spectral function, the linear spectral function of MLSF Meier etc..
It is calculated separately in N kind vocal print feature vector between any two kinds of vocal print feature vectors according to above-mentioned formula (1) to (7) Average KL distance;By average KL apart from maximum two kinds of vocal print features as target vocal print feature.Assuming that average KL distance is most Two kinds of big vocal print features are vocal print feature C1 and vocal print feature C2, and vocal print feature C1 is according to vocal print feature extraction algorithm What S1 was extracted, vocal print feature C2 is extracted according to vocal print feature extraction algorithm S2, then by vocal print feature C1 harmony Line feature C2 is used as target vocal print feature.
In step S106, during being screened according to target vocal print feature to the user stored in presetting database, The similarity for first calculating the vocal print feature of the user stored in target vocal print feature and presetting database, is then sieved according to similarity Family is selected, for example, filtering out the user that similarity is greater than or equal to the first default similarity threshold.
Illustrate the process for calculating similarity by taking user's first as an example.
Calculate the tool of the similarity simi of the vocal print feature of the user's first stored in target vocal print feature and presetting database Body process is as follows: calculating similarity simi according to formula simi=(simi1+simi2)/2, wherein simi1 is vocal print feature C1 Similarity between the vocal print feature for the user's first extracted according to algorithm S1, simi2 are vocal print feature C2 and according to calculations Similarity between the vocal print feature for user's first that method S2 is extracted.
The average KL of two kinds of vocal print feature vectors apart from it is larger when, the correlation of both vocal print feature vectors is smaller.In When carrying out the similarity calculation of vocal print feature, it is special to calculate separately the vocal print stored in two kinds of vocal print feature vectors and presetting database The similarity between vector is levied, can be avoided single vocal print feature extraction algorithm bring error when extracting vocal print feature, from And effectively improve the accuracy for calculating similarity.
In embodiments of the present invention, the user of social platform can will be used with upload pictures (i.e. facial image), social platform The photo that family uploads is stored into presetting database.When upload pictures, user provides ID card information, and platform, which carries out identity, to be recognized Card terminates if certification does not pass through;If certification passes through, the photo of user's upload is received.The photograph that platform uploads user Piece and the identity card picture of user compare, and judge whether to match, if it does, then the photo that user is uploaded saves;If no Matching then asks user to upload my photo or check whether photo passes through PS.To sum up, user is without serious PS's Photo can be just successfully saved in the photo library of the user;The photo of photo or serious distortion if not user is all It will not be saved in the photo library of the user, the photo in photo library to guarantee user is strictly the photo of user, And these photos can compare the appearance for being truly reflected user.
As an alternative embodiment, the request that target user issues not only carries voice data, also carry First facial image, this illustrates that target user wishes to find good friend, in this case, platform by way of voice+photo Recommendation list is being filtered out according to voice and then the user in recommendation list is further being screened according to photo, specifically Ground, the similarity for calculating separately the first facial image between the facial image of each user that recommendation list includes;By phase The corresponding user of facial image like degree less than the second default similarity threshold deletes from recommendation list.
Inventors have found that during searching similar face image, if two facial images compared are all positive dough figurines The similarity-rough set of face, then two facial images calculated is accurate;If two facial images compared have one or two to be Side face, then the similarity-rough set inaccuracy of two facial images calculated.
The face figure used in order to improve the accuracy of comparison, when as an alternative embodiment, calculating similarity As being front face image.Strong classifier or identification model can be used to identify facial image whether for front face image. If the first facial image that strong classifier or identification model identify that target user uploads is not front face image, remind Target user uploads front face image, until obtaining front face image.
Establish the specific steps of strong classifier are as follows: extract front face image as training positive sample, extract side face Image extracts integrating channel feature as training negative sample, trains strong classification from extraction feature using Adaboost algorithm Device.Strong classifier for giving a mark to face, score it is low be side face image, score it is high be front face image.
Establish the specific steps of identification model are as follows: face samples pictures are pre-processed to obtain the gray level image of unified pixel, Then the gray level image of unified pixel is divided into face image and side face image;Using obtained face image as unsupervised feature The input for learning PCANet carries out unsupervised positive face feature learning, obtains positive face feature;Using obtained side face image as there is prison Educational inspector practises the input of CNN, and combines obtained positive face feature, establishes side face feature and just by the processing of supervised learning CNN Mapping relations between face feature;Unified positive face feature is obtained using mapping relations, unified positive face feature is sent into supporting vector The training that identification model is carried out in machine, obtains identification model.
Optionally, the first facial image is calculated separately between the facial image of each user that recommendation list includes Similarity, comprising: illumination pretreatment is carried out to the first facial image using difference Gauss algorithm, filters out the low of the first facial image Frequency information retains the high-frequency information of the first facial image, obtains Gaussian image;Image histogram equalization is carried out to Gaussian image Processing, obtains the uniform image of gray value;Calculate the corresponding feature vector of the uniform image of gray value, the feature that will be calculated Vector is as the corresponding feature vector of the first facial image;It calculates separately the corresponding feature vector of the first facial image and recommends name Similarity between the corresponding feature vector of the facial image for each user for singly including.
Image histogram equalization processing is the grey level histogram of image from the shape for comparing concentration in some gray scale interval Formula becomes equally distributed form in whole tonal ranges, to increase the local contrast of image, makes the part of image It is more clear.By handling the first facial image before calculating similarity, two facial images of calculating are improved Between similarity accuracy.
The embodiment of the invention also provides a kind of data matching device, the device is for executing above-mentioned data matching method. As shown in Fig. 2, the device includes: receiving unit 12, extraction unit 14, screening unit 16, display unit 18.
Receiving unit 12, for receiving the request of target user's sending, request carries voice data.
Extraction unit 14 obtains target vocal print feature for extracting the vocal print feature of voice data.
Screening unit 16 is obtained for being screened according to target vocal print feature to the user stored in presetting database Recommendation list, recommendation list include an at least user, the vocal print feature and target vocal print of all users that recommendation list includes Similarity between feature is greater than or equal to the first default similarity threshold, and the vocal print of multiple users is stored in presetting database Feature.
Display unit 18, for showing recommendation list to target user.
Optionally, extraction unit 14 includes: to extract subelement, the first computation subunit, determine subelement.It is single to extract son Member, for extracting N kind vocal print feature vector from voice data, wherein N >=2.First computation subunit, for calculating separately N Average KL distance in kind vocal print feature vector between any two kinds of vocal print feature vectors.Subelement is determined, for the KL that will be averaged Apart from maximum two kinds of vocal print features as target vocal print feature.
Optionally, the first computation subunit includes: to obtain module, the first computing module, building module, the second calculating mould Block.Module is obtained, for obtaining first vocal print feature vector sum the second vocal print feature vector.First computing module, for distinguishing Calculate the mean value and covariance of the distribution of first vocal print feature vector sum the second vocal print feature vector.Module is constructed, for according to the One vocal print feature vector sum the second vocal print feature vector distribution mean value and covariance construct the first vocal print feature vector space and The corresponding probability distribution of second vocal print characteristic vector space.Second computing module, for according to the first vocal print feature vector It is special to calculate first the second vocal print of vocal print feature vector sum for space and the corresponding probability distribution of the second vocal print characteristic vector space Levy the average KL distance between vector.
Optionally, the facial image of multiple users is also stored in presetting database, request also carries the first face figure Picture, device further include: computing unit deletes unit.Computing unit recommends name for showing in display unit 18 to target user Before list, the similarity that calculates separately the first facial image between the facial image of each user that recommendation list includes. Unit is deleted, is deleted from recommendation list for the corresponding user of facial image by similarity less than the second default similarity threshold It removes.
Optionally, computing unit includes: to filter out subelement, image histogram equalization processing subelement, the second calculating Unit, third computation subunit.Subelement is filtered out, is located in advance for carrying out illumination to the first facial image using difference Gauss algorithm Reason filters out the low-frequency information of the first facial image, retains the high-frequency information of the first facial image, obtains Gaussian image.Image is straight It is uniform to obtain gray value for carrying out image histogram equalization processing to Gaussian image for square figure equalization processing subelement Image.Second computation subunit, for calculating the corresponding feature vector of the uniform image of gray value, by the feature being calculated to Amount is used as the corresponding feature vector of the first facial image.Third computation subunit, it is corresponding for calculating separately the first facial image Feature vector feature vector corresponding with the facial image for each user that recommendation list includes between similarity.
The embodiment of the invention provides a kind of storage medium, storage medium includes the program of storage, wherein is run in program When control storage medium where equipment execute following steps: receive target user issue request, request carry voice data; The vocal print feature for extracting voice data, obtains target vocal print feature;According to target vocal print feature to storing in presetting database User screens, and obtains recommendation list, and recommendation list includes an at least user, the sound for all users that recommendation list includes Similarity between line feature and target vocal print feature is greater than or equal to the first default similarity threshold, stores in presetting database The vocal print feature of multiple users;Recommendation list is shown to target user.
Optionally, when program is run, equipment where control storage medium also executes following steps: mentioning from voice data Take N kind vocal print feature vector, wherein N >=2;It calculates separately in N kind vocal print feature vector between any two kinds of vocal print feature vectors Average KL distance;By average KL apart from maximum two kinds of vocal print features as target vocal print feature.
Optionally, when program is run, equipment where control storage medium also executes following steps: it is special to obtain the first vocal print Levy the second vocal print of vector sum feature vector;Calculate separately the mean value of first vocal print feature vector sum the second vocal print feature vector distribution With covariance;The first vocal print is constructed according to the mean value of first vocal print feature vector sum the second vocal print feature vector distribution and covariance Characteristic vector space and the corresponding probability distribution of the second vocal print characteristic vector space;According to the first vocal print feature vector space Probability distribution corresponding with the second vocal print characteristic vector space, calculate first the second vocal print feature of vocal print feature vector sum to Average KL distance between amount.
Optionally, when program is run, equipment where control storage medium also executes following steps: to target user's exhibition Before showing recommendation list, the first facial image is calculated separately between the facial image of each user that recommendation list includes Similarity;The corresponding user of facial image by similarity less than the second default similarity threshold deletes from recommendation list.
Optionally, when program is run, equipment where control storage medium also executes following steps: being calculated using difference Gauss Method carries out illumination pretreatment to the first facial image, filters out the low-frequency information of the first facial image, retains the first facial image High-frequency information obtains Gaussian image;Image histogram equalization processing is carried out to Gaussian image, gray value is obtained and uniformly schemes Picture;The corresponding feature vector of the uniform image of gray value is calculated, using the feature vector being calculated as the first facial image pair The feature vector answered;Calculate separately the people for each user that the corresponding feature vector of the first facial image and recommendation list include Similarity between the corresponding feature vector of face image.
The embodiment of the invention provides a kind of computer equipments, including memory and processor, and memory is for storing packet The information of program instruction is included, processor is used to control the execution of program instruction, real when program instruction is loaded and executed by processor Existing following steps: receiving the request that target user issues, and request carries voice data;The vocal print feature of voice data is extracted, Obtain target vocal print feature;The user stored in presetting database is screened according to target vocal print feature, obtains recommending name Single, recommendation list includes an at least user, the vocal print feature and target vocal print feature of all users that recommendation list includes it Between similarity be greater than or equal to the first default similarity threshold, the vocal print feature of multiple users is stored in presetting database; Recommendation list is shown to target user.
Optionally, the extraction N kind from voice data is also performed the steps of when program instruction is loaded and executed by processor Vocal print feature vector, wherein N >=2;It calculates separately flat between any two kinds of vocal print feature vectors in N kind vocal print feature vector Equal KL distance;By average KL apart from maximum two kinds of vocal print features as target vocal print feature.
Optionally, when program instruction is loaded and is executed by processor also perform the steps of obtain the first vocal print feature to Amount and the second vocal print feature vector;Calculate separately mean value and the association of the distribution of first vocal print feature vector sum the second vocal print feature vector Variance;The first vocal print feature is constructed according to the mean value of first vocal print feature vector sum the second vocal print feature vector distribution and covariance Vector space and the corresponding probability distribution of the second vocal print characteristic vector space;According to the first vocal print feature vector space and The corresponding probability distribution of two vocal print feature vector spaces, calculate first the second vocal print of vocal print feature vector sum feature vector it Between average KL distance.
Optionally, it also performs the steps of when program instruction is loaded and executed by processor and is pushed away to target user's displaying Before recommending list, calculate separately similar between the first facial image and the facial image for each user that recommendation list includes Degree;The corresponding user of facial image by similarity less than the second default similarity threshold deletes from recommendation list.
Optionally, it is also performed the steps of when program instruction is loaded and executed by processor and utilizes difference Gauss algorithm pair First facial image carries out illumination pretreatment, filters out the low-frequency information of the first facial image, retains the high frequency of the first facial image Information obtains Gaussian image;Image histogram equalization processing is carried out to Gaussian image, obtains the uniform image of gray value;Meter The corresponding feature vector of the uniform image of gray value is calculated, using the feature vector being calculated as the corresponding spy of the first facial image Levy vector;Calculate separately the facial image for each user that the corresponding feature vector of the first facial image and recommendation list include Similarity between corresponding feature vector.
Fig. 3 is a kind of schematic diagram of computer equipment provided in an embodiment of the present invention.As shown in figure 3, the meter of the embodiment Machine equipment 50 is calculated to include: processor 51, memory 52 and be stored in the meter that can be run in memory 52 and on processor 51 Calculation machine program 53 realizes the data matching method in embodiment when the computer program 53 is executed by processor 51, to avoid weight It is multiple, it does not repeat one by one herein.Alternatively, being realized when the computer program is executed by processor 51 in embodiment in data matching device The function of each model/unit does not repeat one by one herein to avoid repeating.
Computer equipment 50 can be desktop PC, notebook, palm PC and cloud server etc. and calculate equipment. Computer equipment may include, but be not limited only to, processor 51, memory 52.It will be understood by those skilled in the art that Fig. 3 is only It is the example of computer equipment 50, does not constitute the restriction to computer equipment 50, may include more more or fewer than illustrating Component perhaps combines certain components or different components, such as computer equipment can also include input-output equipment, net Network access device, bus etc..
Alleged processor 51 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.
Memory 52 can be the internal storage unit of computer equipment 50, such as the hard disk or interior of computer equipment 50 It deposits.Memory 52 is also possible to the plug-in type being equipped on the External memory equipment of computer equipment 50, such as computer equipment 50 Hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, memory 52 can also both including computer equipment 50 internal storage unit and also including External memory equipment.Memory 52 is for storing other programs and data needed for computer program and computer equipment.It deposits Reservoir 52 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided by the present invention, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or group Part can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown Or the mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, device or unit it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer It is each that device (can be personal computer, server or network equipment etc.) or processor (Processor) execute the present invention The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various It can store the medium of program code.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims (10)

1. a kind of data matching method, which is characterized in that the described method includes:
The request that target user issues is received, the request carries voice data;
The vocal print feature for extracting the voice data obtains target vocal print feature;
The user stored in presetting database is screened according to the target vocal print feature, obtains recommendation list, it is described to push away Recommending list includes an at least user, the vocal print feature for all users that the recommendation list includes and the target vocal print feature Between similarity be greater than or equal to the first default similarity threshold, the vocal print of multiple users is stored in the presetting database Feature;
The recommendation list is shown to the target user.
2. the method according to claim 1, wherein the vocal print feature for extracting the voice data, obtains Target vocal print feature, comprising:
N kind vocal print feature vector is extracted from the voice data, wherein N >=2;
Calculate separately the average KL distance in the N kind vocal print feature vector between any two kinds of vocal print feature vectors;
By average KL apart from maximum two kinds of vocal print features as the target vocal print feature.
3. according to the method described in claim 2, it is characterized in that, calculate two kinds of vocal print feature vectors between average KL away from From, comprising:
Obtain first vocal print feature vector sum the second vocal print feature vector;
Calculate separately the mean value and covariance of the distribution of the second vocal print feature vector described in the first vocal print feature vector sum;
Described in the mean value and covariance building of the distribution of the second vocal print feature vector according to the first vocal print feature vector sum First vocal print feature vector space and the corresponding probability distribution of the second vocal print characteristic vector space;
According to the first vocal print feature vector space and the corresponding probability distribution of the second vocal print characteristic vector space, Calculate the average KL distance between the second vocal print feature vector described in the first vocal print feature vector sum.
4. the method according to claim 1, wherein the people of multiple users is also stored in the presetting database Face image, the request also carry the first facial image, before the recommendation list to target user displaying, The method also includes:
Calculate separately the phase between the facial image for each user that first facial image includes with the recommendation list Like degree;
The corresponding user of facial image by similarity less than the second default similarity threshold deletes from the recommendation list.
5. according to the method described in claim 4, it is characterized in that, described calculate separately first facial image and push away with described Recommend the similarity between the facial image for each user that list includes, comprising:
Illumination pretreatment is carried out to first facial image using difference Gauss algorithm, filters out the low of first facial image Frequency information retains the high-frequency information of first facial image, obtains Gaussian image;
Image histogram equalization processing is carried out to the Gaussian image, obtains the uniform image of gray value;
The corresponding feature vector of the uniform image of the gray value is calculated, using the feature vector being calculated as described the first The corresponding feature vector of face image;
Calculate separately the people for each user that the corresponding feature vector of first facial image and the recommendation list include Similarity between the corresponding feature vector of face image.
6. a kind of data matching device, which is characterized in that described device includes:
Receiving unit, for receiving the request of target user's sending, the request carries voice data;
Extraction unit obtains target vocal print feature for extracting the vocal print feature of the voice data;
Screening unit is pushed away for being screened according to the target vocal print feature to the user stored in presetting database List is recommended, the recommendation list includes an at least user, the vocal print feature for all users that the recommendation list includes and institute It states the similarity between target vocal print feature and is greater than or equal to the first default similarity threshold, stored in the presetting database The vocal print feature of multiple users;
Display unit, for showing the recommendation list to the target user.
7. device according to claim 6, which is characterized in that the extraction unit includes:
Subelement is extracted, for extracting N kind vocal print feature vector from the voice data, wherein N >=2;
First computation subunit, for calculating separately in the N kind vocal print feature vector between any two kinds of vocal print feature vectors Average KL distance;
Subelement is determined, for the KL that will be averaged apart from maximum two kinds of vocal print features as the target vocal print feature.
8. device according to claim 7, which is characterized in that first computation subunit includes:
Module is obtained, for obtaining first vocal print feature vector sum the second vocal print feature vector;
First computing module, for calculating separately the distribution of the second vocal print feature vector described in the first vocal print feature vector sum Mean value and covariance;
Module is constructed, mean value and association for the distribution of the second vocal print feature vector according to the first vocal print feature vector sum Variance constructs the first vocal print feature vector space and the corresponding probability distribution of the second vocal print characteristic vector space;
Second computing module, for according to the first vocal print feature vector space and the second vocal print characteristic vector space point Not corresponding probability distribution, calculate average KL between the second vocal print feature vector described in the first vocal print feature vector sum away from From.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein run in described program When control the storage medium where equipment perform claim require any one of 1 to 5 described in data matching method.
10. a kind of computer equipment, including memory and processor, the memory is for storing the letter including program instruction Breath, the processor are used to control the execution of program instruction, it is characterised in that: described program instruction is loaded and executed by processor The step of data matching method described in Shi Shixian claim 1 to 5 any one.
CN201910651114.5A 2019-07-18 2019-07-18 Data matching method and device Pending CN110489659A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910651114.5A CN110489659A (en) 2019-07-18 2019-07-18 Data matching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910651114.5A CN110489659A (en) 2019-07-18 2019-07-18 Data matching method and device

Publications (1)

Publication Number Publication Date
CN110489659A true CN110489659A (en) 2019-11-22

Family

ID=68547404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910651114.5A Pending CN110489659A (en) 2019-07-18 2019-07-18 Data matching method and device

Country Status (1)

Country Link
CN (1) CN110489659A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111816191A (en) * 2020-07-08 2020-10-23 珠海格力电器股份有限公司 Voice processing method, device, system and storage medium
CN112270927A (en) * 2020-09-27 2021-01-26 青岛海尔空调器有限总公司 Intelligent interaction method based on environment adjusting equipment and intelligent interaction equipment
CN112818095A (en) * 2021-01-26 2021-05-18 广州欢网科技有限责任公司 Method, device and equipment for matching and associating questions
CN113239229A (en) * 2021-06-17 2021-08-10 张鹏涛 Intelligent screening data processing method and system and cloud platform
CN113299296A (en) * 2021-05-08 2021-08-24 深圳市沃特沃德信息有限公司 Friend making method and device based on voiceprint recognition and computer equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100095166A (en) * 2009-02-20 2010-08-30 성균관대학교산학협력단 A personal adaptive music recommendation method using analysis of playlists of users
CN104183245A (en) * 2014-09-04 2014-12-03 福建星网视易信息系统有限公司 Method and device for recommending music stars with tones similar to those of singers
CN105575393A (en) * 2015-12-02 2016-05-11 中国传媒大学 Personalized song recommendation method based on voice timbre
CN105656756A (en) * 2015-12-28 2016-06-08 百度在线网络技术(北京)有限公司 Friend recommendation method and device
CN108495143A (en) * 2018-03-30 2018-09-04 百度在线网络技术(北京)有限公司 The method and apparatus of video recommendations

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100095166A (en) * 2009-02-20 2010-08-30 성균관대학교산학협력단 A personal adaptive music recommendation method using analysis of playlists of users
CN104183245A (en) * 2014-09-04 2014-12-03 福建星网视易信息系统有限公司 Method and device for recommending music stars with tones similar to those of singers
CN105575393A (en) * 2015-12-02 2016-05-11 中国传媒大学 Personalized song recommendation method based on voice timbre
CN105656756A (en) * 2015-12-28 2016-06-08 百度在线网络技术(北京)有限公司 Friend recommendation method and device
CN108495143A (en) * 2018-03-30 2018-09-04 百度在线网络技术(北京)有限公司 The method and apparatus of video recommendations

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111816191A (en) * 2020-07-08 2020-10-23 珠海格力电器股份有限公司 Voice processing method, device, system and storage medium
WO2022007497A1 (en) * 2020-07-08 2022-01-13 珠海格力电器股份有限公司 Voice processing method and apparatus, system and storage medium
CN112270927A (en) * 2020-09-27 2021-01-26 青岛海尔空调器有限总公司 Intelligent interaction method based on environment adjusting equipment and intelligent interaction equipment
CN112818095A (en) * 2021-01-26 2021-05-18 广州欢网科技有限责任公司 Method, device and equipment for matching and associating questions
CN113299296A (en) * 2021-05-08 2021-08-24 深圳市沃特沃德信息有限公司 Friend making method and device based on voiceprint recognition and computer equipment
CN113239229A (en) * 2021-06-17 2021-08-10 张鹏涛 Intelligent screening data processing method and system and cloud platform

Similar Documents

Publication Publication Date Title
CN108197532B (en) The method, apparatus and computer installation of recognition of face
CN110489659A (en) Data matching method and device
CN109284733B (en) Shopping guide negative behavior monitoring method based on yolo and multitask convolutional neural network
US10810870B2 (en) Method of processing passage record and device
WO2020258667A1 (en) Image recognition method and apparatus, and non-volatile readable storage medium and computer device
CN108197250B (en) Picture retrieval method, electronic equipment and storage medium
WO2021082118A1 (en) Person re-identification method and apparatus, and terminal and storage medium
CN111738357B (en) Junk picture identification method, device and equipment
CN108269254A (en) Image quality measure method and apparatus
CN110222728B (en) Training method and system of article identification model and article identification method and equipment
CN110428399A (en) Method, apparatus, equipment and storage medium for detection image
CN110163111A (en) Method, apparatus of calling out the numbers, electronic equipment and storage medium based on recognition of face
CN112163637B (en) Image classification model training method and device based on unbalanced data
CN109871845A (en) Certificate image extracting method and terminal device
CN109902561A (en) A kind of face identification method and device, robot applied to robot
CN110689046A (en) Image recognition method, image recognition device, computer device, and storage medium
CN113515988A (en) Palm print recognition method, feature extraction model training method, device and medium
CN110245573A (en) A kind of register method, apparatus and terminal device based on recognition of face
CN110287767A (en) Can attack protection biopsy method, device, computer equipment and storage medium
CN112329586A (en) Client return visit method and device based on emotion recognition and computer equipment
CN111784665A (en) OCT image quality assessment method, system and device based on Fourier transform
CN108090108A (en) Information processing method, device, electronic equipment and storage medium
CN113157956B (en) Picture searching method, system, mobile terminal and storage medium
CN112651333A (en) Silence living body detection method and device, terminal equipment and storage medium
CN112614110A (en) Method and device for evaluating image quality and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination