CN110019921A

CN110019921A - Correlating method and device, the audio search method and device of audio and attribute

Info

Publication number: CN110019921A
Application number: CN201711137185.0A
Authority: CN
Inventors: 孙浩华; 王朝阳
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2017-11-16
Filing date: 2017-11-16
Publication date: 2019-07-16
Anticipated expiration: 2037-11-16
Also published as: CN110019921B

Abstract

This application provides correlating method and device, the audio search methods and device of audio and attribute, wherein the correlating method of audio and attribute includes: at least two initial attributes for obtaining audio；According to the similitude between the initial attribute, effective attribute is extracted from the initial attribute；With reference to the conflict relationship between effective attribute, the incidence relation of effective attribute and audio is established.Using the method or apparatus of the embodiment of the present application, so that the incidence relation of audio has more breadth and depth.In addition, the embodiment of the present application is also based on the incidence relation of foundation to search for audio, to provide the user with the audio for more likely meeting user demand.

Description

Correlating method and device, the audio search method and device of audio and attribute

Technical field

This application involves internet data processing technology field, in particular to the correlating method and dress of a kind of audio and attribute It sets, a kind of audio search method and device, a kind of data processing method and device, and, a kind of data retrieval method and device.

Background technique

Currently, more and more users listen to music using playout software, play video etc., by various playout softwares come Carry out daily study or amusement.If the user desired that searching a song, under normal circumstances, keyword can be inputted and examined Rope includes that the song of the keyword of user's input will be showed user as search result in those names.In certain feelings Under condition, keyword retrieval may be utilized for searching for album name, singer's title, etc..

The prior art is unable to satisfy user to the audio search demand of the more ranges such as electronics school, background musical instrument.

Summary of the invention

Inventor has found in the course of the research, for keyword retrieval in the prior art, can only realize to certain titles of the song, The audio of album, artist and lyrics etc. including keyword is just able to achieve retrieval, and other the case where, for example, user wishes accurate Some electronics schools, background musical instrument are looked for, the song of classification can not then give the search result of user's more range, lead to the prior art It is unable to satisfy the audio search demand of user's more range.

Based on this, this application provides the correlating methods of a kind of audio and audio attribute, a kind of audio search method, one kind Data processing method and a kind of data retrieval method, for the similitude between the audio attribute of user's mark, to be associated It extracts effective attribute in audio attribute, and with reference to the conflict relationship between effective attribute, establishes effective attribute and audio Incidence relation can be by user to the various audio attributes of audio, such as voice point based on the incidence relation which is established Solution, emotion label, main body scene positioning etc., all establish with being associated between audio so that the incidence relation of audio is with more wide Degree and depth.In addition, the embodiment of the present application is also based on the incidence relation of foundation to search for audio, to provide the user with more It is possible that meeting the audio of user demand.

Present invention also provides the associated apparatus of a kind of audio and audio attribute, a kind of audio search device, a kind of data Processing unit and a kind of data searcher, to guarantee the realization and application of the above method in practice.

To solve the above-mentioned problems, this application discloses the correlating methods of a kind of audio and attribute, which is characterized in that the party Method includes:

Obtain at least two initial attributes of audio；

According to the similitude between the initial attribute, effective attribute is extracted from the initial attribute；

With reference to the conflict relationship between effective attribute, the incidence relation of effective attribute and audio is established.

Wherein, the similitude according between the initial attribute extracts effective attribute from the initial attribute, packet It includes:

For each initial attribute, executes following acquisition process: obtaining all audios belonging to the initial attribute, and foundation The build-in attribute and extrinsic attribute of each audio, determine weight of each audio with respect to the initial attribute；And it obtains described initial Label number of the attribute in each audio belonging to it；

Weight and label number according to each audio with respect to the initial attribute, by more disaggregated models to described initial Attribute is clustered；

According to the distance between the result of the cluster and each initial attribute, effective attribute is extracted.

Wherein, the distance between the result and each audio attribute according to the cluster, extracts effective attribute, comprising:

Deleted from the result of the cluster initial attribute labeled number be less than preset threshold as a result, obtaining Cluster result to be processed；

According to user gradation and/or labeled number, optimum attributes are selected from each cluster result to be processed As effective attribute, the corresponding effective attribute of a cluster result；

Record the transforming relationship of other initial attributes and effective attribute in each cluster result.

Wherein, the conflict relationship with reference between effective attribute, establishes the incidence relation of effective attribute and audio, Include:

According to the relationship between audio and effective attribute, the audio to be processed including effective attribute is determined；

The conflict attribute pair that there is conflict is determined from effective attribute of the audio to be processed；

According to the conflict attribute to the collisions parameter being related to, an objective attribute target attribute is determined from the conflict attribute centering；Institute Stating collisions parameter includes: user gradation and/or the number that is marked；

Establish being associated between the audio and the objective attribute target attribute.

Wherein, this method further include:

According to the click information for each effective attribute diaphone frequency that search system obtains, the incidence relation is repaired Just.

Wherein, each audio attribute obtained according to search system corresponds to the click information of audio, closes to the association System is modified, comprising:

When being scanned for according to each effective attribute of historical behavior data statistics as search term, total point of obtained each audio Hit number；

Judge whether total number of clicks of the audio is less than preset threshold, if it is, deleting effective attribute and being somebody's turn to do The incidence relation of audio.

Wherein, this method further include:

Effective attribute and the incidence relation of audio are indicated using music map.

Disclosed herein as well is a kind of audio search methods, this method comprises:

Obtain the search term of input；

According to the incidence relation of attribute and audio pre-establish, effective, the determining audio with described search word association；

The audio is shown.

Wherein, the incidence relation is established in the following manner:

Obtain at least two initial attributes of audio；

Disclosed herein as well is a kind of data processing methods, comprising:

Obtain the flag data of object data；

Similarity system design is carried out to the flag data, obtains significant notation data；

Obtain the conflict relationship between significant notation data；

Based on the conflict relationship, selection target flag data；

Establish the incidence relation of the target label data and the object data.

Wherein, described that similarity system design is carried out to the flag data, obtain significant notation data, comprising:

Object data belonging to the flag data is obtained, and according to the object data build-in attribute and extrinsic category Property, determine the weight of the relatively described flag data of the object data, and, object data of the flag data belonging to it In label number；

According to the weight and label number of the object data, the flag data is gathered by more disaggregated models Class；

According to the distance between the result of the cluster and described flag data, significant notation data are extracted.

It is wherein, described to be based on the conflict relationship, selection target flag data, comprising:

According to the relationship between the object data and significant notation data, pair including the significant notation data is determined Image data；

The colliding data pair that there is conflict is determined from the significant notation data of the object data of the determination；

According to the colliding data to the collisions parameter being related to, a target label number is determined from the colliding data centering According to；The collisions parameter includes: user gradation and/or the number that is marked.

Disclosed herein as well is a kind of data retrieval methods, comprising:

Obtain retrieval input data, wherein the retrieval input data is target label data；

Based on the target label data, object data is obtained；

The object data is fed back as search result.

Wherein, the target label data and the foundation of the incidence relation of the object data includes:

Obtain the flag data of object data；

Obtain the conflict relationship between significant notation data；

Based on the conflict relationship, selection target flag data；

Establish the incidence relation of the target label data and the object data.

Disclosed herein as well is the associated apparatus of a kind of audio and attribute characterized by comprising

Communication interface, for obtaining the initial attribute of audio to be associated；

Processor, for extracting effective attribute from the initial attribute according to the similitude between the initial attribute； And with reference to the conflict relationship between effective attribute, establish the incidence relation of effective attribute and audio.

Wherein, the processor is used to extract from the initial attribute according to the similitude between the initial attribute Effective attribute, comprising:

Wherein, the processor is used to extract effective according to the distance between the result and each audio attribute of the cluster Attribute, comprising:

Wherein, the processor is used to establish effective attribute and audio with reference to the conflict relationship between effective attribute Incidence relation, comprising:

Wherein, the processor is also used to:

Wherein, each audio attribute that the processor is used to obtain according to search system corresponds to the click information of audio, right The incidence relation is modified, comprising:

Wherein, the processor is also used to:

Disclosed herein as well is a kind of audio search devices, comprising:

Communication interface, for obtaining the search term of input；

Processor, it is determining to be closed with described search word for the incidence relation according to attribute and audio pre-establish, effective The audio of connection；

Display screen, for the audio to be shown.

Wherein, the incidence relation is established in the following manner:

Obtain at least two initial attributes of audio；

Disclosed herein as well is a kind of data processing equipments, comprising:

Communication interface, for obtaining the flag data of object data；

Processor obtains significant notation data for carrying out similarity system design to the flag data；Obtain significant notation Conflict relationship between data；Based on the conflict relationship, selection target flag data；And establish the target label number According to the incidence relation with the object data.

Wherein, the processor is used to carry out similarity system design to the flag data, obtains significant notation data, wraps It includes:

Wherein, the processor is used to be based on the conflict relationship, selection target flag data, comprising:

Present invention also provides a kind of data searchers, comprising:

Communication interface, for obtaining retrieval input data, wherein the retrieval input data is target label data；

Processor obtains object data for being based on the target label data；And it feeds back the object data and makees For search result.

Obtain the flag data of object data；

Obtain the conflict relationship between significant notation data；

Based on the conflict relationship, selection target flag data；

Establish the incidence relation of the target label data and the object data.

Compared with prior art, the embodiment of the present application includes the following advantages:

In the embodiment of the present application, because audio attribute can indicate the unstructured information of audio, by user couple The audio attribute that each audio is marked also is used as the search dimension of audio, as long as the content of user's input belongs to a certain of audio A audio attribute, even if input content not in the title of audio or the lyrics, also can because of incidence relation foundation and by It is retrieved, thus user can be helped to search more associated audio file, not only increase the depth and essence of audio retrieval Exactness also improves the audio search experience of user.

Certainly, any product for implementing the application does not necessarily require achieving all the advantages described above at the same time.

Detailed description of the invention

In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for For those of ordinary skill in the art, without any creative labor, it can also be obtained according to these attached drawings His attached drawing.

Fig. 1 is the exemplary process diagram of the audio of the application and the correlating method embodiment of attribute；

Fig. 2 is the illustrative diagram of the music map of the application；

Fig. 3 is the exemplary process diagram of the audio search method embodiment of the application；

Fig. 4 is the exemplary process diagram of the data processing method embodiment of the application；

Fig. 5 is the exemplary process diagram of the data retrieval method embodiment of the application；

Fig. 6 is the exemplary block diagram of the Installation practice of the application；

Fig. 7 is the another exemplary structural block diagram of the Installation practice of the application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.

In the embodiment of the present application, the audio file of music industry, there is its attribute, attribute may include build-in attribute and Extrinsic attribute, wherein intrinsic information for example may include: user, record company, album, song, singer, composition, work Word, the lyrics, mv etc., these are all the build-in attributes of audio.Build-in attribute can be provided by record company, when audio is online It waits, the build-in attribute of audio can be obtained by obtaining the data that record company provides, build-in attribute generally comprises constant category Property.

In addition to build-in attribute, audio can also include extrinsic attribute, and extrinsic attribute generally comprises the attribute of variation.It is non- Build-in attribute may include: musical instrument, school, voice decomposes, user is to mark information of audio etc., wherein user is to audio Mark information is completely by user from primary input.Musical instrument refers to the theme of audio forms (same head using which kind of instrument playing Song can be deduced using different musical instrument), and school then may include: the style of song such as electronics, prevalence or allusion school (with Subdivision school is continuously increased, this attribute can also constantly change with the time), voice decomposition then can be bass, female High pitch etc. (the same singer can also carry out different deductions to same song), these extrinsic attributes can pass through machine Mode of learning training obtains, and marking data is that user is the property content of audio mark when listening audio.

Wherein, for the build-in attribute of audio, direct construction can be come according to the source data of the offers such as record company Incidence relation between build-in attribute and each audio.For example, album, artist, the lyrics, writing words, wrirting music.And for extrinsic category For property, the incidence relation between extrinsic attribute and audio can be constructed by mode provided by the embodiments of the present application.Such as: It plays an instrument, voice, school, rhythm, user's marking data.Audio and extrinsic attribute are associated in the present embodiment, i.e., It is to be associated each extrinsic attribute that these users are marked by audio with each audio.

Refering to what is shown in Fig. 1, for the flow chart of audio a kind of in the application and the correlating method embodiment of attribute, the present embodiment It may comprise steps of 101~step 104:

Step 101: obtaining at least two initial attributes of audio.

In the present embodiment, the extrinsic attribute of all audios can be got from the user data saved in database As initial attribute.For example, getting each sound from the database of its preservation user data for audio server Each extrinsic attribute that frequency is related to is as the initial attribute to be associated.For example, extrinsic attribute may include each User is the user data that audio is commented on or marked, for example, the audio that user is a first sad love song has stamped label " sad ", then " sad " is exactly the attribute of the audio.For the extrinsic attribute of emotion class, handmarking can be carried out, for example, The attributes such as " sad ", " sad " are labeled as to the emotion of " sadness " class, then for one classification logotype of category setting (such as 0), then the emotion attribute of classification logotype and the category is saved respectively.And so on, the attribute of " happy " class can be set Setting classification logotype is 1, and it is 2, etc. that classification logotype, which can be set, in the attribute of " excitement " class.

In practical applications, sometimes due to the category that user misoperation may also input some additional characters etc. as audio Property, so the attribute marked when in order to filter out user misoperation, such as some symbols " ", " " etc. or some and audio Not related attribute itself, such as " today is a fair weather " etc., can be by those labeled numbers for an audio Attribute less than certain threshold values filters out, then using the attribute after filtering as the initial attribute of audio.

Step 102: according to the similitude between the initial attribute, effective attribute is extracted from the initial attribute.

In this step, according to the similarity degree between the similitude between initial attribute, that is, each attribute, belong to from initial Effective attribute is extracted in property.For example, initial attribute " sentiment " and " sad " clustering algorithm that passes through can gather in same class, then It can be confirmed that the two initial attributes have similitude, that is, belong to the attribute of same class emotion, then select one from two As effective attribute；And attribute " sentiment " and it is " happy " do not gathered in same class by clustering algorithm, then can be confirmed this two A initial attribute does not have similitude, then " happy " can be used as effective attribute of another emotion, etc..

Correspondingly, according to similitude come when extracting effective attribute, can by neural network model come to it is each just Beginning attribute carries out cluster realization, and specific step 102 may include step A1~step A4:

Step A1: being directed to each initial attribute, executes following acquisition process: obtaining all sounds belonging to the initial attribute Frequently, and the build-in attribute according to each audio and extrinsic attribute, weight of each audio with respect to the initial attribute is determined；And it obtains Take label number of the initial attribute in each audio belonging to it.

For each initial attribute, audio involved in the initial attribute is first obtained according to user data and is beaten by user All audios of the upper initial attribute, and an audio list is formed, then for each sound in the audio list of the initial attribute The build-in attribute and extrinsic attribute of frequency determine that weight of each audio relative to the initial attribute, weight are bigger, then it represents that The audio is more possible to be associated with the initial attribute, alternatively, a possibility that there are incidence relations between the audio and the initial attribute It is bigger.

It, can be according to the lyrics that the audio is related to, singer, radio station specifically, when the weight for determining the audio Etc. build-in attributes, or determine according to the extrinsic attribute of user's mark.For example, a singer often sings sentimental song Song, then in the audio list of the attribute of " sentiment ", weight of the audio of the singer just than the audio of the non-singer is bigger. Certainly, this is only exemplary signal, and those skilled in the art can determine the weight of each audio according to historical data etc..

In the audio list that an initial attribute has been determined after the weight of each audio, it can obtained from user data Take label number of the initial attribute in each audio involved in it.For example, in the audio list of attribute " sentiment ", for Audio A " sentiment " is marked with 1568 times altogether, and audio B " sentiment " is marked with altogether 186 label numbers.

Step A2: weight and label number according to each audio with respect to the initial attribute pass through more disaggregated models pair The initial attribute is clustered.

Then the weight according to each audio with respect to the initial attribute and label number, pass through more classification in machine learning Model clusters all initial attributes.For example, can be using support vector machines (SVM) model etc..It is carrying out clustering it Before, it for longer initial attribute, can also be segmented, it is assumed for example that the content of initial attribute " feel very much by the mood of today Wound ", then can segment the content, and the object by " sad " after participle as cluster, can not only improve so poly- The speed of class can also improve the accuracy of cluster.Specific segmentation methods can use the segmenting method etc. based on string matching.

Specifically, can be carried out according to the classification of initial attribute or the emotion of initial attribute, example when cluster Such as, the classification of initial attribute is nursery rhymes, opera, prevalence or rock and roll etc., and the emotion of initial attribute is then sentimental, glad, sharp It moves.The same initial attribute can then belong to since generic or affiliated emotion all can serve as the dimension clustered The result of different clusters.Specifically, the initial attribute of the identical emotion of every one kind all converges in one kind, for example, " happy " and " happiness ", " sad " and " sad ", etc.；And the initial attribute of inhomogeneity emotion then converges in inhomogeneity, such as " sobbing " " happy ", " sadness " and " excitement " etc., after cluster, the cluster result of all kinds of initial attributes all respectively indicates a kind of emotion.

Step A3: according to the distance between the result of the cluster and each initial attribute, effective attribute is extracted.

Then, effective attribute is extracted according to the distance between the result of cluster and each initial attribute.Wherein, each initial The distance between attribute refers to that the score of similitude between attribute two-by-two, two high attributes of score can be carried out merging, Such as the score of the similitude of " sad " and " sentiment " is higher, then finally can be merged into " sad " or " sentiment ".In addition, cluster As a result the especially few cluster result of the number of middle initial attribute can be ignored, and finally choose respectively from remaining each cluster result One initial attribute is as effective attribute, because the emotion that the initial attribute between all kinds of cluster results indicates all is different, So extracting an optimum attributes as effective attribute in every one kind initial attribute.

Specifically, being directed to cluster result, a predetermined number threshold value, such as 5 or 10 can be preset, i.e., If initial attribute number deletes this kind of cluster result, after deletion less than 5 or 10 in the result of cluster Multiple cluster results to be processed are selected from each cluster result according to the parameters such as user gradation and/or initial attribute number Optimum attributes are as effective attribute, wherein the corresponding effective attribute of a cluster result.

For example, deleting initial attribute number less than having 10 cluster results after the cluster result of predetermined number threshold value, then The initial attribute for the emotion that can most indicate such cluster result is selected from each cluster result as effective attribute, finally Obtain 10 effective attributes corresponding with 10 cluster results respectively.Wherein, user gradation refers to as audio indicia content work It, can be with for example, intelligent user is because login time is longer than the time of ordinary user for the trusting degree of the user of attribute By the higher weight of content imparting for the initial attribute that intelligent user in the same cluster result marks.What labeled number referred to It is the total degree that the initial attribute is marked by all users, in the same cluster result, labeled number is higher, then corresponds to Initial attribute weight it is higher, finally, highest that initial attribute of weight in the same cluster result can be chosen and made For the corresponding effective attribute of the cluster result.

Step A4: the transforming relationship of other initial attributes and effective attribute in each cluster result is recorded.

It, can also be by its wherein other than as the optimum attributes of effective attribute meanwhile for all kinds of cluster results Transforming relationship between his initial attribute, with effective attribute is recorded.For example, there is 10 initial categories in a kind of cluster result Property, the content of one of initial attribute is " sad " eventually as effective attribute, and in addition has the content of an initial attribute Be " sentiment ", then can recorde down the transforming relationship between " sentiment " and " sad ", so as to subsequent user in search " sad " or When other audio attributes in such cluster result of person, it can be scanned for " sentiment " on backstage, so as to root The search of user is more comprehensively covered according to the transforming relationship.

Step 103: with reference to the presence or absence of conflict, establishing being associated with for effective attribute and audio between effective attribute System.

After extracting effective attribute in attribute to be associated, whether there is according between obtained each effective attribute Conflict, come the incidence relation established between each effective attribute and audio.Wherein, there is conflict between different effective attributes, use The emotion generic represented by expression different attribute is inconsistent.For example, effectively attribute simultaneously include: " sad ", " happiness ", " excitement " etc., and " sad " belongs to sentimental emotion, " happiness " belongs to happy emotion, then just has between " sad " and " happiness " Conflict, and just there is no conflicts between " happiness " and " excitement " that belongs to excited emotion.In practical applications, in addition to being all kinds of Classification logotype is arranged in the attribute of other emotion, can also be that there are two class emotion identifications of conflict relationship, and conflict mark is arranged, such as The conflict mark that the attribute of two class emotion of fruit is related to is consistent (for example, conflict mark is all 00), it is determined that the category of these two types of emotions Property exist conflict, on the contrary it is then determine there is no conflict.Certainly, the above content is all exemplary approach, should not be construed as this The restriction of application.

In this step, then it whether there is conflict between any two according to these effective attributes, to establish each effective attribute With the incidence relation of audio, that is, each audio all should be associated with which effective attribute.

Specifically, the realization of this step may include step B1~step B4:

Step B1: according to the relationship between audio and effective attribute, the audio to be processed including effective attribute is determined.

In this step, for each effective attribute, obtaining from the user data saved in database includes one Or each audio of multiple effective attributes, as audio to be processed.

Step B2: the conflict attribute pair that there is conflict is determined from effective attribute of the audio to be processed.

For each audio to be processed, because may occur two mutual exclusions (there is conflict relationship) in the same audio Attribute, from effective attribute that the audio to be processed includes determine conflict attribute pair, wherein conflict attribute to include mood or There are two attributes of conflict relationship for person's emotion etc., for example, " sad " and " happiness ", etc..If having in audio to be processed It imitates in attribute there is no conflict attribute pair, then can directly establish the incidence relation between audio and effective attribute.

Step B3: according to conflict attribute to the collisions parameter being related to, an objective attribute target attribute is determined from conflict attribute centering.

In this step, for example, selecting the attribute data marked by high ranked user as objective attribute target attribute.For another example choosing Select the attribute data of more users label.For another example selecting the attribute data of recent more users label.For another example selection duty Industry background is the attribute data of the relevant user's mark of music.

The number that the user gradation and the attribute being related to according to conflict each attribute of attribute centering are commented on, from conflict Attribute centering determines an attribute as objective attribute target attribute.For example, for conflict attribute for " sad " and " happiness ", more Intelligent user thinks that the emotion of some audio is " sad " rather than " happiness ", and " sad " the labeled number of attribute is greater than " happiness " labeled number, then for the conflict attribute to can therefrom select " sad " as objective attribute target attribute.Wherein, conflict Parameter also may include user gradation and/or attribute number etc..

Step B4: being associated between the audio and the objective attribute target attribute is established.

Being associated between audio and objective attribute target attribute is established, for example, can establish between the two by the way of Key-Value Association, other modes opening relationships can also be used, it is only necessary to corresponding audio can be associated with according to objective attribute target attribute, or Person can be associated with its labeled objective attribute target attribute according to audio.

Refering to what is shown in Fig. 2, for the attribute schematic diagram being associated to audio and attribute.Assuming that for the whole of an audio Extrinsic attribute are as follows: { electronics, excitement, sentiment, today are a fair weather, sadness, sad }, by " today is a fair weather " Filtering, obtain the initial attribute of the audio: { electronics, excitement, sentiment, sadness, sad }.And pass through the cluster to initial attribute Deng, the effective attribute for obtaining the audio is respectively { electronics, excitement, sentiment }, and in these effective attributes, " excitement " and " wound Sense " belongs to conflict attribute pair, and therefore, going out an objective attribute target attribute from the conflict attribute centering decision is " sentiment ".Then Foundation " sentiment " and " electronics " is associated with the audio.

After step 103 establishes incidence relation, in practical applications, can also to the incidence relation in step 103 into Row amendment, to improve the accuracy of incidence relation, then step 103 can also include: later

Step 104: according to the click information for each effective attribute diaphone frequency that search system obtains, to the incidence relation It is modified.

In this step, user can be can reflect from the click data in database in all customer data, the click Click information when searching for audio, such as: which audio user clicks after searching for effective attribute and obtaining each audio Carried out audition, but have which audio not by user click information, and on a video audition stay time, for Which audio has carried out downloading etc..By taking number of clicks as an example, for example, it is assumed that effectively attribute " sad " was searched for by user, obtain To 10 audio results, wherein 8 audios all click audition by user, and other 2 are not then clicked audition by user, It may be considered that can not be associated with " sad " foundation of attribute by this 2 audios that user clicks.Because user data reflects The click information of all users just can more protect so being modified according to the incidence relation that click information establishes step 103 The accuracy of incidence relation is demonstrate,proved.

The realization of specific step 104 may include: that first (what is saved in database goes through acquisition foundation historical behavior data The user data of history) each effective attribute of statistics is as search term when scanning for, total point that obtained each audio is clicked by user Hit number, then judge whether total number of clicks of the audio is less than preset threshold, if it is, delete effective attribute with The incidence relation of the search result audio.For example, one default click threshold of setting is 5, it assumes that search " sad " obtained What search obtained has audio A, and the audio A is both less than 5 times by the number that all users click, then by the audio A and valid genus Incidence relation between property " sad " is deleted.

The incidence relation established through this embodiment, because attribute can indicate the extrinsic attribute of audio, by user couple The property content that each audio is marked also is used as the search dimension of audio, as long as the content of user's input belongs to a certain of audio A attribute, even if input content not in the title of audio or the lyrics, also can because of incidence relation foundation and be retrieved Out, thus user can be helped to search more associated audio file, not only increases the depth and accuracy of audio retrieval, Also improve the audio search experience of user.

After step 103 or step 104, obtained incidence relation can also be indicated using music map, Music map is all used to indicate the build-in attribute that each audio establishes associated attribute and each audio, thus more The relationship between expression audio and build-in attribute or extrinsic attribute etc. that can be visualized.

With reference to Fig. 3, show a kind of flow chart of audio search method embodiment of the application, the present embodiment may include with Lower step:

Step 301: obtaining the search term of input.

The embodiment of the present application can be applied on audio server.Assuming that user opens in audio A PP or audio software Searched page inputs " sentiment " this attribute, it is desirable to search the audition for the songs of sentiment, then " sentiment " is exactly search term.

Step 302: according to the incidence relation of attribute and audio pre-establish, effective, determining and described search word association Audio.

In this step, the incidence relation between audio and effective attribute can be established by mode shown in FIG. 1.Then sound " sentiment " can be sent to audio server by frequency APP, by audio server according to the effective attribute and audio pre-established Incidence relation, the audio for " sentiment " attribute that gets beat up searching for those, as search result.

Step 303: the audio is shown.

Then search result can be sent to the page presentation as the result is shown of audio A PP to user by server.

As it can be seen that in the embodiment of the present application, as long as including some audio and some in the incidence relation pre-established By the attribute of user's mark, no matter user's mark be musical genre or song emotion or voice decomposes etc., be ok It is associated with audio foundation, thus when the search term of user's input is with corresponding attributes match, so that it may which attribute is corresponding Audio show user as search result be possibly realized so that providing the user with and more meeting the audio of its demand.

With reference to Fig. 4, show a kind of flow chart of data processing method embodiment of the application, the present embodiment may include with Lower step:

Step 401: obtaining the flag data of object data.

In the embodiment of the present application, object can be the commodity in electric business field, then object data can be electric business field Commodity data, and flag data then can be user to the comment data etc. of commodity；Alternatively, object is also possible to the audio of live streaming Or video etc., then object data can be the attribute data of audio or video itself, and flag data then can be in live streaming process The barrage data etc. that middle user issues.Therefore, include the object of the extrinsics attribute such as user comment for some, as long as can incite somebody to action Its comment data is got as flag data, data and object data can be marked using the method for the present embodiment Association.

In this step, by taking the commodity in electric business field as an example, the build-in attribute of commodity, such as the profile data of commodity are obtained Deng, and obtain the user comment data etc. of the extrinsic data such as commodity of commodity, these build-in attributes and extrinsic attribute Object data is belonged to, then extracts user comment data as flag data from object data.

Step 402: similarity system design being carried out to the flag data, obtains significant notation data.

In this step, according to the similarity degree between the similitude between each flag data, that is, each flag data, To determine significant notation data from flag data.For example, flag data " high-quality " and " texture is pretty good " are calculated by cluster Method can gather in same class, then can determine that the two flag datas have similitude, then can be from the two flag datas In select one be used as significant notation data.Similarly, " high-quality " and " wears effect is bad " is not gathered by clustering algorithm Into same class, then it can be confirmed that the two flag datas do not have similitude, then in addition " wears effect is bad " can be used as One significant notation data waits.

Specifically, this step can first obtain number of objects belonging to each flag data i.e. user comment data when realizing According to (i.e. commodity data), and according to build-in attribute those of in commodity data and extrinsic attribute, to determine each commodity data The weight of opposite user comment data, for example, weight etc. of the size of commodity for " wears effect is pretty good ", and, Yong Huping By label number of the data in the commodity data belonging to it, for example, the commodity data of " wears effect is pretty good " in dress In to be marked with 59 inferior altogether.

Then, according to commodity data weight and label number, by more disaggregated models to each user comment data into Row cluster (cluster mode can refer to the introduction of step A2, and details are not described herein), and result and user comment according to cluster The distance between data extract significant notation data (extraction process can refer to the introduction of step A3, and details are not described herein).

Step 403: obtaining the conflict relationship between significant notation data.

In significant notation data, that is, validated user comment data, it might have and there is punching between some user comment data It is prominent.For example, some user comment data are " wears effect is good ", some user comment data are " wears effect is bad ", etc.. Wherein, there is conflict between different validated user comment datas, for indicating represented by different validated user comment datas User Perspective is inconsistent.Conflict relationship obtains the introduction that can refer to step 103, and details are not described herein.

Step 404: being based on the conflict relationship, selection target flag data.

After step 403 gets conflict relationship, it is based on conflict relationship selection target flag data.

Specifically, firstly, determination includes effect flag data according to the relationship between object data and significant notation data Object data, for example, significant notation data be " wears effect is good ", " wears effect is bad ", " high-quality " etc., then get Those include the commodity data of these significant notation data.The number of collisions that there is conflict is determined from these significant notation data again According to right, such as above-mentioned significant notation data, there are the colliding datas of conflict to for " wears effect is good " and " wears effect is not It is good ".Then, one target is determined from colliding data centering to the user gradation being related to and/or the number that is marked according to colliding data Flag data.For example, the number that is marked has 59 times for " wears effect is good " this significant notation data, and " dress effect Fruit is bad " number that is marked has 8 times, and user gradation when " wears effect is good " is marked is VIP, and rank is generally higher than and " wears Effect it is bad " user gradation when being marked, then the target label data selected from the colliding data centering are " dress effect Fruit is good ".

Certainly, the above exemplary only content of specific data, should not be construed as the restriction of the application.

Step 405: establishing the incidence relation of target label data and object data.

This step establishes being associated between object data and target label data, for example, the side of Key-Value can be used Formula establishes association between the two, can also use other modes opening relationships, it is only necessary to can close according to target label data It is linked to corresponding object data, or its labeled target label data can be associated with according to object data.

The incidence relation established through this embodiment, because user comment data etc. can indicate the extrinsic category of audio Property, the property content that user marks each commodity is also used as to the search dimension of commodity, as long as the content category of user's input In some attribute of commodity, even if the content of input not in the title of commodity or brief introduction, also can be because of incidence relation It establishes and is retrieved the commodity, thus user can be helped to search more associated commodity, not only increase commodity inspection The depth and accuracy of rope also improve the commercial articles searching experience of user.

With reference to Fig. 5, show a kind of flow chart of data retrieval method embodiment of the application, the present embodiment may include with Lower step:

Step 501: obtaining retrieval input data, wherein the retrieval input data is target label data.

It, can be with after establishing the incidence relation between flag data and object data using embodiment shown in Fig. 4 Commodity etc. are scanned for using the incidence relation, or audio-video is scanned for using the barrage data of audio-video.Example Such as, the search content that user carries out input when commercial articles searching is " wears effect is good ", then " wears effect is good " can serve as mesh Mark flag data.

Step 502: being based on the target label data, obtain object data.

Then according to the incidence relation between target label data and object data, obtain relative to object data.Its In, the foundation of the incidence relation can be by the way of embodiment shown in Fig. 4, and details are not described herein.

Step 503: feeding back the object data as search result.

Then the commodity searched or commodity data etc. can be regard as search result, feeds back to user.

In the present embodiment, as long as include in the incidence relation pre-established some commodity and some by user's mark Flag data, no matter user's mark is wears effect or commercial quality etc., it can be associated with commodity foundation, To when the search term of user's input is matched with corresponding user's mark data, so that it may by the corresponding quotient of flag data Product or commodity data etc. show user as search result, so that providing the user with the commodity etc. for more meeting its demand It is possibly realized.

For the aforementioned method embodiment, for simple description, therefore, it is stated as a series of action combinations, still Those skilled in the art should understand that the application is not limited by the described action sequence, because according to the application, it is certain Step can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know that, it is described in the specification Embodiment belong to preferred embodiment, necessary to related actions and modules not necessarily the application.

It is corresponding with method provided by a kind of audio of above-mentioned the application and the correlating method embodiment of attribute, referring to Fig. 6, Present invention also provides the associated apparatus embodiments of a kind of audio and attribute, in the present embodiment, the apparatus may include:

Communication interface 601, for obtaining the initial attribute of audio to be associated；

Communication bus 602, and, processor 603, by reading the instruction stored in the memory 604 and/or number According to for performing the following operations: according to the similitude between the initial attribute, extracting valid genus from the initial attribute Property；And with reference to the conflict relationship between effective attribute, establish the incidence relation of effective attribute and audio.

Wherein, the processor 603 is used to mention from the initial attribute according to the similitude between the initial attribute Take effective attribute, comprising:

Wherein, the processor 603 is used for according to the distance between the result and each audio attribute of the cluster, and extraction has Imitate attribute, comprising:

Wherein, the processor 603 is used to establish effective attribute and sound with reference to the conflict relationship between effective attribute The incidence relation of frequency, comprising:

Wherein, the processor 603 can be also used for:

Wherein, each audio attribute that the processor 603 can be used for obtaining according to search system corresponds to the click of audio Information is modified the incidence relation, comprising:

Wherein, the processor 603 can be also used for:

Refering to what is shown in Fig. 7, present invention also provides a kind of audio search Installation practices, and in the present embodiment, the device May include:

Communication interface 701, for obtaining the search term of input；

Communication bus 702 and processor 703, by reading the instruction and/or data that store in the memory 704, For performing the following operations: according to the incidence relation of attribute and audio pre-establish, effective, determining and described search word association Audio；And

Display screen 705, for the audio to be shown.

Wherein, the incidence relation can be established in the following manner:

Obtain at least two initial attributes of audio；According to the similitude between the initial attribute, from the initial category Effective attribute is extracted in property；And with reference to the conflict relationship between effective attribute, effective attribute and audio are established Incidence relation.

Disclosed herein as well is a kind of data processing equipment, which may include:

Communication interface, for obtaining the flag data of object data；And processor, for the flag data into Row similarity system design obtains significant notation data；Obtain the conflict relationship between significant notation data；It is closed based on the conflict System, selection target flag data；And establish the incidence relation of the target label data and the object data.

Wherein, the processor is used to carry out similarity system design to the flag data, obtains significant notation data, can be with Include:

According to the weight and label number of the object data, the flag data is gathered by more disaggregated models Class；And

Wherein, the processor is used to be based on the conflict relationship, and selection target flag data may include:

The colliding data pair that there is conflict is determined from the significant notation data of the object data of the determination；And

Present invention also provides a kind of data searcher, the apparatus may include:

Communication interface, for obtaining retrieval input data, wherein the retrieval input data is target label data；With And processor obtains object data for being based on the target label data；And the object data is fed back as retrieval As a result.

Wherein, the target label data and the foundation of the incidence relation of the object data may comprise steps of:

Obtain the flag data of object data；Similarity system design is carried out to the flag data, obtains significant notation data； Obtain the conflict relationship between significant notation data；Based on the conflict relationship, selection target flag data；And establish institute State the incidence relation of target label data Yu the object data.

It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other. For device class embodiment, since it is basically similar to the method embodiment, so being described relatively simple, related place ginseng See the part explanation of embodiment of the method.

Finally, it is to be noted that, herein, the terms "include", "comprise" or its any other variant are intended to Cover non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or setting Standby intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in the process, method, article or apparatus that includes the element.

The correlating method to audio and attribute provided herein and device, audio search method and device carry out above It is discussed in detail, specific examples are used herein to illustrate the principle and implementation manner of the present application, above embodiments Explanation be merely used to help understand the present processes and its core concept；At the same time, for those skilled in the art, According to the thought of the application, there will be changes in the specific implementation manner and application range, in conclusion in this specification Hold the limitation that should not be construed as to the application.

Claims

1. the correlating method of a kind of audio and attribute, which is characterized in that this method comprises:

Obtain at least two initial attributes of audio；

2. the method according to claim 1, wherein the similitude according between the initial attribute, from Effective attribute is extracted in the initial attribute, comprising:

For each initial attribute, executes following acquisition process: obtaining all audios belonging to the initial attribute, and according to each sound The build-in attribute and extrinsic attribute of frequency, determine weight of each audio with respect to the initial attribute；And obtain the initial attribute The label number in each audio belonging to it；

Weight and label number according to each audio with respect to the initial attribute, by more disaggregated models to the initial attribute It is clustered；

3. according to the method described in claim 2, it is characterized in that, the result according to the cluster and each audio attribute Between distance, extract effective attribute, comprising:

Deleted from the result of the cluster initial attribute labeled number be less than preset threshold as a result, obtaining to from The cluster result of reason；

According to user gradation and/or labeled number, optimum attributes conduct is selected from each cluster result to be processed Effective attribute, the corresponding effective attribute of a cluster result；

4. the method according to claim 1, wherein the conflict relationship with reference between effective attribute, Establish the incidence relation of effective attribute and audio, comprising:

According to the conflict attribute to the collisions parameter being related to, an objective attribute target attribute is determined from the conflict attribute centering；The punching Prominent parameter includes: user gradation and/or the number that is marked；

5. the method according to claim 1, wherein further include:

According to the click information for each effective attribute diaphone frequency that search system obtains, the incidence relation is modified.

6. according to the method described in claim 5, it is characterized in that, each audio attribute obtained according to search system is corresponding The click information of audio is modified the incidence relation, comprising:

When being scanned for according to each effective attribute of historical behavior data statistics as search term, total click time of obtained each audio Number；

Judge whether total number of clicks of the audio is less than preset threshold, if it is, deleting effective attribute and the audio Incidence relation.

7. the method according to claim 1, wherein further include:

8. a kind of audio search method, which is characterized in that this method comprises:

Obtain the search term of input；

The audio is shown.

9. according to the method described in claim 8, it is characterized in that, the incidence relation is established in the following manner:

Obtain at least two initial attributes of audio；

10. a kind of data processing method characterized by comprising

Obtain the flag data of object data；

Obtain the conflict relationship between significant notation data；

Based on the conflict relationship, selection target flag data；

Establish the incidence relation of the target label data and the object data.

11. according to the method described in claim 10, it is characterized in that, it is described to the flag data carry out similarity system design, Obtain significant notation data, comprising:

Object data belonging to the flag data is obtained, and according to the object data build-in attribute and extrinsic attribute, really The weight of the fixed relatively described flag data of the object data, and, the flag data is in the object data belonging to it Mark number；

According to the weight and label number of the object data, the flag data is clustered by more disaggregated models；

12. according to the method described in claim 10, it is characterized in that, described be based on the conflict relationship, selection target label Data, comprising:

According to the relationship between the object data and significant notation data, the number of objects including the significant notation data is determined According to；

According to the colliding data to the collisions parameter being related to, a target label data are determined from the colliding data centering；Institute Stating collisions parameter includes: user gradation and/or the number that is marked.

13. a kind of data retrieval method characterized by comprising

Based on the target label data, object data is obtained；

The object data is fed back as search result.

14. method according to claim 13, which is characterized in that the target label data are associated with the object data The foundation of system includes:

Obtain the flag data of object data；

Obtain the conflict relationship between significant notation data；

Based on the conflict relationship, selection target flag data；

Establish the incidence relation of the target label data and the object data.

15. the associated apparatus of a kind of audio and attribute characterized by comprising

Processor, for extracting effective attribute from the initial attribute according to the similitude between the initial attribute；With And with reference to the conflict relationship between effective attribute, establish the incidence relation of effective attribute and audio.

16. a kind of audio search device characterized by comprising

Communication interface, for obtaining the search term of input；

Processor, it is determining and described search word association for the incidence relation according to attribute and audio pre-establish, effective Audio；

Display screen, for the audio to be shown.

17. a kind of data processing equipment characterized by comprising

Communication interface, for obtaining the flag data of object data；

Processor obtains significant notation data for carrying out similarity system design to the flag data；Obtain significant notation data Between conflict relationship；Based on the conflict relationship, selection target flag data；And establish the target label data with The incidence relation of the object data.

18. a kind of data searcher characterized by comprising

Processor obtains object data for being based on the target label data；And the object data is fed back as inspection Hitch fruit.