CN113094543A - Music authentication method, device, equipment and medium - Google Patents

Music authentication method, device, equipment and medium Download PDF

Info

Publication number
CN113094543A
CN113094543A CN202110459549.7A CN202110459549A CN113094543A CN 113094543 A CN113094543 A CN 113094543A CN 202110459549 A CN202110459549 A CN 202110459549A CN 113094543 A CN113094543 A CN 113094543A
Authority
CN
China
Prior art keywords
music
words
standard
determining
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110459549.7A
Other languages
Chinese (zh)
Other versions
CN113094543B (en
Inventor
崔启明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Netease Cloud Music Technology Co Ltd
Original Assignee
Hangzhou Netease Cloud Music Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Netease Cloud Music Technology Co Ltd filed Critical Hangzhou Netease Cloud Music Technology Co Ltd
Priority to CN202110459549.7A priority Critical patent/CN113094543B/en
Publication of CN113094543A publication Critical patent/CN113094543A/en
Application granted granted Critical
Publication of CN113094543B publication Critical patent/CN113094543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure relates to a music authentication method, device, equipment and medium, which are used for solving the problems of low efficiency and poor accuracy of the existing music authentication method. Since the authentication vocabulary library is previously arranged, the authentication vocabulary library includes at least one of a non-infringing vocabulary library and an infringing vocabulary library. After song information of music to be identified is acquired, target words contained in the song information are matched with words contained in a pre-configured identification vocabulary library. When the matched words exist, the target identification result of the music is determined directly according to the identification result corresponding to the pre-configured matched words, so that the efficiency and the accuracy of music authentication are improved, the cost for manually authenticating the music is reduced, and the influence of manual experience on the accuracy of the authentication result is avoided.

Description

Music authentication method, device, equipment and medium
Technical Field
The present disclosure relates to the field of internet technologies, and in particular, to a music authentication method, apparatus, device, and medium.
Background
Currently, with the development of the internet, music infringement behaviors are more and more complicated, for example, music is uploaded to a music playing platform without permission of a sound recording producer to obtain profits; recomposing music without permission of a word author and obtaining revenue; all types of public numbers on a public platform are sung without authorization of a copyright party; and other audio and video works are mixed as background music.
In the related art, the method mainly adopts an identification mode of manual verification to identify whether the music has infringement behavior. However, this method too depends on the knowledge storage of the music industry that the discriminator has more profound quality, the labor cost is high, the discriminator needs to listen to the music to be discriminated for many times and search the related music library to obtain a more accurate discrimination result, so that the workload of the discriminator is very large, the efficiency is low, the infringement behavior is not easy to find in time, and the timeliness is poor.
Disclosure of Invention
The present disclosure provides a music authentication method, device, apparatus and medium, which are used to solve the problems of low efficiency and poor accuracy of the existing music authentication method.
The present disclosure provides a music authentication method, the method comprising:
acquiring song information of music to be identified;
matching target words contained in the song information with words contained in a pre-configured identification vocabulary library; the identification vocabulary library comprises at least one of a non-infringing vocabulary library and an infringing vocabulary library;
and if the matched words exist, determining the identification result corresponding to the matched words as the target identification result of the music.
In some possible embodiments, the determining the discrimination result corresponding to the matched word as the target discrimination result of the music includes:
if the matched words belong to the non-infringement vocabulary library, determining that the identification result corresponding to the matched words is non-infringement, and determining the non-infringement as the target identification result;
and if the matched words belong to the infringement vocabulary library, determining that the identification result corresponding to the matched words is infringement, and determining the infringement as the target identification result.
In some possible embodiments, the non-infringing vocabulary library includes a first non-infringing vocabulary library and a second non-infringing vocabulary library, and the infringing vocabulary library includes a first infringing vocabulary library and a second infringing vocabulary library, and the determining the discrimination corresponding to the matched word as the target discrimination of the music includes:
if the matched words belong to the first non-infringement vocabulary library, determining that the identification result corresponding to the matched words is a non-infringement recording copyright, and determining the non-infringement recording copyright as the target identification result; the first non-infringing vocabulary library comprises at least one word of the recording copyright of standard music which is preset without infringing;
if the matched words belong to the second non-infringing vocabulary library, determining that the identification result corresponding to the matched words is a non-infringing vocabulary copyright and determining the non-infringing vocabulary copyright as the target identification result; the second non-infringing vocabulary library comprises at least one word of the vocabulary copyright of standard music which is preset without infringing;
if the matched words belong to the first infringement vocabulary library, determining the identification result corresponding to the matched words as an infringement recording copyright, and determining the infringement recording copyright as the target identification result; the first infringing vocabulary library comprises at least one infringing preset recording copyright word of standard music;
if the matched words belong to the second infringing vocabulary library, determining the identification result corresponding to the matched words as infringing vocabulary copyright, and determining the infringing vocabulary copyright as the target identification result; the second infringing vocabulary library comprises at least one infringing preset word of the vocabulary copyright of the standard music.
In some possible embodiments, the method further comprises:
if no matched words exist, determining a first letter sequence of the target words;
determining whether the first letter sequence is matched with a second letter sequence corresponding to preset standard music; the second letter sequence corresponding to any standard music is determined by the letter sequence of the standard words of the standard music; the standard words are determined according to song information of the standard music;
and determining the target identification result according to the matching result.
In some possible embodiments, the determining the first letter sequence of the target word comprises:
if the target word is English, determining the target word as the first letter sequence;
if the target word is Chinese, determining the pinyin of each character contained in the target word; and determining the first letter sequence according to the pinyin of each character.
In some possible embodiments, the method further comprises:
and aiming at each character contained in the target word, if the character is determined to be matched with the preset character to be replaced, replacing the character according to the preset target character corresponding to the matched character to be replaced.
In some possible embodiments, the method further comprises:
and if the first letter sequence is not matched with the third letter sequence of the preset stop word, executing the subsequent step of respectively matching the first letter sequence with the preset second letter sequence.
In some possible embodiments, the combination of the first letter sequence and the second letter sequence includes at least one of a first subsequence of song title words and a second subsequence of standard song title words, a third subsequence of artist noun words and a fourth subsequence of standard artist name words, a fifth subsequence of album title words and a sixth subsequence of standard album title words, and a seventh subsequence of lyric words and an eighth subsequence of standard lyric words, and whether the first letter sequence matches the second letter sequence corresponding to preset standard music is determined by at least one of:
determining whether said first subsequence of song title terms respectively matches said second subsequence of standard song title terms for each standard music;
determining whether the third subsequence of art name terms matches the fourth subsequence of standard art name terms for each standard music, respectively;
determining whether the fifth subsequence of album title words respectively matches the sixth subsequence of standard album title words for each of the standard music;
determining whether the seventh subsequence of lyric words matches the eighth subsequence of standard lyric words for each standard music, respectively.
In some possible embodiments, the determining whether the first subsequence of song title words matches the second subsequence of standard song title words for each standard music, respectively, comprises:
determining a first similarity of the first subsequence to each second subsequence;
if the first similarity corresponding to any standard music meets a preset first requirement, determining that the song name of the standard music is matched with the song name of the music;
otherwise, determining that the song name of the music does not match the song name of any standard music.
In some possible embodiments, the determining whether the third subsequence of art name terms matches the fourth subsequence of standard art name terms for each standard music, respectively, includes:
determining a second similarity of the third subsequence to each fourth subsequence; if the second similarity corresponding to any standard music meets a preset second requirement, determining that the artist name of the standard music is matched with the artist name of the music; otherwise, determining that the artist name of the music is not matched with the artist name of any standard music; or
If the third subsequence contains any fourth subsequence, or any fourth subsequence contains the third subsequence, determining that the artist name of the standard music corresponding to the fourth subsequence with the inclusion relation is matched with the artist name of the music; otherwise, determining that the artist name of the music does not match the artist name of any standard music.
In some possible implementations, the determining whether the fifth subsequence of album title words matches the sixth subsequence of standard album title words for each standard music, respectively, includes:
determining a third similarity of the fifth subsequence to each of the sixth subsequences;
if the third similarity corresponding to any standard music meets a preset third requirement, determining that the album name of the standard music is matched with the album name of the music;
otherwise, determining that the album name of the music does not match the album name of any standard music.
In some possible embodiments, the determining whether the seventh subsequence of lyric words matches the eighth subsequence of standard lyric words for each standard music, respectively, comprises:
respectively determining the frequency of occurrence of each lyric word in the lyrics of the music;
determining each lyric word with the frequency meeting a preset fourth requirement as a target lyric word;
determining a seventh subsequence of each target lyric word and a first number consistent with each eighth subsequence corresponding to the standard music for each standard music; the eighth subsequence is the letter sequence of standard lyric words with the frequency meeting the fourth requirement in the lyrics of the standard music;
if the first quantity corresponding to any standard music meets a preset fifth requirement, determining that the lyrics of the standard music are matched with the lyrics of the music;
otherwise, determining that the lyrics of the music do not match the lyrics of any standard music.
In some possible embodiments, the determining the target authentication result according to the matching result includes:
if the song name, the album name and the artist name of any standard music are respectively matched with the song name, the album name and the artist name of the music, determining infringement recording copyright as the target identification result;
and if the song name, the album name and the artist name of any standard music do not exist and are respectively matched with the song name, the album name and the artist name of the music, determining the non-infringement recording copyright as the target identification result.
In some possible embodiments, the determining the target authentication result according to the matching result includes:
if the song name and the lyrics of any standard music are respectively matched with the song name and the lyrics of the music, determining the copyright of the infringing word song as the target identification result;
and if the song name and the lyrics of any standard music do not exist and are respectively matched with the song name and the lyrics of the music, determining the copyright of the non-infringing lyrics as the target identification result.
In some possible embodiments, the method further comprises:
determining the music as sample music;
and if the fact that the set number of sample music is obtained is determined, updating the identification vocabulary library according to the target identification result corresponding to each sample music and the target words contained in each sample music.
In some possible embodiments, the updating the identification vocabulary library according to the target identification result corresponding to each sample music and the target word included in each sample music includes:
determining each sample music corresponding to the target identification result as target sample music according to the target identification result corresponding to each sample music; for each target word contained in the target sample music, determining a second number of target sample music containing the target word; adding the first N target words into an identification word library corresponding to the target identification result according to the sequence of the second number from large to small; wherein N is an integer not less than 1.
The present disclosure provides a music authentication apparatus, the apparatus comprising:
an acquisition unit configured to acquire song information of music to be identified;
the processing unit is used for matching target words contained in the song information with words contained in a pre-configured identification vocabulary library; the identification vocabulary library comprises at least one of a non-infringing vocabulary library and an infringing vocabulary library;
and the determining unit is used for determining the identification result corresponding to the matched word as the target identification result of the music if the matched word exists.
In some possible embodiments, the determining unit is specifically configured to determine that the authentication result corresponding to the matched word is non-infringement if the matched word belongs to the non-infringement vocabulary library, and determine that the non-infringement is the target authentication result; and if the matched words belong to the infringement vocabulary library, determining that the identification result corresponding to the matched words is infringement, and determining the infringement as the target identification result.
In some possible embodiments, the determining unit is specifically configured to determine that the non-infringement vocabulary library includes a first non-infringement vocabulary library and a second non-infringement vocabulary library, and the infringement vocabulary library includes a first infringement vocabulary library and a second infringement vocabulary library, and if the matched word belongs to the first non-infringement vocabulary library, the authentication result corresponding to the matched word is determined as the non-infringement recording copyright, and the non-infringement recording copyright is determined as the target authentication result; the first non-infringing vocabulary library comprises at least one word of the recording copyright of standard music which is preset without infringing; if the matched words belong to the second non-infringing vocabulary library, determining that the identification result corresponding to the matched words is a non-infringing vocabulary copyright and determining the non-infringing vocabulary copyright as the target identification result; the second non-infringing vocabulary library comprises at least one word of the vocabulary copyright of standard music which is preset without infringing; if the matched words belong to the first infringement vocabulary library, determining the identification result corresponding to the matched words as an infringement recording copyright, and determining the infringement recording copyright as the target identification result; the first infringing vocabulary library comprises at least one infringing preset recording copyright word of standard music; if the matched words belong to the second infringing vocabulary library, determining the identification result corresponding to the matched words as infringing vocabulary copyright, and determining the infringing vocabulary copyright as the target identification result; the second infringing vocabulary library comprises at least one infringing preset word of the vocabulary copyright of the standard music.
In some possible embodiments, the determining unit is further configured to determine a first letter sequence of the target word if there is no matching word; determining whether the first letter sequence is matched with a second letter sequence corresponding to preset standard music; the second letter sequence corresponding to any standard music is determined by the letter sequence of the standard words of the standard music; the standard words are determined according to song information of the standard music; and determining the target identification result according to the matching result.
In some possible embodiments, the determining unit is specifically configured to determine the target word as the first pinyin sequence if the target word is english; if the target word is Chinese, determining the pinyin of each character contained in the target word; and determining the first letter sequence according to the pinyin of each character.
In some possible embodiments, the apparatus further comprises: a first update unit;
and the first updating unit is used for replacing each character contained in the target word according to a target character corresponding to a preset matched character to be replaced if the character is determined to be matched with the preset character to be replaced.
In some possible embodiments, the determining unit is further configured to perform a subsequent step of matching the first letter sequence with a second letter sequence that is pre-configured, if the first letter sequence does not match with the third letter sequence of the pre-configured stop word.
In some possible embodiments, the determining unit is specifically configured to determine whether the first letter sequence matches a second letter sequence corresponding to a preset standard music by at least one of a first sub-sequence of song title words and a second sub-sequence of standard song title words, a third sub-sequence of artist noun words and a fourth sub-sequence of standard artist name words, a fifth sub-sequence of album title words and a sixth sub-sequence of standard album title words, and a seventh sub-sequence of lyric words and an eighth sub-sequence of standard lyric words, in which the combination of the first letter sequence and the second letter sequence includes at least one of:
determining whether said first subsequence of song title terms respectively matches said second subsequence of standard song title terms for each standard music;
determining whether the third subsequence of art name terms matches the fourth subsequence of standard art name terms for each standard music, respectively;
determining whether the fifth subsequence of album title words respectively matches the sixth subsequence of standard album title words for each of the standard music;
determining whether the seventh subsequence of lyric words matches the eighth subsequence of standard lyric words for each standard music, respectively.
In some possible embodiments, the determining unit is specifically configured to determine a first similarity between the first subsequence and each of the second subsequences; if the first similarity corresponding to any standard music meets a preset first requirement, determining that the song name of the standard music is matched with the song name of the music; otherwise, determining that the song name of the music does not match the song name of any standard music.
In some possible embodiments, the determining unit is specifically configured to determine a second similarity between the third subsequence and each of the fourth subsequences; if the second similarity corresponding to any standard music meets a preset second requirement, determining that the artist name of the standard music is matched with the artist name of the music; otherwise, determining that the artist name of the music is not matched with the artist name of any standard music; or if the third subsequence is determined to contain any fourth subsequence, or any fourth subsequence contains the third subsequence, determining that the artist name of the standard music corresponding to the fourth subsequence with the containing relationship is matched with the artist name of the music; otherwise, determining that the artist name of the music does not match the artist name of any standard music.
In some possible embodiments, the determining unit is specifically configured to determine a third similarity between the fifth subsequence and each of the sixth subsequences; if the third similarity corresponding to any standard music meets a preset third requirement, determining that the album name of the standard music is matched with the album name of the music; otherwise, determining that the album name of the music does not match the album name of any standard music.
In some possible embodiments, the determining unit is specifically configured to determine a frequency of occurrence of each lyric word in lyrics of the music, respectively; determining each lyric word with the frequency meeting a preset fourth requirement as a target lyric word; determining a seventh subsequence of each target lyric word and a first number consistent with each eighth subsequence corresponding to the standard music for each standard music; the eighth subsequence is the letter sequence of standard lyric words with the frequency meeting the fourth requirement in the lyrics of the standard music; if the first quantity corresponding to any standard music meets a preset fifth requirement, determining that the lyrics of the standard music are matched with the lyrics of the music; otherwise, determining that the lyrics of the music do not match the lyrics of any standard music.
In some possible embodiments, the determining unit is specifically configured to determine an infringing recording right as the target authentication result if a song name, an album name, and an artist name of any standard music respectively match with the song name, the album name, and the artist name of the music; and if the song name, the album name and the artist name of any standard music do not exist and are respectively matched with the song name, the album name and the artist name of the music, determining the non-infringement recording copyright as the target identification result.
In some possible embodiments, the determining unit is specifically configured to determine an infringing word song copyright as the target identification result if the song name and the lyrics of any standard music match with the song name and the lyrics of the music, respectively; and if the song name and the lyrics of any standard music do not exist and are respectively matched with the song name and the lyrics of the music, determining the copyright of the non-infringing lyrics as the target identification result.
In some possible embodiments, the apparatus further comprises: a second updating unit;
the second updating unit is used for determining the music as sample music; and if the fact that the set number of sample music is obtained is determined, updating the identification vocabulary library according to the target identification result corresponding to each sample music and the target words contained in each sample music.
In some possible embodiments, the second updating unit is specifically configured to, for a target identification result corresponding to each sample music, determine each sample music corresponding to the target identification result as a target sample music; for each target word contained in the target sample music, determining a second number of target sample music containing the target word; adding the first N target words into an identification word library corresponding to the target identification result according to the sequence of the second number from large to small; wherein N is an integer not less than 1.
The present disclosure provides an electronic device comprising at least a processor and a memory, the processor being adapted to carry out the steps of the music authentication method as described in any one of the above when executing a computer program stored in the memory.
The present disclosure provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the music authentication method as described in any one of the above.
Since the authentication vocabulary library is previously arranged, the authentication vocabulary library includes at least one of a non-infringing vocabulary library and an infringing vocabulary library. After song information of music to be identified is acquired, target words contained in the song information are matched with words contained in a pre-configured identification vocabulary library. When the matched words exist, the target identification result of the music is determined directly according to the identification result corresponding to the pre-configured matched words, so that the efficiency and the accuracy of music authentication are improved, the cost for manually authenticating the music is reduced, and the influence of manual experience on the accuracy of the authentication result is avoided.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the disclosure. The objectives and other advantages of the disclosure may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.
Fig. 1 is a schematic diagram of a music authentication process provided in an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating an exemplary process for updating an authentication vocabulary library according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram illustrating a specific music authentication process according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a music authentication apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
The present disclosure will be described in further detail below with reference to the attached drawings, and it should be understood that the described embodiments are only a part of the embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the scope of protection of the present disclosure.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
In this document, it is to be understood that any number of elements in the figures are provided by way of illustration and not limitation, and any nomenclature is used for differentiation only and not in any limiting sense.
For convenience of understanding, some concepts involved in the embodiments of the present disclosure are explained below:
recording copyright: the recording right is an exclusive right which the recording producer shares with the recording product recorded by the producer according to law, and belongs to the copyright adjacency right.
Word copy copyright: also called the copyright of the word and song, is the right that the creator of the musical composition has in the law of the work he creates. The method mainly comprises the following steps: property rights such as the performance right, the copy right, the broadcast right, the network transmission right and the like of the music works, mental rights such as the signature right, the whole protection right and the like.
Audio fingerprinting technology (Audio fingerprinting technology): the method is characterized in that unique digital features in a piece of audio are extracted in the form of identifiers through a specific algorithm and are used for identifying massive sound samples or tracking and locating the positions of the sound samples in a database. The method is used as a core algorithm of a content automatic identification technology, and is widely applied to the fields of music identification, copyright content monitoring and broadcasting, content library duplicate removal, television second screen interaction and the like.
Mixing (Remix): refers to a re-composed mix of a first tune or a re-composed mix album, i.e., a re-composed mix.
Text cosine similarity algorithm: the text cosine similarity algorithm is to divide words or characters of a text, represent cosine vectors, and evaluate the similarity of an included angle between two generated cosines. The formula is as follows:
Figure BDA0003041884510000121
wherein n is the total length of the cosine vector, AiThe frequency of occurrence of the word or phrase represented by the ith element in the cosine vector corresponding to the text A in the text A, BiThe frequency of occurrence of the word or phrase represented by the ith element in the cosine vector corresponding to the text B in the text B.
The following introduces design ideas of the embodiments of the present disclosure.
In the related art, music authentication is mainly performed based on an identification mode of manual verification, but the mode requires that a verifier has a more sophisticated industry background, and has high labor cost and low efficiency. In order to reduce labor cost and improve efficiency of music authentication, methods such as a text direct comparison method and an audio fingerprint comparison method are provided for music authentication. For example, the copyright infringement identification of the recording combines a text direct comparison method and an audio fingerprint comparison method, and the copyright infringement identification of the word song uses a text direct comparison method of the lyrics.
The text direct comparison method is to determine whether infringement exists between two pieces of music by directly comparing whether text characters of song information of the two pieces of music are completely consistent. The method has good identification effect only on some music which directly adopts the song information of the original music, can realize quick screening, and once the song information of certain infringing music is the information which is obtained by modifying the song information of the original music, the infringement of the infringing music on the original music cannot be accurately identified by the method, so that the condition of missing screening occurs, and the accuracy of music infringement cannot be ensured.
For example, most music platforms unify text characters included in the song information of music according to the page characteristics of the platform, so that some character strings included in the song information of infringing music are replaced and unified, the text of the song information of the infringing music is inconsistent with the text of the song information of the original music, and when subsequently performing music authentication based on a text direct comparison method, the infringing-free music is determined to be not infringing the original music due to the inconsistency of the texts of the song information of the two pieces of music, so that the accuracy of music infringing is reduced.
For example, if the song information of an infringing music is a song information of an original music in which a chinese character "(" is replaced with an english parenthesis "(" and if the music authentication is performed based on the text direct comparison method, it may be determined that the infringing music does not infringe the original music due to the fact that the english character "(" is different from the chinese parenthesis "(" in the song information of the original music) in the song information of the infringing music is judged.
For another example, more and more users can produce music by themselves and upload the music to a music library, especially some pieces of music for singing, modify the text of the song information of the original music, for example, prefix and suffix are added to the song information, synonyms and homophones in the song information are replaced, some descriptive information is added to the song information, so that the modified song information is inconsistent with the song information of the original music, and when subsequently performing music authentication based on a text direct comparison method, the infringement of the original music is determined to be not infringed by the infringement of the original music due to the inconsistency of the texts of the song information of the two pieces of music, thereby reducing the accuracy of music infringement.
For the audio fingerprint comparison method, the method is mainly used for infringing based on audio features of music. When the audio fingerprint comparison method is used, music needs to be downloaded to the local and analyzed, the audio features of the music are extracted, similarity between the music features of the music and the extracted audio features of other music is calculated, and then whether the music infringes other music can be determined according to each calculated similarity. And when some infringing music performs operations such as reverberation and singing on the audio frequency of the original music, the audio frequency characteristics of the infringing music may be different from the audio frequency characteristics of the original music, so that the similarity of the infringing music and the audio frequency characteristics of the original music is not high, infringement of the infringing music on the original music cannot be judged, and the accuracy of music authentication is reduced.
Because the above authentication methods have the problems of low accuracy and efficiency, incomplete infringement authentication and the like under the condition that the resources of electronic equipment are relatively limited, the authentication methods are generally only used for assisting a verifier to perform music infringement authentication so as to reduce the workload of manual work in the authentication process and improve the efficiency of music infringement authentication, but the workload of manual work is still large, and the efficiency of music infringement authentication still depends on the manual authentication speed.
The present disclosure provides various music authentication methods, apparatuses, devices, and media. Since the authentication vocabulary library is previously arranged, the authentication vocabulary library includes at least one of a non-infringing vocabulary library and an infringing vocabulary library. After song information of music to be identified is acquired, target words contained in the song information are matched with words contained in a pre-configured identification vocabulary library. When the matched words exist, the target identification result of the music is determined directly according to the identification result corresponding to the pre-configured matched words, so that the efficiency and the accuracy of music authentication are improved, the cost for manually authenticating the music is reduced, and the influence of manual experience on the accuracy and the efficiency of the music authentication is avoided.
Fig. 1 is a schematic diagram of a music authentication process provided in an embodiment of the present disclosure, where the process includes:
s101: song information of music to be authenticated is acquired.
The music authentication method provided by the disclosure is applied to electronic equipment, and the electronic equipment can be intelligent equipment such as a smart phone, a smart tablet and the like, and can also be a server.
In one example, when a certain music needs to be authenticated, a user may input a music input request authentication operation to be authenticated through an intelligent device. The method for inputting the request for the authentication operation includes a plurality of modes, which may be a click operation on music displayed on a display screen of the intelligent device, a mode of inputting voice information, or a mode of inputting text information for authenticating music to be authenticated through the display screen of the intelligent device. In the specific implementation process, the setting can be flexibly performed according to the actual requirement, and is not specifically limited herein. And after receiving the request authentication operation of the music to be authenticated, the intelligent equipment sends request information for authenticating the music to be authenticated to the electronic equipment. The request information carries song information of music to be authenticated.
The electronic device may be the intelligent device, or may be another electronic device different from the intelligent device, and it may be understood that the manner in which the electronic device obtains the song information of the music to be authenticated may be transmitted by another device, or may be determined according to an operation of the user.
It should be noted that the song information includes one or more of a song name, an album name, an artist name, and a lyric.
S102: matching target words contained in the song information with words contained in a pre-configured identification vocabulary library; the identification vocabulary library includes at least one of a non-infringing vocabulary library and an infringing vocabulary library.
In order to accurately and quickly identify music to be identified, an identification vocabulary library is configured in advance, and the identification vocabulary library contains words. The identification vocabulary library may include only a non-infringing vocabulary library including words included in song information of music other than infringing music, or may include only an infringing vocabulary library including words included in song information of infringing music. Of course, the identification vocabulary library may also include both the non-infringing vocabulary library and the infringing vocabulary library.
The words contained in the initialized identification vocabulary library are configured by words contained in song information of music (for convenience of description and standard music) protected by a worker according to needs, and in the process of identifying the music to be identified by the subsequent electronic equipment by adopting the music authentication method disclosed by the invention, the electronic equipment can update the words contained in the identification vocabulary library according to the song information of the music to be identified and a target identification result.
After the song information of the music to be identified is acquired based on the step of S101, word segmentation processing may be performed on the song information of the music to acquire each target word included in the song information. Then, aiming at each target word contained in the song information, the target word is matched with the word contained in the pre-funded identification vocabulary library. And determining the identification result of the music to be identified according to the matching result corresponding to each target word, namely determining whether the music to be identified is infringing music. Here, the matching refers to identifying whether or not the same word as the target word is stored in the vocabulary library.
S103: and if the matched words exist, determining the identification result corresponding to the matched words as the target identification result of the music.
After the target words included in the song information are matched with the words included in the pre-configured identification vocabulary library based on the above embodiment, if it is determined that the words matched with the target words exist, it is indicated that whether the music to be identified is an infringing result can be determined, and the identification result corresponding to the matched words can be determined as the target identification result of the music to be identified.
In a possible implementation manner, the determining the discrimination result corresponding to the matched word as the target discrimination result of the music includes:
if the matched words belong to the non-infringement vocabulary library, determining that the identification result corresponding to the matched words is non-infringement, and determining the non-infringement as the target identification result;
and if the matched words belong to the infringement vocabulary library, determining that the identification result corresponding to the matched words is infringement, and determining the infringement as the target identification result.
After the words matched with the target words are acquired from the identification vocabulary library based on the above embodiment, determining the identification result corresponding to the matched words as the target identification result of the music to be identified includes the following cases:
in case one, the identification vocabulary library includes a non-infringing vocabulary library, and the words matched with the target words belong to the non-infringing vocabulary library, the identification result corresponding to the matched words is determined to be non-infringing, and the target identification result of the music to be identified is determined to be non-infringing, that is, the music is not infringing.
Since the copyright of music includes recording copyright and vocabulary copyright, the non-infringement vocabulary library may include a recording copyright non-infringement vocabulary library (for convenience of description, referred to as a first non-infringement vocabulary library) and/or a vocabulary copyright non-infringement vocabulary library (for convenience of description, referred to as a second non-infringement vocabulary library). The first infringing vocabulary library is determined according to words contained in song information of music which does not infringe the recording copyright of protected music, namely, according to words contained in song information of music which does not infringe the recording copyright of preset standard music, and the first infringing vocabulary library contains at least one word which does not infringe the recording copyright of the preset standard music. The second infringing vocabulary library is determined according to words contained in the song information of the music which does not infringe the vocabulary copyright of the protected music, namely, the words contained in the song information of the music which does not infringe the vocabulary copyright of the preset standard music comprise at least one word which does not infringe the copyright of the preset standard music.
When the matched words are determined to belong to the first non-infringement vocabulary library, the fact that the music does not infringe the recording copyright of the preset standard music is indicated, the identification result corresponding to the matched words is determined to be the non-infringement recording copyright, and the non-infringement recording copyright is determined to be the target identification result; and when the matched words are determined to belong to the second non-infringing vocabulary bank, the music is not infringed with the preset word copyright of the standard music, the identification result corresponding to the matched words is determined to be the non-infringing word copyright, and the non-infringing word copyright is determined to be the target identification result.
And secondly, the identification vocabulary library comprises an infringing vocabulary library, and the words matched with the target words belong to the infringing vocabulary library, so that the identification result corresponding to the matched words is determined as infringement, and the target identification result of the music to be identified is determined as infringement, namely the music is infringement music.
Since the copyright of music includes recording copyright and vocabulary copyright, the infringement vocabulary library may include an infringement vocabulary library of recording copyright (for convenience of description, referred to as a first infringement vocabulary library) and/or an infringement vocabulary library of vocabulary copyright (for convenience of description, referred to as a second infringement vocabulary library). The first infringing vocabulary library is determined according to words contained in song information of music infringing the recording copyright of protected music, namely determined according to words contained in song information of music infringing the recording copyright of preset standard music, and contains at least one infringing word of the recording copyright of the preset standard music. The second infringing vocabulary library is determined according to words contained in the song information of the music infringing the vocabulary copyright of the protected music, namely determined according to words contained in the song information of the music infringing the vocabulary copyright of the preset standard music, and contains at least one infringing word of the vocabulary copyright of the preset standard music.
When the matched words are determined to belong to the first infringement vocabulary library, the recording copyright of standard music preset by the music infringement is described, the identification result corresponding to the matched words is determined to be the infringement recording copyright, and the infringement recording copyright is determined to be a target identification result; and when the matched words are determined to belong to the second infringing vocabulary library, the word and music copyright of the standard music preset by the music infringement is described, the identification result corresponding to the matched words is determined to be the infringing word and music copyright, and the infringing word and music copyright is determined to be the target identification result.
The non-infringement vocabulary library included in the identification vocabulary library is subdivided into the first non-infringement vocabulary library and the second non-infringement vocabulary library, and the infringement vocabulary library included in the identification vocabulary library is subdivided into the first infringement vocabulary library and the second infringement vocabulary library, so that when a target word is matched with a word included in the identification vocabulary library, if the matched word exists, whether the music to be identified is infringed music or not can be determined according to an identification result corresponding to the matched word, and whether the music to be identified infringes the vocabulary copyright and/or the recording copyright of preset standard music or not can be accurately determined.
S104: and if no matched words exist, determining the target identification result of the music to be identified.
In a possible implementation manner, matching the target word with a word included in a pre-configured identification vocabulary library may cause a situation that a word matching the target word does not exist in the identification vocabulary library, so that the target identification result of the music to be identified cannot be determined according to the identification result corresponding to the matching word. In order to accurately determine the target authentication result of the music to be authenticated, the target authentication result of the music to be authenticated may be determined by means of manual authentication.
In another possible embodiment, the method further comprises:
if no matched words exist, determining a first letter sequence of the target words;
determining whether the first letter sequence is matched with a second letter sequence corresponding to preset standard music; the second letter sequence corresponding to any standard music is determined by the letter sequence of the standard words of the standard music; the standard words are determined according to song information of the standard music;
and determining the target identification result according to the matching result.
In order to reduce the workload of manpower and improve the efficiency and accuracy of music authentication, when determining that no words matched with the target words exist in the identification vocabulary library, determining the letter sequence (for convenience of description, recorded as a first letter sequence) of the target words. The first letter sequence is matched with a letter sequence (for convenience of description, recorded as a second letter sequence) corresponding to preset standard music, so that the situation that words contained in the song information of the infringing music cannot be matched with words in the song information of the preset standard music due to the fact that simple text replacement is carried out on the song information of the original music when the song information of the infringing music appears, for example, Chinese characters in the song information of the original music are replaced by pinyin, Chinese characters in the song information of the original music are replaced by homophone and the like is avoided.
The second letter sequence corresponding to any standard music is determined by the letter sequence of the standard words of the standard music, and the standard words are determined according to the words contained in the song information of the standard music.
It should be noted that the second letter sequence may be configured in advance by the staff, and the second letter sequence may be updated in real time subsequently in the music authentication process.
In one possible embodiment, the language of the target word may be considered in determining the first letter sequence of the target word.
For example, the language of the target word is chinese, for example, the pinyin of each character included in the target word may be determined, and the first letter sequence may be determined according to the pinyin of each character.
For another example, the language of the target word is english, and the target word can be directly determined as the first letter sequence.
It should be noted that the above example of determining the first letter sequence of the target word according to the language of the target word is only for convenience of description, and is not a limitation to determining the first letter sequence according to the language of the target word in the present disclosure.
As a possible implementation, the method further comprises:
and aiming at each character contained in the target word, if the character is determined to be matched with the preset character to be replaced, replacing the character according to the preset target character corresponding to the matched character to be replaced.
The infringing music can not be accurately identified because the song information of the infringing music is simply and comprehensively converted from the song information of the original music, or English contained in the song information of the original music is replaced in lower case, or similar symbols are used for replacing symbols contained in the song information of the original music, or pinyin is used for replacing characters contained in the song information of the original music, and the like, and the accuracy of music authentication is reduced. Therefore, in order to avoid the above situation, and further improve the accuracy of music authentication, each character included in the target word may be normalized, that is, each character included in the target word is represented according to a preset representation manner, for example, english included in the target word is lowercase, chinese characters included in the target word are simplified, symbols included in the target word are only one of several preset symbols, and the like, and a character to be replaced and a character after normalization processing corresponding to the character to be replaced are configured in advance, where the character to be replaced is a character that needs to be normalized, for example, the character after normalization processing is simplified, the character to be replaced is a traditional character, the character after normalization processing is lowercase, and the character to be replaced is uppercase. For each character contained in each target word, matching the character with a pre-configured character to be replaced, and if the character is matched with the pre-configured character to be replaced and the character is a character which needs to be standardized, replacing the character according to the standardized character (marked as the target character for convenience of description) corresponding to the pre-configured matched character to be replaced; if the character is not matched with the character to be replaced which is configured in advance, the character is the character which does not need to be standardized, and then the next character is directly obtained.
For example, if a character "" matches a character to be replaced, which is a complex character and needs to be standardized, the character "" is replaced according to a target character "dream" corresponding to the matched character to be replaced.
For example, if a certain character "(" matches with a character to be replaced of a preconfigured english symbol, the character "(" is replaced according to a target character "(" of a chinese symbol corresponding to the preconfigured matched character to be replaced.
For another example, if a certain character "a" is matched with a pre-configured character to be replaced in capital english, the character "a" is replaced according to a target character "a" in lower english corresponding to the pre-configured matched character to be replaced.
It should be noted that the character to be replaced may be configured in advance by a worker, and the character to be replaced may be updated in real time in the music authentication process.
After each character contained in the target word is subjected to standardization processing, the determined first character sequence is more accurate according to the target word subjected to standardization processing, and the problem that the accuracy of music authentication is reduced due to the fact that the first character sequence is not matched with a pre-configured second character sequence when the song information of infringing music is subjected to simplified and traditional conversion, or English contained in the song information of the original music is subjected to capital and lower case replacement, or similar symbols are adopted to replace symbols contained in the song information of the original music, or pinyin is adopted to replace characters contained in the song information of the original music and the like is solved.
As another possible embodiment, the method further includes:
and if the first letter sequence is not matched with the third letter sequence of the preset stop word, executing the subsequent step of respectively matching the first letter sequence with the preset second letter sequence.
Because some modified characters or words may be added to the song information of the original music, the accuracy of music authentication is affected. Based on the above, in order to further improve the accuracy of music authentication, stop words can be configured in advance according to some meaningless characters or words existing in the collected infringing music. For example, a song message of the original music is added with "(tremble version)", "(boy version)", "-sweet girl version", "a song that is a muddy chicken skin knot before playing", and the like. After the first letter sequence is obtained based on the above embodiment, it may be determined whether the first letter sequence is a letter sequence of a preconfigured stop word (for convenience of description, it is denoted as a third letter sequence), that is, the first letter sequence is matched with the third letter sequence of the preconfigured stop word. If it is determined that the first letter sequence matches a preconfigured third letter sequence, indicating that the target word is a stop word, the target word and the first letter sequence thereof may be filtered out, and the subsequent step of matching the first letter sequence with the preconfigured second letter sequence, respectively, is not performed. If it is determined that the first letter sequence is not matched with the preconfigured third letter sequence, indicating that the target word is not a stop word, then the subsequent step of matching the first letter sequence with the preconfigured second letter sequence may be performed.
It should be noted that the stop word may be configured in advance by the staff, and the stop word may be updated in real time subsequently in the music authentication process.
Since the song information includes one or more of the song name, the album name, the artist name, and the lyrics, when the first letter sequence is matched with the second letter sequence based on the above-described embodiment to determine the matching result, at least one of the following cases may be included:
in case one, if the song information includes song names, and the combination of the first alphabetic sequence and the second alphabetic sequence includes a first subsequence of song title words and a second subsequence of standard song title words, the first subsequence of song title words is matched with the second subsequence of standard song title words of each standard music, respectively. Wherein the standard song title words are words contained in the song title of the standard music.
In one possible embodiment, the determining whether the first subsequence of song title words matches the second subsequence of standard song title words for each standard music, respectively, comprises:
determining a first similarity of the first subsequence to each second subsequence;
if the first similarity corresponding to any standard music meets a preset first requirement, determining that the song name of the standard music is matched with the song name of the music;
otherwise, determining that the song name of the music does not match the song name of any standard music.
In order to determine whether the song name of the music to be identified matches the song name of the preset standard music, when the first subsequence of the song name of the music to be identified is matched with the second subsequence of the standard song name words of each standard music, the similarity (for convenience of description, marked as the first similarity) of the first subsequence to each second subsequence is determined. The similarity may be represented by cosine similarity, euclidean distance, hamming distance, and the like. In the specific implementation process, the setting can be flexibly performed according to the requirement, and is not specifically limited herein. And determining whether a second subsequence matching the first subsequence exists according to each determined first similarity.
For convenience and accuracy, a requirement (for convenience of description, referred to as a first requirement) is preset to determine whether a second subsequence matching the first subsequence exists. The first requirement may be greater than a preset similarity threshold, or greater than a preset similarity threshold, and is a maximum value in the first similarity, and the like. After the first similarity is obtained based on the above embodiment, for the first similarity corresponding to any standard music, if the obtained first similarity corresponding to the standard music meets a preset first requirement, it is determined that the song name of the standard music matches the song name of the music to be identified; and if the acquired first similarity corresponding to the standard music does not meet the preset first requirement, determining that the song name of the standard music is not matched with the song name of the music to be identified.
It should be noted that, after each first similarity is obtained, whether the first similarity meets a preset first requirement may be determined, or after each first similarity is obtained, whether each first similarity meets the preset first requirement may be sequentially or randomly determined.
By adopting the method, the situation that the original music is infringed by the music to be identified according to the song name of the original music and the song name of the music to be identified due to the fact that the original music is tampered with by the infringing music can be effectively avoided, and the accuracy of music authentication is improved.
And in case II, if the song information comprises the art name words, the combination of the first letter sequence and the second letter sequence comprises a third subsequence of the art name words and a fourth subsequence of the standard art name words, respectively matching the third subsequence of the art name words with the fourth subsequence of the standard art name words of each standard music. Wherein, the standard art name words are words contained in the art name of the standard music.
In one possible implementation, the determining whether the third subsequence of art name terms matches the fourth subsequence of standard art name terms for each standard music, respectively, includes:
determining a second similarity of the third subsequence to each fourth subsequence; if the second similarity corresponding to any standard music meets a preset second requirement, determining that the artist name of the standard music is matched with the artist name of the music; otherwise, determining that the artist name of the music does not match the artist name of any standard music.
In general, the name of the artist of the original music is not modified too much by the infringing music, for example, the name of the artist a of the original music is directly replaced by the name of the artist B, and at least one character included in the name of the artist a of the original music is replaced by homophone character. Therefore, when the third subsequence of the artist name of the music to be identified is matched with each of the fourth subsequences, the similarity (for convenience of explanation, referred to as the second similarity) of the third subsequence with each of the fourth subsequences is determined. The similarity may be represented by cosine similarity, euclidean distance, hamming distance, and the like. In the specific implementation process, the setting can be flexibly performed according to the requirement, and is not specifically limited herein. And determining whether a fourth subsequence matching the third subsequence exists according to each determined second similarity.
For convenience of determining whether the third subsequence matches any of the fourth subsequences, a requirement (for convenience of explanation, referred to as a second requirement) is preset. The second requirement may be greater than a preset similarity threshold, or greater than a preset similarity threshold, and is the maximum value in the second similarity, and so on. After the second similarity is obtained based on the above embodiment, for the second similarity corresponding to any standard music, if the obtained second similarity corresponding to the standard music meets a preset second requirement, it is determined that the artist name of the standard music matches the artist name of the music to be identified; and if the acquired second similarity corresponding to the standard music does not meet the preset second requirement, determining that the artist name of the standard music is not matched with the artist name of the music to be identified.
It should be noted that, after each second similarity is obtained, whether the second similarity meets the preset second requirement may be determined, or after each second similarity is obtained, whether each second similarity meets the preset second requirement may be sequentially or randomly determined.
In another possible implementation, the determining whether the third subsequence of art name terms matches the fourth subsequence of standard art name terms for each standard music, respectively, includes:
if the third subsequence contains any fourth subsequence, or any fourth subsequence contains the third subsequence, determining that the artist name of the standard music corresponding to the fourth subsequence with the inclusion relation is matched with the artist name of the music; otherwise, determining that the artist name of the music does not match the artist name of any standard music.
Since it may occur that the artist name of the original music and other multiple names are taken together as the artist name of the infringing music, or that a part of the artist names among the multiple artist names of the original music are taken as the artist names of the infringing music, for example, the artist name a and other artist names B of the original music are taken as the artist names of the infringing music, or the artist name a among the artist name a, artist names B, and artist names C of the original music is taken as the artist name of the infringing music. Thus, in order to avoid the above situation, for each fourth sub-sequence, it is determined whether the third sub-sequence is comprised in the fourth sub-sequence or whether the fourth sub-sequence is comprised in the third sub-sequence. If the third subsequence is determined to be contained in the fourth subsequence or the fourth subsequence is contained in the third subsequence, the fact that the fourth subsequence has an inclusion relationship with the third subsequence is described, and the artist name of the standard music corresponding to the fourth subsequence with the inclusion relationship is determined to be matched with the artist name of the music to be identified; if the third subsequence is determined not to be included in the fourth subsequence and the fourth subsequence is not included in the third subsequence, the fact that the inclusion relationship does not exist between the fourth subsequence and the third subsequence indicates that the artist name of the standard music corresponding to the fourth subsequence does not match with the artist name of the music to be identified.
By adopting the method, the situation that the music infringing the original music cannot be determined according to the artist name of the original music and the artist name of the music to be identified due to the tampering of the original music with the original music by the infringing music can be effectively avoided, and the accuracy of music authentication is improved.
And thirdly, if the song information comprises album name words, and the combination of the first letter sequence and the second letter sequence comprises a fifth subsequence of the album name words and a sixth subsequence of the standard album name words, respectively matching the fifth subsequence of the album name words with the sixth subsequence of the standard album name words of each standard music. The standard album name words are words contained in the album name of the standard music.
In one possible implementation, the determining whether the fifth subsequence of album title words matches the sixth subsequence of standard album title words for each standard music, respectively, includes:
determining a third similarity of the fifth subsequence to each of the sixth subsequences;
if the third similarity corresponding to any standard music meets a preset third requirement, determining that the album name of the standard music is matched with the album name of the music;
otherwise, determining that the album name of the music does not match the album name of any standard music.
The name of the album of the original music is not modified too much due to the infringing music, for example, the name of the album a of the original music is directly replaced by the name of the album B, at least one character contained in the name of the album a of the original music is replaced by homophonic characters, and the like. Therefore, when the fifth sub-sequence of the album name of the music to be authenticated is matched with each of the sixth sub-sequences, the similarity (for convenience of explanation, denoted as the third similarity) of the fifth sub-sequence with each of the sixth sub-sequences is determined. The third similarity may be represented by cosine similarity, euclidean distance, hamming distance, and the like. In the specific implementation process, the setting can be flexibly performed according to the requirement, and is not specifically limited herein. And determining whether a sixth subsequence matching the fifth subsequence exists according to each determined third similarity.
For convenience of determining whether the fifth subsequence matches any of the sixth subsequences, a requirement (for convenience of explanation, referred to as a third requirement) is preset. The third requirement may be greater than a preset similarity threshold, or greater than a preset similarity threshold, and is a maximum value in the third similarity, and the like. After the third similarity is obtained based on the above embodiment, for the third similarity corresponding to any standard music, if the third similarity corresponding to the obtained standard music meets a preset third requirement, it is determined that the album name of the standard music matches the album name of the music to be identified; and if the acquired third similarity corresponding to the standard music does not meet a preset third requirement, determining that the album name of the standard music is not matched with the album name of the music to be identified.
It should be noted that, after each third similarity is obtained, it may be determined whether the third similarity meets a preset third requirement, or after each third similarity is obtained, it may be sequentially or randomly determined whether each third similarity meets the preset third requirement.
By adopting the method, the situation that the original music is infringed by the music to be identified according to the album name of the original music and the album name of the music to be identified due to the fact that the infringed music is tampered with the album name of the original music can be effectively avoided, and accuracy of music authentication is improved.
And if the song information comprises lyric words, and the combination of the first letter sequence and the second letter sequence comprises a seventh subsequence of the lyric words and an eighth subsequence of the standard lyric words, matching the seventh subsequence of the lyric words with the eighth subsequence of the standard lyric words of each standard music respectively. Wherein, the standard words of the lyrics are words contained in the lyrics of the standard music.
In one possible embodiment, the determining whether the seventh subsequence of lyric words matches the eighth subsequence of standard lyric words of each standard music, respectively, comprises:
respectively determining the frequency of occurrence of each lyric word in the lyrics of the music;
determining each lyric word with the frequency meeting a preset fourth requirement as a target lyric word;
determining a seventh subsequence of each target lyric word and a first number consistent with each eighth subsequence corresponding to the standard music for each standard music; the eighth subsequence is the letter sequence of standard lyric words with the frequency meeting the fourth requirement in the lyrics of the standard music;
if the first quantity corresponding to any standard music meets a preset fifth requirement, determining that the lyrics of the standard music are matched with the lyrics of the music;
otherwise, determining that the lyrics of the music do not match the lyrics of any standard music.
In the infringing platform, in order to facilitate the user to read the music lyrics, the content of the lyrics is not greatly adjusted, and the lyrics may be converted by changing the sentence-breaking position, the line-feed position and the like. Moreover, because the main content of the lyrics can be expressed by the words with more frequency contained in the lyrics, the lyrics of the infringement music generally contain the words with more frequency in the lyrics of the original music, and the frequency of the words appearing in the lyrics of the infringement music is also higher, so that the same content can be expressed by the lyrics of the infringement music and the lyrics of the original music. Therefore, when the seventh subsequence of the lyric words of the music to be identified is respectively matched with the eighth subsequence of the standard lyric words of each standard music, the frequency of each lyric word included in the lyrics of the music to be identified appearing in the lyrics can be determined, and each lyric word (for convenience of description, recorded as a target lyric word) with the frequency meeting the preset fourth requirement is determined. And determining whether the lyrics of the music to be identified infringe the word copyright of the standard music based on the seventh subsequence of each target lyric word and each eighth subsequence corresponding to each standard music. And each eighth subsequence corresponding to any standard music is a letter sequence of standard lyric words, of the lyrics of the standard music, of which the occurrence frequency meets a preset fourth requirement.
It should be noted that, the meeting of the preset fourth requirement may be that the frequency is greater than a preset frequency threshold, or may be that standard lyric words of a preset number are sorted in descending order of frequency.
In one possible embodiment, a requirement (for convenience of explanation, denoted as a fifth requirement) is preset in order to determine whether the lyrics of the music to be identified infringes the lyrics copyright of standard music. The fifth requirement may be greater than the preset number threshold, or greater than the preset number threshold, and is a maximum value of the first number, and the like. In the specific implementation process, the setting can be flexibly performed according to the actual requirement, and is not specifically limited herein. For each standard music, a seventh subsequence of each target lyric word is determined, the number (for convenience of explanation, noted as first number) corresponding to each eighth subsequence corresponding to the standard music. And determining whether the first quantity corresponding to the standard music meets a preset fifth requirement.
If the first quantity corresponding to the standard music is determined to meet the preset fifth requirement, determining that the lyrics of the standard music are matched with the lyrics of the music; and if the first quantity corresponding to the standard music is determined not to meet the preset fifth requirement, determining that the lyrics of the standard music are not matched with the lyrics of the music.
By adopting the method, whether the lyrics of the music to be identified plagiarism the lyrics of the original music can be effectively determined, so that whether the music to be identified infringes the word copyright of the original music is determined, and the accuracy of music authentication is improved.
After the first letter sequence is matched with the second letter sequence corresponding to the preset standard music based on the above embodiment, the target identification result of the music to be identified is determined according to the obtained matching result.
In a possible implementation manner, determining the target authentication result of the music to be authenticated according to the obtained matching result includes any one of the following cases:
in case 1, the song name, the album name, and the artist name of the music to be identified may be acquired, and the matching result with the song name, the album name, and the artist name of each standard music may be obtained, respectively.
If the song name, the album name and the artist name of a certain standard music are respectively matched with the song name, the album name and the artist name of the music to be identified, the music to be identified infringes the recording copyright of the standard music, and the infringed recording copyright is determined as a target identification result.
If the song name of each standard music is not matched with the song name of the music to be identified, or the album name of each standard music is not matched with the album name of the music to be identified, or the artist name of each standard music is not matched with the artist name of the music to be identified, the recording copyright of the music to be identified is not infringed by the standard music, and the non-infringed recording copyright is determined as a target identification result.
Under the condition that the words matched with the target words are not determined in the identification vocabulary library, whether the song name, the album name and the artist name of the standard music are matched with the song name, the album name and the artist name of the music to be identified can be determined, whether the music to be identified is infringing music can be determined, whether the music to be identified infringes the recording copyright of the standard music can be accurately determined, the accuracy and the fineness of music authentication are further improved, the music authentication process is more comprehensive, the target identification result of the music to be identified can be rapidly determined, and the workload of workers is reduced.
And 2, obtaining the song name and the lyrics of the music to be identified, and respectively matching the song name and the lyrics of each standard music.
If the song name and the lyrics of a certain standard music are determined based on the above embodiment and are respectively matched with the song name and the lyrics of the music to be identified, which indicates that the music to be identified may infringe the word copyright of the standard music, the infringement word copyright is determined as the target identification result.
If it is determined based on the above embodiment that the song name of each standard music is not matched with the song name of the music to be identified, or the lyrics of each standard music are not matched with the lyrics of the music to be identified, which indicates that the music to be identified may not have the word copyright of the infringing standard music, the non-infringing word copyright is determined as the target identification result.
Under the condition that the words matched with the target words are not determined in the identification word library, whether the song names and the lyrics of the standard music are matched with the song names and the lyrics of the music to be identified can be determined, whether the music to be identified is infringing music can be determined, whether the music to be identified infringes the word copyright of the standard music can be accurately determined, the accuracy and the fineness of music authentication are further improved, the music authentication process is more comprehensive, the target identification result of the music to be identified can be rapidly determined, and the manual workload is reduced.
And in the case 3, at least one of song names, album names, artist names and lyrics possibly existing in song information of the music to be identified is absent, so that when determining whether the music to be identified infringes the recording copyright or the word song copyright of the standard music, the methods in the cases 1 and 2 cannot be adopted to determine the target identification result of the music to be identified, and then the music to be identified can be prompted to be manually identified in a preset notification mode.
S105: and updating the identification vocabulary library according to the music to be identified and the corresponding target identification result.
When the target authentication result of the music to be authenticated is determined based on the above-described embodiment, the music to be authenticated may be determined as sample music. And then updating the identification vocabulary library according to the target identification result of the sample music and the target words contained in the song information of the sample music.
In one possible embodiment, since the music to be discriminated is massive, in order to reduce the number of sample music that needs to be processed, only music that is discriminated manually may be determined as the sample music.
In one possible embodiment, in order to avoid frequent updating of the authentication vocabulary library, an updating condition may be preset, and the updating condition may be a preset period or a preset set number. Taking the set number preset as an example, when it is determined that the set number of sample music has been obtained, the identification vocabulary library may be updated according to the target identification results respectively corresponding to the preset number of sample music and the target words included in the song information respectively corresponding to the preset number of sample music.
As a possible implementation manner, the updating the identification vocabulary library according to the target identification result corresponding to each sample music and the target words contained in each sample music includes:
determining each sample music corresponding to the target identification result as target sample music according to the target identification result corresponding to each sample music; for each target word contained in the target sample music, determining a second number of target sample music containing the target word; adding the first N target words into an identification word library corresponding to the target identification result according to the sequence of the second number from large to small; wherein N is an integer not less than 1.
The words contained in the vocabulary library are identified as words generally contained in the song information of a plurality of infringing music or words generally contained in the song information of a plurality of non-infringing music. Therefore, when a set number of sample music pieces are acquired, each sample music piece (for convenience of explanation, referred to as a target sample music piece) corresponding to a target authentication result corresponding to each of the set number of sample music pieces can be determined. Then, for each target word included in the sample music, the number of target sample music including the target word (denoted as a second number) is determined. And updating the identification vocabulary library corresponding to the target identification result according to each target word and the second quantity corresponding to each target word.
In a possible implementation manner, the top N target words are added to the identification vocabulary library corresponding to the target identification result in the order from the second number to the smaller number. Wherein N is an integer not less than 1.
For example, the target identification result is non-infringement, and the target identification result in the sample music is determined to be non-infringement target sample music. And determining a second number of target words contained in the target sample music aiming at the target words contained in the song information of the target sample music. And adding the top 3 ordered target words into the non-infringing vocabulary library in the order from the second number to the second number.
And determining the target music with the target identification result of the copyright of the infringing word song in the sample music. And determining a second number of target words contained in the target sample music aiming at the target words contained in the song information of the target sample music. And adding the top 3 ordered target words into the infringing word album copyright vocabulary library according to the sequence of the second number from large to small.
As a possible implementation manner, in order to avoid adding repeated target words to the identification vocabulary library, after the first N target words are acquired based on the above-described embodiment, it is determined whether the target words are included in the identification vocabulary library for the N target words. If the target word is determined to be contained, the target word is not added into the identification word library; if the target word is determined not to be included, the target word is added to the identification vocabulary library.
As another possible implementation manner, in order to improve the accuracy of music authentication by identifying words in the vocabulary library, a number threshold may be preset, and based on the foregoing embodiment, after acquiring N target words ranked in the front, it is determined whether the second number corresponding to the target word is greater than the preset number threshold for the N target words. And adding the target words of which the second number is greater than a preset number threshold value in the N target words into an identification word library corresponding to the target identification result.
It should be noted that the number threshold set for different target authentication results may be the same or different.
As another possible implementation manner, after acquiring N target words ranked in the front, it may be determined whether a second number corresponding to the target word is greater than a preset number threshold for the N target words, and whether a word matching the target word is stored in the identification vocabulary library. And adding the target words of which the second number is greater than a preset number threshold value and which have no matched words in the identification word library into the identification word library corresponding to the target identification result.
By dynamically updating the identification vocabulary library through the method, the capability of identifying the music to be identified through the identification vocabulary library can be further improved, and the self-adaptability of the electronic equipment in the music authentication process is improved.
Since the authentication vocabulary library is previously arranged, the authentication vocabulary library includes at least one of a non-infringing vocabulary library and an infringing vocabulary library. After song information of music to be identified is acquired, target words contained in the song information are matched with words contained in a pre-configured identification vocabulary library. When the matched words exist, the target identification result of the music is determined directly according to the identification result corresponding to the pre-configured matched words, so that the efficiency and the accuracy of music authentication are improved, the cost for manually authenticating the music is reduced, and the influence of manual experience on the accuracy of the authentication result is avoided.
The following describes a specific process for updating an authentication vocabulary library provided by the present disclosure by a specific embodiment, and fig. 2 is a schematic diagram of the specific process for updating the authentication vocabulary library provided by the embodiment of the present disclosure, where the process includes:
s201: the target authentication result of the music to be authenticated is determined.
For convenience of describing the process of determining the target authentication result of the music to be authenticated in S201, the following detailed description is made with reference to fig. 3, where fig. 3 is a schematic diagram of a specific music authentication process provided in the embodiment of the present disclosure, and the process includes:
s301: song information of music to be authenticated is acquired.
S302: and performing word segmentation processing on the song information to obtain each target word contained in the song information.
The following steps S303 to S306 are performed for each target word included in the song information.
S303: and matching the target word with the words contained in the infringement vocabulary library, if the matched word exists, executing S310, and if the matched word does not exist, executing S304.
Wherein, the infringing vocabulary library can comprise a first infringing vocabulary library and a second infringing vocabulary library. The first infringing vocabulary library stores at least one infringing preset word of recording copyright of standard music, and the second infringing vocabulary library comprises at least one infringing preset word of song copyright of standard music.
S304: the target word is matched with the words included in the non-infringement vocabulary library, and if there is a matching word, S311 is executed, and if there is no matching word, S305 is executed.
Wherein the non-infringing vocabulary library may include a first non-infringing vocabulary library and a second non-infringing vocabulary library. The first non-infringing vocabulary library comprises at least one word of recording copyright of standard music which is not infringed and preset, and the second non-infringing vocabulary library comprises at least one word of vocabulary copyright of standard music which is not infringed and preset.
S305: and carrying out standardization processing on the target word to obtain a first letter sequence of the standardized target word.
The normalization process includes character replacement, alphanumerization, and stop word filtering.
First, character replacement processing is performed on a target word. The process of character replacement includes: and aiming at each character contained in the target word, if the character is determined to be matched with the preset character to be replaced, replacing the character according to the preset target character corresponding to the matched character to be replaced.
Then, according to the language of the target word, performing the alphabeticization processing on the target word after the character replacement processing, for example, if the target word is english, directly determining the target word as the first letter sequence of the target word, and if the target word is chinese, determining the first letter sequence of the target word according to the pinyin of each character contained in the target word.
And finally, matching the acquired first letter sequence with a third letter sequence of a preset stop word respectively, namely performing stop word processing. If the first letter sequence is not matched with a third letter sequence of a preset stop word, executing the subsequent step S306; and if the first letter sequence is matched with a third letter sequence of a preset stop word, filtering the first letter sequence to obtain a next target word.
After each character contained in the target word is subjected to standardization processing, the determined first character sequence is more accurate according to the target word subjected to standardization processing, and the problem that the accuracy of music authentication is reduced due to the fact that the first character sequence is not matched with a pre-configured second character sequence when the song information of infringing music is subjected to simplified and traditional conversion, or English contained in the song information of the original music is subjected to capital and lower case replacement, or similar symbols are adopted to replace symbols contained in the song information of the original music, or pinyin is adopted to replace characters contained in the song information of the original music and the like is solved.
S306: and matching the first letter sequence with a second letter sequence corresponding to the preset standard music to obtain a matching result.
The combination of the first letter sequence and the second letter sequence comprises at least one of a first subsequence of song title words and a second subsequence of standard song title words, a third subsequence of artist noun words and a fourth subsequence of standard artist name words, a fifth subsequence of album title words and a sixth subsequence of standard album title words, and a seventh subsequence of lyric words and an eighth subsequence of standard lyric words, and the first letter sequence is determined to be matched with the second letter sequence corresponding to preset standard music in at least one of the following modes:
determining a first similarity between a first subsequence of song title words and a second subsequence of standard song title words of each standard music; if the first similarity corresponding to any standard music meets a preset first requirement, determining that the song name of the standard music is matched with the song name of the music; otherwise, it is determined that the song name of the music does not match the song name of any standard music.
Determining a second similarity between the third subsequence of the art name words and the fourth subsequence of the standard art name words of each standard music; if the second similarity corresponding to any standard music meets the preset second requirement, determining that the artist name of the standard music is matched with the artist name of the music; otherwise, determining that the artist name of the music is not matched with the artist name of any standard music; or if the third subsequence of the art name words contains the fourth subsequence of the standard art name words of any standard music, or any fourth subsequence contains the third subsequence, determining that the art name of the standard music corresponding to the fourth subsequence with the inclusion relation is matched with the artist name of the music; otherwise, it is determined that the artist name of the music does not match the artist name of any standard music.
Determining a third similarity between the fifth subsequence of album title words and the sixth subsequence of standard album title words of each standard music; if the third similarity corresponding to any standard music meets a preset third requirement, determining that the album name of the standard music is matched with the album name of the music; otherwise, it is determined that the album name of the music does not match the album name of any standard music.
Determining the frequency of each lyric word appearing in the lyrics of the music respectively; determining each lyric word with the frequency meeting a preset fourth requirement as a target lyric word; determining a seventh subsequence of each target lyric word and a first number consistent with each eighth subsequence corresponding to each standard music for each standard music; the eighth subsequence is the letter sequence of standard lyric words with the frequency meeting the fourth requirement in the lyrics of the standard music; if the first quantity corresponding to any standard music meets a preset fifth requirement, determining that the lyrics of the standard music are matched with the lyrics of the music; otherwise, it is determined that the lyrics of the music do not match the lyrics of any standard music.
By adopting the mode, whether the music to be identified is copied by the preset standard music can be effectively determined according to the song information of the music to be identified and the song information of the preset standard music, so that whether the music to be identified infringes the vocabulary copyright and/or the recording copyright of the preset standard music can be subsequently determined, and the accuracy of music authentication is improved.
S307: and determining whether the song name, the album name, the artist name and the lyrics of the music to be identified are obtained or not, and respectively obtaining the matching result of the song name, the album name, the artist name and the lyrics of each standard music, if so, executing S308, otherwise, executing S309.
S308: and determining a target identification result of the music to be identified according to each acquired matching result.
Determining a target identification result of music to be identified according to each acquired matching result, wherein the method comprises the following steps: if the song name, the album name and the artist name of any standard music are respectively matched with the song name, the album name and the artist name of the music, determining the infringing recording copyright as a target identification result; if the song name, the album name and the artist name of any standard music do not exist and are respectively matched with the song name, the album name and the artist name of the music, the non-infringement recording copyright is determined as a target identification result.
Under the condition that the words matched with the target words are not determined in the identification vocabulary library, whether the song name, the album name and the artist name of the standard music are matched with the song name, the album name and the artist name of the music to be identified can be determined, whether the music to be identified is infringing music can be determined, whether the music to be identified infringes the recording copyright of the standard music can be accurately determined, the accuracy and the fineness of music authentication are further improved, the music authentication process is more comprehensive, the target identification result of the music to be identified can be rapidly determined, and the workload of workers is reduced.
Wherein, according to each matching result obtained, the target identification result of the music to be identified is determined, and the method further comprises the following steps: if the song name and the lyrics of any standard music are respectively matched with the song name and the lyrics of the music, determining the copyright of the infringing word song as a target identification result; and if the song name and the lyrics of any standard music do not exist and are respectively matched with the song name and the lyrics of the music, determining the copyright of the non-infringing lyrics as a target identification result.
Under the condition that the words matched with the target words are not determined in the identification word library, whether the song names and the lyrics of the standard music are matched with the song names and the lyrics of the music to be identified can be determined, whether the music to be identified is infringing music can be determined, whether the music to be identified infringes the word copyright of the standard music can be accurately determined, the accuracy and the fineness of music authentication are further improved, the music authentication process is more comprehensive, the target identification result of the music to be identified can be rapidly determined, and the manual workload is reduced.
S309: the target authentication result of the music to be authenticated is determined manually.
S310: infringement is determined as a target authentication result of music to be authenticated.
S311: and determining non-infringement as a target authentication result of the music to be authenticated.
S202: and determining the music to be identified as sample music, and storing song information of the sample music and a corresponding target identification result.
S203: and when the number of the stored sample music is determined to be larger than the preset set number, counting the target identification result corresponding to each sample music.
The following steps S204 to S207 are executed for the target discrimination result corresponding to each sample music:
s204: and determining each sample music corresponding to the target identification result as target sample music, and performing word segmentation processing on song information of each target sample music to obtain target words contained in each target sample music.
S205: and determining a second number of target sample music containing the target words for each target word obtained in S204.
S206: and determining the top N target words according to the descending order of the second number.
S207: and determining that the second number is larger than a preset number threshold value in the first N target words, and identifying the target words of which the matched words are not stored in the vocabulary library and storing the target words into an identification vocabulary library corresponding to the target identification result.
The step S204 to the step S207 are used for dynamically updating the identification vocabulary library, so that the capability of identifying the music to be identified through the identification vocabulary library can be further improved, and the self-adaptability of the electronic equipment in the music authentication process is improved.
The present disclosure also provides a music authentication apparatus, and fig. 4 is a schematic structural diagram of a music authentication apparatus provided in an embodiment of the present disclosure, where the apparatus includes:
an acquisition unit 41 for acquiring song information of music to be authenticated;
a processing unit 42, configured to match target words included in the song information with words included in a pre-configured identification vocabulary library; the identification vocabulary library comprises at least one of a non-infringing vocabulary library and an infringing vocabulary library;
and the determining unit 43 is configured to determine, if there is a matching term, an authentication result corresponding to the matching term as a target authentication result of the music.
Because the principle of solving the problem of the music authentication device is similar to the music authentication method, the implementation of the music authentication device can be referred to the implementation of the method, and repeated details are not repeated.
In some possible embodiments, the determining unit 43 is specifically configured to determine that the authentication result corresponding to the matched word is non-infringement if the matched word belongs to the non-infringement vocabulary library, and determine that the non-infringement is the target authentication result; and if the matched words belong to the infringement vocabulary library, determining that the identification result corresponding to the matched words is infringement, and determining the infringement as the target identification result.
In some possible embodiments, the determining unit 43 is specifically configured to determine that the non-infringement vocabulary library includes a first non-infringement vocabulary library and a second non-infringement vocabulary library, and the infringement vocabulary library includes a first infringement vocabulary library and a second infringement vocabulary library, and if the matched word belongs to the first non-infringement vocabulary library, the authentication result corresponding to the matched word is determined as a non-infringement recording copyright, and the non-infringement recording copyright is determined as the target authentication result; the first non-infringing vocabulary library comprises at least one word of the recording copyright of standard music which is preset without infringing; if the matched words belong to the second non-infringing vocabulary library, determining that the identification result corresponding to the matched words is a non-infringing vocabulary copyright and determining the non-infringing vocabulary copyright as the target identification result; the second non-infringing vocabulary library comprises at least one word of the vocabulary copyright of standard music which is preset without infringing; if the matched words belong to the first infringement vocabulary library, determining the identification result corresponding to the matched words as an infringement recording copyright, and determining the infringement recording copyright as the target identification result; the first infringing vocabulary library comprises at least one infringing preset recording copyright word of standard music; if the matched words belong to the second infringing vocabulary library, determining the identification result corresponding to the matched words as infringing vocabulary copyright, and determining the infringing vocabulary copyright as the target identification result; the second infringing vocabulary library comprises at least one infringing preset word of the vocabulary copyright of the standard music.
In some possible embodiments, the determining unit 43 is further configured to determine a first letter sequence of the target word if there is no matching word; determining whether the first letter sequence is matched with a second letter sequence corresponding to preset standard music; the second letter sequence corresponding to any standard music is determined by the letter sequence of the standard words of the standard music; the standard words are determined according to song information of the standard music; and determining the target identification result according to the matching result.
In some possible embodiments, the determining unit 43 is specifically configured to determine the target word as the first pinyin sequence if the target word is english; if the target word is Chinese, determining the pinyin of each character contained in the target word; and determining the first letter sequence according to the pinyin of each character.
In some possible embodiments, the apparatus further comprises: a first update unit;
and the first updating unit is used for replacing each character contained in the target word according to a target character corresponding to a preset matched character to be replaced if the character is determined to be matched with the preset character to be replaced.
In some possible embodiments, the determining unit 43 is further configured to perform a subsequent step of matching the first letter sequence with the preconfigured second letter sequence, respectively, if the first letter sequence does not match the preconfigured third letter sequence of the stop word.
In some possible embodiments, the determining unit 43 is specifically configured to determine whether the first letter sequence matches the second letter sequence corresponding to the preset standard music by at least one of the following methods:
determining whether said first subsequence of song title terms respectively matches said second subsequence of standard song title terms for each standard music;
determining whether the third subsequence of art name terms matches the fourth subsequence of standard art name terms for each standard music, respectively;
determining whether the fifth subsequence of album title words respectively matches the sixth subsequence of standard album title words for each of the standard music;
determining whether the seventh subsequence of lyric words matches the eighth subsequence of standard lyric words for each standard music, respectively.
In some possible embodiments, the determining unit 43 is specifically configured to determine a first similarity between the first subsequence and each of the second subsequences; if the first similarity corresponding to any standard music meets a preset first requirement, determining that the song name of the standard music is matched with the song name of the music; otherwise, determining that the song name of the music does not match the song name of any standard music.
In some possible embodiments, the determining unit 43 is specifically configured to determine a second similarity between the third subsequence and each of the fourth subsequences; if the second similarity corresponding to any standard music meets a preset second requirement, determining that the artist name of the standard music is matched with the artist name of the music; otherwise, determining that the artist name of the music is not matched with the artist name of any standard music; or if the third subsequence is determined to contain any fourth subsequence, or any fourth subsequence contains the third subsequence, determining that the artist name of the standard music corresponding to the fourth subsequence with the containing relationship is matched with the artist name of the music; otherwise, determining that the artist name of the music does not match the artist name of any standard music.
In some possible embodiments, the determining unit 43 is specifically configured to determine a third similarity between the fifth sub-sequence and each sixth sub-sequence; if the third similarity corresponding to any standard music meets a preset third requirement, determining that the album name of the standard music is matched with the album name of the music; otherwise, determining that the album name of the music does not match the album name of any standard music.
In some possible embodiments, the determining unit 43 is specifically configured to determine the frequency of occurrence of each lyric word in the lyrics of the music respectively; determining each lyric word with the frequency meeting a preset fourth requirement as a target lyric word; determining a seventh subsequence of each target lyric word and a first number consistent with each eighth subsequence corresponding to the standard music for each standard music; the eighth subsequence is the letter sequence of standard lyric words with the frequency meeting the fourth requirement in the lyrics of the standard music; if the first quantity corresponding to any standard music meets a preset fifth requirement, determining that the lyrics of the standard music are matched with the lyrics of the music; otherwise, determining that the lyrics of the music do not match the lyrics of any standard music.
In some possible embodiments, the determining unit 43 is specifically configured to determine infringement recording copyright as the target authentication result if the song name, the album name, and the artist name of any standard music respectively match with the song name, the album name, and the artist name of the music; and if the song name, the album name and the artist name of any standard music do not exist and are respectively matched with the song name, the album name and the artist name of the music, determining the non-infringement recording copyright as the target identification result.
In some possible embodiments, the determining unit 43 is specifically configured to determine infringement word song copyright as the target identification result if the song name and the lyrics of any standard music match with the song name and the lyrics of the music, respectively; and if the song name and the lyrics of any standard music do not exist and are respectively matched with the song name and the lyrics of the music, determining the copyright of the non-infringing lyrics as the target identification result.
In some possible embodiments, the apparatus further comprises: a second updating unit;
the second updating unit is used for determining the music as sample music; and if the fact that the set number of sample music is obtained is determined, updating the identification vocabulary library according to the target identification result corresponding to each sample music and the target words contained in each sample music.
In some possible embodiments, the second updating unit is specifically configured to, for a target identification result corresponding to each sample music, determine each sample music corresponding to the target identification result as a target sample music; for each target word contained in the target sample music, determining a second number of target sample music containing the target word; adding the first N target words into an identification word library corresponding to the target identification result according to the sequence of the second number from large to small; wherein N is an integer not less than 1.
Fig. 5 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure, and on the basis of the foregoing embodiments, an embodiment of the present disclosure further provides an electronic device, as shown in fig. 5, including: the system comprises a processor 51, a communication interface 52, a memory 53 and a communication bus 54, wherein the processor 51, the communication interface 52 and the memory 53 are communicated with each other through the communication bus 54;
the memory 53 has stored therein a computer program which, when executed by the processor 51, causes the processor 51 to perform the steps of:
acquiring song information of music to be identified;
matching target words contained in the song information with words contained in a pre-configured identification vocabulary library; the identification vocabulary library comprises at least one of a non-infringing vocabulary library and an infringing vocabulary library;
and if the matched words exist, determining the identification result corresponding to the matched words as the target identification result of the music.
Because the principle of the electronic device for solving the problem is similar to the music authentication method, the implementation of the electronic device can refer to the implementation of the method, and repeated details are not repeated.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface 52 is used for communication between the above-described electronic apparatus and other apparatuses.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
On the basis of the foregoing embodiments, the embodiments of the present disclosure further provide a computer-readable storage medium, in which a computer program executable by a processor is stored, and when the program runs on the processor, the processor is caused to execute the following steps:
acquiring song information of music to be identified;
matching target words contained in the song information with words contained in a pre-configured identification vocabulary library; the identification vocabulary library comprises at least one of a non-infringing vocabulary library and an infringing vocabulary library;
and if the matched words exist, determining the identification result corresponding to the matched words as the target identification result of the music.
Since the principle of solving the problem of the computer-readable storage medium is similar to the music authentication method, the specific implementation may refer to the implementation of the data processing method, and repeated details are not repeated.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the present disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications can be made in the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims (10)

1. A music authentication method, characterized in that the method comprises:
acquiring song information of music to be identified;
matching target words contained in the song information with words contained in a pre-configured identification vocabulary library; the identification vocabulary library comprises at least one of a non-infringing vocabulary library and an infringing vocabulary library;
and if the matched words exist, determining the identification result corresponding to the matched words as the target identification result of the music.
2. The method of claim 1, wherein the determining the discrimination corresponding to the matched word as the target discrimination of the music comprises:
if the matched words belong to the non-infringement vocabulary library, determining that the identification result corresponding to the matched words is non-infringement, and determining the non-infringement as the target identification result;
and if the matched words belong to the infringement vocabulary library, determining that the identification result corresponding to the matched words is infringement, and determining the infringement as the target identification result.
3. The method of claim 1, further comprising:
if no matched words exist, determining a first letter sequence of the target words;
determining whether the first letter sequence is matched with a second letter sequence corresponding to preset standard music; the second letter sequence corresponding to any standard music is determined by the letter sequence of the standard words of the standard music; the standard words are determined according to song information of the standard music;
and determining the target identification result according to the matching result.
4. The method of claim 3, wherein the combination of the first letter sequence and the second letter sequence comprises at least one of a first subsequence of song title words and a second subsequence of standard song title words, a third subsequence of artist noun words and a fourth subsequence of standard artist name words, a fifth subsequence of album title words and a sixth subsequence of standard album title words, and a seventh subsequence of lyric words and an eighth subsequence of standard lyric words, and wherein whether the first letter sequence matches the second letter sequence corresponding to the preset standard music is determined by at least one of:
determining whether said first subsequence of song title terms respectively matches said second subsequence of standard song title terms for each standard music;
determining whether the third subsequence of art name terms matches the fourth subsequence of standard art name terms for each standard music, respectively;
determining whether the fifth subsequence of album title words respectively matches the sixth subsequence of standard album title words for each of the standard music;
determining whether the seventh subsequence of lyric words matches the eighth subsequence of standard lyric words for each standard music, respectively.
5. The method of claim 4, wherein determining the target authentication result according to the matching result comprises:
if the song name, the album name and the artist name of any standard music are respectively matched with the song name, the album name and the artist name of the music, determining infringement recording copyright as the target identification result;
and if the song name, the album name and the artist name of any standard music do not exist and are respectively matched with the song name, the album name and the artist name of the music, determining the non-infringement recording copyright as the target identification result.
6. The method of claim 4, wherein determining the target authentication result according to the matching result comprises:
if the song name and the lyrics of any standard music are respectively matched with the song name and the lyrics of the music, determining the copyright of the infringing word song as the target identification result;
and if the song name and the lyrics of any standard music do not exist and are respectively matched with the song name and the lyrics of the music, determining the copyright of the non-infringing lyrics as the target identification result.
7. The method according to any one of claims 3-6, further comprising:
determining the music as sample music;
and if the fact that the set number of sample music is obtained is determined, updating the identification vocabulary library according to the target identification result corresponding to each sample music and the target words contained in each sample music.
8. A music authentication apparatus, characterized in that the apparatus comprises:
an acquisition unit configured to acquire song information of music to be identified;
the processing unit is used for matching target words contained in the song information with words contained in a pre-configured identification vocabulary library; the identification vocabulary library comprises at least one of a non-infringing vocabulary library and an infringing vocabulary library;
and the determining unit is used for determining the identification result corresponding to the matched word as the target identification result of the music if the matched word exists.
9. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to carry out the steps of the music authentication method according to any of claims 1-7 when executing a computer program stored in the memory.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the music authentication method according to any one of claims 1 to 7.
CN202110459549.7A 2021-04-27 2021-04-27 Music authentication method, device, equipment and medium Active CN113094543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110459549.7A CN113094543B (en) 2021-04-27 2021-04-27 Music authentication method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110459549.7A CN113094543B (en) 2021-04-27 2021-04-27 Music authentication method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN113094543A true CN113094543A (en) 2021-07-09
CN113094543B CN113094543B (en) 2023-03-17

Family

ID=76680312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110459549.7A Active CN113094543B (en) 2021-04-27 2021-04-27 Music authentication method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113094543B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569086A (en) * 2021-08-05 2021-10-29 深圳墨世科技有限公司 Quadrature library aggregation method and device, terminal equipment and readable storage medium
CN117493532A (en) * 2023-12-29 2024-02-02 深圳智汇创想科技有限责任公司 Text processing method, device, equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020065653A1 (en) * 2000-11-29 2002-05-30 International Business Machines Corporation Method and system for the automatic amendment of speech recognition vocabularies
JP2005122665A (en) * 2003-10-20 2005-05-12 Sony Corp Electronic equipment apparatus, method for updating related word database, and program
CN101552000A (en) * 2009-02-25 2009-10-07 北京派瑞根科技开发有限公司 Music similarity processing method
CN103064928A (en) * 2012-12-21 2013-04-24 北京二六三企业通信有限公司 Method and device for filtering junk files based on key words
US9147393B1 (en) * 2013-02-15 2015-09-29 Boris Fridman-Mintz Syllable based speech processing method
US20160283582A1 (en) * 2013-11-04 2016-09-29 Beijing Qihoo Technology Company Limited Device and method for detecting similar text, and application
CN106227746A (en) * 2016-07-14 2016-12-14 看见网络科技(上海)有限公司 Web information processing method and system
US9836619B1 (en) * 2017-02-13 2017-12-05 Tunego, Inc. Digital vault for music owners
CN108628822A (en) * 2017-03-24 2018-10-09 阿里巴巴集团控股有限公司 Recognition methods without semantic text and device
CN109344570A (en) * 2018-09-30 2019-02-15 真相网络科技(北京)有限公司 A kind of internet music determination method tortious and system
CN111899762A (en) * 2020-06-30 2020-11-06 平安科技(深圳)有限公司 Melody similarity evaluation method and device, terminal equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020065653A1 (en) * 2000-11-29 2002-05-30 International Business Machines Corporation Method and system for the automatic amendment of speech recognition vocabularies
JP2005122665A (en) * 2003-10-20 2005-05-12 Sony Corp Electronic equipment apparatus, method for updating related word database, and program
CN101552000A (en) * 2009-02-25 2009-10-07 北京派瑞根科技开发有限公司 Music similarity processing method
CN103064928A (en) * 2012-12-21 2013-04-24 北京二六三企业通信有限公司 Method and device for filtering junk files based on key words
US9147393B1 (en) * 2013-02-15 2015-09-29 Boris Fridman-Mintz Syllable based speech processing method
US20160283582A1 (en) * 2013-11-04 2016-09-29 Beijing Qihoo Technology Company Limited Device and method for detecting similar text, and application
CN106227746A (en) * 2016-07-14 2016-12-14 看见网络科技(上海)有限公司 Web information processing method and system
US9836619B1 (en) * 2017-02-13 2017-12-05 Tunego, Inc. Digital vault for music owners
CN108628822A (en) * 2017-03-24 2018-10-09 阿里巴巴集团控股有限公司 Recognition methods without semantic text and device
CN109344570A (en) * 2018-09-30 2019-02-15 真相网络科技(北京)有限公司 A kind of internet music determination method tortious and system
CN111899762A (en) * 2020-06-30 2020-11-06 平安科技(深圳)有限公司 Melody similarity evaluation method and device, terminal equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569086A (en) * 2021-08-05 2021-10-29 深圳墨世科技有限公司 Quadrature library aggregation method and device, terminal equipment and readable storage medium
CN113569086B (en) * 2021-08-05 2024-01-26 深圳墨世科技有限公司 Method, device, terminal equipment and readable storage medium for aggregating curved libraries
CN117493532A (en) * 2023-12-29 2024-02-02 深圳智汇创想科技有限责任公司 Text processing method, device, equipment and storage medium
CN117493532B (en) * 2023-12-29 2024-03-29 深圳智汇创想科技有限责任公司 Text processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113094543B (en) 2023-03-17

Similar Documents

Publication Publication Date Title
US7921116B2 (en) Highly meaningful multimedia metadata creation and associations
CN101558591B (en) Content management system
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
CN111737499B (en) Data searching method based on natural language processing and related equipment
US9251248B2 (en) Using context to extract entities from a document collection
CN113094543B (en) Music authentication method, device, equipment and medium
WO2015176431A1 (en) Method and device for generating test data
CN109359026A (en) Log reporting method, device, electronic equipment and computer readable storage medium
US20140041037A1 (en) Detecting pirated applications
CN111046221A (en) Song recommendation method and device, terminal equipment and storage medium
CN112214984B (en) Content plagiarism identification method, device, equipment and storage medium
CN111767393A (en) Text core content extraction method and device
US20120167748A1 (en) Automatically acquiring feature segments in a music file
CN111090813A (en) Content processing method and device and computer readable storage medium
US20170011480A1 (en) Data analysis system, data analysis method, and data analysis program
CN105095304A (en) Log template generation method and equipment
CN115291836A (en) Automatic threat modeling identification system and method based on STRIDE method
CN113656575B (en) Training data generation method and device, electronic equipment and readable medium
CN113407775B (en) Video searching method and device and electronic equipment
CN117763510A (en) Webpage identification method, device, equipment, medium and program product
CN112163415A (en) User intention identification method and device for feedback content and electronic equipment
CN111813964B (en) Data processing method based on ecological environment and related equipment
CN111353301B (en) Auxiliary secret determination method and device
CN114490929A (en) Bidding information acquisition method and device, storage medium and terminal equipment
CN113886263A (en) System testing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant