CN110489588B

CN110489588B - Audio detection method, device, server and storage medium

Info

Publication number: CN110489588B
Application number: CN201910791492.3A
Authority: CN
Inventors: 牛闯
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-08-26
Filing date: 2019-08-26
Publication date: 2021-11-02
Anticipated expiration: 2039-08-26
Also published as: CN110489588A

Abstract

The disclosure relates to an audio detection method, an audio detection device, a server and a storage medium, and belongs to the technical field of networks. The method comprises the following steps: acquiring an audio file uploaded by a user and a certification file of the audio file; carrying out first detection on the audio file according to a plurality of original audio data to obtain a first detection result; and when the first detection result indicates that the audio file is the original audio, performing second detection on the audio file according to the certification file to obtain a second detection result, wherein the second detection result is used for indicating whether the audio file is the original audio of the user. The present disclosure may improve the efficiency and accuracy of audio detection.

Description

Audio detection method, device, server and storage medium

Technical Field

The present disclosure relates to the field of network technologies, and in particular, to an audio detection method, an audio detection device, a server, and a storage medium.

Background

Along with the continuous specification of the music market, people have stronger and stronger consciousness on the copyright of songs, and only original songs can obtain copyright protection, so that the original detection on the songs is very important. The original creation refers to creation or creation of a brand-new work, and is not a work of copying, reprogramming, plagiarism, imitation, plagiarism and secondary creation.

In the related art, after a musician uploads a song on a music platform independently, an operator may perform manual review on the song, for example, search from each music platform, and determine whether the song is the original song of the musician according to a search result, for example, determine that the song is the original song of the musician if the song of a same-name performer is searched.

According to the technology, through manual examination of the originality of the songs, a large amount of searching operation is required, the efficiency is low, the manual judgment is too subjective, and the accuracy is poor.

Disclosure of Invention

The present disclosure provides an audio detection method, apparatus, server and storage medium to at least solve the problems of low efficiency and poor accuracy in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided an audio detection method, including:

acquiring an audio file uploaded by a user and a certification file of the audio file;

carrying out first detection on the audio file according to a plurality of original audio data to obtain a first detection result;

and when the first detection result indicates that the audio file is the original audio, performing second detection on the audio file according to the certification file to obtain a second detection result, wherein the second detection result is used for indicating whether the audio file is the original audio of the user.

In one possible implementation manner, the performing a first detection on the audio file according to a plurality of original audio data to obtain a first detection result includes:

detecting a repetition degree between the audio file and each original audio data according to the audio information of the audio file and the audio information of the plurality of original audio data, wherein the audio information comprises audio texts and audio melodies;

and when the repetition degree between the audio file and each original audio data is less than or equal to a repetition degree threshold value, determining that the audio file is the original audio.

In one possible implementation, after the detecting the degree of duplication between the audio file and each original audio data, the method further includes:

when the repetition degree between the audio file and any original audio data is larger than the repetition degree threshold value and the audio file contains voice, detecting the voice similarity between the audio file and any original audio data;

and when the voice similarity is greater than a similarity threshold value, determining that the audio file is original audio.

In one possible implementation, after detecting the human voice similarity between the audio file and the any original audio data, the method further includes:

and when the voice similarity is smaller than or equal to the similarity threshold value, determining the non-original audio of the audio file.

and when the repetition degree between the audio file and any original audio data is greater than the repetition degree threshold value and the audio file does not contain human voice, determining that the audio file is non-original audio.

In one possible implementation, the certification document includes at least one of a certificate picture and a target document, the target document including a plurality of audio track files of the audio file;

the second detection of the audio file according to the certification file to obtain a second detection result includes:

and performing second detection on the audio file according to at least one of a matching result of the certificate picture and a preset database and the similarity between the target file and the audio file to obtain a second detection result.

In a possible implementation manner, before performing the second detection on the audio file according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file, and obtaining the second detection result, the method further includes at least one of:

identifying the certificate picture, extracting copyright information in the certificate picture, wherein the copyright information comprises a registration number, a work name and author information, and inquiring a copyright database according to the registration number, wherein the copyright database is used for storing a plurality of copyright certificates;

and detecting the similarity between the audio file and the target file according to the audio information of the audio track files in the target file and the audio information of the audio file, wherein the audio information comprises duration and melody.

In a possible implementation manner, the performing, according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file, a second detection on the audio file to obtain a second detection result includes any one of:

when the author information of the copyright certificate inquired according to the registration number is the same as the author information on the certificate picture and the real-name authentication information of the user, determining that the audio file is the original audio of the user;

when the similarity between the target file and the audio file is larger than a similarity threshold value, determining that the audio file is the original audio of the user;

and when the author information of the copyright certificate inquired according to the registration number is the same as the author information on the certificate picture and the real-name authentication information of the user, and the similarity between the target file and the audio file is greater than a similarity threshold value, determining that the audio file is the original audio of the user.

when a copyright certificate is not inquired according to the registration number on the certificate picture, determining that the audio file is not the original audio of the user;

when the author information of the copyright certificate inquired according to the registration number is different from the author information on the certificate picture or the real-name authentication information of the user, determining that the audio file is not the original audio of the user;

when the similarity between the target file and the audio file is smaller than or equal to the similarity threshold value, determining that the audio file is not the original audio of the user.

In one possible implementation manner, before the obtaining of the audio file uploaded by the user and the certification file of the audio file, the method further includes:

receiving a registration request of the user;

performing real-name authentication on the user to obtain real-name authentication information of the user;

and storing the real-name authentication information of the user.

In one possible implementation manner, after receiving the registration request of the user, the method further includes:

carrying out identity detection on the user to obtain a first identity characteristic of the user;

storing a first identity characteristic of the user.

In one possible implementation manner, the obtaining an audio file uploaded by a user and a certification file of the audio file includes:

receiving an uploading request of the user, wherein the uploading request is used for requesting to upload the audio file and the certification file;

carrying out identity detection on the user to obtain a second identity characteristic of the user;

and when the similarity between the second identity feature and the first identity feature is greater than a similarity threshold value, acquiring the audio file and the certification file uploaded by the user.

In a possible implementation manner, after performing the second detection on the audio file according to the certification file and obtaining a second detection result, the method further includes:

when the second detection result indicates that the audio file is the original audio of the user, marking the audio file as the original audio and storing the original audio in an audio database.

According to a second aspect of the embodiments of the present disclosure, there is provided an audio detection apparatus comprising:

the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is configured to acquire an audio file uploaded by a user and a certification file of the audio file;

the first detection unit is configured to perform first detection on the audio file according to a plurality of original audio data to obtain a first detection result;

and the second detection unit is configured to perform second detection on the audio file according to the certification file when the first detection result indicates that the audio file is the original audio, so as to obtain a second detection result, wherein the second detection result is used for indicating whether the audio file is the original audio of the user.

In one possible implementation, the first detection unit is configured to perform:

In one possible implementation, the first detection unit is further configured to perform:

In one possible implementation manner, the first detection unit is further configured to perform determining that the audio file is non-original audio when the human voice similarity is less than or equal to the similarity threshold.

In one possible implementation, the first detection unit is further configured to perform determining that the audio file is non-original audio when a degree of duplication between the audio file and any original audio data is greater than the threshold degree of duplication, and the audio file does not contain human voice.

the second detection unit is configured to perform second detection on the audio file according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file, so as to obtain a second detection result.

In one possible implementation, the second detection unit is further configured to perform at least one of:

In one possible implementation, the second detection unit is configured to perform any one of:

In one possible implementation, the apparatus further includes:

a receiving unit configured to perform receiving a registration request of the user;

the authentication unit is configured to perform real-name authentication on the user to obtain real-name authentication information of the user;

a first storage unit configured to perform storing of real-name authentication information of the user.

In one possible implementation, the apparatus further includes:

a third detection unit configured to perform identity detection on the user, obtaining a first identity characteristic of the user;

the first storage unit is further configured to perform storing a first identity feature of the user.

In one possible implementation, the obtaining unit is configured to perform:

In one possible implementation, the apparatus further includes:

a second storage unit configured to perform, when the second detection result indicates that the audio file is original audio of the user, marking the audio file as original audio and storing the original audio in an audio database.

According to a third aspect of the embodiments of the present disclosure, there is provided a server, including:

one or more processors;

one or more memories for storing the one or more processor-executable instructions;

wherein the one or more processors are configured to execute the instructions to implement the audio detection method as described in the first aspect or any one of the possible implementations of the first aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium having instructions that, when executed by a processor of a server, enable the server to perform the audio detection method according to the first aspect or any one of the possible implementations of the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, wherein the instructions of the computer program product, when executed by a processor of a server, enable the server to perform the audio detection method according to the first aspect or any one of the possible implementations of the first aspect.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

after the audio file and the certification file are uploaded by the user, the audio file uploaded by the user is detected according to the original audio data, and after the audio file uploaded by the user is detected to be the original audio, the audio file uploaded by the user is further detected according to the certification file, so that whether the audio file uploaded by the user is the original audio of the user is detected. The technical scheme combines audio detection and copyright detection to realize automatic detection of audio originality, and can improve the efficiency and accuracy of audio detection.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a flow diagram illustrating a method of audio detection according to an exemplary embodiment.

FIG. 2 is a flow diagram illustrating a method of audio detection according to an example embodiment.

Fig. 3 is a block diagram illustrating an audio detection device according to an exemplary embodiment.

Fig. 4 is a block diagram illustrating a server 400 according to an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims. The user information to which the present disclosure relates may be information authorized by the user or sufficiently authorized by each party.

Fig. 1 is a flowchart illustrating an audio detection method according to an exemplary embodiment, where the audio detection method is used in a server, as shown in fig. 1, and includes the following steps:

in step S11, an audio file uploaded by the user and a certification file of the audio file are acquired.

In step S12, a first detection is performed on the audio file according to the plurality of original audio data, and a first detection result is obtained.

In step S13, when the first detection result indicates that the audio file is the original audio, performing a second detection on the audio file according to the certification file to obtain a second detection result, where the second detection result is used to indicate whether the audio file is the original audio of the user.

According to the method provided by the embodiment of the disclosure, after the audio file and the certification file are uploaded by the user, the audio file uploaded by the user is detected according to the original audio data, and after the audio file uploaded by the user is detected to be the original audio, the audio file uploaded by the user is further detected according to the certification file, so that whether the audio file uploaded by the user is the original audio of the user is detected. The technical scheme combines audio detection and copyright detection to realize automatic detection of audio originality, and can improve the efficiency and accuracy of audio detection.

detecting the repeatability between the audio file and each original audio data according to the audio information of the audio file and the audio information of the plurality of original audio data, wherein the audio information comprises audio texts and audio melodies;

and when the repetition degree between the audio file and each original audio data is less than or equal to the repetition degree threshold value, determining that the audio file is the original audio.

In one possible implementation, after detecting the degree of duplication between the audio file and each original audio data, the method further includes:

when the repetition degree between the audio file and any original audio data is greater than the repetition degree threshold value and the audio file contains voice, detecting the voice similarity between the audio file and any original audio data;

and when the voice similarity is greater than the similarity threshold value, determining that the audio file is the original audio.

In one possible implementation manner, after detecting the human voice similarity between the audio file and the any original audio data, the method further includes:

and when the human voice similarity is less than or equal to the similarity threshold value, determining the non-original audio of the audio file.

and when the repetition degree between the audio file and any original audio data is greater than the repetition degree threshold value and the audio file does not contain human voice, determining that the audio file is the non-original audio.

In one possible implementation, the certification document includes at least one of a certification picture and a target document, the target document including a plurality of audio track files of the audio file;

the second detection is performed on the audio file according to the certification file to obtain a second detection result, and the method comprises the following steps:

In a possible implementation manner, before the obtaining the second detection result according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file, the method further includes at least one of:

In a possible implementation manner, the obtaining the second detection result according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file includes any one of the following:

and when the similarity between the target file and the audio file is less than or equal to the similarity threshold, determining that the audio file is not the original audio of the user.

receiving a registration request of the user;

the real name authentication information of the user is stored.

In one possible implementation, after receiving the registration request of the user, the method further includes:

a first identity characteristic of the user is stored.

and when the similarity between the second identity characteristic and the first identity characteristic is greater than a similarity threshold value, acquiring the audio file and the certification file uploaded by the user.

when the second detection result indicates that the audio file is the original audio of the user, the audio file is marked as the original audio and is stored in an audio database.

Fig. 2 is a flowchart illustrating an audio detection method according to an exemplary embodiment, where the audio detection method is used in a server, as shown in fig. 2, and includes the following steps.

In step S21, an audio file uploaded by the user and a certification file of the audio file are acquired.

Wherein the audio file may be a song file and the certification file may include at least one of a certification picture and a target file including a plurality of audio track files of the audio file. The certificate picture may be a picture obtained by shooting or scanning a copyright certificate, the target file may be a split file or an engineering file, the split file is a file obtained by separately splitting audio tracks of various musical instruments in the same audio, and the engineering file is a file obtained by arranging audio tracks of various musical instruments in the same audio.

In the embodiment of the invention, a user can upload on a terminal and trigger the terminal to send an upload request of the user to a server, wherein the upload request is used for requesting to upload the audio file and the certification file, and the upload request can carry the audio file and the certification file, so that the server can obtain the audio file and the certification file from the upload request.

In a possible implementation manner, after receiving an upload request of a user, the server may perform identity detection on the user to obtain a second identity characteristic of the user; and when the similarity between the second identity characteristic and the first identity characteristic is greater than a similarity threshold value, acquiring the audio file and the certification file uploaded by the user.

The first identity characteristic and the second identity characteristic are obtained by carrying out identity detection on the user at different occasions. The first identity characteristic may be an identity characteristic that is acquired and stored by the server in advance, for example, when the user is registered, the server may receive a registration request of the user, perform identity detection on the user, obtain the first identity characteristic of the user, and store the first identity characteristic of the user. The second identity characteristic is the identity characteristic obtained by the server through identity detection of the user when the user requests to upload the audio and the certification file. For the identity detection mode, the server may adopt a biometric technology to perform identity detection on the user, for example, the server may identify a face image uploaded by the user to obtain a face feature included in the face image, and the face feature is used as the identity feature of the user.

After the server acquires the second identity characteristic of the user, the server can calculate the similarity between the second identity characteristic and the first identity characteristic, and if the similarity is greater than a similarity threshold value, the server can acquire the audio and the certification file from the uploading request of the user. If the similarity is smaller than or equal to the similarity threshold, the server can return uploading failure information, and the uploading failure information can prompt that the user identities are not matched and cannot be uploaded. By carrying out identity detection on the user before uploading, the risk caused by uploading the user by other people can be avoided, and the safety is improved.

In one possible implementation manner, before uploading the audio, the user may register on the server and perform real-name authentication, for example, the user may perform a registration operation on the terminal to trigger the terminal to send a registration request to the server, and after receiving the registration request of the user, the server may perform real-name authentication on the user to obtain real-name authentication information of the user; the real name authentication information of the user is stored. The real-name authentication information may be information on an identity card of the user, including an identity card number, a name, and the like. After receiving the registration request, the server may return to the real-name authentication page, and the user may fill in the real-name authentication information on the real-name authentication page, then click to submit, and trigger the terminal to submit the real-name authentication information of the user to the server, so that the server may obtain the real-name authentication information of the user. By carrying out real-name authentication, the server can subsequently detect the originality of the audio by using real-name authentication information, and the reliability of the detection result is improved.

It should be noted that the server may also obtain the audio file and the certification file uploaded by the user through different steps, for example, the server may obtain only the audio file uploaded by the user in step S21, and obtain the certification file uploaded by the user in subsequent steps, for example, after the server obtains the first detection result through performing step S22, if the first detection result indicates that the audio file is the original audio, at this time, the server may obtain the certification file uploaded by the user, specifically, the server may send an upload prompt message to the terminal where the user is located to prompt the user to upload the certification file, and the user may upload operation on the terminal to trigger the terminal to send the certification file to the server.

In step S22, a first detection is performed on the audio file uploaded by the user according to the plurality of original audio data, so as to obtain a first detection result.

The plurality of original audio data may be a plurality of audio files marked as original audio in an audio database, and the plurality of original audio data may be audio files uploaded by a plurality of users and detected through originality. The first detection in this step and the second detection in the subsequent step may represent different detection modes.

In one possible implementation manner, the first detecting the audio file uploaded by the user according to the plurality of original audio data to obtain the first detection result may include at least two steps of the following steps a1 to a 6:

step a1, detecting the duplication degree between the audio file uploaded by the user and each original audio data according to the audio information of the audio file uploaded by the user and the audio information of the plurality of original audio data, wherein the audio information comprises audio text and audio melody.

Wherein the audio text may be lyrics. The server may employ audio matching techniques to detect the degree of duplication between the audio files uploaded by the user and each of the original audio data. For each original audio data, the server can determine a repeated audio segment between the audio file uploaded by the user and the original audio data according to the audio information of the audio file uploaded by the user and the audio information of the original audio data, wherein the repeated audio segment can be an audio segment with audio text similarity and audio melody similarity both greater than a similarity threshold, the server can calculate the percentage of the length of the repeated audio segment in the length of the whole audio file uploaded by the user, and the percentage is used as the repetition degree between the audio file uploaded by the user and the original audio data.

Step a2, when the repetition degree between the audio file uploaded by the user and each original audio data is less than or equal to the repetition degree threshold value, determining that the audio file uploaded by the user is the original audio.

For each original audio data, if the repetition degree between the audio file uploaded by the user and the original audio is less than or equal to the repetition degree threshold value, the audio file uploaded by the user and the original audio can be considered to be not repeated, and if the audio file uploaded by the user and the original audio data are not repeated, the server can determine that the audio file uploaded by the user is the original audio.

Step a3, when the duplication degree between the audio file and any original audio data is greater than the duplication degree threshold value and the audio file contains human voice, detecting the human voice similarity between the audio file and any original audio data.

If the repetition degree between the audio file uploaded by the user and any original audio data is greater than the repetition degree threshold value, the audio file uploaded by the user and the original audio data can be considered to be repeated, at this time, the server can detect the type (including voice and pure music) of the audio file uploaded by the user, namely, judge whether the audio file uploaded by the user contains voice, and if the audio file uploaded by the user contains voice, detect whether the voice of the audio file uploaded by the user and the voice of the original audio data are the same, for example, the server can obtain the voiceprint characteristics of the audio file uploaded by the user and the voiceprint characteristics of the original audio data through voiceprint detection, calculate the similarity between the voiceprint characteristics and the voiceprint characteristics, and use the similarity as the voiceprint similarity between the audio file uploaded by the user and the original audio data.

Step a4, when the voice similarity is larger than the similarity threshold, determining the audio file as original audio.

If the audio file uploaded by the user is repeated with any original audio data and the similarity of the voice is high, the audio file currently uploaded by the user can be considered as the original audio uploaded for the second time, and whether the original audio uploaded for the last time in the audio database is updated by using the audio file currently uploaded by the user can be determined according to the tone quality subsequently.

Step a5, when the similarity of the human voice is less than or equal to the similarity threshold, determining the audio file is non-original audio.

If the audio file uploaded by the user is repeated with any original audio data and the similarity of the voice is low, the audio file uploaded by the user currently can be considered as the singing audio and is not the original audio.

Step a6, when the duplication degree between the audio file and any original audio data is larger than the duplication degree threshold value and the audio file does not contain human voice, determining that the audio file is non-original audio.

If the audio file uploaded by the user has duplication with any original audio data, but the audio file uploaded by the user does not contain human voice (pure music), it can be determined that the audio file uploaded by the user is not the original audio.

The first inspection process may include step a1 and step a2, or the first inspection process may include step a1, step a3, and step a4, or the first inspection process may include step a1, step a3, and step a5, or the first inspection process may include step a1 and step a 6. The audio detection is carried out on the audio file uploaded by the user according to the original audio data, whether the audio file uploaded by the user and the original audio data are repeated is specifically detected, whether the voice is the same can be further detected by combining with the audio type, and therefore whether the audio file uploaded by the user is the original audio is determined, the automatic detection of originality can be achieved, and the detection efficiency and accuracy are improved.

In step S23, when the first detection result indicates that the audio file uploaded by the user is the original audio, performing a second detection on the audio uploaded by the user according to the certification file to obtain a second detection result, where the second detection result is used to indicate whether the audio uploaded by the user is the original audio of the user.

In the embodiment of the disclosure, after the server performs audio detection on the audio file uploaded by the user according to the original audio and determines that the audio file uploaded by the user is the original audio, the server may further perform copyright detection on the audio file uploaded by the user according to the certification file uploaded by the user and determine whether the audio file uploaded by the user is the original audio of the user.

In step S23, the second detecting, performed on the audio file according to the certificate, for the certificate including at least one of the certificate picture and the target file, to obtain a second detection result, includes: according to at least one item of the matching result of the certificate picture and a preset database and the similarity between the target file and the audio file, the audio file is subjected to

And performing second detection to obtain a second detection result. Wherein the preset database may be a copyright database.

The second detection process will be described in three cases, the first case where the certificate includes a certificate picture, the second case where the certificate includes a target file, and the third case where the certificate includes a certificate picture and a target file.

For the first case that the certification document includes the certificate picture, the server may perform a second detection on the audio document according to a matching result of the certificate picture and the preset database to obtain the second detection result, and specifically may include at least two steps of the following step b1 to step b 4:

step b1, identifying the certificate picture, extracting copyright information therein, wherein the copyright information comprises a registration number, a work name and author information, and querying a copyright database according to the registration number, wherein the copyright database is used for storing a plurality of copyright certificates.

The server may identify the certificate picture by using an OCR (Optical Character Recognition) technology, and extract copyright information on the certificate picture. The copyright database can store the copyright certificate by taking the registration number as an index, and the server can query the copyright database according to the extracted registration number to see whether the copyright certificate corresponding to the registration number can be queried or not.

Step b2, when the author information of the copyright certificate inquired according to the registration number is the same as the author information on the certificate picture and the real name authentication information of the user, determining that the audio file is the original audio of the user.

If the server inquires the corresponding copyright certificate according to the registration number on the certificate picture, whether the author information on the inquired copyright certificate is the same as the author information on the certificate picture and the real-name authentication information of the user can be judged, and if the author information on the inquired copyright certificate is the same as the real-name authentication information of the user, the server can determine that the audio uploaded by the user is the original audio of the user.

Step b3, when the author information of the copyright certificate inquired according to the registration number is different from the author information on the certificate picture or the real name authentication information of the user, determining that the audio file is not the original audio of the user.

If the server inquires the corresponding copyright certificate according to the registration number on the certificate picture, but the author information on the inquired copyright certificate is different from the author information on the certificate picture or the real-name authentication information of the user, the server can determine the audio uploaded by the user. For example, if the author information on the queried copyright certificate is different from the author information on the certificate picture, the certificate picture uploaded by the user is likely to be a copyright certificate of which the author information is fake, or if the author information on the queried copyright certificate is different from the real-name authentication information of the user, the copyright certificate uploaded by the user may not be the copyright certificate of the user but may steal the copyright certificate of another person.

Step b4, when the copyright certificate is not inquired according to the registration number on the certificate picture, determining that the audio file is not the original audio of the user.

If the server does not inquire the corresponding copyright certificate according to the registration number on the certificate picture, the server indicates that the certificate picture uploaded by the user is a copyright certificate with a fake registration number, and the server can determine that the audio uploaded by the user is not the original audio of the user.

In this first case, the second inspection process may include step b1 and step b2, or the second inspection process may include step b1 and step b3, or the second inspection process may include step b1 and step b 4.

For the second case that the certification document includes the target document, the server may perform the second detection on the audio document according to the similarity between the target document and the audio document to obtain the second detection result, and specifically may include the following steps c1 to c 3:

step c1, according to the audio information of the audio track files in the target file and the audio information of the audio file, detecting the similarity between the audio file and the target file, wherein the audio information includes duration and melody.

The target file comprises a plurality of audio track files of the audio file, the server can judge whether the audio time lengths of the plurality of audio tracks in the target file are the same as the audio time length of the audio file uploaded by the user, if the audio time lengths are the same, the server can judge whether the melody of the plurality of audio tracks in the target file is similar to the melody of the audio file uploaded by the user, for example, the server can calculate the melody similarity between the melody of the target file and the melody of the audio file uploaded by the user, and the melody similarity is used as the similarity between the target file and the audio file uploaded by the user.

And c2, when the similarity between the target file and the audio file is greater than the similarity threshold, determining that the audio file is the original audio of the user.

If the similarity between the target file and the audio file uploaded by the user is large, the server can determine that the audio file uploaded by the user is synthesized by the target file, and thus can determine that the audio file uploaded by the user is the original audio of the user.

And c3, when the similarity between the target file and the audio file is less than or equal to the similarity threshold, determining that the audio file is not the original audio of the user.

If the similarity between the target file and the audio file uploaded by the user is small, the server can determine that the audio file uploaded by the user is not synthesized by the target file, and thus can determine that the audio file uploaded by the user is not the original audio of the user.

In this second case, the second inspection process may include step c1 and step c2, or the second inspection process may include step c1 and step c 3.

For the third case that the certification document includes the certificate picture and the target document, the server may perform the second detection on the audio document according to the matching result of the certificate picture and the preset database and the similarity between the target document and the audio document, so as to obtain a second detection result, which may specifically include the following steps d1 to d 3:

and d1, identifying the certificate picture, extracting copyright information in the certificate picture, wherein the copyright information comprises a registration number, a work name and author information, and inquiring a copyright database according to the registration number.

The step d1 is the same as the step b1, and is not repeated.

Step d2, according to the audio information of the audio track files in the target file and the audio information of the audio file, detecting the similarity between the audio file and the target file, wherein the audio information includes duration and melody.

The step d2 is the same as the step c1, and is not repeated.

Step d3, when the author information of the copyright certificate inquired according to the registration number is the same as the author information on the certificate picture and the real name authentication information of the user, and the similarity between the target file and the audio file is greater than the similarity threshold, determining that the audio file is the original audio of the user, otherwise, determining that the audio file is not the original audio of the user.

If the server inquires the corresponding copyright certificate according to the registration number on the certificate picture, and the similarity between the target file and the audio file uploaded by the user is obtained through calculation, the server can judge whether the author information on the inquired copyright certificate is the same as the author information on the certificate picture and the real-name authentication information of the user, judge whether the similarity between the target file and the audio file uploaded by the user is larger than a similarity threshold value, and if the two judgment results are yes, the server can determine that the audio file uploaded by the user is the original audio of the user.

If the server does not inquire the corresponding copyright certificate or inquires the copyright certificate according to the registration number on the certificate picture, but any one of the two judgment results is negative, the server can determine that the audio file uploaded by the user is not the original audio of the user. For example, when a copyright certificate is not queried according to the registration number on the certificate picture, the server may determine that the audio file is not the original audio of the user. When the author information of the copyright certificate inquired according to the registration number is different from the author information on the certificate picture or the real-name authentication information of the user, the server can determine that the audio file is not the original audio of the user. When the similarity between the target file and the audio file is less than or equal to the similarity threshold, the server may determine that the audio file is not the user's original audio.

In step S24, when the second detection result indicates that the audio file uploaded by the user is the original audio of the user, the audio file uploaded by the user is marked as original audio and stored in the audio database.

In the embodiment of the disclosure, after determining that the audio file uploaded by the user is the original audio of the user, the server may mark the audio file uploaded by the user as the original audio and store the original audio in the audio database. For step a4 in step S22, the audio file currently uploaded by the user is an original audio uploaded for the second time, in this case, the original audio uploaded for the last time is already stored in the audio database, the server may compare the tone quality of the original audio uploaded for the this time with the tone quality of the original audio uploaded for the last time, and if the tone quality of the original audio uploaded for the this time is higher than the tone quality of the original audio uploaded for the last time, the server may update the audio database with the original audio uploaded for the this time, for example, replace the original audio uploaded for the last time with the original audio uploaded for the this time. In addition, the server may also associate the user currently uploading the audio file with the certification file, for example, the server may establish an association relationship between real-name authentication information of the user and the certification file.

The step S24 is an optional step, and after determining that the audio file uploaded by the user is the original audio, the original audio is stored in the audio database, so that the number of the original audio is increased, more original audio can be used for performing original detection on other audio, and the accuracy of the original detection can be improved.

According to the technical scheme, the automatic process of audio detection is set up, the audio detection and the copyright detection are combined, the originality of the audio uploaded by the user is detected, and compared with manual auditing, the auditing efficiency, accuracy and objectivity of the audio detection are improved.

Fig. 3 is a block diagram illustrating an audio detection device according to an exemplary embodiment. Referring to fig. 3, the apparatus includes an acquisition unit 301, a first detection unit 302, and a second detection unit 303.

The obtaining unit 301 is configured to perform obtaining an audio file uploaded by a user and a certification file of the audio file;

the first detection unit 302 is configured to perform a first detection on the audio file according to a plurality of original audio data, resulting in a first detection result;

the second detecting unit 303 is configured to perform, when the first detection result indicates that the audio file is the original audio, a second detection on the audio file according to the certification file to obtain a second detection result, where the second detection result is used to indicate whether the audio file is the original audio of the user.

In one possible implementation, the first detection unit 302 is configured to perform:

In one possible implementation, the first detection unit 302 is further configured to perform:

In one possible implementation, the first detecting unit 302 is further configured to perform determining that the audio file is non-original audio when the human voice similarity is less than or equal to the similarity threshold.

In one possible implementation, the first detecting unit 302 is further configured to determine that the audio file is non-original audio when a degree of duplication between the audio file and any original audio data is greater than the threshold degree of duplication and the audio file does not contain human voice.

the second detecting unit 303 is configured to perform a second detection on the audio file according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file, so as to obtain a second detection result.

In one possible implementation, the second detection unit 303 is further configured to perform at least one of:

In one possible implementation, the second detection unit 303 is configured to perform any one of:

In one possible implementation, the apparatus further includes:

the third detection unit is configured to perform identity detection on the user to obtain a first identity characteristic of the user;

the first storage unit is further configured to perform storing a first identity characteristic of the user.

In one possible implementation, the obtaining unit 301 is configured to perform:

In one possible implementation, the apparatus further includes:

and the second storage unit is configured to perform storage of the audio file marked as original audio in an audio database when the second detection result indicates that the audio file is the original audio of the user.

In the embodiment of the disclosure, after the audio file and the certification file are uploaded by the user, the audio file uploaded by the user is detected according to the plurality of original audio data, and after the audio file uploaded by the user is detected to be the original audio, the audio file uploaded by the user is further detected according to the certification file, so as to detect whether the audio file uploaded by the user is the original audio of the user. The technical scheme combines audio detection and copyright detection to realize automatic detection of audio originality, and can improve the efficiency and accuracy of audio detection.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 4 is a block diagram illustrating a server 400 according to an exemplary embodiment, where the server 400 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 401 and one or more memories 402, where the memory 402 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 401 to implement the methods provided by the above method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, a storage medium comprising instructions, such as a memory 402 comprising instructions, executable by a processor 401 of the apparatus 400 to perform the above-described method is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product is also provided, in which instructions are executable by the processor 401 of the apparatus 400 to perform the above-described method.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An audio detection method, comprising:

acquiring an audio file uploaded by a user and a certification file of the audio file, wherein the certification file comprises at least one of a certificate picture and a target file, the certificate picture is a picture obtained by shooting or scanning a copyright certificate, and the target file comprises a plurality of audio track files of the audio file;

detecting the repeatability between the audio file and each original audio data according to the audio information of the audio file and the audio information of a plurality of original audio data, wherein the audio information comprises audio texts and audio melodies;

when the repetition degree between the audio file and each original audio data is less than or equal to a repetition degree threshold value, determining that the audio file is an original audio;

when the repetition degree between the audio file and any original audio data is larger than the repetition degree threshold value and the audio file contains voice, detecting the voice similarity between the audio file and any original audio data; when the voice similarity is larger than a similarity threshold value, determining that the audio file is original audio;

and when the audio file is the original audio, performing second detection on the audio file according to at least one of a matching result of the certificate picture and a preset database and the similarity between the target file and the audio file to obtain a second detection result, wherein the second detection result is used for indicating whether the audio file is the original audio of the user.

2. The audio detection method of claim 1, wherein after detecting the human voice similarity between the audio file and the any original audio data, the method further comprises:

3. The audio detection method of claim 1, wherein after detecting the degree of duplication between the audio file and each original audio data, the method further comprises:

4. The audio detection method according to claim 1, wherein before performing the second detection on the audio file according to at least one of a matching result of the certificate picture with a preset database and a similarity between the target file and the audio file, the method further comprises at least one of:

5. The audio detection method according to claim 4, wherein the second detection is performed on the audio file according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file, so as to obtain the second detection result, and the method includes any one of the following steps:

6. The audio detection method according to claim 4, wherein the second detection is performed on the audio file according to at least one of a matching result of the certificate picture and a preset database and a similarity between the target file and the audio file, so as to obtain the second detection result, and the method includes any one of the following steps:

when a copyright certificate is not inquired according to the registration number, determining that the audio file is not the original audio of the user;

7. The audio detection method according to claim 1, wherein before the obtaining the audio file uploaded by the user and the certification file of the audio file, the method further comprises:

receiving a registration request of the user;

and storing the real-name authentication information of the user.

8. The audio detection method of claim 7, wherein after receiving the registration request of the user, the method further comprises:

storing a first identity characteristic of the user.

9. The audio detection method according to claim 8, wherein the obtaining of the audio file uploaded by the user and the certification file of the audio file comprises:

10. The audio detection method according to claim 1, wherein after performing the second detection on the audio file according to the certification file to obtain a second detection result, the method further comprises:

11. An audio detection apparatus, comprising:

an acquisition unit configured to perform acquisition of an audio file uploaded by a user and a certification file of the audio file, the certification file including at least one of a certificate picture and a target file, the certificate picture being a picture obtained by shooting or scanning a copyright certificate, the target file including a plurality of audio track files of the audio file;

a first detection unit configured to detect a degree of duplication between the audio file and each original audio data, based on audio information of the audio file and audio information of a plurality of original audio data, the audio information including an audio text and an audio melody; when the repetition degree between the audio file and each original audio data is less than or equal to a repetition degree threshold value, determining that the audio file is an original audio; when the repetition degree between the audio file and any original audio data is larger than the repetition degree threshold value and the audio file contains voice, detecting the voice similarity between the audio file and any original audio data; when the voice similarity is larger than a similarity threshold value, determining that the audio file is original audio;

and the second detection unit is configured to perform second detection on the audio file according to at least one of a matching result of the certificate picture and a preset database and the similarity between the target file and the audio file when the audio file is the original audio, so as to obtain a second detection result, wherein the second detection result is used for indicating whether the audio file is the original audio of the user.

12. The audio detection apparatus according to claim 11, wherein the first detection unit is further configured to perform determining that the audio file is non-original audio when the human voice similarity is less than or equal to the similarity threshold.

13. The audio detection apparatus according to claim 11, wherein the first detection unit is further configured to perform determining that the audio file is non-original audio when a degree of duplication between the audio file and any original audio data is greater than the threshold degree of duplication, and the audio file does not contain human voice.

14. The audio detection apparatus according to claim 11, characterized in that the second detection unit is further configured to perform at least one of the following:

15. The audio detection apparatus according to claim 14, characterized in that the second detection unit is configured to perform any of the following:

16. The audio detection apparatus according to claim 14, characterized in that the second detection unit is configured to perform any of the following:

17. The audio detection apparatus of claim 11, further comprising:

18. The audio detection device of claim 17, further comprising:

19. The audio detection apparatus according to claim 18, wherein the acquisition unit is configured to perform:

20. The audio detection apparatus of claim 11, further comprising:

21. A server, comprising:

one or more processors;

wherein the one or more processors are configured to execute the instructions to implement the audio detection method of any of claims 1 to 10.

22. A storage medium in which instructions, when executed by a processor of a server, enable the server to perform the audio detection method of any of claims 1 to 10.