CN116644753A - Song named entity identification method and system based on big data - Google Patents

Song named entity identification method and system based on big data Download PDF

Info

Publication number
CN116644753A
CN116644753A CN202310630300.7A CN202310630300A CN116644753A CN 116644753 A CN116644753 A CN 116644753A CN 202310630300 A CN202310630300 A CN 202310630300A CN 116644753 A CN116644753 A CN 116644753A
Authority
CN
China
Prior art keywords
song
search
fuzzy
songs
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310630300.7A
Other languages
Chinese (zh)
Inventor
白晓东
李曼曼
王亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Minzu University
Original Assignee
Dalian Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Minzu University filed Critical Dalian Minzu University
Priority to CN202310630300.7A priority Critical patent/CN116644753A/en
Publication of CN116644753A publication Critical patent/CN116644753A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/638Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/64Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to the technical field of big data, in particular to a song naming entity identification method and system based on big data. The method comprises the following steps: constructing a correlation database; acquiring fuzzy search information and determining fuzzy search content; determining a search priority order; searching in the library, and if songs in the library exist in the associated database, sequentially displaying song names; if no song in the database exists in the associated database, carrying out deep retrieval based on big data to obtain deep retrieval content; processing the depth search content to determine songs outside the library; and sequentially displaying song names of songs outside the library, a listening platform and webpage links. According to the application, searching can be performed in the associated database according to the fuzzy song name or other information, if no song exists in the associated database, the entry information can be searched through big data, and the platform or website and webpage links which the song can listen to are displayed, so that the searching precision is improved, and meanwhile, the time is greatly saved.

Description

Song named entity identification method and system based on big data
Technical Field
The application relates to the technical field of big data, in particular to a song naming entity identification method and system based on big data.
Background
The current society is a society with high development speed, developed technology, information circulation, more and more intimate communication among people, more and more convenient life and big data are the products of the high-tech age.
When a piece of music which is good but not known in name is heard, and when the user forgets to use the music listening and song recognition function, certain keywords which are heard before being input on each big music platform are often selected for retrieval afterwards, the retrieval result of the current music platform can only retrieve the music in the database of the current music platform, and the retrieval result of the user needs to be screened one by one, so that the song meeting the requirement of the user cannot be quickly found.
Disclosure of Invention
The application provides a song naming entity identification method and system based on big data, which can solve the problem that the existing music search platform cannot search for music outside a database.
The first technical scheme of the application is a song naming entity identification method based on big data, comprising the following steps:
s1: constructing an associated database comprising a plurality of songs in the library;
s2: acquiring fuzzy search information about songs to be identified, and determining fuzzy search content corresponding to the fuzzy search information;
S3: determining a search priority order corresponding to the fuzzy search content according to the fuzzy search content corresponding to the fuzzy search information;
s4: according to the search priority, searching in the library aiming at the fuzzy search content in the associated database, and if a plurality of songs in the library corresponding to the fuzzy search content exist in the associated database, sequentially displaying song names of the songs in the plurality of libraries;
s5: if the related database does not have the in-library songs corresponding to the fuzzy search content, carrying out deep search based on big data to obtain deep search content comprising out-library songs, a listening platform and web page links;
sequentially carrying out sorting treatment and screening treatment on the deep search content to determine a plurality of out-of-library songs corresponding to the fuzzy search content;
song names of a plurality of off-library songs corresponding to the fuzzy search content, and listening platforms and web page links corresponding to the plurality of off-library songs, respectively, are sequentially displayed.
The second technical scheme of the application is a song naming entity recognition system based on big data, comprising: the device comprises a database module, an input module, an identification module, a retrieval module and an output module;
the database module is used for constructing an associated database comprising a plurality of songs in the database;
The input module is used for acquiring fuzzy search information about songs to be identified and determining fuzzy search contents corresponding to the fuzzy search information;
the identification module is used for determining the search priority sequence corresponding to the fuzzy search content according to the fuzzy search content corresponding to the fuzzy search information;
the retrieval module is used for retrieving the fuzzy retrieval content in the associated database according to the retrieval priority order;
the output module is used for sequentially displaying song names of the songs in the plurality of libraries when the songs in the plurality of libraries corresponding to the fuzzy search content exist in the associated database;
the retrieval module is also used for carrying out deep retrieval based on big data to obtain deep retrieval contents comprising songs outside the library, a listening platform and web page links when the related database has no in-library songs corresponding to the fuzzy retrieval contents;
sequentially carrying out sorting treatment and screening treatment on the deep search content to determine a plurality of out-of-library songs corresponding to the fuzzy search content;
the output module is also used for sequentially displaying song names of a plurality of songs outside the library corresponding to the fuzzy search content, and a listening platform and a webpage link corresponding to the plurality of songs outside the library respectively.
The beneficial effects are that:
according to the application, the information with the deepest impression is selected as the preferential display according to the fuzzy song name, lyrics or other more information of the impression, the song corresponding to the preferential sorting is more forward in the search result, if the song is not in the associated database, the member user can search the vocabulary entry information through big data and display a platform or website which can be listened to by the song associated with the vocabulary entry information, and the user is accompanied with a webpage link which jumps to the platform or website, so that the user can quickly screen out which song is heard in accordance with the impression, the search precision is improved and the time is greatly saved;
therefore, the application can reduce the time for downloading a plurality of platforms to search one by adopting the method, so as to ensure the retrieval effectiveness of the platform and enhance the viscosity of the user;
in summary, the application can solve the problem that the existing music search platform can not search for the music outside the database.
Drawings
In order to more clearly illustrate the technical solution of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of a song naming entity identification method based on big data in an embodiment of the present application;
FIG. 2 is a schematic diagram of a training process for a music recognition model according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a song naming entity recognition system based on big data according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a login module according to an embodiment of the present application;
in the figure, 1-login module; 11-a normal user login unit; 12-member user login unit; 2-a database module;
3-an input module; 31-an audio recognition unit; 4-an identification module; 41-a priority identification unit;
5-a retrieval module; 51-a custom retrieval unit; 52-a common retrieval unit; 53-a depth retrieval unit; 54-a history retrieval unit;
6-an output module; 61-a correlation recognition unit; 7-feedback module.
Detailed Description
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The embodiments described in the examples below do not represent all embodiments consistent with the application. Merely exemplary of systems and methods consistent with aspects of the application as set forth in the claims.
Unnamed entity recognition is used as part of natural language processing and is the basis for correctly understanding texts, and the main task of recognition is to recognize proper nouns such as person names, place names, mechanism names and the like in the texts to be processed. The method for identifying the named entities is more and more important for quickly understanding information in texts in mass data, efficiently and accurately acquiring and analyzing information knowledge, the application of the named entity identification to the music field is an important basis for carrying out structural processing on unstructured texts in the music field, and information extracted by the named entity identification method can automatically analyze information of singers, songs and the like, so that a search engine and an intelligent question-answering system related to the music field are constructed.
Example 1
The application provides a song naming entity identification method based on big data, as shown in fig. 1, fig. 1 is a flow diagram of the song naming entity identification method based on big data in the embodiment of the application, which comprises the following steps:
s1: an associative database is constructed that includes a number of songs in the library.
Specifically, the association database is used for updating, deleting and storing song information, and granting different use authorities according to different user types, so that the ordinary user is limited to access the association database only when searching songs, and the member user can access the association database outside the library and the associated information which can be queried by a big data platform outside the association database at the same time when searching songs.
Wherein, step S1 includes:
s11: a base database comprising several songs in the library is constructed based on MySQL, oracle or NoSQL.
S12: relevant information including song title, artist name, and song lyrics corresponding to the songs in the library is determined.
And determining a data table covering songs in a plurality of libraries according to the related information.
S13: and extracting keywords aiming at song lyrics of the songs in the plurality of libraries respectively to obtain keywords corresponding to the songs in the plurality of libraries respectively.
S14: and establishing indexes according to song names of the songs, singer names and keywords corresponding to the songs in the libraries respectively.
And carrying out association processing on the basic database and the index to obtain an association database which can be used for song name retrieval.
Specifically, a relational database such as MySQL, oracle, or the like may be used, and a NoSQL database such as mongdb or the like may be used.
A data table is created in the database containing fields for song title, singer name, lyrics, etc. Data is stored according to song name and singer name, and lyric text is stored in one field separately.
For the text field of lyrics, word segmentation is required. The Chinese word segmentation device such as the crust word segmentation, the IK word segmentation and the like can be used for segmenting the text into a plurality of words, and useless words such as stop words and the like are removed.
An index is set in the database to quickly find the data. For song name and singer name fields, a common index or a unique index may be created; for lyrics text fields, full text indexes or word segmentation indexes may be used to better support text searches.
Song data is inserted into the data table, and song names, artist names, and lyric text are stored in the corresponding fields.
After the database is built, the data can be queried through SQL sentences, such as searching according to song names and singer names, or text searching according to lyric texts. If fuzzy search is to be supported, LIKE statements or regular expressions, etc. may be used in order to match more data results.
S2: acquiring fuzzy search information about songs to be identified, and determining fuzzy search content corresponding to the fuzzy search information.
Specifically, the embodiment of the application can also provide users with song listening and identifying for music when the users listen to the music at the web end or an external playing source. In this step, the lyric information is preferentially confirmed, and then the tune information is checked to identify a plurality of different versions of the same song, such as original version, remix version or other recomposition, and the related ordering after song retrieval is specifically that the music currently carrying out song listening and song identification is the most front and marked as the currently identified song, original version song order and other recomposition version ordering is the later.
As shown in fig. 2, fig. 2 is a schematic flow chart of training a music recognition model according to an embodiment of the present application, and the corresponding steps are as follows:
(1) For input speech, a sampling and quantization process is required to obtain a digitized audio signal.
(2) And carrying out certain preprocessing on the original voice data and the test data, including filtering, pre-emphasis, windowing and framing and the like.
The note starting point is detected, the data set is intercepted from the note starting point, and the interference of silent section noise is eliminated.
(4) Converting a time-series audio signal into a two-dimensional vector representation that can be modeled and calculated as input to a neural network using an appropriate audio feature vector representation method
(5) The input layer receives as input a mel-spectrogram of the audio signal.
The local features of the audio signal are learned using a number of convolution layers resulting in an audio signal feature map.
And (3) using a plurality of circulating layers to induce and learn the sequence characteristics formed by the audio signal changing along with time.
(8) And obtaining the probability distribution of the recognition result of the input audio signal as a certain song through a Softmax activation function.
Wherein, step S2 includes:
s21: fuzzy search information about songs to be identified is obtained.
Specifically, at the time of retrieval, one or more of information such as a singer name, song lyrics, etc. with blurred impression may be input in the retrieval field.
S22: and aiming at the fuzzy search information, sequentially performing blank and line feed symbol removal processing, punctuation mark removal processing, lowercase processing, stop word removal processing, stem extraction processing and encoding processing to obtain the processing search information.
Specifically, the step of determining the fuzzy search contents by the fuzzy search information is as follows:
(1) Space and line feed removal: the blank spaces and line breaks at the beginning and end of a string can be removed using the string () function of the string.
(2) Removing punctuation marks: all punctuation can be removed using the funciton in the Python string module.
Lower case: all term can be converted to lower case using the lower () function, facilitating subsequent processing.
(4) Removing stop words: and storing the common stop word list into a file, and then reading the file to remove the stop words in the term list.
(5) Extracting word stems: stem extraction is performed by using a stem module of an nltk library, so that words of different forms of term can be merged into the same stem, and the feature quantity and noise interference are reduced.
Encoding: for term containing Chinese, coding processing is needed to convert it into Unicode coding, which is convenient for subsequent processing.
S23: and determining fuzzy search contents corresponding to the processing search information according to the processing search information.
S24: rights information corresponding to general rights/member rights of the fuzzy search contents is determined.
Specifically, when using a platform implementing the method of the embodiment of the present application, login is required. After login, the users are divided into common users and member users, and the use authorities of the common users and the member users are different to a certain extent.
S3: a search priority corresponding to the fuzzy search content is determined based on the fuzzy search content corresponding to the fuzzy search information.
Wherein the fuzzy search content comprises a plurality of search terms related to song names/related to singer names/related to song lyrics. The search priority order is an order in which a plurality of search terms corresponding to the fuzzy search contents are arranged according to the priority.
And, step S3 includes:
s31. Search terms regarding song title/ambiguous song title, singer title, song lyrics/ambiguous lyrics in the ambiguous search content are determined based on song title, singer title and song lyrics.
S32: if the fuzzy search content includes a search term for song names and a search term for fuzzy lyrics, determining that the search is prioritized in order of song names and fuzzy lyrics.
S33: if the fuzzy search contents include a search term for a song title, a search term for a singer title, and a search term for song lyrics, it is determined that the search priority is sequentially singer title, song title, and song lyrics corresponding to the song title and song lyrics.
S34: if the fuzzy search content comprises a search term related to the fuzzy song name and a search term related to song lyrics, determining that the search is prioritized to be song lyrics and the fuzzy song name in sequence.
Specifically, before the search starts, the search terms need to be sorted.
(1) The song names are preferably ranked so that when the vocabulary entry information searched by the user has both the song names and the lyric information with fuzzy impressions, the search result which is most suitable for the song names is most forward, and the song and the searched vocabulary entry information (song) are identified in a related manner for the user to confirm.
(2) The singer priority ordering is used for leading the singer corresponding to the song and lyric information to be most forward in the search result when the singer, the song name and the lyric information are simultaneously included in the vocabulary entry information searched by the user, and carrying out related identification on the song and the searched vocabulary entry information (singer) for the user to confirm.
The lyric priority ordering is used for identifying the song and the searched entry information (impression fuzzy lyric information) in a correlation way when the entry information searched by the user has the fuzzy song name and the lyric information, and the search result which is most consistent with the song lyric information is most forward for the user to confirm.
S4: and according to the search priority, carrying out in-library search on the fuzzy search content in the associated database, and if a plurality of in-library songs corresponding to the fuzzy search content exist in the associated database, sequentially displaying song names of the plurality of in-library songs.
S5: and if the related database does not have the in-library songs corresponding to the fuzzy search content, carrying out deep search based on the big data to obtain the deep search content comprising the out-library songs, the listening platform and the webpage links.
And sequentially carrying out sorting processing and screening processing on the deep search content, and determining a plurality of out-of-library songs corresponding to the fuzzy search content.
Song names of a plurality of off-library songs corresponding to the fuzzy search content, and listening platforms and web page links corresponding to the plurality of off-library songs, respectively, are sequentially displayed.
Wherein, step S5 includes:
s51: if there is no in-library song corresponding to the fuzzy search content in the associated database and the authority information corresponding to the fuzzy search content is the ordinary authority, displaying the search content without the in-library song corresponding to the fuzzy search content.
S52: and if the related database does not have the in-library songs corresponding to the fuzzy search content and the authority information corresponding to the fuzzy search content is the member authority, carrying out deep search on the fuzzy search content based on the big data to obtain the deep search content comprising the out-library songs, the listening platform and the webpage links.
Specifically, at the beginning of the search, the member user selects the sorting option in the priority identification unit, and at the same time, the specific search mode of the search in the associated database or the search outside the database can be selected or performed simultaneously.
When a common user inputs entry information related to songs in a search column, the search result shows the song information related to the entry information in a related database to the user, wherein the entry information can be one or a combination of a singer name, a song name, an album name or a section of lyrics in the songs, and only the displayed result is a default search mode of the system.
If the member user inputs the entry information of the related song in the search column, if the related song information does not exist in the related database or the related degree of the related song deviates greatly from the entry information input by the user, the entry information is searched through big data, songs related to the entry information are ordered and displayed, the ordered priority is a music platform and a music website with the copyright of the song, and a webpage link for jumping to the platform or the website is attached to the song information.
In the embodiment of the application, song information, singer information, lyric information or other entry information which is searched by the user in the past can be stored.
S53: and sorting the deep search content according to the arrangement sequence of the original music platform with the song copyright, the original music website with the song copyright and the related music platform or website to obtain the deep sorting content.
S54: the de-duplication and filtering process is performed on the depth-ordered content to determine depth-filtered content comprising a number of off-library songs corresponding to the fuzzy search content.
S55: and carrying out link adding processing on a plurality of out-of-library songs in the deep screening content to obtain a plurality of out-of-library songs with a listening platform and web page links.
S56: song names of a plurality of off-library songs corresponding to the fuzzy search content, and listening platforms and web page links corresponding to the plurality of off-library songs, respectively, are sequentially displayed.
Specifically, after the search is completed, screening the search results, only reserving links of copyrighted original music platforms and original music websites, and deleting links of non-copyrighted network music platforms and websites.
Ranking the remaining search results, according to the following priorities: copyrighted network music platform or website, copyrighted music platform or website with copyrighted song.
Links belonging to the same priority may be ordered according to their weight, access amount, etc. And calling a search engine API, generating a search result page according to the sequencing result, and displaying the search result page to a user.
And after the retrieval is completed, the user can feed back the operation or operation interface and the function information of the system, and can identify whether the identified song is a song which the user wants to find or not after identifying the song, and when other users pass through the same retrieval mode, the songs with more identification numbers are ranked more forward, and the songs are marked as songs selected by a plurality of people in the song information.
Example two
The application also provides a song naming entity recognition system based on big data, as shown in fig. 3, fig. 3 is a schematic structural diagram of the song naming entity recognition system based on big data in the embodiment of the application, and the system comprises: a login module 1, a database module 2, an input module 3, an identification module 4, a retrieval module 5, an output module 6 and an output module 6.
A login module 1, configured to allow a user to login and determine authority information of a common authority/member authority of the user.
Specifically, as shown in fig. 4, fig. 4 is a schematic structural diagram of a login module in the embodiment of the present application, and the login module 1 includes a general user login unit 11 and a member user login unit 12, and the two have a certain difference with respect to the use authority of the present system.
A database module 2 for constructing an associated database comprising a number of songs in the library.
Specifically, the association database is used for updating, deleting and storing song information, and granting different use authorities according to different user types, so that the ordinary user is limited to access the association database only when searching songs, and the member user can access the association database outside the library and the associated information which can be queried by a big data platform outside the association database at the same time when searching songs.
The database is established as follows:
relational databases such as MySQL, oracle, etc. may be used, and NoSQL databases such as mongo db, etc. may also be used.
A data table is created in the database containing fields for song title, singer name, lyrics, etc. Data is stored according to song name and singer name, and lyric text is stored in one field separately.
For the text field of lyrics, word segmentation is required. The Chinese word segmentation device such as the crust word segmentation, the IK word segmentation and the like can be used for segmenting the text into a plurality of words, and useless words such as stop words and the like are removed.
An index is set in the database to quickly find the data. For song name and singer name fields, a common index or a unique index may be created; for lyrics text fields, full text indexes or word segmentation indexes may be used to better support text searches.
Song data is inserted into the data table, and song names, artist names, and lyric text are stored in the corresponding fields.
After the database is built, the data can be queried through SQL sentences, such as searching according to song names and singer names, or text searching according to lyric texts. If fuzzy search is to be supported, LIKE statements or regular expressions, etc. may be used in order to match more data results.
An input module 3 for acquiring fuzzy search information about songs to be identified and determining fuzzy search content corresponding to the fuzzy search information.
Specifically, at the time of retrieval, one or more of information such as a singer name, song lyrics, etc. with blurred impression may be input in the retrieval field.
The step of determining the fuzzy search contents by the fuzzy search information is as follows:
(1) Space and line feed removal: the blank spaces and line breaks at the beginning and end of a string can be removed using the string () function of the string.
(2) Removing punctuation marks: all punctuation can be removed using the funciton in the Python string module.
Lower case: all term can be converted to lower case using the lower () function, facilitating subsequent processing.
(4) Removing stop words: and storing the common stop word list into a file, and then reading the file to remove the stop words in the term list.
(5) Extracting word stems: stem extraction is performed by using a stem module of an nltk library, so that words of different forms of term can be merged into the same stem, and the feature quantity and noise interference are reduced.
Encoding: for term containing Chinese, coding processing is needed to convert it into Unicode coding, which is convenient for subsequent processing.
Wherein the input module 3 comprises: an audio recognition unit 31.
The audio recognition unit 31 stores an audio recognition model for receiving real-time play music from a web-side or an external play source and determining fuzzy search contents according to the real-time play music. And also to confirm tune information for playing music in real time.
Specifically, the audio recognition unit 31 is configured to, when a user listens to music at the web end or an external playing source, listen to the music through the audio recognition unit, and in this step, the user will preferentially confirm the lyric information, then check the tune information, and identify a plurality of different versions of the same song, such as original edition, remix edition, or other adaptation, where the relevant ranking after the song search is specifically that the music currently listening to the song is most forward and marked as the currently identified song, original edition song order, or other adaptation version ranking is backward.
The training steps of the audio recognition model are as follows:
(1) For input speech, a sampling and quantization process is required to obtain a digitized audio signal.
(2) And carrying out certain preprocessing on the original voice data and the test data, including filtering, pre-emphasis, windowing and framing and the like.
The note starting point is detected, the data set is intercepted from the note starting point, and the interference of silent section noise is eliminated.
(4) Converting a time-series audio signal into a two-dimensional vector representation that can be modeled and calculated as input to a neural network using an appropriate audio feature vector representation method
(5) The input layer receives as input a mel-spectrogram of the audio signal.
The local features of the audio signal are learned using a number of convolution layers resulting in an audio signal feature map.
And (3) using a plurality of circulating layers to induce and learn the sequence characteristics formed by the audio signal changing along with time.
(8) And obtaining the probability distribution of the recognition result of the input audio signal as a certain song through a Softmax activation function.
The identification module 4 is used for determining the search priority order corresponding to the fuzzy search content according to the fuzzy search content corresponding to the fuzzy search information.
Wherein the identification module 4 comprises: the priority identifying unit 41.
The priority identifying unit 41 is used for determining search items about song names/fuzzy song names, singer names, song lyrics/fuzzy lyrics in fuzzy search contents based on song names, singer names and song lyrics.
If the fuzzy search content includes a search term for song names and a search term for fuzzy lyrics, determining that the search is prioritized in order of song names and fuzzy lyrics.
If the fuzzy search content includes a search term for a song title, a search term for a singer title, and a search term for song lyrics, it is determined that the search priority is in order of singer title, song title, and song lyrics.
If the fuzzy search content comprises a search term related to the fuzzy song name and a search term related to song lyrics, determining that the search is prioritized to be the song lyrics and the fuzzy song name in sequence.
Specifically, the priority recognition unit 41 has a singer prioritization, lyrics prioritization function
The song name priority sorting function is used for leading the search result most conforming to the song name to be the most forward when the vocabulary entry information searched by the user has the lyric information with fuzzy song names and impressions, and carrying out related identification on the song and the searched vocabulary entry information (song) for the user to confirm;
The singer priority ordering function is used for carrying out related identification on the singer corresponding to the song and lyric information in the search result when the singer, the song name and the lyric information are simultaneously included in the vocabulary entry information searched by the user, and the song and the searched vocabulary entry information (singer) are identified for the user to confirm;
the lyric prioritization function is used for identifying the song and the searched entry information (impression fuzzy lyric information) in a correlation mode when the entry information searched by the user has the fuzzy song name and the lyric information, and the search result which is most consistent with the song lyric information is the most front, so that the user can confirm the identification.
And the retrieval module 5 is used for retrieving the fuzzy retrieval content in the associated database according to the retrieval priority order.
And the method is also used for carrying out deep retrieval based on big data to obtain deep retrieval contents comprising songs outside the library, a listening platform and web page links when the in-library songs corresponding to the fuzzy retrieval contents are not in the associated database.
And sequentially carrying out sorting processing and screening processing on the deep search content, and determining a plurality of out-of-library songs corresponding to the fuzzy search content.
Wherein the retrieval module 5 comprises: a custom search unit 51, a general search unit 52, a depth search unit 53, and a history search unit 54.
The custom search unit 51 is used for allowing the user with member authority to select to perform in-library search through the common search unit 52 and/or to perform deep search through the deep search unit 53.
The general search unit 52 is configured to search the fuzzy search content in the association database according to the search priority, and if there are a plurality of in-library songs corresponding to the fuzzy search content in the association database, send the in-library search result to the output module 6.
And a depth retrieval unit 53 for performing depth retrieval for the fuzzy retrieval contents based on the big data to obtain the depth retrieval contents including the songs outside the library, the listening platform and the web page link when no in-library song corresponding to the fuzzy retrieval contents in the associated database has the authority information corresponding to the fuzzy retrieval contents as the membership authority.
And the method is also used for carrying out sorting processing on the deep search content according to the arrangement sequence of the original music platform with the song copyright, the original music website with the song copyright and the related music platform or website to obtain the deep sorting content.
And is further configured to perform a deduplication and filtering process with respect to the deep-ranked content, and to determine deep-filtered content comprising a number of off-library songs corresponding to the fuzzy search content.
And the method is also used for carrying out link adding processing on a plurality of off-library songs in the deep screening content to obtain a plurality of off-library songs with a listening platform and web page links.
And also for delivering a number of off-library songs to the output module 6.
And a history retrieval unit 54 for storing the fuzzy search information.
Specifically, the custom search unit 51 is used for associating with the priority identification unit, and the user selects the sorting options in the priority identification unit through the user unit, and can also select a specific search mode of intra-system search or big data search or search together through the custom search unit 51;
the general search unit 52 is configured to display, to the user, in the search result of the system, related song information in the related database after the user inputs the entry information related to the song in the search column, where the entry information may be one or more of a singer name, a song name, an album name, and a song lyrics, and only the result displayed in the general search unit 52 is the default search mode of the system.
The deep search unit 53 is configured to search term information of related songs through big data after the user inputs the term information of the related songs in the search bar, but when related song information does not exist in the related database or the related degree of the related songs deviates greatly from the term information input by the user, so as to obtain deep search content including songs outside the library, listening platforms and web page links.
Specifically, when specifically retrieving, the input item is term information term; the output items are a prioritized list of song information and a jump link.
The detailed steps are as follows:
cleaning and processing the term to remove irrelevant information and special characters such as blank spaces, punctuation marks and the like;
II, establishing an index by using a search engine or a database, and taking song names, singers and lyrics as search keywords;
searching song information related to term including song title, singer name, song duration, album name, album cover, lyrics, etc.
The history retrieval unit 54 is used to store song information, singer information, lyric information, or other entry information that the user has previously retrieved.
And an output module 6, configured to sequentially display song names of the songs in the plurality of libraries when the plurality of libraries corresponding to the fuzzy search content exist in the associated database.
And the system is also used for sequentially displaying song names of a plurality of off-library songs corresponding to the fuzzy search content and a listening platform and a webpage link corresponding to the plurality of off-library songs respectively.
For displaying song names of a number of in-library songs corresponding to real-time playing of music, and in the order of currently identified songs, original songs and related adapted versions, when there are a number of in-library songs in the associated database.
Specifically, the output module 6 is configured to feed back the operation or operation interface and the function information of the system, and identify whether the identified song is a desired song after identifying the song, and when other users pass through the same search method, the songs with a larger number of identifications are ranked more forward, and the songs are marked in the song information as songs selected by a plurality of people.
When screening and ranking search results, ranking is performed according to the following priorities:
(1) And a legal music platform with the song copyright, such as QQ music, internet music and the like.
(2) Copyrighted music websites such as cool dog music, dried shrimps music and the like.
Copyrighted network music platform or web site.
And I, carrying out de-duplication and screening on the search results, and removing repeated song information and unavailable links.
And II, adding a jump link for each song information, and linking to a corresponding music platform or website.
And III, outputting the ordered song list and the corresponding links as a final result.
The output module 6 includes: the correlation degree identifying unit 61.
The relevance identifying unit 61 is used for displaying a plurality of songs in the library or displaying song names, listening platforms and web page links of a plurality of songs outside the library according to the order of the matching degree with the fuzzy search content from high to low.
Specifically, when the user queries the song through the term information, the relevance identifying unit 61 is configured to rank the song with higher matching degree with the term information in the search result more forward.
After the search is completed, screening search results, only reserving links of copyrighted original music platforms and original music websites, and deleting links of non-copyrighted network music platforms and websites.
Ranking the remaining search results, according to the following priorities: copyrighted network music platform or website, copyrighted music platform or website with copyrighted song.
Links belonging to the same priority may be ordered according to their weight, access amount, etc. And calling a search engine API, generating a search result page according to the sequencing result, and displaying the search result page to a user.
And the feedback module 7 is used for the user to feed back in terms of the use experience and the retrieval experience.
Specifically, the feedback module 7 is configured to perform feedback on the operation or operation interface and the function information of the system by the user, and can identify whether the identified song is a song that the user wants to find after identifying the song, and when other users pass through the same search mode, the songs with a larger number of identifications are ranked more forward, and the songs are marked as songs selected by a plurality of people in the song information.
The embodiments of the present application have been described in detail, but the present application is merely the preferred embodiments of the present application and should not be construed as limiting the scope of the present application. All equivalent changes and modifications within the scope of the present application should be made within the scope of the present application.

Claims (10)

1. A song naming entity recognition method based on big data, comprising:
s1: constructing an associated database comprising a plurality of songs in the library;
s2: acquiring fuzzy search information about songs to be identified, and determining fuzzy search content corresponding to the fuzzy search information;
s3: determining a search priority order corresponding to the fuzzy search content according to the fuzzy search content corresponding to the fuzzy search information;
s4: according to the search priority, searching in the library aiming at the fuzzy search content in the associated database, and if a plurality of songs in the library corresponding to the fuzzy search content exist in the associated database, sequentially displaying song names of the songs in the plurality of libraries;
s5: if the related database does not have the in-library songs corresponding to the fuzzy search content, carrying out deep search based on big data to obtain deep search content comprising out-library songs, a listening platform and web page links;
Sequentially carrying out sorting treatment and screening treatment on the deep search content to determine a plurality of out-of-library songs corresponding to the fuzzy search content;
song names of a plurality of off-library songs corresponding to the fuzzy search content, and listening platforms and web page links corresponding to the plurality of off-library songs, respectively, are sequentially displayed.
2. The song naming entity recognition method based on big data according to claim 1, wherein step S1 includes:
s11: constructing a basic database comprising a plurality of songs in the library based on MySQL, oracle or NoSQL;
s12: determining relevant information including song title, artist name and song lyrics corresponding to songs in the library;
determining a data table covering songs in a plurality of libraries according to the related information;
s13: extracting keywords aiming at song lyrics of songs in a plurality of libraries respectively to obtain keywords corresponding to the songs in the libraries respectively;
s14: establishing indexes according to song names of a plurality of songs, singer names and keywords corresponding to the songs in a plurality of libraries respectively;
and carrying out association processing on the basic database and the index to obtain an association database which can be used for song name retrieval.
3. The song naming entity recognition method based on big data of claim 1, wherein step S2 includes:
s21: acquiring fuzzy search information about songs to be identified;
s22: for fuzzy search information, sequentially performing blank and line feed symbol removal processing, punctuation mark removal processing, lowercase processing, stop word removal processing, stem extraction processing and encoding processing to obtain processing search information;
s23: and determining fuzzy search contents corresponding to the processing search information according to the processing search information.
4. The big data based song naming entity recognition method of claim 1, wherein said ambiguous search includes a number of search terms for song title/for singer title/for song lyrics;
the search priority is ordered into an order of arranging a plurality of search items corresponding to the fuzzy search content according to the priority;
the step S3 includes:
s31, performing S31; determining search terms related to song names/fuzzy song names, singer names, song lyrics/fuzzy lyrics in fuzzy search contents based on song names, singer names and song lyrics;
s32: if the fuzzy search content comprises a search term related to song names and a search term related to fuzzy lyrics, determining that the search priority is sequentially the song names and the fuzzy lyrics;
S33: if the fuzzy search content includes a search term for song title, a search term for singer title and a search term for song lyrics, determining that the search priority is in turn singer title, song title and song lyrics corresponding to song title and song lyrics;
s34: if the fuzzy search content comprises a search term related to the fuzzy song name and a search term related to song lyrics, determining that the search is prioritized to be song lyrics and the fuzzy song name in sequence.
5. The song naming entity recognition method based on big data of claim 1, wherein step S2 further comprises:
s24: determining rights information corresponding to general rights/member rights of the fuzzy search contents;
and, the step S5 includes:
s51: if the associated database has no in-library songs corresponding to the fuzzy search content and the authority information corresponding to the fuzzy search content is the common authority, displaying the search content without the in-library songs corresponding to the fuzzy search content;
s52: if the related database does not have the in-library songs corresponding to the fuzzy search content and the authority information corresponding to the fuzzy search content is the member authority, carrying out deep search on the fuzzy search content based on big data to obtain deep search content comprising out-library songs, a listening platform and a webpage link;
S53: according to the arrangement sequence of the original music platform with the song copyright, the original music website with the song copyright and the related music platform or website, carrying out sorting processing on the deep search content to obtain deep sorting content;
s54: performing de-duplication and screening processing on the depth-ordered content, and determining depth-screened content comprising a plurality of extra-library songs corresponding to the fuzzy retrieval content;
s55: performing link adding processing on a plurality of off-library songs in the deep screening content to obtain a plurality of off-library songs with a listening platform and web page links;
s56: song names of a plurality of off-library songs corresponding to the fuzzy search content, and listening platforms and web page links corresponding to the plurality of off-library songs, respectively, are sequentially displayed.
6. A song naming entity recognition system based on big data, comprising: the device comprises a database module, an input module, an identification module, a retrieval module and an output module;
the database module is used for constructing an associated database comprising a plurality of songs in the database;
the input module is used for acquiring fuzzy search information about songs to be identified and determining fuzzy search contents corresponding to the fuzzy search information;
The identification module is used for determining the search priority sequence corresponding to the fuzzy search content according to the fuzzy search content corresponding to the fuzzy search information;
the retrieval module is used for retrieving the fuzzy retrieval content in the associated database according to the retrieval priority order;
the output module is used for sequentially displaying song names of the songs in the plurality of libraries when the songs in the plurality of libraries corresponding to the fuzzy search content exist in the associated database;
the retrieval module is also used for carrying out deep retrieval based on big data to obtain deep retrieval contents comprising songs outside the library, a listening platform and web page links when the related database has no in-library songs corresponding to the fuzzy retrieval contents;
sequentially carrying out sorting treatment and screening treatment on the deep search content to determine a plurality of out-of-library songs corresponding to the fuzzy search content;
the output module is also used for sequentially displaying song names of a plurality of songs outside the library corresponding to the fuzzy search content, and a listening platform and a webpage link corresponding to the plurality of songs outside the library respectively.
7. The big data based song naming entity recognition system of claim 6, wherein said input module comprises: an audio recognition unit;
The audio recognition unit is used for receiving real-time playing music from a web end or an external playing source and determining fuzzy retrieval content according to the real-time playing music; the method is also used for confirming the tune information of the real-time playing music;
the output module is used for displaying song names of the songs in the plurality of libraries when the songs in the plurality of libraries corresponding to the real-time playing music exist in the associated database and according to the sequence of the currently identified songs, original edition songs and related recomposition versions.
8. The big data based song naming entity recognition system of claim 6, wherein said recognition module includes: a priority identification unit;
the priority identification unit is used for determining search items related to song names/fuzzy song names, singer names and song lyrics in fuzzy search contents based on song names, singer names and song lyrics;
if the fuzzy search content comprises a search term related to song names and a search term related to fuzzy lyrics, determining that the search priority is sequentially the song names and the fuzzy lyrics;
if the fuzzy search content comprises a search term about a song title, a search term about a singer title and a search term about song lyrics, determining that the search priority is sequentially the singer title, the song title and the song lyrics;
If the fuzzy search content comprises a search term related to the fuzzy song name and a search term related to song lyrics, determining that the search is prioritized to be the song lyrics and the fuzzy song name in sequence.
9. The big data based song naming entity recognition system of claim 6, wherein said system further comprises: a login module;
the login module is used for a user to log in and determining the authority information of the common authority/member authority of the user;
and, the retrieval module comprises: the system comprises a custom search unit, a common search unit, a depth search unit and a history search unit;
the user-defined searching unit is used for users with member rights to select to search in a library through the common searching unit and/or to search deeply through the deep searching unit;
the common searching unit is used for searching in the associated database aiming at the fuzzy searching content according to the searching priority order, and if a plurality of songs in the database corresponding to the fuzzy searching content exist in the associated database, the searching result in the database is transmitted to the output module;
the depth retrieval unit is used for performing depth retrieval on the fuzzy retrieval content based on big data to obtain the depth retrieval content comprising songs outside the library, a listening platform and web page links when no in-library song corresponding to the fuzzy retrieval content in the associated database is the member right corresponding to the authority information of the fuzzy retrieval content;
The method is also used for carrying out sorting processing on the deep search content according to the arrangement sequence of the original music platform with the song copyright, the original music website with the song copyright and the related music platform or website to obtain deep sorting content;
the method is also used for carrying out de-duplication and screening processing on the depth-ordered content, and determining depth-screened content comprising a plurality of songs outside the library corresponding to the fuzzy retrieval content;
the method is also used for carrying out link adding processing on a plurality of off-library songs in the deep screening content to obtain a plurality of off-library songs with a listening platform and web page links;
the device is also used for transmitting a plurality of songs outside the library to the output module;
the history retrieval unit is used for storing fuzzy search information;
and, the system further comprises: a feedback module;
and the feedback module is used for the user to feed back in the aspects of using experience and searching experience.
10. The big data based song naming entity recognition system of claim 6, wherein said output module includes: a correlation degree identification unit;
the relativity identifying unit is used for displaying a plurality of songs in the library or displaying song names, listening platforms and web page links of a plurality of songs outside the library according to the sequence of the matching degree with the fuzzy search content from high to low.
CN202310630300.7A 2023-05-30 2023-05-30 Song named entity identification method and system based on big data Pending CN116644753A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310630300.7A CN116644753A (en) 2023-05-30 2023-05-30 Song named entity identification method and system based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310630300.7A CN116644753A (en) 2023-05-30 2023-05-30 Song named entity identification method and system based on big data

Publications (1)

Publication Number Publication Date
CN116644753A true CN116644753A (en) 2023-08-25

Family

ID=87624332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310630300.7A Pending CN116644753A (en) 2023-05-30 2023-05-30 Song named entity identification method and system based on big data

Country Status (1)

Country Link
CN (1) CN116644753A (en)

Similar Documents

Publication Publication Date Title
US9600533B2 (en) Matching and recommending relevant videos and media to individual search engine results
US9846744B2 (en) Media discovery and playlist generation
US9230041B2 (en) Search suggestions of related entities based on co-occurrence and/or fuzzy-score matching
CN101223525B (en) Relationship networks
US20080154886A1 (en) System and method for summarizing search results
KR101661198B1 (en) Method and system for searching by using natural language query
US20040249808A1 (en) Query expansion using query logs
JP5066963B2 (en) Database construction device
KR20070089449A (en) Method of classifying documents, computer readable record medium on which program for executing the method is recorded
US20120323905A1 (en) Ranking data utilizing attributes associated with semantic sub-keys
CA2932401A1 (en) Systems and methods for in-memory database search
CN111090771B (en) Song searching method, device and computer storage medium
KR100835290B1 (en) System and method for classifying document
CN110866089A (en) Robot knowledge base construction system and method based on synonymous multi-language environment analysis
CN108345694B (en) Document retrieval method and system based on theme database
US9875298B2 (en) Automatic generation of a search query
JP2003150624A (en) Information extraction device and information extraction method
CN113190692A (en) Self-adaptive retrieval method, system and device for knowledge graph
CN116644753A (en) Song named entity identification method and system based on big data
JP4057962B2 (en) Question answering apparatus, question answering method and program
KR102534131B1 (en) Method and Apparatus for Providing Book Recommendation Service Based on Interactive Form
US20120317103A1 (en) Ranking data utilizing multiple semantic keys in a search query
CN112507097A (en) Method for improving generalization capability of question-answering system
Kleb et al. Ontology based entity disambiguation with natural language patterns
CN113869025B (en) Learning path generation and literature evaluation method based on knowledge element similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination