CN117235300B - Song recommendation method, system and storage medium of intelligent K song system - Google Patents

Song recommendation method, system and storage medium of intelligent K song system Download PDF

Info

Publication number
CN117235300B
CN117235300B CN202311498913.6A CN202311498913A CN117235300B CN 117235300 B CN117235300 B CN 117235300B CN 202311498913 A CN202311498913 A CN 202311498913A CN 117235300 B CN117235300 B CN 117235300B
Authority
CN
China
Prior art keywords
user
information
song
data
voice domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311498913.6A
Other languages
Chinese (zh)
Other versions
CN117235300A (en
Inventor
赵鑫
隋阳
岳平安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhishang Information Technology Co ltd
Original Assignee
Shenzhen Zhishang Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhishang Information Technology Co ltd filed Critical Shenzhen Zhishang Information Technology Co ltd
Priority to CN202311498913.6A priority Critical patent/CN117235300B/en
Publication of CN117235300A publication Critical patent/CN117235300A/en
Application granted granted Critical
Publication of CN117235300B publication Critical patent/CN117235300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

The invention relates to a song recommendation method, a system and a storage medium of an intelligent K song system, which belong to the technical field of singing recommendation. According to the voice domain analysis method and the voice domain analysis device, voice domain of the user is analyzed through singing training voice data information, and voice domain change data of the user within preset time are monitored, so that personalized recommendation is more suitable for songs singed by the user, personalized customization is achieved for the user, recommendation precision of an intelligent K song system is improved, and singing experience of the user is improved. On the other hand, the method and the device improve the accuracy of acquiring the singing voice domain of the user by configuring the related singing scenes.

Description

Song recommendation method, system and storage medium of intelligent K song system
Technical Field
The invention relates to the technical field of song recommendation, in particular to a song recommendation method, a song recommendation system and a storage medium of an intelligent K song system.
Background
In the application scenario where songs are sung, it is most important for the singer to properly deduct the songs. The investigation results show that the most significant reason most singers cannot correctly deduct songs is because the selected singing track is either too loud or too loud for the singer's own ability, resulting in a "go high" and no go low ". There is therefore a need for a music recommendation method that helps a user select songs that fit their range of ranges and thereby avoid this type of situation. However, existing music recommendation methods do not address this problem for users. Secondly, deviation exists when the range of the voice domain of the user is acquired, so that the recommendation accuracy of the intelligent K song system is reduced.
Disclosure of Invention
The invention overcomes the defects of the prior art and provides a song recommendation method, a song recommendation system and a song recommendation storage medium of an intelligent K song system.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the first aspect of the invention provides a song recommendation method of an intelligent K song system, which comprises the following steps:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
Constructing time-series-based gamut feature data according to the user-friendly gamut information and the user-unapplicable gamut information, and updating the user-friendly gamut information and the user-unapplicable gamut information by learning the time-series-based gamut feature data;
initializing and searching according to the voice domain information which is good at the user, acquiring song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to acquire song data after the secondary processing;
and obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
Further, in the method, the singing training voice data information of the user is obtained, and the voice domain information good for the user and the voice domain information poor for the user are generated based on the singing training voice data information of the user, which specifically comprises the following steps:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
The method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and when the deviation rate is larger than the deviation threshold value, using the gamut information corresponding to the deviation rate larger than the deviation threshold value as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, using the gamut information corresponding to the deviation rate not larger than the deviation threshold value as the gamut information which is good for the user.
Further, in the method, the time-series-based gamut characteristic data is constructed according to the gamut information which is good for the user and the gamut information which is poor for the user, and the method comprises the following steps:
constructing a time stamp, introducing a graph neural network, initializing a first graph node and a second graph node, taking the first graph node as a graph node of a user-friendly category, and taking the second graph node as a graph node of a user-unapplicable category;
Constructing a plurality of third graph nodes from the soundfield information which is good for the user and the soundfield information which is poor for the user, constructing directed edge description, enabling the third graph nodes corresponding to the soundfield information which is good for the user to point to the first graph nodes, constructing a first topological structure diagram, fusing time stamps and the first topological structure diagram at each moment to generate soundfield feature data which is good for the user and is good for the user based on time sequence;
according to the directed edge description, enabling a third graph node corresponding to the soundfield information which is not good for the user to point to a second graph node, constructing a second topological structure diagram, fusing a time stamp and the second topological structure diagram at each moment to generate soundfield characteristic data which is not good for the user and is based on a time sequence;
generating time-series-based gamut feature data from the time-series-based user-friendly gamut feature data and the time-series-based user-unapplicable gamut feature data;
further, in the present method, by learning the time-series-based register feature data, register information good for the user and register information bad for the user are updated, specifically including:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
Presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
inputting the eliminated voice domain characteristic data based on the time sequence into a user voice domain preference identification model for coding learning, and obtaining a trained user voice domain preference identification model;
and acquiring the soundfield information good for the user and the soundfield information poor for the user through the trained soundfield preference identification model of the user, and periodically updating the soundfield information good for the user and the soundfield information poor for the user.
Further, in the method, initializing search is performed according to the soundfield information which is good for the user, song data after primary search is obtained, secondary processing is performed on the song data after primary search based on the soundfield information which is bad for the user, and song data after secondary processing is obtained, which specifically comprises:
initializing and searching according to the soundfield information which is good at the user, acquiring song data which is searched after initializing, introducing a decision tree model, and constructing splitting standards according to the soundfield information which is not good at the user;
Based on the song data retrieved after initialization as a root node, initializing and splitting the root node according to splitting standards to generate a new leaf node;
when song data of the soundfield information which is not good for the user no longer appears in the song data in the new leaf nodes, splitting is finished, a final leaf node is generated, and a song data set corresponding to the soundfield information which is good for the user is acquired through the final leaf node;
and generating song data after secondary processing according to the song data set corresponding to the voice domain information which is good for the user, and outputting the song data after secondary processing.
Further, in the method, song demand data of a user is obtained, and song recommendation information of the intelligent K song system is generated according to the song demand data of the user and the song data after secondary processing, and the method specifically comprises the following steps:
obtaining song demand data of a user, obtaining song characteristic data required by the user by carrying out characteristic extraction on the song demand data of the user, and constructing a characteristic sequence according to the song characteristic data required by the user;
acquiring song description characteristic information of the secondarily processed song data, and taking the corresponding secondarily processed song data as a song recommendation item when the song description characteristic information of the secondarily processed song data accords with the song characteristic data in the characteristic sequence;
Acquiring historical singing evaluation information of a song recommendation item, and acquiring singing evaluation difficulty information of each section of a song according to the historical singing evaluation information of the song recommendation item;
when the singing evaluation difficulty information is larger than the preset singing evaluation difficulty, generating song recommendation information of the intelligent K song system by the corresponding singing segments, and displaying the song recommendation information according to a preset mode.
The second aspect of the present invention provides a song recommendation system of an intelligent K song system, the system including a memory and a processor, the memory including a song recommendation method program of the intelligent K song system, the song recommendation method program of the intelligent K song system implementing the following steps when executed by the processor:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
constructing time-series-based gamut feature data according to the user-friendly gamut information and the user-unapplicable gamut information, and updating the user-friendly gamut information and the user-unapplicable gamut information by learning the time-series-based gamut feature data;
Initializing and searching according to the voice domain information which is good at the user, acquiring song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to acquire song data after the secondary processing;
and obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
Further, in the present system, the singing training voice data information of the user is obtained, and the voice domain information good for the user and the voice domain information bad for the user are generated based on the singing training voice data information of the user, which specifically includes:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
the method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
Presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and when the deviation rate is larger than the deviation threshold value, using the gamut information corresponding to the deviation rate larger than the deviation threshold value as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, using the gamut information corresponding to the deviation rate not larger than the deviation threshold value as the gamut information which is good for the user.
Further, in the present system, by learning the time-series-based register feature data, register information good for the user and register information bad for the user are updated, specifically including:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
Inputting the eliminated voice domain characteristic data based on the time sequence into a user voice domain preference identification model for coding learning, and obtaining a trained user voice domain preference identification model;
acquiring soundfield information good for the user and soundfield information poor for the user through the trained soundfield preference identification model of the user, and periodically updating the soundfield information good for the user and the soundfield information poor for the user;
a third aspect of the present invention provides a computer readable storage medium including a song recommendation method program for an intelligent K-song system therein, which when executed by a processor, implements the steps of the song recommendation method for an intelligent K-song system of any one of the above.
The invention solves the defects existing in the background technology, and has the following beneficial effects:
according to the song recommendation method, the song training voice data information of the user is obtained, the voice domain information good for the user and the voice domain information poor for the user are generated based on the song training voice data information of the user, the voice domain characteristic data based on the time sequence is built according to the voice domain information good for the user and the voice domain information poor for the user, the voice domain information good for the user and the voice domain information poor for the user are updated through learning the voice domain characteristic data based on the time sequence, therefore initial search is conducted according to the voice domain information good for the user, song data after initial search are obtained, secondary processing is conducted on the song data after initial search based on the voice domain information poor for the user, song demand data of the user are obtained finally, and song recommendation information of an intelligent K song system is generated according to the song demand data of the user and the song data after secondary processing. According to the voice domain analysis method and the voice domain analysis device, voice domain of the user is analyzed through singing training voice data information, and voice domain change data of the user within preset time are monitored, so that personalized recommendation is more suitable for songs singed by the user, personalized customization is achieved for the user, recommendation precision of an intelligent K song system is improved, and singing experience of the user is improved. On the other hand, the method and the device improve the accuracy of acquiring the singing voice domain of the user by configuring the related singing scenes.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other embodiments of the drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 illustrates an overall method flow diagram of a song recommendation method for an intelligent karaoke system;
FIG. 2 shows a first method flow diagram of a song recommendation method for an intelligent karaoke system;
FIG. 3 shows a second method flow diagram of a song recommendation method for an intelligent karaoke system;
fig. 4 shows a system block diagram of a song recommendation system of the intelligent K song system.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.
As shown in fig. 1, a first aspect of the present invention provides a song recommendation method of an intelligent K song system, including the following steps:
s102, acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
s104, constructing time-series-based voice domain feature data according to voice domain information good for the user and voice domain information poor for the user, and updating voice domain information good for the user and voice domain information poor for the user by learning the voice domain feature data based on the time series;
s106, carrying out initial search according to the voice domain information which is good at the user, obtaining song data after the initial search, and carrying out secondary processing on the song data after the initial search based on the voice domain information which is not good at the user, so as to obtain song data after the secondary processing;
s108, obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
The voice domain of the user is analyzed through the singing training voice data information, and the voice domain change data of the user within the preset time is monitored, so that personalized recommendation is more suitable for songs singed by the user, personalized customization is realized for the user, the recommendation precision of the intelligent K song system is improved, and the singing experience of the user is improved. On the other hand, the method and the device improve the accuracy of acquiring the singing voice domain of the user by configuring the related singing scenes.
As shown in fig. 2, further, in the method, singing training voice data information of a user is acquired, and voice domain information good for the user and voice domain information bad for the user are generated based on the singing training voice data information of the user, which specifically includes:
s202, acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
s204, acquiring singing training voice data information of the user according to singing training song information of the user and the singing training scene by configuring the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
s206, presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and S208, when the deviation rate is larger than the deviation threshold value, the gamut information corresponding to the deviation rate larger than the deviation threshold value is used as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, the gamut information corresponding to the deviation rate not larger than the deviation threshold value is used as the gamut information which is good for the user.
It should be noted that, by the method, the soundfield information good for the user and the soundfield information poor for the user can be obtained through analysis, wherein, by configuring the singing training scene, the method specifically comprises the following steps:
acquiring accuracy information of the voice domain data under each environmental factor through big data, constructing a voice domain acquisition accuracy knowledge graph, and inputting the voice domain data acquisition accuracy information under each environmental factor into the voice domain acquisition accuracy knowledge graph;
acquiring environmental factors of a current singing training field, inputting the environmental factors of the current singing training field into the register acquisition precision knowledge graph, and acquiring related register data acquisition precision information;
judging whether the related register data acquisition precision information is larger than a preset register data acquisition precision threshold value or not;
if the related register data acquisition accuracy information is not greater than the preset register data acquisition accuracy threshold, acquiring the optimal environmental factors through the big data, and adjusting the environmental factors of the current singing training field according to the optimal environmental factors to generate a singing training scene.
It should be noted that, because the capture of the voice domain data is affected by environmental factors, for example, the noise of the environment may affect the processing of the voice data, so that the abnormality generated in the voice domain data in the identified audio data may be further caused, thereby affecting the capture precision of the voice domain data, and further causing the difference between the captured voice domain data and the real voice domain data of the user.
Further, in the method, the time-series-based gamut characteristic data is constructed according to the gamut information which is good for the user and the gamut information which is poor for the user, and the method comprises the following steps:
constructing a time stamp, introducing a graph neural network, initializing a first graph node and a second graph node, taking the first graph node as a graph node of a user-friendly category, and taking the second graph node as a graph node of a user-unapplicable category;
constructing a plurality of third graph nodes from the soundfield information which is good for the user and the soundfield information which is poor for the user, constructing directed edge description, enabling the third graph nodes corresponding to the soundfield information which is good for the user to point to the first graph nodes, constructing a first topological structure diagram, fusing time stamps and the first topological structure diagram at each moment to generate soundfield feature data which is good for the user and is good for the user based on time sequence;
according to the directed edge description, enabling a third graph node corresponding to the soundfield information which is not good for the user to point to a second graph node, constructing a second topological structure diagram, fusing a time stamp and the second topological structure diagram at each moment to generate soundfield characteristic data which is not good for the user and is based on a time sequence;
the time-series-based gamut feature data is generated from the time-series-based user-friendly gamut feature data and the time-series-based user-unapplicable gamut feature data.
It should be noted that, because the vocal cords of the user may be damaged or repaired, a voice domain suitable for the user may have a certain influence, and by using the method, voice domain feature data of the user within a preset time can be obtained.
As shown in fig. 3, in the present method, further, by learning the time-series-based gamut characteristic data, the gamut information that is good for the user and the gamut information that is bad for the user are updated, specifically including:
s302, constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
s304, presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the gamut feature data based on the time sequence after elimination;
s306, inputting the eliminated voice domain characteristic data based on the time sequence into a voice domain preference recognition model of a user for coding learning, and obtaining the voice domain preference recognition model of the user after training;
And S308, acquiring the soundfield information good for the user and the soundfield information bad for the user through the trained soundfield preference identification model of the user, and updating the soundfield information good for the user and the soundfield information bad for the user regularly.
In the long time series, because the user may damage the vocal cords or recover the vocal cords or generate pathological features, the voice domain feature data irrelevant to the user can be removed through the local outlier detection algorithm in the long time series, so that voice domain information good for the user and voice domain information bad for the user can be adjusted according to the voice domain features of the user in stages, and the recommendation precision of the intelligent K song system is improved.
Further, in the method, initializing search is performed according to the soundfield information which is good for the user, song data after primary search is obtained, secondary processing is performed on the song data after primary search based on the soundfield information which is bad for the user, and song data after secondary processing is obtained, which specifically comprises:
initializing and searching according to the soundfield information which is good at the user, acquiring song data which is searched after initializing, introducing a decision tree model, and constructing splitting standards according to the soundfield information which is not good at the user;
Based on the song data retrieved after initialization as a root node, initializing and splitting the root node according to splitting standards to generate a new leaf node;
when song data of the soundfield information which is not good for the user no longer appears in the song data in the new leaf nodes, splitting is finished, a final leaf node is generated, and a song data set corresponding to the soundfield information which is good for the user is acquired through the final leaf node;
and generating song data after secondary processing according to the song data set corresponding to the voice domain information which is good for the user, and outputting the song data after secondary processing.
By the method, song recommendation precision suitable for users can be further improved.
Further, in the method, song demand data of a user is obtained, and song recommendation information of the intelligent K song system is generated according to the song demand data of the user and the song data after secondary processing, and the method specifically comprises the following steps:
obtaining song demand data of a user, obtaining song characteristic data required by the user by carrying out characteristic extraction on the song demand data of the user, and constructing a characteristic sequence according to the song characteristic data required by the user;
acquiring song description characteristic information of the secondarily processed song data, and taking the corresponding secondarily processed song data as a song recommendation item when the song description characteristic information of the secondarily processed song data accords with the song characteristic data in the characteristic sequence;
Acquiring historical singing evaluation information of a song recommendation item, and acquiring singing evaluation difficulty information of each section of a song according to the historical singing evaluation information of the song recommendation item;
when the singing evaluation difficulty information is larger than the preset singing evaluation difficulty, generating song recommendation information of the intelligent K song system by the corresponding singing segments, and displaying the song recommendation information according to a preset mode.
It should be noted that, the song feature data required by the user includes, but is not limited to, data such as classification labels (cheerful, calm, wounded, etc.) of songs, music genres (such as pop music, rock music, etc.), and the like. The method can generate the song recommendation information of the intelligent K song system from the data that the song evaluation difficulty information in songs is larger than the preset song evaluation difficulty information so as to prompt a user, wherein the song evaluation difficulty information comprises easiness, moderate difficulty, high difficulty and the like, and the user can set the preset song evaluation difficulty by himself.
In addition, the method can further comprise the following steps:
obtaining song data corresponding to the voice domain information good for the user in the leaf node, constructing a covariance matrix according to the song data corresponding to the voice domain information good for the user in the leaf node, and fusing a singular value decomposition algorithm; decomposing the covariance matrix through the singular value decomposition algorithm to generate a feature matrix and an orthogonal matrix which are composed of feature vectors, and calculating included angles formed among the feature vectors in the feature matrix; judging whether the included angle value is larger than a preset included angle value, and counting a feature vector comparison group corresponding to the included angle value larger than the preset included angle value when the included angle value is larger than the preset included angle value so as to obtain an abnormal feature vector in the feature vector comparison group; and obtaining the leaf node where the abnormal feature vector is located, and re-splitting the leaf node until the included angle value is not larger than a preset included angle value, and outputting a new leaf node after splitting.
It should be noted that, the singular value decomposition algorithm is introduced to decompose the sample data in the leaf nodes, so as to reduce the complexity of calculation, and the included angle formed between the feature vectors in the feature matrix is calculated, when the included angle value is greater than the preset included angle value, the classification error is described, and the method can further improve the classification precision of songs, thereby improving the song recommendation precision of the intelligent K song system.
As shown in fig. 4, the second aspect of the present invention provides a song recommendation system 4 of an intelligent K song system, where the system 4 includes a memory 41 and a processor 42, and the memory 41 includes a song recommendation method program of the intelligent K song system, and when the song recommendation method program of the intelligent K song system is executed by the processor 42, the following steps are implemented:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
constructing time-series-based gamut feature data according to the user-friendly gamut information and the user-unapplicable gamut information, and updating the user-friendly gamut information and the user-unapplicable gamut information by learning the time-series-based gamut feature data;
Initializing and searching according to the voice domain information which is good at the user, acquiring song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to acquire song data after the secondary processing;
and obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
Further, in the present system, the singing training voice data information of the user is obtained, and the voice domain information good for the user and the voice domain information bad for the user are generated based on the singing training voice data information of the user, which specifically includes:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
the method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
Presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and when the deviation rate is larger than the deviation threshold value, using the gamut information corresponding to the deviation rate larger than the deviation threshold value as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, using the gamut information corresponding to the deviation rate not larger than the deviation threshold value as the gamut information which is good for the user.
Further, in the present system, by learning the time-series-based register feature data, register information good for the user and register information bad for the user are updated, specifically including:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
Inputting the eliminated voice domain characteristic data based on the time sequence into a user voice domain preference identification model for coding learning, and obtaining a trained user voice domain preference identification model;
and acquiring the soundfield information good for the user and the soundfield information poor for the user through the trained soundfield preference identification model of the user, and periodically updating the soundfield information good for the user and the soundfield information poor for the user.
A third aspect of the present invention provides a computer readable storage medium including a song recommendation method program for an intelligent K-song system therein, which when executed by a processor, implements the steps of the song recommendation method for an intelligent K-song system of any one of the above.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.
Alternatively, the above-described integrated units of the present invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.
The foregoing is merely illustrative embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the technical scope of the present invention, and the invention should be covered. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (6)

1. A song recommendation method of an intelligent K song system is characterized by comprising the following steps:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
constructing time-series-based voice domain feature data according to the voice domain information good for the user and the voice domain information poor for the user, and updating the voice domain information good for the user and the voice domain information poor for the user by learning the voice domain feature data based on the time series;
initializing and searching according to the voice domain information which is good at the user, obtaining song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to obtain song data after the secondary processing;
acquiring song demand data of a user, and generating song recommendation information of an intelligent K song system according to the song demand data of the user and the song data after secondary processing;
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user, wherein the method specifically comprises the following steps:
Acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
the method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation rate threshold;
when the deviation rate is larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate larger than the deviation rate threshold as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate not larger than the deviation rate threshold as the gamut information which is good for the user;
Wherein, through configuration singing training scene specifically includes:
acquiring accuracy information of the voice domain data under each environmental factor through big data, constructing a voice domain acquisition accuracy knowledge graph, and inputting the voice domain data acquisition accuracy information under each environmental factor into the voice domain acquisition accuracy knowledge graph;
acquiring environmental factors of a current singing training field, inputting the environmental factors of the current singing training field into the register acquisition precision knowledge graph, and acquiring related register data acquisition precision information;
judging whether the related register data acquisition precision information is larger than a preset register data acquisition precision threshold value or not;
if the related register data acquisition accuracy information is not greater than a preset register data acquisition accuracy threshold, acquiring the optimal environmental factors through big data, and adjusting the environmental factors of the current singing training field according to the optimal environmental factors to generate a singing training scene;
the method for updating the voice domain information which is good for the user and the voice domain information which is bad for the user by learning the voice domain characteristic data based on the time sequence specifically comprises the following steps:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on the voice domain feature data based on the time sequence within preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
Presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
inputting the eliminated voice domain characteristic data based on the time sequence into the voice domain preference identification model of the user for coding learning, and obtaining the voice domain preference identification model of the user after training;
the trained user voice preference identification model acquires voice domain information good for the user and voice domain information bad for the user, and periodically updates the voice domain information good for the user and voice domain information bad for the user.
2. The song recommendation method of an intelligent K song system according to claim 1, wherein constructing time-series-based musical scale feature data from the user-friendly musical scale information and the user-unapplicable musical scale information comprises the steps of:
constructing a time stamp, introducing a graph neural network, initializing a first graph node and a second graph node, taking the first graph node as a graph node of a user-friendly category, and taking the second graph node as a graph node of a user-unapplicable category;
Constructing a plurality of third graph nodes from the soundfield information which is good for the user and the soundfield information which is bad for the user, constructing directed edge description, enabling the third graph nodes corresponding to the soundfield information which is good for the user to point to the first graph nodes, constructing a first topological structure diagram, and fusing the time stamp and the first topological structure diagram at each moment to generate soundfield characteristic data which is good for the user based on a time sequence;
according to the directed edge description, enabling a third graph node corresponding to the soundfield information which is not good for the user to point to a second graph node, constructing a second topological structure diagram, and fusing the time stamp and the second topological structure diagram at each moment to generate soundfield characteristic data which is not good for the user and is based on a time sequence;
and generating the voice domain characteristic data based on the time sequence according to the voice domain characteristic data which is good for the user based on the time sequence and the voice domain characteristic data which is poor for the user based on the time sequence.
3. The song recommendation method of an intelligent K song system according to claim 1, wherein the initializing search is performed according to the voice domain information good for the user to obtain song data after the first search, and the second processing is performed on the song data after the first search based on the voice domain information poor for the user to obtain song data after the second processing, specifically comprising:
Initializing and searching according to the soundfield information which is good at the user, acquiring song data which is searched after initializing, introducing a decision tree model, and constructing splitting standards according to the soundfield information which is not good at the user;
based on the song data retrieved after initialization as a root node, initializing and splitting the root node according to the splitting standard to generate a new leaf node;
when song data of the soundfield information which is not good for the user does not appear in the song data in the new leaf nodes, splitting is finished, a final leaf node is generated, and a song data set corresponding to the soundfield information which is good for the user is acquired through the final leaf node;
and generating song data after secondary processing according to the song data set corresponding to the voice domain information which is good for the user, and outputting the song data after secondary processing.
4. The song recommendation method of the intelligent K song system according to claim 1, wherein the song demand data of the user is obtained, and song recommendation information of the intelligent K song system is generated according to the song demand data of the user and the song data after the secondary processing, specifically comprising:
obtaining song demand data of a user, obtaining song characteristic data of the user demand by carrying out characteristic extraction on the song demand data of the user, and constructing a characteristic sequence according to the song characteristic data of the user demand;
Acquiring song description characteristic information of the secondarily processed song data, and taking the corresponding secondarily processed song data as a song recommendation item when the song description characteristic information of the secondarily processed song data accords with the song characteristic data in the characteristic sequence;
acquiring historical singing evaluation information of the song recommendation items, and acquiring singing evaluation difficulty information of each section of the song according to the historical singing evaluation information of the song recommendation items;
when the singing evaluation difficulty information is larger than the preset singing evaluation difficulty, generating song recommendation information of the intelligent K song system by the corresponding singing segments, and displaying the song recommendation information according to a preset mode.
5. The song recommendation system of the intelligent K song system is characterized by comprising a memory and a processor, wherein the memory comprises a song recommendation method program of the intelligent K song system, and when the song recommendation method program of the intelligent K song system is executed by the processor, the following steps are realized:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
Constructing time-series-based voice domain feature data according to the voice domain information good for the user and the voice domain information poor for the user, and updating the voice domain information good for the user and the voice domain information poor for the user by learning the voice domain feature data based on the time series;
initializing and searching according to the voice domain information which is good at the user, obtaining song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to obtain song data after the secondary processing;
acquiring song demand data of a user, and generating song recommendation information of an intelligent K song system according to the song demand data of the user and the song data after secondary processing;
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user, wherein the method specifically comprises the following steps:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
The method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation rate threshold;
when the deviation rate is larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate larger than the deviation rate threshold as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate not larger than the deviation rate threshold as the gamut information which is good for the user;
wherein, through configuration singing training scene specifically includes:
acquiring accuracy information of the voice domain data under each environmental factor through big data, constructing a voice domain acquisition accuracy knowledge graph, and inputting the voice domain data acquisition accuracy information under each environmental factor into the voice domain acquisition accuracy knowledge graph;
Acquiring environmental factors of a current singing training field, inputting the environmental factors of the current singing training field into the register acquisition precision knowledge graph, and acquiring related register data acquisition precision information;
judging whether the related register data acquisition precision information is larger than a preset register data acquisition precision threshold value or not;
if the related register data acquisition accuracy information is not greater than a preset register data acquisition accuracy threshold, acquiring the optimal environmental factors through big data, and adjusting the environmental factors of the current singing training field according to the optimal environmental factors to generate a singing training scene;
the method for updating the voice domain information which is good for the user and the voice domain information which is bad for the user by learning the voice domain characteristic data based on the time sequence specifically comprises the following steps:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on the voice domain feature data based on the time sequence within preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
Inputting the eliminated voice domain characteristic data based on the time sequence into the voice domain preference identification model of the user for coding learning, and obtaining the voice domain preference identification model of the user after training;
the trained user voice preference identification model acquires voice domain information good for the user and voice domain information bad for the user, and periodically updates the voice domain information good for the user and voice domain information bad for the user.
6. A computer readable storage medium, characterized in that the computer readable storage medium comprises a song recommendation method program of an intelligent K song system, which when executed by a processor, implements the steps of the song recommendation method of the intelligent K song system as claimed in any one of claims 1-4.
CN202311498913.6A 2023-11-13 2023-11-13 Song recommendation method, system and storage medium of intelligent K song system Active CN117235300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311498913.6A CN117235300B (en) 2023-11-13 2023-11-13 Song recommendation method, system and storage medium of intelligent K song system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311498913.6A CN117235300B (en) 2023-11-13 2023-11-13 Song recommendation method, system and storage medium of intelligent K song system

Publications (2)

Publication Number Publication Date
CN117235300A CN117235300A (en) 2023-12-15
CN117235300B true CN117235300B (en) 2024-03-15

Family

ID=89098661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311498913.6A Active CN117235300B (en) 2023-11-13 2023-11-13 Song recommendation method, system and storage medium of intelligent K song system

Country Status (1)

Country Link
CN (1) CN117235300B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104715760A (en) * 2015-02-13 2015-06-17 朱威 KTV song matching analyzing method and system
KR20180043925A (en) * 2016-10-21 2018-05-02 정문일 Singing evaluation system, singing evaluation server and method thereof
JP2018205514A (en) * 2017-06-02 2018-12-27 株式会社第一興商 Recommendation system for karaoke songs
CN109710797A (en) * 2018-11-14 2019-05-03 腾讯科技(深圳)有限公司 Method for pushing, device, electronic device and the storage medium of audio file
CN115470371A (en) * 2022-08-15 2022-12-13 苏坤 Song recommendation method and device based on user voice characteristics and terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104715760A (en) * 2015-02-13 2015-06-17 朱威 KTV song matching analyzing method and system
KR20180043925A (en) * 2016-10-21 2018-05-02 정문일 Singing evaluation system, singing evaluation server and method thereof
JP2018205514A (en) * 2017-06-02 2018-12-27 株式会社第一興商 Recommendation system for karaoke songs
CN109710797A (en) * 2018-11-14 2019-05-03 腾讯科技(深圳)有限公司 Method for pushing, device, electronic device and the storage medium of audio file
CN115470371A (en) * 2022-08-15 2022-12-13 苏坤 Song recommendation method and device based on user voice characteristics and terminal

Also Published As

Publication number Publication date
CN117235300A (en) 2023-12-15

Similar Documents

Publication Publication Date Title
CN109800407B (en) Intention recognition method and device, computer equipment and storage medium
CN111916111B (en) Intelligent voice outbound method and device with emotion, server and storage medium
Dikmen et al. Sound event detection using non-negative dictionaries learned from annotated overlapping events
Panteli et al. Towards the characterization of singing styles in world music
BR112016020457B1 (en) METHOD OF SEARCHING AUDIO FINGERPRINTS STORED IN A DATABASE WITHIN AN AUDIO FINGERPRINT DETECTION SYSTEM
Somervuo et al. Bird song recognition based on syllable pair histograms
CN111428028A (en) Information classification method based on deep learning and related equipment
CN108549675B (en) Piano teaching method based on big data and neural network
CN110164417B (en) Language vector obtaining and language identification method and related device
CN107993636B (en) Recursive neural network-based music score modeling and generating method
CN111462761A (en) Voiceprint data generation method and device, computer device and storage medium
Benetos et al. Characterisation of acoustic scenes using a temporally-constrained shift-invariant model
Birla A robust unsupervised pattern discovery and clustering of speech signals
Sobieraj et al. Masked non-negative matrix factorization for bird detection using weakly labeled data
CN117235300B (en) Song recommendation method, system and storage medium of intelligent K song system
CN106503181B (en) Audio data processing method and device
Pikrakis et al. Unsupervised singing voice detection using dictionary learning
CN116578700A (en) Log classification method, log classification device, equipment and medium
CN111477212A (en) Content recognition, model training and data processing method, system and equipment
CN107133344B (en) Data processing method and device
CN114818651A (en) Text similarity determination method and device, storage medium and electronic device
CN115859191A (en) Fault diagnosis method and device, computer readable storage medium and computer equipment
CN115357720A (en) Multi-task news classification method and device based on BERT
CN114974221A (en) Speech recognition model training method and device and computer readable storage medium
CN112766368A (en) Data classification method, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant