CN117235300B - Song recommendation method, system and storage medium of intelligent K song system - Google Patents
Song recommendation method, system and storage medium of intelligent K song system Download PDFInfo
- Publication number
- CN117235300B CN117235300B CN202311498913.6A CN202311498913A CN117235300B CN 117235300 B CN117235300 B CN 117235300B CN 202311498913 A CN202311498913 A CN 202311498913A CN 117235300 B CN117235300 B CN 117235300B
- Authority
- CN
- China
- Prior art keywords
- user
- information
- song
- data
- voice domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000012549 training Methods 0.000 claims abstract description 89
- 238000004458 analytical method Methods 0.000 claims abstract description 35
- 238000013450 outlier detection Methods 0.000 claims description 55
- 238000012545 processing Methods 0.000 claims description 37
- 230000007613 environmental effect Effects 0.000 claims description 22
- 238000011156 evaluation Methods 0.000 claims description 19
- 238000010586 diagram Methods 0.000 claims description 16
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000003066 decision tree Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 239000013598 vector Substances 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 6
- 238000000354 decomposition reaction Methods 0.000 description 3
- 210000001260 vocal cord Anatomy 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Reverberation, Karaoke And Other Acoustics (AREA)
Abstract
The invention relates to a song recommendation method, a system and a storage medium of an intelligent K song system, which belong to the technical field of singing recommendation. According to the voice domain analysis method and the voice domain analysis device, voice domain of the user is analyzed through singing training voice data information, and voice domain change data of the user within preset time are monitored, so that personalized recommendation is more suitable for songs singed by the user, personalized customization is achieved for the user, recommendation precision of an intelligent K song system is improved, and singing experience of the user is improved. On the other hand, the method and the device improve the accuracy of acquiring the singing voice domain of the user by configuring the related singing scenes.
Description
Technical Field
The invention relates to the technical field of song recommendation, in particular to a song recommendation method, a song recommendation system and a storage medium of an intelligent K song system.
Background
In the application scenario where songs are sung, it is most important for the singer to properly deduct the songs. The investigation results show that the most significant reason most singers cannot correctly deduct songs is because the selected singing track is either too loud or too loud for the singer's own ability, resulting in a "go high" and no go low ". There is therefore a need for a music recommendation method that helps a user select songs that fit their range of ranges and thereby avoid this type of situation. However, existing music recommendation methods do not address this problem for users. Secondly, deviation exists when the range of the voice domain of the user is acquired, so that the recommendation accuracy of the intelligent K song system is reduced.
Disclosure of Invention
The invention overcomes the defects of the prior art and provides a song recommendation method, a song recommendation system and a song recommendation storage medium of an intelligent K song system.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the first aspect of the invention provides a song recommendation method of an intelligent K song system, which comprises the following steps:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
Constructing time-series-based gamut feature data according to the user-friendly gamut information and the user-unapplicable gamut information, and updating the user-friendly gamut information and the user-unapplicable gamut information by learning the time-series-based gamut feature data;
initializing and searching according to the voice domain information which is good at the user, acquiring song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to acquire song data after the secondary processing;
and obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
Further, in the method, the singing training voice data information of the user is obtained, and the voice domain information good for the user and the voice domain information poor for the user are generated based on the singing training voice data information of the user, which specifically comprises the following steps:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
The method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and when the deviation rate is larger than the deviation threshold value, using the gamut information corresponding to the deviation rate larger than the deviation threshold value as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, using the gamut information corresponding to the deviation rate not larger than the deviation threshold value as the gamut information which is good for the user.
Further, in the method, the time-series-based gamut characteristic data is constructed according to the gamut information which is good for the user and the gamut information which is poor for the user, and the method comprises the following steps:
constructing a time stamp, introducing a graph neural network, initializing a first graph node and a second graph node, taking the first graph node as a graph node of a user-friendly category, and taking the second graph node as a graph node of a user-unapplicable category;
Constructing a plurality of third graph nodes from the soundfield information which is good for the user and the soundfield information which is poor for the user, constructing directed edge description, enabling the third graph nodes corresponding to the soundfield information which is good for the user to point to the first graph nodes, constructing a first topological structure diagram, fusing time stamps and the first topological structure diagram at each moment to generate soundfield feature data which is good for the user and is good for the user based on time sequence;
according to the directed edge description, enabling a third graph node corresponding to the soundfield information which is not good for the user to point to a second graph node, constructing a second topological structure diagram, fusing a time stamp and the second topological structure diagram at each moment to generate soundfield characteristic data which is not good for the user and is based on a time sequence;
generating time-series-based gamut feature data from the time-series-based user-friendly gamut feature data and the time-series-based user-unapplicable gamut feature data;
further, in the present method, by learning the time-series-based register feature data, register information good for the user and register information bad for the user are updated, specifically including:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
Presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
inputting the eliminated voice domain characteristic data based on the time sequence into a user voice domain preference identification model for coding learning, and obtaining a trained user voice domain preference identification model;
and acquiring the soundfield information good for the user and the soundfield information poor for the user through the trained soundfield preference identification model of the user, and periodically updating the soundfield information good for the user and the soundfield information poor for the user.
Further, in the method, initializing search is performed according to the soundfield information which is good for the user, song data after primary search is obtained, secondary processing is performed on the song data after primary search based on the soundfield information which is bad for the user, and song data after secondary processing is obtained, which specifically comprises:
initializing and searching according to the soundfield information which is good at the user, acquiring song data which is searched after initializing, introducing a decision tree model, and constructing splitting standards according to the soundfield information which is not good at the user;
Based on the song data retrieved after initialization as a root node, initializing and splitting the root node according to splitting standards to generate a new leaf node;
when song data of the soundfield information which is not good for the user no longer appears in the song data in the new leaf nodes, splitting is finished, a final leaf node is generated, and a song data set corresponding to the soundfield information which is good for the user is acquired through the final leaf node;
and generating song data after secondary processing according to the song data set corresponding to the voice domain information which is good for the user, and outputting the song data after secondary processing.
Further, in the method, song demand data of a user is obtained, and song recommendation information of the intelligent K song system is generated according to the song demand data of the user and the song data after secondary processing, and the method specifically comprises the following steps:
obtaining song demand data of a user, obtaining song characteristic data required by the user by carrying out characteristic extraction on the song demand data of the user, and constructing a characteristic sequence according to the song characteristic data required by the user;
acquiring song description characteristic information of the secondarily processed song data, and taking the corresponding secondarily processed song data as a song recommendation item when the song description characteristic information of the secondarily processed song data accords with the song characteristic data in the characteristic sequence;
Acquiring historical singing evaluation information of a song recommendation item, and acquiring singing evaluation difficulty information of each section of a song according to the historical singing evaluation information of the song recommendation item;
when the singing evaluation difficulty information is larger than the preset singing evaluation difficulty, generating song recommendation information of the intelligent K song system by the corresponding singing segments, and displaying the song recommendation information according to a preset mode.
The second aspect of the present invention provides a song recommendation system of an intelligent K song system, the system including a memory and a processor, the memory including a song recommendation method program of the intelligent K song system, the song recommendation method program of the intelligent K song system implementing the following steps when executed by the processor:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
constructing time-series-based gamut feature data according to the user-friendly gamut information and the user-unapplicable gamut information, and updating the user-friendly gamut information and the user-unapplicable gamut information by learning the time-series-based gamut feature data;
Initializing and searching according to the voice domain information which is good at the user, acquiring song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to acquire song data after the secondary processing;
and obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
Further, in the present system, the singing training voice data information of the user is obtained, and the voice domain information good for the user and the voice domain information bad for the user are generated based on the singing training voice data information of the user, which specifically includes:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
the method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
Presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and when the deviation rate is larger than the deviation threshold value, using the gamut information corresponding to the deviation rate larger than the deviation threshold value as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, using the gamut information corresponding to the deviation rate not larger than the deviation threshold value as the gamut information which is good for the user.
Further, in the present system, by learning the time-series-based register feature data, register information good for the user and register information bad for the user are updated, specifically including:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
Inputting the eliminated voice domain characteristic data based on the time sequence into a user voice domain preference identification model for coding learning, and obtaining a trained user voice domain preference identification model;
acquiring soundfield information good for the user and soundfield information poor for the user through the trained soundfield preference identification model of the user, and periodically updating the soundfield information good for the user and the soundfield information poor for the user;
a third aspect of the present invention provides a computer readable storage medium including a song recommendation method program for an intelligent K-song system therein, which when executed by a processor, implements the steps of the song recommendation method for an intelligent K-song system of any one of the above.
The invention solves the defects existing in the background technology, and has the following beneficial effects:
according to the song recommendation method, the song training voice data information of the user is obtained, the voice domain information good for the user and the voice domain information poor for the user are generated based on the song training voice data information of the user, the voice domain characteristic data based on the time sequence is built according to the voice domain information good for the user and the voice domain information poor for the user, the voice domain information good for the user and the voice domain information poor for the user are updated through learning the voice domain characteristic data based on the time sequence, therefore initial search is conducted according to the voice domain information good for the user, song data after initial search are obtained, secondary processing is conducted on the song data after initial search based on the voice domain information poor for the user, song demand data of the user are obtained finally, and song recommendation information of an intelligent K song system is generated according to the song demand data of the user and the song data after secondary processing. According to the voice domain analysis method and the voice domain analysis device, voice domain of the user is analyzed through singing training voice data information, and voice domain change data of the user within preset time are monitored, so that personalized recommendation is more suitable for songs singed by the user, personalized customization is achieved for the user, recommendation precision of an intelligent K song system is improved, and singing experience of the user is improved. On the other hand, the method and the device improve the accuracy of acquiring the singing voice domain of the user by configuring the related singing scenes.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other embodiments of the drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 illustrates an overall method flow diagram of a song recommendation method for an intelligent karaoke system;
FIG. 2 shows a first method flow diagram of a song recommendation method for an intelligent karaoke system;
FIG. 3 shows a second method flow diagram of a song recommendation method for an intelligent karaoke system;
fig. 4 shows a system block diagram of a song recommendation system of the intelligent K song system.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.
As shown in fig. 1, a first aspect of the present invention provides a song recommendation method of an intelligent K song system, including the following steps:
s102, acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
s104, constructing time-series-based voice domain feature data according to voice domain information good for the user and voice domain information poor for the user, and updating voice domain information good for the user and voice domain information poor for the user by learning the voice domain feature data based on the time series;
s106, carrying out initial search according to the voice domain information which is good at the user, obtaining song data after the initial search, and carrying out secondary processing on the song data after the initial search based on the voice domain information which is not good at the user, so as to obtain song data after the secondary processing;
s108, obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
The voice domain of the user is analyzed through the singing training voice data information, and the voice domain change data of the user within the preset time is monitored, so that personalized recommendation is more suitable for songs singed by the user, personalized customization is realized for the user, the recommendation precision of the intelligent K song system is improved, and the singing experience of the user is improved. On the other hand, the method and the device improve the accuracy of acquiring the singing voice domain of the user by configuring the related singing scenes.
As shown in fig. 2, further, in the method, singing training voice data information of a user is acquired, and voice domain information good for the user and voice domain information bad for the user are generated based on the singing training voice data information of the user, which specifically includes:
s202, acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
s204, acquiring singing training voice data information of the user according to singing training song information of the user and the singing training scene by configuring the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
s206, presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and S208, when the deviation rate is larger than the deviation threshold value, the gamut information corresponding to the deviation rate larger than the deviation threshold value is used as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, the gamut information corresponding to the deviation rate not larger than the deviation threshold value is used as the gamut information which is good for the user.
It should be noted that, by the method, the soundfield information good for the user and the soundfield information poor for the user can be obtained through analysis, wherein, by configuring the singing training scene, the method specifically comprises the following steps:
acquiring accuracy information of the voice domain data under each environmental factor through big data, constructing a voice domain acquisition accuracy knowledge graph, and inputting the voice domain data acquisition accuracy information under each environmental factor into the voice domain acquisition accuracy knowledge graph;
acquiring environmental factors of a current singing training field, inputting the environmental factors of the current singing training field into the register acquisition precision knowledge graph, and acquiring related register data acquisition precision information;
judging whether the related register data acquisition precision information is larger than a preset register data acquisition precision threshold value or not;
if the related register data acquisition accuracy information is not greater than the preset register data acquisition accuracy threshold, acquiring the optimal environmental factors through the big data, and adjusting the environmental factors of the current singing training field according to the optimal environmental factors to generate a singing training scene.
It should be noted that, because the capture of the voice domain data is affected by environmental factors, for example, the noise of the environment may affect the processing of the voice data, so that the abnormality generated in the voice domain data in the identified audio data may be further caused, thereby affecting the capture precision of the voice domain data, and further causing the difference between the captured voice domain data and the real voice domain data of the user.
Further, in the method, the time-series-based gamut characteristic data is constructed according to the gamut information which is good for the user and the gamut information which is poor for the user, and the method comprises the following steps:
constructing a time stamp, introducing a graph neural network, initializing a first graph node and a second graph node, taking the first graph node as a graph node of a user-friendly category, and taking the second graph node as a graph node of a user-unapplicable category;
constructing a plurality of third graph nodes from the soundfield information which is good for the user and the soundfield information which is poor for the user, constructing directed edge description, enabling the third graph nodes corresponding to the soundfield information which is good for the user to point to the first graph nodes, constructing a first topological structure diagram, fusing time stamps and the first topological structure diagram at each moment to generate soundfield feature data which is good for the user and is good for the user based on time sequence;
according to the directed edge description, enabling a third graph node corresponding to the soundfield information which is not good for the user to point to a second graph node, constructing a second topological structure diagram, fusing a time stamp and the second topological structure diagram at each moment to generate soundfield characteristic data which is not good for the user and is based on a time sequence;
the time-series-based gamut feature data is generated from the time-series-based user-friendly gamut feature data and the time-series-based user-unapplicable gamut feature data.
It should be noted that, because the vocal cords of the user may be damaged or repaired, a voice domain suitable for the user may have a certain influence, and by using the method, voice domain feature data of the user within a preset time can be obtained.
As shown in fig. 3, in the present method, further, by learning the time-series-based gamut characteristic data, the gamut information that is good for the user and the gamut information that is bad for the user are updated, specifically including:
s302, constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
s304, presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the gamut feature data based on the time sequence after elimination;
s306, inputting the eliminated voice domain characteristic data based on the time sequence into a voice domain preference recognition model of a user for coding learning, and obtaining the voice domain preference recognition model of the user after training;
And S308, acquiring the soundfield information good for the user and the soundfield information bad for the user through the trained soundfield preference identification model of the user, and updating the soundfield information good for the user and the soundfield information bad for the user regularly.
In the long time series, because the user may damage the vocal cords or recover the vocal cords or generate pathological features, the voice domain feature data irrelevant to the user can be removed through the local outlier detection algorithm in the long time series, so that voice domain information good for the user and voice domain information bad for the user can be adjusted according to the voice domain features of the user in stages, and the recommendation precision of the intelligent K song system is improved.
Further, in the method, initializing search is performed according to the soundfield information which is good for the user, song data after primary search is obtained, secondary processing is performed on the song data after primary search based on the soundfield information which is bad for the user, and song data after secondary processing is obtained, which specifically comprises:
initializing and searching according to the soundfield information which is good at the user, acquiring song data which is searched after initializing, introducing a decision tree model, and constructing splitting standards according to the soundfield information which is not good at the user;
Based on the song data retrieved after initialization as a root node, initializing and splitting the root node according to splitting standards to generate a new leaf node;
when song data of the soundfield information which is not good for the user no longer appears in the song data in the new leaf nodes, splitting is finished, a final leaf node is generated, and a song data set corresponding to the soundfield information which is good for the user is acquired through the final leaf node;
and generating song data after secondary processing according to the song data set corresponding to the voice domain information which is good for the user, and outputting the song data after secondary processing.
By the method, song recommendation precision suitable for users can be further improved.
Further, in the method, song demand data of a user is obtained, and song recommendation information of the intelligent K song system is generated according to the song demand data of the user and the song data after secondary processing, and the method specifically comprises the following steps:
obtaining song demand data of a user, obtaining song characteristic data required by the user by carrying out characteristic extraction on the song demand data of the user, and constructing a characteristic sequence according to the song characteristic data required by the user;
acquiring song description characteristic information of the secondarily processed song data, and taking the corresponding secondarily processed song data as a song recommendation item when the song description characteristic information of the secondarily processed song data accords with the song characteristic data in the characteristic sequence;
Acquiring historical singing evaluation information of a song recommendation item, and acquiring singing evaluation difficulty information of each section of a song according to the historical singing evaluation information of the song recommendation item;
when the singing evaluation difficulty information is larger than the preset singing evaluation difficulty, generating song recommendation information of the intelligent K song system by the corresponding singing segments, and displaying the song recommendation information according to a preset mode.
It should be noted that, the song feature data required by the user includes, but is not limited to, data such as classification labels (cheerful, calm, wounded, etc.) of songs, music genres (such as pop music, rock music, etc.), and the like. The method can generate the song recommendation information of the intelligent K song system from the data that the song evaluation difficulty information in songs is larger than the preset song evaluation difficulty information so as to prompt a user, wherein the song evaluation difficulty information comprises easiness, moderate difficulty, high difficulty and the like, and the user can set the preset song evaluation difficulty by himself.
In addition, the method can further comprise the following steps:
obtaining song data corresponding to the voice domain information good for the user in the leaf node, constructing a covariance matrix according to the song data corresponding to the voice domain information good for the user in the leaf node, and fusing a singular value decomposition algorithm; decomposing the covariance matrix through the singular value decomposition algorithm to generate a feature matrix and an orthogonal matrix which are composed of feature vectors, and calculating included angles formed among the feature vectors in the feature matrix; judging whether the included angle value is larger than a preset included angle value, and counting a feature vector comparison group corresponding to the included angle value larger than the preset included angle value when the included angle value is larger than the preset included angle value so as to obtain an abnormal feature vector in the feature vector comparison group; and obtaining the leaf node where the abnormal feature vector is located, and re-splitting the leaf node until the included angle value is not larger than a preset included angle value, and outputting a new leaf node after splitting.
It should be noted that, the singular value decomposition algorithm is introduced to decompose the sample data in the leaf nodes, so as to reduce the complexity of calculation, and the included angle formed between the feature vectors in the feature matrix is calculated, when the included angle value is greater than the preset included angle value, the classification error is described, and the method can further improve the classification precision of songs, thereby improving the song recommendation precision of the intelligent K song system.
As shown in fig. 4, the second aspect of the present invention provides a song recommendation system 4 of an intelligent K song system, where the system 4 includes a memory 41 and a processor 42, and the memory 41 includes a song recommendation method program of the intelligent K song system, and when the song recommendation method program of the intelligent K song system is executed by the processor 42, the following steps are implemented:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
constructing time-series-based gamut feature data according to the user-friendly gamut information and the user-unapplicable gamut information, and updating the user-friendly gamut information and the user-unapplicable gamut information by learning the time-series-based gamut feature data;
Initializing and searching according to the voice domain information which is good at the user, acquiring song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to acquire song data after the secondary processing;
and obtaining song demand data of the user, and generating song recommendation information of the intelligent K song system according to the song demand data of the user and the song data after secondary processing.
Further, in the present system, the singing training voice data information of the user is obtained, and the voice domain information good for the user and the voice domain information bad for the user are generated based on the singing training voice data information of the user, which specifically includes:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
the method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
Presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation threshold;
and when the deviation rate is larger than the deviation threshold value, using the gamut information corresponding to the deviation rate larger than the deviation threshold value as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation threshold value, using the gamut information corresponding to the deviation rate not larger than the deviation threshold value as the gamut information which is good for the user.
Further, in the present system, by learning the time-series-based register feature data, register information good for the user and register information bad for the user are updated, specifically including:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on voice domain feature data based on a time sequence within a preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
Inputting the eliminated voice domain characteristic data based on the time sequence into a user voice domain preference identification model for coding learning, and obtaining a trained user voice domain preference identification model;
and acquiring the soundfield information good for the user and the soundfield information poor for the user through the trained soundfield preference identification model of the user, and periodically updating the soundfield information good for the user and the soundfield information poor for the user.
A third aspect of the present invention provides a computer readable storage medium including a song recommendation method program for an intelligent K-song system therein, which when executed by a processor, implements the steps of the song recommendation method for an intelligent K-song system of any one of the above.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.
Alternatively, the above-described integrated units of the present invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.
The foregoing is merely illustrative embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the technical scope of the present invention, and the invention should be covered. Therefore, the protection scope of the invention is subject to the protection scope of the claims.
Claims (6)
1. A song recommendation method of an intelligent K song system is characterized by comprising the following steps:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
constructing time-series-based voice domain feature data according to the voice domain information good for the user and the voice domain information poor for the user, and updating the voice domain information good for the user and the voice domain information poor for the user by learning the voice domain feature data based on the time series;
initializing and searching according to the voice domain information which is good at the user, obtaining song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to obtain song data after the secondary processing;
acquiring song demand data of a user, and generating song recommendation information of an intelligent K song system according to the song demand data of the user and the song data after secondary processing;
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user, wherein the method specifically comprises the following steps:
Acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
the method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation rate threshold;
when the deviation rate is larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate larger than the deviation rate threshold as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate not larger than the deviation rate threshold as the gamut information which is good for the user;
Wherein, through configuration singing training scene specifically includes:
acquiring accuracy information of the voice domain data under each environmental factor through big data, constructing a voice domain acquisition accuracy knowledge graph, and inputting the voice domain data acquisition accuracy information under each environmental factor into the voice domain acquisition accuracy knowledge graph;
acquiring environmental factors of a current singing training field, inputting the environmental factors of the current singing training field into the register acquisition precision knowledge graph, and acquiring related register data acquisition precision information;
judging whether the related register data acquisition precision information is larger than a preset register data acquisition precision threshold value or not;
if the related register data acquisition accuracy information is not greater than a preset register data acquisition accuracy threshold, acquiring the optimal environmental factors through big data, and adjusting the environmental factors of the current singing training field according to the optimal environmental factors to generate a singing training scene;
the method for updating the voice domain information which is good for the user and the voice domain information which is bad for the user by learning the voice domain characteristic data based on the time sequence specifically comprises the following steps:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on the voice domain feature data based on the time sequence within preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
Presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
inputting the eliminated voice domain characteristic data based on the time sequence into the voice domain preference identification model of the user for coding learning, and obtaining the voice domain preference identification model of the user after training;
the trained user voice preference identification model acquires voice domain information good for the user and voice domain information bad for the user, and periodically updates the voice domain information good for the user and voice domain information bad for the user.
2. The song recommendation method of an intelligent K song system according to claim 1, wherein constructing time-series-based musical scale feature data from the user-friendly musical scale information and the user-unapplicable musical scale information comprises the steps of:
constructing a time stamp, introducing a graph neural network, initializing a first graph node and a second graph node, taking the first graph node as a graph node of a user-friendly category, and taking the second graph node as a graph node of a user-unapplicable category;
Constructing a plurality of third graph nodes from the soundfield information which is good for the user and the soundfield information which is bad for the user, constructing directed edge description, enabling the third graph nodes corresponding to the soundfield information which is good for the user to point to the first graph nodes, constructing a first topological structure diagram, and fusing the time stamp and the first topological structure diagram at each moment to generate soundfield characteristic data which is good for the user based on a time sequence;
according to the directed edge description, enabling a third graph node corresponding to the soundfield information which is not good for the user to point to a second graph node, constructing a second topological structure diagram, and fusing the time stamp and the second topological structure diagram at each moment to generate soundfield characteristic data which is not good for the user and is based on a time sequence;
and generating the voice domain characteristic data based on the time sequence according to the voice domain characteristic data which is good for the user based on the time sequence and the voice domain characteristic data which is poor for the user based on the time sequence.
3. The song recommendation method of an intelligent K song system according to claim 1, wherein the initializing search is performed according to the voice domain information good for the user to obtain song data after the first search, and the second processing is performed on the song data after the first search based on the voice domain information poor for the user to obtain song data after the second processing, specifically comprising:
Initializing and searching according to the soundfield information which is good at the user, acquiring song data which is searched after initializing, introducing a decision tree model, and constructing splitting standards according to the soundfield information which is not good at the user;
based on the song data retrieved after initialization as a root node, initializing and splitting the root node according to the splitting standard to generate a new leaf node;
when song data of the soundfield information which is not good for the user does not appear in the song data in the new leaf nodes, splitting is finished, a final leaf node is generated, and a song data set corresponding to the soundfield information which is good for the user is acquired through the final leaf node;
and generating song data after secondary processing according to the song data set corresponding to the voice domain information which is good for the user, and outputting the song data after secondary processing.
4. The song recommendation method of the intelligent K song system according to claim 1, wherein the song demand data of the user is obtained, and song recommendation information of the intelligent K song system is generated according to the song demand data of the user and the song data after the secondary processing, specifically comprising:
obtaining song demand data of a user, obtaining song characteristic data of the user demand by carrying out characteristic extraction on the song demand data of the user, and constructing a characteristic sequence according to the song characteristic data of the user demand;
Acquiring song description characteristic information of the secondarily processed song data, and taking the corresponding secondarily processed song data as a song recommendation item when the song description characteristic information of the secondarily processed song data accords with the song characteristic data in the characteristic sequence;
acquiring historical singing evaluation information of the song recommendation items, and acquiring singing evaluation difficulty information of each section of the song according to the historical singing evaluation information of the song recommendation items;
when the singing evaluation difficulty information is larger than the preset singing evaluation difficulty, generating song recommendation information of the intelligent K song system by the corresponding singing segments, and displaying the song recommendation information according to a preset mode.
5. The song recommendation system of the intelligent K song system is characterized by comprising a memory and a processor, wherein the memory comprises a song recommendation method program of the intelligent K song system, and when the song recommendation method program of the intelligent K song system is executed by the processor, the following steps are realized:
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user;
Constructing time-series-based voice domain feature data according to the voice domain information good for the user and the voice domain information poor for the user, and updating the voice domain information good for the user and the voice domain information poor for the user by learning the voice domain feature data based on the time series;
initializing and searching according to the voice domain information which is good at the user, obtaining song data after the primary searching, and performing secondary processing on the song data after the primary searching based on the voice domain information which is not good at the user, so as to obtain song data after the secondary processing;
acquiring song demand data of a user, and generating song recommendation information of an intelligent K song system according to the song demand data of the user and the song data after secondary processing;
acquiring singing training voice data information of a user, and generating soundfield information good for the user and soundfield information poor for the user based on the singing training voice data information of the user, wherein the method specifically comprises the following steps:
acquiring singing training song information of a user, decomposing the singing training song information of the user to generate voice data of a plurality of sub-segments, and performing voice domain analysis on the voice data of the sub-segments to acquire first voice domain analysis data of each sub-segment;
The method comprises the steps of configuring a singing training scene, obtaining singing training voice data information of a user according to singing training song information of the user and the singing training scene, analyzing the singing training voice data information of the user, and generating second voice domain analysis data of a plurality of sub-segments;
presetting a deviation rate threshold, comparing the first voice domain analysis data with the second voice domain analysis data at the same song progress position to obtain a deviation rate, and judging whether the deviation rate is larger than the deviation rate threshold;
when the deviation rate is larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate larger than the deviation rate threshold as the gamut information which is not good for the user, and when the deviation rate is not larger than the deviation rate threshold, using the gamut information corresponding to the deviation rate not larger than the deviation rate threshold as the gamut information which is good for the user;
wherein, through configuration singing training scene specifically includes:
acquiring accuracy information of the voice domain data under each environmental factor through big data, constructing a voice domain acquisition accuracy knowledge graph, and inputting the voice domain data acquisition accuracy information under each environmental factor into the voice domain acquisition accuracy knowledge graph;
Acquiring environmental factors of a current singing training field, inputting the environmental factors of the current singing training field into the register acquisition precision knowledge graph, and acquiring related register data acquisition precision information;
judging whether the related register data acquisition precision information is larger than a preset register data acquisition precision threshold value or not;
if the related register data acquisition accuracy information is not greater than a preset register data acquisition accuracy threshold, acquiring the optimal environmental factors through big data, and adjusting the environmental factors of the current singing training field according to the optimal environmental factors to generate a singing training scene;
the method for updating the voice domain information which is good for the user and the voice domain information which is bad for the user by learning the voice domain characteristic data based on the time sequence specifically comprises the following steps:
constructing a user voice domain preference identification model based on LSTM, fusing a local outlier detection algorithm, and carrying out voice domain feature outlier detection on the voice domain feature data based on the time sequence within preset time through the local outlier detection algorithm to obtain an outlier detection value of each voice domain feature;
presetting an outlier detection threshold, and when the outlier detection value of the gamut feature is larger than the outlier detection threshold, eliminating the gamut feature with the outlier detection value larger than the outlier detection threshold from the gamut feature data based on the time sequence, and acquiring the eliminated gamut feature data based on the time sequence;
Inputting the eliminated voice domain characteristic data based on the time sequence into the voice domain preference identification model of the user for coding learning, and obtaining the voice domain preference identification model of the user after training;
the trained user voice preference identification model acquires voice domain information good for the user and voice domain information bad for the user, and periodically updates the voice domain information good for the user and voice domain information bad for the user.
6. A computer readable storage medium, characterized in that the computer readable storage medium comprises a song recommendation method program of an intelligent K song system, which when executed by a processor, implements the steps of the song recommendation method of the intelligent K song system as claimed in any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311498913.6A CN117235300B (en) | 2023-11-13 | 2023-11-13 | Song recommendation method, system and storage medium of intelligent K song system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311498913.6A CN117235300B (en) | 2023-11-13 | 2023-11-13 | Song recommendation method, system and storage medium of intelligent K song system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117235300A CN117235300A (en) | 2023-12-15 |
CN117235300B true CN117235300B (en) | 2024-03-15 |
Family
ID=89098661
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311498913.6A Active CN117235300B (en) | 2023-11-13 | 2023-11-13 | Song recommendation method, system and storage medium of intelligent K song system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117235300B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104715760A (en) * | 2015-02-13 | 2015-06-17 | 朱威 | KTV song matching analyzing method and system |
KR20180043925A (en) * | 2016-10-21 | 2018-05-02 | 정문일 | Singing evaluation system, singing evaluation server and method thereof |
JP2018205514A (en) * | 2017-06-02 | 2018-12-27 | 株式会社第一興商 | Recommendation system for karaoke songs |
CN109710797A (en) * | 2018-11-14 | 2019-05-03 | 腾讯科技(深圳)有限公司 | Method for pushing, device, electronic device and the storage medium of audio file |
CN115470371A (en) * | 2022-08-15 | 2022-12-13 | 苏坤 | Song recommendation method and device based on user voice characteristics and terminal |
-
2023
- 2023-11-13 CN CN202311498913.6A patent/CN117235300B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104715760A (en) * | 2015-02-13 | 2015-06-17 | 朱威 | KTV song matching analyzing method and system |
KR20180043925A (en) * | 2016-10-21 | 2018-05-02 | 정문일 | Singing evaluation system, singing evaluation server and method thereof |
JP2018205514A (en) * | 2017-06-02 | 2018-12-27 | 株式会社第一興商 | Recommendation system for karaoke songs |
CN109710797A (en) * | 2018-11-14 | 2019-05-03 | 腾讯科技(深圳)有限公司 | Method for pushing, device, electronic device and the storage medium of audio file |
CN115470371A (en) * | 2022-08-15 | 2022-12-13 | 苏坤 | Song recommendation method and device based on user voice characteristics and terminal |
Also Published As
Publication number | Publication date |
---|---|
CN117235300A (en) | 2023-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109800407B (en) | Intention recognition method and device, computer equipment and storage medium | |
CN111916111B (en) | Intelligent voice outbound method and device with emotion, server and storage medium | |
Dikmen et al. | Sound event detection using non-negative dictionaries learned from annotated overlapping events | |
Panteli et al. | Towards the characterization of singing styles in world music | |
BR112016020457B1 (en) | METHOD OF SEARCHING AUDIO FINGERPRINTS STORED IN A DATABASE WITHIN AN AUDIO FINGERPRINT DETECTION SYSTEM | |
Somervuo et al. | Bird song recognition based on syllable pair histograms | |
CN111428028A (en) | Information classification method based on deep learning and related equipment | |
CN108549675B (en) | Piano teaching method based on big data and neural network | |
CN110164417B (en) | Language vector obtaining and language identification method and related device | |
CN107993636B (en) | Recursive neural network-based music score modeling and generating method | |
CN111462761A (en) | Voiceprint data generation method and device, computer device and storage medium | |
Benetos et al. | Characterisation of acoustic scenes using a temporally-constrained shift-invariant model | |
Birla | A robust unsupervised pattern discovery and clustering of speech signals | |
Sobieraj et al. | Masked non-negative matrix factorization for bird detection using weakly labeled data | |
CN117235300B (en) | Song recommendation method, system and storage medium of intelligent K song system | |
CN106503181B (en) | Audio data processing method and device | |
Pikrakis et al. | Unsupervised singing voice detection using dictionary learning | |
CN116578700A (en) | Log classification method, log classification device, equipment and medium | |
CN111477212A (en) | Content recognition, model training and data processing method, system and equipment | |
CN107133344B (en) | Data processing method and device | |
CN114818651A (en) | Text similarity determination method and device, storage medium and electronic device | |
CN115859191A (en) | Fault diagnosis method and device, computer readable storage medium and computer equipment | |
CN115357720A (en) | Multi-task news classification method and device based on BERT | |
CN114974221A (en) | Speech recognition model training method and device and computer readable storage medium | |
CN112766368A (en) | Data classification method, equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |