CN117131224A - Playlist generation device and method and wearable device - Google Patents

Playlist generation device and method and wearable device Download PDF

Info

Publication number
CN117131224A
CN117131224A CN202310981866.4A CN202310981866A CN117131224A CN 117131224 A CN117131224 A CN 117131224A CN 202310981866 A CN202310981866 A CN 202310981866A CN 117131224 A CN117131224 A CN 117131224A
Authority
CN
China
Prior art keywords
track
played
tracks
scene
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310981866.4A
Other languages
Chinese (zh)
Inventor
张萌
肖杰伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Eswin Computing Technology Co Ltd
Original Assignee
Beijing Eswin Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Eswin Computing Technology Co Ltd filed Critical Beijing Eswin Computing Technology Co Ltd
Priority to CN202310981866.4A priority Critical patent/CN117131224A/en
Publication of CN117131224A publication Critical patent/CN117131224A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/638Presentation of query results
    • G06F16/639Presentation of query results using playlists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/687Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Indexing, Searching, Synchronizing, And The Amount Of Synchronization Travel Of Record Carriers (AREA)

Abstract

The application discloses a play list generation device and method and wearable equipment, relates to the technical field of data processing, and mainly aims to generate a play list which is more fit with a scene where a user is and user preferences; the main technical scheme comprises the following steps: the playlist generating apparatus includes: the determining module is used for determining the user preference weight and the scene adaptation weight corresponding to each track to be played; the scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located; and the generation module is used for ordering the to-be-played tracks based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generating a play list.

Description

Playlist generation device and method and wearable device
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a playlist generating device and method, and a wearable device.
Background
With the continuous development of electronic devices such as wearable devices, more and more users play music through the electronic devices. Electronic devices typically play music based on a playlist. Currently, a playlist is typically generated according to a track order manually set by a user according to his own preference. The existing play list generation mode can meet the listening preference of a user, but is easy to be in a situation of not attaching to a scene where the user is located, and is easy to cause poor listening experience of the user.
Disclosure of Invention
In view of this, the application provides a play list generating device and method, and a wearable device, and mainly aims to generate a play list which is more fit with a scene where a user is and a preference of the user.
In order to achieve the above purpose, the present application mainly provides the following technical solutions:
in a first aspect, the present application provides a playlist generating apparatus including:
the determining module is used for determining the user preference weight and the scene adaptation weight corresponding to each track to be played; the scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located;
and the generation module is used for ordering the to-be-played tracks based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generating a play list.
In some embodiments of the application, the determining module includes: an extraction unit for extracting scene features based on scene parameters; based on the scene characteristics, determining scene information of a scene where a user is currently located through a scene identification model; the scene parameters are used for indicating the scene where the user is currently located; the first determining unit is used for determining the scene adaptation weight corresponding to each to-be-played track through the weight identification model based on the scene information and the classification label corresponding to each to-be-played track.
In some embodiments of the present application, the scene parameter includes a scene audio signal, and the extracting unit is specifically configured to process the scene audio signal into a mel-frequency spectrum; and determining the spatial characteristics and the temporal characteristics obtained by processing the Mel-sound spectrum as the scene characteristics.
In some embodiments of the application, the scene parameters include target parameters, and the target parameters include at least one of: the extraction unit is specifically configured to analyze the target parameter, and determine a parameter feature obtained by analyzing the target parameter as the scene feature.
In some embodiments of the present application, the first determining unit is further configured to determine, for each track to be played for which a category label needs to be determined: intercepting a sample audio signal with the playing time length of a designated time length from the to-be-played track, and determining a classification label corresponding to the to-be-played track through a classification label identification model based on the sample audio signal; the classification tag identification model is a bidirectional convolution cyclic neural network model integrating an attention mechanism.
In some embodiments of the application, the scene recognition model is a parallel convolution cyclic neural network model that fuses at least one set of standard batched mechanisms of a depth separable convolution network and at least one set of bi-directional gated cyclic unit layer networks.
In some embodiments of the application, the generating module includes: the dividing unit is used for dividing at least two first track groups based on the user preference weight corresponding to each track to be played; wherein each of the first track groups has a first ranking, and for a first track group for which there is a next first track group in the first ranking, the lowest user preference weight in the first track group is higher than the highest user preference weight in the next first track group; the extraction unit is used for setting the target sequence of the to-be-played tracks in each first track group from large to small based on the scene adaptation weight corresponding to the to-be-played tracks; based on the target sequence corresponding to each first track group, carrying out track extraction processing on the at least two first track groups according to a first proportion to form at least one second track group; the first proportion is used for indicating the quantity proportion of the to-be-played tracks extracted from each first track group in the second track group formed correspondingly; and the generating unit is used for arranging the tracks to be played in each second track group based on the second ordering of the at least one second track group to form the play list.
In some embodiments of the present application, the extracting unit is specifically configured to perform each track extracting process: for each first track group, starting from the to-be-played track which is not currently extracted from the first track group and has the highest corresponding scene adaptation weight, according to the sequence of the to-be-played tracks in the first track group from large to small in scene adaptation weight, extracting the to-be-played tracks with the quantity matched with the first proportion; and if the current track extraction processing is finished for all the first track groups, arranging the tracks to be played extracted from each first track group according to the first sorting to form a second track group corresponding to the current track extraction processing.
In some embodiments of the present application, the extracting unit is further configured to, in a case where the current track extraction process is performed, determine a total number of currently non-extracted tracks to be played in the first track group located at the last position of the first order if it is determined that the currently non-extracted tracks to be played in each of the first track groups do not satisfy the first ratio, and set a second ratio based on the total number; the second proportion is used for indicating that the number of the tracks extracted from the first track group positioned at the last position of the first sequence in the second track group formed correspondingly is the total number, and indicating that the number of the tracks extracted from other first track groups in the second track group formed correspondingly is not less than the total number.
In some embodiments of the present application, the extracting unit is further configured to, in a case where the current track extracting process is performed, for each of the first track groups: if the total number of the currently unextracted tracks in the first track group is matched with the second proportion, extracting the currently unextracted tracks in the first track group; if the total number of the currently unextracted tracks in the first track group is less than the number matched with the second proportion, sorting the scenes of the tracks to be played in the first track group according to the sequence from big to small, selecting a first track from the tracks to be played with the highest scene adaptation weight, and extracting the first track and the currently unextracted tracks to be played; wherein the total number of both the first track and the currently unextracted track to be played matches the second ratio; if it is determined that the unextracted to-be-played tracks do not exist in the first track group currently, selecting a second track from the to-be-played tracks with the highest scene adaptation weight according to the sequence from the big to the small of the scene adaptation weights of the to-be-played tracks in the first track group, and extracting the second track; wherein the total number of second tracks matches the second ratio.
In some embodiments of the present application, the extracting unit is further configured to, when it is determined that the number of currently unextracted tracks in each of the first track groups does not satisfy the first ratio, extract currently unextracted tracks in each of the first track groups, and arrange the currently unextracted tracks extracted from each of the first track groups according to the first ranking, to form a second track group corresponding to the current track extraction processing.
In some embodiments of the present application, the generating unit is specifically configured to, for each second track group: setting a third ranking of each track to be played in the second track group, wherein the third ranking is random ranking or ranking based on user preference weights corresponding to the tracks to be played; and determining the priority order of the tracks to be played based on the second order and the third order corresponding to each second track group, and generating a play list based on the priority order.
In some embodiments of the application, the apparatus further comprises: a processing module for monitoring a heart rate of the user; if the target to-be-played track which is not matched with the currently monitored heart rate exists in the play list, hiding the target to-be-played track in the play list; wherein the target to-be-played track subjected to hiding treatment cannot be played; if the current heart rate meets the requirement of recovering to play the target to-be-played track, displaying and processing the target to-be-played track in the play list; wherein the target to-be-played track of the display processing can be played.
In some embodiments of the present application, the generating unit is further configured to, before generating the playlist, if it is determined that there is a short-term duplicate to-be-played track in the prioritization, only keep, for the short-term duplicate to-be-played track, a first to appear to be-played track in the prioritization; wherein the short-term repeated waiting tracks are the waiting tracks which appear in the play list at least twice and the number of interval tracks between any two occurrences is smaller than the preset number.
In some embodiments of the application, the determining module includes: the second determining unit is used for determining the playing duration corresponding to each playing time of each target track in the appointed time period; determining the target track with the playing time meeting the requirements as a track to be played based on the playing time length of each target track; determining the playing times corresponding to each to-be-played track; the playing times are used for indicating the accumulated playing times of the corresponding tracks to be played in the appointed time period. A third determining unit, configured to determine a sum of playing times of all tracks to be played; and determining the ratio of the playing times of each to-be-played track to the sum of the playing times as the user preference weight corresponding to each to-be-played track.
In a second aspect, the present application provides a wearable device comprising: play means as in the first aspect;
the playing device is used for playing the track based on the play list generated by the play list generating device.
In a third aspect, the present application provides a playlist generating method, including:
determining user preference weight and scene adaptation weight corresponding to each track to be played; the scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located;
and ordering each to-be-played track based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generating a play list.
In a fourth aspect, the present application provides a computer-readable storage medium, the storage medium comprising a stored program, wherein the program, when run, controls a device in which the storage medium is located to perform the playlist generating method of the third aspect.
According to the play list generation device and method and the wearable device, under the condition that the play list is required to be generated, firstly, the user preference weight and the scene adaptation weight corresponding to each to-be-played track are determined, and then each to-be-played track is ordered based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, so that the play list is generated. Therefore, the play list generated by integrating the user preference weight and the scene adaptation weight of the to-be-played tracks is fit for the listening preference of the user and the scene where the user is located, and music playing is carried out based on the play list, so that the user has better listening experience.
The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram showing a configuration of a play list generating apparatus according to an embodiment of the present application;
fig. 2 is a schematic diagram showing a configuration of a play list generating apparatus according to another embodiment of the present application;
FIG. 3 is a diagram illustrating a process of generating a playlist by a playlist generating apparatus according to an embodiment of the present application
Fig. 4 shows a schematic structural diagram of a wearable device according to an embodiment of the present application;
Fig. 5 is a flowchart illustrating a playlist generating method according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Currently, a playlist is generally generated according to a track order manually set by a user according to his own preference, so that the electronic apparatus plays music based on the generated playlist. The existing play list generation mode only considers the listening preference of the user, but does not consider the fitting degree of the play list to the scene where the user is located, so that the situation that the generated play list is not fitted to the scene where the user is located is easy to occur, and the listening experience of the user is easy to be poor.
The inventors have found that for any track, the track can be measured in two dimensions: first, a user preference dimension, which may be represented by a user preference weight, for indicating a user's preference level for the track; and secondly, the scene fitting dimension can be represented by scene adaptation weight, and the scene adaptation weight is used for indicating the fitting degree of the track and the corresponding scene. It should be noted that, the degree of fit of a track to different scenes is different, so the track has different scene adaptation weights for different scenes. Through the above findings, the inventor proposes a technical scheme for generating a play list, specifically: under the condition that the play list is required to be generated, firstly, determining the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and then ordering the to-be-played tracks based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track to generate the play list. Therefore, the generated play list is fit for the listening preference of the user and the scene where the user is located, and music playing is performed based on the play list, so that the user has better listening experience.
The technical scheme for generating the play list provided by the embodiment of the application can be applied to any electronic equipment needing play list generation, and the embodiment of the application does not limit the specific type of the electronic equipment. By way of example, electronic devices may include, but are not limited to, wearable devices, speakers, hosts with music playing capabilities, internet of things devices, and the like.
Based on the technical scheme of playlist generation provided by the embodiment of the application, the embodiment of the application particularly provides a playlist generation device and method and wearable equipment. The playlist generating device and method and the wearable device provided by the embodiment of the application are specifically described below.
As shown in fig. 1, an embodiment of the present application provides a playlist generating apparatus, which mainly includes: a determining module 11 and a generating module 12.
A determining module 11, configured to determine a user preference weight and a scene adaptation weight corresponding to each track to be played; the scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located;
the generating module 12 is configured to sort each to-be-played track based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generate a playlist.
The specific structure and interaction relationship of each module related to the playlist generating apparatus are described below:
determination module 11:
the determining module 11 is mainly configured to determine a user preference weight and a scene adaptation weight corresponding to each track to be played. The to-be-played tracks are tracks selected based on tracks played in a specified period of time, and the to-be-played tracks conform to the listening interests of the user. The specified time period may be determined based on the service requirement, and in an exemplary case where it is determined that the playlist needs to be generated, the specified time period is a time period corresponding to the last 10 days. The user preference weight is used for indicating the preference degree of the user on the corresponding to-be-played tracks in all to-be-played tracks. The scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located.
The following describes a specific technical scheme for determining the scene adaptation weight and the user preference weight corresponding to each track to be played.
Determination of scene adaptation weights:
as shown in fig. 2, the determining module 11 includes an extracting unit 111 and a first determining unit 112, and determines a scene corresponding to each track to be played through interaction of the extracting unit 111 and the first determining unit 112
The weights are adapted. An extraction unit 111 for extracting scene features based on scene parameters; based on scene characteristics, determining scene information of a scene where a user is currently located through a scene identification model; the scene parameter is used for indicating the scene in which the user is currently located. The first determining unit 112 is configured to determine, based on the scene information and the classification label corresponding to each track to be played, a scene adaptation weight corresponding to each track to be played through a weight recognition model.
The determination of the scene in which the user is located is made based on scene parameters, which are parameters indicating the scene in which the user is currently located, which may include, but are not limited to, at least one of the following: scene audio signal, current location information, current time, current weather information, current illumination intensity information. The scene audio signal is a signal obtained by collecting sound in a space where a user is located.
In case the scene parameters comprise a scene audio signal, the extraction unit 111 is specifically configured to process the scene audio signal into a mel-sound spectrum; and determining the spatial characteristics and the temporal characteristics obtained by processing the mel-frequency spectrum as scene characteristics.
The mel-frequency spectrum has good distinguishing and robustness, based on which key features of the audio signal can be extracted, thus processing the scene audio signal into a mel-frequency spectrum. The specific process of processing the scene audio signal into a mel-spectrum may comprise the steps of: performing short-time Fourier transform on the scene audio signals to obtain target sound spectrum; and (3) passing the target sound spectrum through a Mel filter group to obtain a Mel sound spectrum. Furthermore, before the short-time Fourier transform is performed on the scene audio signals to obtain the target sound spectrum, the frame division and windowing processing can be performed on the scene audio signals, so that the enrichment and reinforcement of the scene audio signals can be realized.
After the scene audio is processed into the mel-frequency spectrum, the mel-frequency spectrum is processed to extract the spatial features and temporal features corresponding to the field Jing Yinpin signal. The specific process of mel-spectrum processing may include the steps of: and taking the Mel-sound spectrum as the input of the first feature extraction model, and extracting the spatial features and the temporal features corresponding to the scene audio signals based on the first feature extraction model. The spatial feature and the temporal feature are used for indicating the space in which the user is located, and are an important basis for determining the scene in which the user is located. After the spatial features and the temporal features are obtained, the spatial features and the temporal features are determined to be scene features.
The context parameters include target parameters, and the target parameters include at least one of: the extracting unit 111 is specifically configured to analyze the target parameter, and determine a parameter feature obtained by analyzing the target parameter as a scene feature.
The specific process of analyzing the target parameters may include the steps of: and taking the target parameter as the input of a second feature extraction model, and extracting the parameter feature corresponding to the target parameter based on the second feature extraction model. The parameter features are used for indicating the space in which the user is located and/or the current mood of the user, and are an important basis for judging the scene in which the user is located. After the parameter characteristics are obtained, the parameter characteristics are determined to be scene characteristics.
After determining the scene features, the extraction unit 111 determines scene information of a scene in which the user is currently located through a scene recognition model based on all the determined scene features. The scene recognition model is a convolution cyclic neural network fused with multiple optimization mechanisms, and specifically is a parallel convolution cyclic neural network model fused with at least one group of depth separable convolution networks and at least one group of bidirectional gating cyclic unit layer networks of at least one group of standard batched mechanisms.
Based on all the determined scene characteristics, the specific process of determining the scene information of the scene in which the user is currently located through the scene recognition model can comprise the following steps: the scene recognition model adopts a parallel mode of at least one group (such as 4 groups) of depth separable convolutional networks (BN) and at least one group (such as 4 groups) of Bi-gating cyclic unit layer networks (Bi-GRU), the extracted scene features are fused into new feature vectors, the new feature vectors are input into a SoftMax classifier with a full connection layer for classification recognition, and finally the scene information of the scene where the user is currently located is output. For example, for spatial features and temporal features related to a scene audio signal in scene features, a parallel mode of at least one set of depth separable convolutional network and at least one set of bidirectional gating cyclic unit layer network of a standard batching mechanism is adopted, modeling can be performed on spatial structure and time frame sequence of the audio signal, compared with single structure identification accuracy is obviously improved, the standard batching mechanism is fused, a Dropout layer is used, a model convergence process can be accelerated, network generalization capability is improved, a layered attention mechanism is fused again, and the model can identify key signal frames playing an important role in identification more easily.
After obtaining the scene information, the first determining unit 112 determines, based on the scene information and the classification label corresponding to each track to be broadcast, the scene adaptation weight corresponding to each track to be broadcast through the weight recognition model.
The classification label is used for indicating the music type of the corresponding to-be-played track, and can reflect the fitting degree of the corresponding to-be-played track and the scene. The category labels may include, but are not limited to, the following information: track style, track singer, track language, track release date.
The classification label corresponding to the track to be played can be extracted from a preset classification label set. And determining the to-be-played track without the corresponding classification label in the classification label set as the to-be-played track needing to determine the classification label. The first determining unit 112 is further configured to determine, for each to-be-played track that needs to determine a category label: intercepting a sample audio signal with the playing time length being the designated time length from the to-be-played track, and determining a classification label corresponding to the to-be-played track through a classification label identification model based on the sample audio signal; the classification tag identification model is a bidirectional convolution cyclic neural network model integrating an attention mechanism.
Considering that for the to-be-played tracks with longer playing time length, if the audio signals corresponding to the whole tracks are adopted to determine the classification labels, the data processing capacity is increased. For the tracks to be played with shorter playing time, if the classification label determination is performed by adopting the audio signal corresponding to the whole track, although the data processing amount is smaller, the insufficient sample size may cause the insufficient accuracy of the classification label determination. Therefore, in order to balance the above two situations, for any one to-be-played track needing to determine the classification label, the sample audio signal with the playing time length being the designated time length can be intercepted from the to-be-played track, and the classification label is determined according to the sample audio sample.
The specific process of intercepting a sample audio signal with the playing time length being a specified time length from a to-be-played track can comprise the following steps: and for the to-be-played tracks with the playing time length not longer than the designated time length, intercepting sample audio signals with the designated time length from the to-be-played tracks. And for the to-be-played tracks with the playing time length smaller than the designated time length, repeatedly sampling the to-be-played tracks, and combining the repeatedly sampled audio signals to form a sample audio signal. The appointed time length can be flexibly set based on the service requirement, and the value of the appointed time length is not limited in the embodiment of the application. Illustratively, the specified duration is 30 seconds.
After determining the sample audio signal, determining a classification label corresponding to the track to be played through a classification label identification model based on the sample audio signal, wherein the specific process for determining the classification label can comprise the following steps: processing the sample audio signal into a mel-spectrum; and determining the spatial characteristics and the temporal characteristics obtained by processing the mel-frequency spectrum as target characteristics. And determining the classification label corresponding to the track to be played through the classification label identification model based on the target characteristics. The classification tag identification model is a bidirectional convolution cyclic neural network model which fuses an attention mechanism, combines an RGLU-SE convolution structure with the bidirectional cyclic neural network, and synthesizes the attention mechanism. The RGLU-SE convolution structure is used for extracting local features of deep sound spectrum, and the bidirectional cyclic neural network is used for summarizing time domain information, so that a model can learn time sequence information in music, and a classification label of a track to be played can be accurately determined.
The first determining unit 112 determines, based on the scene information and the classification label corresponding to each of the to-be-played tracks, the scene adaptation weight corresponding to each of the to-be-played tracks through the weight recognition model. The weight recognition model is a convolutional neural network model, can use one-dimensional convolution as a basic convolution structure, comprises an input layer, a hidden layer and an output layer, and is used for carrying out correlation analysis on scene information and classification labels corresponding to the to-be-played tracks, so that scene adaptation weights corresponding to the to-be-played tracks are obtained, and the fit degree of each to-be-played track and a scene where a user is located is indicated through the scene adaptation weights.
Determination of user preference weights:
as shown in fig. 2, the determining module 11 includes a second determining unit 113 and a third determining unit 114, and determines a user preference weight corresponding to each to-be-played track through interaction of the second determining unit 113 and the third determining unit 114. A second determining unit 113, configured to determine a playing duration corresponding to each playing of each target track in a specified period of time; determining the target track with the playing time meeting the requirements as a track to be played based on the playing time length of each target track; determining the playing times corresponding to each to-be-played track; the playing times are used for indicating the accumulated playing times of the corresponding tracks to be played in the appointed time period. A third determining unit 114, configured to determine a sum of playing times of all tracks to be played; and determining the ratio of the playing times of each to-be-played track to the sum of the playing times as the user preference weight corresponding to each to-be-played track.
The second determination unit 113 determines a specified period of time based on the point in time at which the playlist needs to be determined, and then determines target tracks that have been played within the specified period of time, which are all tracks listened to by the user. The to-be-played track may be determined based on the target track. The specific process of determining the to-be-played track based on the target track may include the following two processes: first, all the target tracks are determined as to-be-played tracks. Second, the second determining unit 113 is specifically configured to determine a playing duration corresponding to each playing of each target track in a specified time period; and determining the target track with the playing time meeting the requirements as a track to be played based on the playing time length of each target track.
For each target track: determining the playing time length corresponding to each playing of the target track in the designated time; the playing time indicates the listening interest level of the target song corresponding to the user. And judging whether the accumulated playing times with the playing time length smaller than the preset time length is smaller than the target times, if so, determining the target track as the track to be played, and if not, eliminating the target track.
After determining the to-be-played tracks based on the target tracks, determining the playing times corresponding to each to-be-played track, wherein the playing times are used for indicating the accumulated playing times of the corresponding to-be-played tracks in the specified time period. The third determining unit 114 is specifically configured to determine a sum of playing times of all tracks to be played; and determining the ratio of the playing times of each to-be-played track to the sum of the playing times as the user preference weight corresponding to each to-be-played track. The user preference weight is used to indicate the preference degree of the user for the corresponding track in all the tracks to be played. For example, if the number of plays of the to-be-played track 1 is 10 and the sum of the number of plays of all to-be-played tracks is 100, the user preference weight of the to-be-played track 1 is: 10/100=0.1.
The generation module 12:
the generating module 12 is mainly configured to sort each to-be-played track based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generate a playlist.
As shown in fig. 2, the generation module 12 includes a division unit 121, an extraction unit 122, and a generation unit 123, and generates a playlist through interactions between the division unit 121, the extraction unit 122, and the generation unit 123. A dividing unit 121, configured to divide at least two first track groups based on a user preference weight corresponding to each track to be played; wherein each first track group has a first ranking, and for a first track group for which there is a next first track group in the first ranking, the lowest user preference weight in the first track group is higher than the highest user preference weight in the next first track group. An extracting unit 122, configured to set a target ranking of the tracks to be played in each first track group from a large order to a small order based on the scene adaptation weights corresponding to the tracks to be played; based on the target sequence corresponding to each first track group, carrying out track extraction processing on the at least two first track groups according to a first proportion to form at least one second track group; the first proportion is used for indicating the quantity proportion of the to-be-played tracks extracted from each first track group in the second track group formed correspondingly. A generating unit 123, configured to arrange the tracks to be played in each second track group based on the second order of the at least one second track group, so as to form a playlist.
The specific process of dividing the at least two first track groups by the dividing unit 121 based on the user preference weight corresponding to each to-be-played track may include the following steps: and ordering the to-be-played tracks from large to small based on the user preference weight, wherein the to-be-played tracks with the same user preference weight are adjacent in the formed ordering. At least two first track groups are then partitioned based on the resulting ranking. The specific method for dividing at least two first track groups comprises the following two steps: firstly, setting the number of first track groups, and dividing the tracks to be played on the basis of the set number on average; and secondly, setting the number, and dividing the to-be-played tracks into at least two first track groups based on the set number, wherein the total amount of the to-be-played tracks in each first track group is the set number. Third, the number of the first track groups and the number proportion of the tracks to be played in each first track group are set, and the first track groups are divided based on the set number and the set number proportion.
Illustratively, as shown in fig. 3, fig. 3 is a schematic diagram of a process of generating a playlist by the playlist generating apparatus. The 20 tracks to be broadcast with the sequence numbers 1 to 20 are sequenced based on the user preference weight corresponding to each track to be broadcast, so that the sequencing in the dashed line box A in fig. 3 is formed. Based on dividing three first track groups, the number proportion of the tracks to be played in each first track group is 1:1:1, dividing 20 tracks to be played into a first track group B1, a first track group B2 and a first track group B3. Each first track group has a first ranking "B1, B2, B3", and for the first track groups for which there is a next first track group in the first ranking, the lowest user preference weight in the first track group is higher than the highest user preference weight in the next first track group.
After the dividing unit 121 divides the first track groups, the extracting unit 122 performs, for each first track group: and setting the target ordering of the to-be-played tracks in the first track group from big to small based on the scene adaptation weight corresponding to the to-be-played tracks in the first track group.
For example, as shown in fig. 3, for the first track group B1, the target ranks of the tracks to be played in the first track group B1 are set to "2, 19, 7, 20, 14, 17, 13" based on the order of the scene adaptation weights corresponding to the tracks to be played in the first track group B1 from large to small. Similarly, the first track group B2 and the first track group B3 are set for the target ranking.
After setting the target ranks of the tracks to be played in each first track group, the extracting unit 122 performs track extraction processing on the at least two first track groups according to the first proportion based on the target ranks corresponding to each first track group, so as to form at least one second track group. The first ratio is used for indicating the quantity ratio of the to-be-played tracks extracted from each first track group in the correspondingly formed second track group.
The extracting unit 122 is specifically configured to perform track extraction processing every time: for each first track group, starting from the track to be played which is not currently extracted from the first track group and has the highest corresponding scene adaptation weight, extracting the tracks to be played, the number of which is matched with the first proportion, according to the sequence of the scene adaptation weights of the tracks to be played in the first track group from large to small; and if the current track extraction processing is finished for all the first track groups, arranging the tracks to be played extracted from each first track group according to the first sorting to form a second track group corresponding to the current track extraction processing.
Illustratively, a first ratio is set to m1:m2:m3, the first ratio resulting from multiple trial analysis, and m1> =m2 > =m3. Specifically, the first ratio m1:m2:m3 is 6:5:4. For the first track extraction processing, the tracks to be played are extracted from each first track group in the order of the first track group B1, the first track group B2, and the first track group B3. As shown in fig. 3, for the first track group B1, 6 waiting tracks "2, 19, 7, 20, 14, 17" are extracted from the first track group B1. For the first track group B2, the 5 to-be-played tracks "18, 5, 4, 3, 9" are extracted from the first track group B2. For the second track group B3, the 5 to-be-played tracks "1, 10, 8, 12" are extracted from the first track group B3. And after the process of extracting the current track is completed for all the first track groups, arranging the second track groups C1 formed by arranging all the extracted tracks to be played according to the first order.
In the process of forming the second track group by the extraction unit 122, if a situation that the currently unextracted track in each first track group does not satisfy the first proportion is encountered in the case of performing the current track extraction process, the current track extraction process may be performed by two methods:
First, the extracting unit 122 is further configured to, in a case where the current track extraction process is performed, determine, if it is determined that the first proportion is not satisfied by the currently unextracted tracks in each first track group, a total number of currently unextracted tracks in the first track group located at the last position of the first order, and set a second proportion based on the total number; wherein the second ratio is used for indicating that the number of the tracks extracted from the first track group positioned at the last position of the first sorting is the total number in the corresponding formed second track group, and indicating that the number of the tracks extracted from other first track groups is larger than the total number in the corresponding formed second track group.
In general, the closer the order is to the first track group at the end of the first order, the smaller the first proportion indicates the proportion of the number of tracks in the correspondingly formed second track group. Therefore, in the case that it is determined that the currently non-extracted to-be-played tracks in each first track group do not satisfy the first proportion, the more the currently non-extracted to-be-played tracks in the first track group near the last position of the first order are, therefore, in order to be able to spend as few track extraction processing times as possible, the total number of the currently non-extracted to-be-played tracks in the first track group located at the last position of the first order is determined, and the second proportion is set based on the total number. The second proportion indicates that the number of the tracks extracted from the first track group positioned at the last position of the first sequence is the total number in the second track group correspondingly formed, and in addition, the second proportion also indicates that the number of the tracks extracted from other first track groups in the second track group correspondingly formed is larger than the total number, so that the extraction of all the currently unextracted tracks in all the first track groups can be completed through one track extraction processing time.
Illustratively, as shown in fig. 3, after the second track group C1 is formed, it is determined that the currently unextracted waiting track in each first track group does not satisfy the first ratio "6:5:4", it is determined that the total number of currently unextracted tracks to be played in the first track group B3 located at the last bit of the first ranking is 2, and thus the second scale is set based on the total number" 2 ". The setting principle of the second proportion is as follows: m+2:m+1:m, where m is the total number of currently unextracted tracks in the first track group B3 located at the last bit of the first ranking. Based on the setting principle, the second proportion is set as follows: 4:3:2.
After setting the second ratio, the extraction unit 122 continues the generation of the second track group according to the second ratio. The extracting unit 122 is further configured to, in a case where the current track extraction process is performed, for each first track group operation, there may be three cases in which:
in the first case, if it is determined that the total number of the currently unextracted tracks in the first track group matches the second proportion, extracting the currently unextracted tracks in the first track group.
For example, as shown in fig. 3, the first track group is B3, and the total number of the currently non-extracted tracks to be played is 2, and the total number matches the second ratio, so that the currently non-extracted tracks "15 and 6" to be played are extracted.
If the total number of the currently unextracted tracks in the first track group is less than the number matched with the second proportion, sorting the scene adaptation weights of the tracks to be played in the first track group from large to small, selecting the first track from the track to be played with the highest scene adaptation weight, and extracting the first track and the currently unextracted tracks to be played; wherein the total number of both the first track and the currently unextracted track to be played matches the second ratio.
If the total number of the currently unextracted tracks in the first track group is less than the number matched with the second proportion, the total number of the currently unextracted tracks is not matched with the second proportion, and in order to be matched with the second proportion, the first track is selected from the tracks with the highest scene adaptation weight according to the sequence of the scenes adaptation weights of the tracks to be sown in the first track group from the big to the small, and the first track and the currently unextracted tracks to be sown are extracted, so that the number of the tracks to be sown extracted from the first track group accords with the requirement of the second proportion through the supplement of the first track.
As shown in fig. 3, the first track group is B1, and the total number of the currently unextracted tracks is 1, and the total number is not matched with the second proportion, so that the first tracks "2 and 19" are selected from the tracks to be broadcast with the highest scene adaptation weights according to the order of the scene adaptation weights of the tracks to be broadcast in the first track group from large to small, and the first tracks "2 and 19" and the non-currently extracted tracks to be broadcast "13" are extracted, so that the number of the tracks to be broadcast extracted from the first track group B1 meets the requirement of the second proportion through the supplement of the first tracks. Similarly, the extraction of the to-be-played track is performed on the first track group B2 by the same method. After the first track groups "B1 to B3" are each extracted based on the first ratio, a second track group C2 is formed.
Thirdly, if it is determined that the unextracted to-be-played tracks do not exist in the first track group currently, selecting a second track from the to-be-played tracks with the highest scene adaptation weight according to the sequence from the big to the small scene adaptation weights of the to-be-played tracks in the first track group, and extracting the second track; wherein the total number of second tracks matches the second ratio.
If it is determined that there are currently no non-extracted tracks to be played in the first track group, it is indicated that all the tracks to be played in the current first track group are extracted and exist in the corresponding second track group. At this time, in order to ensure that the newly formed second track group can cover the to-be-played tracks in each first track group, according to the sequence of the to-be-played tracks in the first track group from big to small in scene adaptation weight, the second tracks with the number matched with the second proportion are selected from the to-be-played tracks with the highest scene adaptation weight, and extracted.
Second, the extracting unit 122 is further configured to, in a case where the current track extraction process is performed, extract the currently non-extracted tracks to be played in each first track group if it is determined that the number of the currently non-extracted tracks to be played in each first track group does not satisfy the first ratio, and arrange the tracks to be played extracted from each first track group according to the first order, so as to form a second track group corresponding to the current track extraction process.
If the number of the currently unextracted tracks in each first track group is judged to not meet the first proportion, the fact that only a small number of unextracted tracks exist in all the first track groups is indicated, so that the extraction of all the unextracted tracks in the first track groups is completed as soon as possible, the currently unextracted tracks in each first track group are extracted, and the unextracted tracks extracted from each first track group are arranged according to the first order, so that a second track group corresponding to the current track extraction processing is formed.
After determining that the track extraction processing is all completed, the generation unit 12 may generate a playlist based on all the second track groups formed by the track extraction processing. The specific process of generating the play category may include the following two types:
first, the generating unit 12 is specifically configured to determine a priority ranking of the tracks to be played based on the second rankings corresponding to the second track groups, and generate a playlist based on the priority ranking. The second ranking indicates an order of generation of the second track group.
Second, the generating unit 12 is specifically configured to perform the following steps 12A to 12B for each second track group:
12A, setting a third order of the tracks to be played in the second track group.
The third ranking is the ranking of tracks to be played within the second track group. The third ordering may include two forms: firstly, the third ordering is random ordering; second, the third ranking is a ranking based on the user preference weights corresponding to tracks to be played.
12B, determining a priority ranking of the tracks to be played based on the second rankings and the third rankings corresponding to each second track group, and generating a playlist based on the priority rankings.
Further, considering that the play list has a short-term repeated track, a bad hearing experience is brought to the user, and therefore, the generating unit 12 is further configured to, before generating the play list, if it is determined that the short-term repeated track exists in the priority ranking, only the first track appearing in the priority ranking is reserved for the short-term repeated track.
Illustratively, as shown in fig. 3, the tracks to be played in each of the second track groups C1 and C2 are randomly ordered to form their respective third orders. Then, according to the second order of the second track groups C1 and C2, the to-be-played tracks in the first track group C1 are all ordered before the to-be-played tracks in the second track group, and the to-be-played tracks in the first track group C1 are arranged according to the corresponding third order, and the to-be-played tracks in the first track group C2 are arranged according to the corresponding third order, so that the priority order is formed. After short-term repetition removal of the prioritization, the playlist D1 in fig. 3 is formed.
Further, although the generated playlist is more suitable for the user preference and the scene in which the user is located, if some tracks to be played in the playlist do not conform to the current physical state of the user, an undesirable listening experience may be brought to the user, and therefore, as shown in fig. 2, the playlist generating apparatus may further include: a processing module 13 for monitoring the heart rate of the user; if the target to-be-played track which is not matched with the currently monitored heart rate exists in the play list, hiding the target to-be-played track in the play list; wherein, the target to-be-played track subjected to hiding treatment cannot be played; if the current heart rate meets the requirement of recovering the target to-be-played track, displaying the target to-be-played track in a play list; wherein the target to-be-played track of the display processing can be played.
If it is determined that target to-be-played tracks which are not matched with the currently monitored heart rate exist in the play list, the fact that the target to-be-played tracks bring bad experience to a user if the target to-be-played tracks are played with high probability is indicated, so that the target to-be-played tracks are hidden in the play list, and when the tracks are played in the play list, the target to-be-played tracks are skipped, and the target to-be-played tracks cannot be played.
Illustratively, as shown in fig. 3, if it is determined that there are target tracks "5, 12, 14, 1" to be played in the playlist that do not match the currently monitored heart rate, these four tracks are hidden in the playlist.
Further, the processing module 13 is further configured to display the processing target to-be-played track in the playlist if it is determined that the current heart rate meets the requirement for recovering the playing target to-be-played track; wherein the target to-be-played track of the display processing can be played.
If the current heart rate meets the requirement of recovering the target to be played, the current physical state of the user is recovered to be suitable for listening to the target to be played, the target to be played is displayed in the play list, and the target to be played is normally played when the target to be played is played based on the play list.
In the play list generating device provided by the embodiment of the application, under the condition that the play list is determined to be required to be generated, firstly, the user preference weight and the scene adaptation weight corresponding to each to-be-played track are determined, and then, the to-be-played tracks are ordered based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, so that the play list is generated. Therefore, the play list generated by integrating the user preference weight and the scene adaptation weight of the to-be-played tracks is fit for the listening preference of the user and the scene where the user is located, and music playing is carried out based on the play list, so that the user has better listening experience.
Further, an embodiment of the present application further provides a wearable device, as shown in fig. 4, including: a playback device 21 and a play list generation device 22 as described above;
and a playing means 21 for playing the track based on the playlist generated by the playlist generating means 22.
The playlist used by the playing device provided by the embodiment of the application is generated by the playlist generating device integrating the user preference weight and the scene adaptation weight of the to-be-played tracks, the playlist generated by the playlist generating device is fit with the listening preference of the user and the scene where the user is located, and music playing is performed based on the playlist, so that the user has better listening experience.
In some embodiments of the present application, the wearable device may further include: and the acquisition device is used for acquiring scene parameters for the play list generation device to use. And the detection device is used for collecting the heart rate of the user for the play list generation device to use.
Further, an embodiment of the present application further provides a playlist generating method, as shown in fig. 5, including the following steps 301 to 302:
301. determining user preference weight and scene adaptation weight corresponding to each track to be played; the scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located.
302. And ordering each to-be-played track based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generating a play list.
In the play list generation method provided by the embodiment of the application, under the condition that the play list is determined to be generated, firstly, the user preference weight and the scene adaptation weight corresponding to each to-be-played track are determined, and then each to-be-played track is ordered based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, so that the play list is generated. Therefore, the playlist generated by integrating the user preference weight and the scene adaptation weight of the to-be-played tracks is fit for the listening preference of the user and the scene where the user is located, and music playing is performed based on the playlist, so that the user has better listening experience.
In some embodiments of the present application, the specific implementation process of the step 301 "determining the scene adaptation weight corresponding to each track to be played" may include the following steps 301A to 301C:
301A, extracting scene characteristics based on scene parameters; the scene parameters are used for indicating the scene where the user is currently located.
301B, determining scene information of a scene where the user is currently located through a scene identification model based on the scene characteristics.
301C, determining the scene adaptation weight corresponding to each track to be played through a weight identification model based on the scene information and the classification label corresponding to each track to be played.
In some embodiments of the present application, the specific implementation of the step 301A "extracting the scene feature based on the scene parameter" may include the following steps: processing the scene audio signal into a mel-spectrum; and determining the spatial characteristics and the temporal characteristics obtained by processing the Mel-sound spectrum as the scene characteristics.
In some embodiments of the application, the scene parameters include target parameters, and the target parameters include at least one of: the specific implementation process of the step 301A "extracting the scene feature based on the scene parameter" may include the following steps: and analyzing the target parameters, and determining parameter characteristics obtained by analyzing the target parameters as the scene characteristics.
In some embodiments of the present application, the scene recognition model referred to in step 301B above is a parallel convolutional recurrent neural network model that fuses at least one set of standard batch mechanisms with at least one set of bi-directional gated recurrent unit-level networks.
In some embodiments of the present application, the playlist generating method may further include the steps of: for each to-be-played track requiring determination of a category label: intercepting a sample audio signal with the playing time length of a designated time length from the to-be-played track, and determining a classification label corresponding to the to-be-played track through a classification label identification model based on the sample audio signal; the classification tag identification model is a bidirectional convolution cyclic neural network model integrating an attention mechanism.
In some embodiments of the present application, the specific implementation process of step 302 "to sort each of the to-be-played tracks based on the user preference weight and the scene adaptation weight corresponding to each of the to-be-played tracks to generate a playlist" may include the following steps 302A to 302C:
302A, dividing at least two first track groups based on the user preference weight corresponding to each track to be played; wherein each of the first track groups has a first ranking, and for a first track group for which there is a next first track group in the first ranking, the lowest user preference weight in the first track group is higher than the highest user preference weight in the next first track group;
302B, setting a target sequence of the to-be-played tracks in each first track group from big to small based on the scene adaptation weight corresponding to the to-be-played tracks; based on the target sequence corresponding to each first track group, carrying out track extraction processing on the at least two first track groups according to a first proportion to form at least one second track group; the first proportion is used for indicating the quantity proportion of the to-be-played tracks extracted from each first track group in the second track group formed correspondingly;
302C, arranging the tracks to be played in each second track group based on a second ranking of the at least one second track group, so as to form the playlist.
In some embodiments of the present application, the specific implementation process of the step 302B "performing the track extraction processing on the at least two first track groups according to the first proportion based on the target ranking corresponding to each of the first track groups to form the at least one second track group" may include the following steps: every time a track extraction process is performed: for each first track group, starting from the to-be-played track which is not currently extracted from the first track group and has the highest corresponding scene adaptation weight, according to the sequence of the to-be-played tracks in the first track group from large to small in scene adaptation weight, extracting the to-be-played tracks with the quantity matched with the first proportion; and if the current track extraction processing is finished for all the first track groups, arranging the tracks to be played extracted from each first track group according to the first sorting to form a second track group corresponding to the current track extraction processing.
In some embodiments of the present application, the playlist generating method may further include the steps of: under the condition of executing the current track extraction processing, if the fact that the currently non-extracted tracks to be played in each first track group do not meet the first proportion is judged, determining the total number of the currently non-extracted tracks to be played in the first track group positioned at the last position of the first sequencing, and setting a second proportion based on the total number; the second proportion is used for indicating that the number of the tracks extracted from the first track group positioned at the last position of the first sequence in the second track group formed correspondingly is the total number, and indicating that the number of the tracks extracted from other first track groups in the second track group formed correspondingly is not less than the total number.
In some embodiments of the present application, the playlist generating method may further include the steps of: in the case where the current track extraction process is performed, for each of the first track groups:
if the total number of the currently unextracted tracks in the first track group is matched with the second proportion, extracting the currently unextracted tracks in the first track group;
If the total number of the currently unextracted tracks in the first track group is less than the number matched with the second proportion, sorting the scenes of the tracks to be played in the first track group according to the sequence from big to small, selecting a first track from the tracks to be played with the highest scene adaptation weight, and extracting the first track and the currently unextracted tracks to be played; wherein the total number of both the first track and the currently unextracted track to be played matches the second ratio;
if it is determined that the unextracted to-be-played tracks do not exist in the first track group currently, selecting a second track from the to-be-played tracks with the highest scene adaptation weight according to the sequence from the big to the small of the scene adaptation weights of the to-be-played tracks in the first track group, and extracting the second track; wherein the total number of second tracks matches the second ratio.
In some embodiments of the present application, the playlist generating method may further include the steps of: under the condition that the current track extraction processing is executed, if the number of the currently non-extracted tracks to be played in each first track group is judged to not meet the first proportion, extracting the currently non-extracted tracks to be played in each first track group, and arranging the tracks to be played extracted from each first track group according to the first ordering to form a second track group corresponding to the current track extraction processing.
In some embodiments of the present application, the step 302C "of arranging the tracks to be played in each of the second track groups based on the second order of the at least one second track group, and forming the playlist" may include the following steps: for each second track group: setting a third ranking of each track to be played in the second track group, wherein the third ranking is random ranking or ranking based on user preference weights corresponding to the tracks to be played; and determining the priority order of the tracks to be played based on the second order and the third order corresponding to each second track group, and generating a play list based on the priority order.
In some embodiments of the present application, the playlist generating method may further include the steps of: monitoring a heart rate of the user; if the target to-be-played track which is not matched with the currently monitored heart rate exists in the play list, hiding the target to-be-played track in the play list; wherein the target to-be-played track subjected to hiding processing cannot be played. If the current heart rate meets the requirement of recovering to play the target to-be-played track, displaying and processing the target to-be-played track in the play list; wherein the target to-be-played track of the display processing can be played.
In some embodiments of the present application, the playlist generating method may further include the steps of: before generating a play list, if it is determined that short-term repeated waiting tracks exist in the priority ranking, only the waiting tracks appearing first in the priority ranking are reserved for the short-term repeated waiting tracks; wherein the short-term repeated waiting tracks are the waiting tracks which appear in the play list at least twice and the number of interval tracks between any two occurrences is smaller than the preset number.
In some embodiments of the present application, the specific implementation of the step 301 "determining the user preference weight corresponding to each to-be-played track" may include the following steps 301D to 301E:
301D, determining a corresponding playing time length of each target track played in a designated time period; determining the target track with the playing time meeting the requirements as a track to be played based on the playing time length of each target track; determining the playing times corresponding to each to-be-played track; the playing times are used for indicating the accumulated playing times of the corresponding tracks to be played in the appointed time period.
301E, determining the sum of the playing times of all the tracks to be played; and determining the ratio of the playing times of each to-be-played track to the sum of the playing times as the user preference weight corresponding to each to-be-played track.
In the playlist generating method provided in the embodiment of the present application, details of each step may be referred to the corresponding details of the embodiment of the playlist generating apparatus, and will not be described herein.
Further, an embodiment of the present application also provides a computer readable storage medium, where the storage medium includes a stored program, and when the program runs, controls a device where the storage medium is located to execute the above playlist generating method.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
It will be appreciated that the relevant features of the methods and apparatus described above may be referenced to one another. In addition, in the case of the optical fiber,
the "first", "second", and the like in the above embodiments are for distinguishing the embodiments, and do not represent the merits and merits of the embodiments.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the present application is not directed to any particular programming language. It should be appreciated that the teachings of the present application as described herein may be implemented in a variety of programming languages and that the foregoing descriptions of specific languages are provided for disclosure of preferred embodiments of the present application.
Furthermore, the memory may include volatile memory, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), in a computer readable medium, the memory including at least one memory chip.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data tapping device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data tapping device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data cutting apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data-cutting apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (17)

1. A playlist generating apparatus, characterized in that the playlist generating apparatus includes:
the determining module is used for determining the user preference weight and the scene adaptation weight corresponding to each track to be played; the scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located;
And the generation module is used for ordering the to-be-played tracks based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generating a play list.
2. The apparatus of claim 1, wherein the determining module comprises:
an extraction unit for extracting scene features based on scene parameters; based on the scene characteristics, determining scene information of a scene where a user is currently located through a scene identification model; the scene parameters are used for indicating the scene where the user is currently located;
the first determining unit is used for determining the scene adaptation weight corresponding to each to-be-played track through the weight identification model based on the scene information and the classification label corresponding to each to-be-played track.
3. The apparatus according to claim 2, wherein the scene parameters comprise a scene audio signal, the extraction unit being in particular adapted to process the scene audio signal into a mel-sound spectrum; and determining the spatial characteristics and the temporal characteristics obtained by processing the Mel-sound spectrum as the scene characteristics.
4. A device according to claim 2 or 3, wherein the scene parameters comprise target parameters and the target parameters comprise at least one of: the extraction unit is specifically configured to analyze the target parameter, and determine a parameter feature obtained by analyzing the target parameter as the scene feature.
5. The apparatus according to claim 2, wherein the first determining unit is further configured to determine, for each to-be-played track requiring a category label: intercepting a sample audio signal with the playing time length of a designated time length from the to-be-played track, and determining a classification label corresponding to the to-be-played track through a classification label identification model based on the sample audio signal; the classification tag identification model is a bidirectional convolution cyclic neural network model integrating an attention mechanism;
and/or the number of the groups of groups,
the scene recognition model is a parallel convolution cyclic neural network model integrating at least one group of standard batched mechanism of a depth separable convolution network and at least one group of bidirectional gating cyclic unit layer network.
6. The apparatus of claim 1, wherein the generating module comprises:
the dividing unit is used for dividing at least two first track groups based on the user preference weight corresponding to each track to be played; wherein each of the first track groups has a first ranking, and for a first track group for which there is a next first track group in the first ranking, the lowest user preference weight in the first track group is higher than the highest user preference weight in the next first track group;
The extraction unit is used for setting the target sequence of the to-be-played tracks in each first track group from large to small based on the scene adaptation weight corresponding to the to-be-played tracks; based on the target sequence corresponding to each first track group, carrying out track extraction processing on the at least two first track groups according to a first proportion to form at least one second track group; the first proportion is used for indicating the quantity proportion of the to-be-played tracks extracted from each first track group in the second track group formed correspondingly;
and the generating unit is used for arranging the tracks to be played in each second track group based on the second ordering of the at least one second track group to form the play list.
7. The apparatus according to claim 6, wherein the extraction unit is specifically configured to perform the track extraction process each time: for each first track group, starting from the to-be-played track which is not currently extracted from the first track group and has the highest corresponding scene adaptation weight, according to the sequence of the to-be-played tracks in the first track group from large to small in scene adaptation weight, extracting the to-be-played tracks with the quantity matched with the first proportion; and if the current track extraction processing is finished for all the first track groups, arranging the tracks to be played extracted from each first track group according to the first sorting to form a second track group corresponding to the current track extraction processing.
8. The apparatus of claim 7, wherein the extracting unit is further configured to, in a case where a current track extraction process is performed, determine a total number of currently non-extracted tracks to be played in a first track group located at a last bit of the first order if it is determined that the first ratio is not satisfied by currently non-extracted tracks to be played in each of the first track groups, and set a second ratio based on the total number; the second proportion is used for indicating that the number of the tracks extracted from the first track group positioned at the last position of the first sequence in the second track group formed correspondingly is the total number, and indicating that the number of the tracks extracted from other first track groups in the second track group formed correspondingly is not less than the total number.
9. The apparatus of claim 8, wherein the extracting unit is further configured to, in a case where a current track extraction process is performed, for each of the first track groups:
if the total number of the currently unextracted tracks in the first track group is matched with the second proportion, extracting the currently unextracted tracks in the first track group;
If the total number of the currently unextracted tracks in the first track group is less than the number matched with the second proportion, sorting the scenes of the tracks to be played in the first track group according to the sequence from big to small, selecting a first track from the tracks to be played with the highest scene adaptation weight, and extracting the first track and the currently unextracted tracks to be played; wherein the total number of both the first track and the currently unextracted track to be played matches the second ratio;
if it is determined that the unextracted to-be-played tracks do not exist in the first track group currently, selecting a second track from the to-be-played tracks with the highest scene adaptation weight according to the sequence from the big to the small of the scene adaptation weights of the to-be-played tracks in the first track group, and extracting the second track; wherein the total number of second tracks matches the second ratio.
10. The apparatus of claim 7, wherein the extracting unit is further configured to, in a case where a current track extraction process is performed, extract the currently non-extracted tracks in each of the first track groups if it is determined that the number of currently non-extracted tracks in each of the first track groups does not satisfy the first ratio, and arrange the tracks extracted from each of the first track groups according to the first ranking, to form a second track group corresponding to the current track extraction process.
11. The apparatus according to claim 6, wherein the generating unit is specifically configured to, for each second track group: setting a third ranking of each track to be played in the second track group, wherein the third ranking is random ranking or ranking based on user preference weights corresponding to the tracks to be played; and determining the priority order of the tracks to be played based on the second order and the third order corresponding to each second track group, and generating a play list based on the priority order.
12. The apparatus of claim 11, wherein the apparatus further comprises:
a processing module for monitoring a heart rate of the user; if the target to-be-played track which is not matched with the currently monitored heart rate exists in the play list, hiding the target to-be-played track in the play list; wherein the target to-be-played track subjected to hiding treatment cannot be played; if the current heart rate meets the requirement of recovering to play the target to-be-played track, displaying and processing the target to-be-played track in the play list; wherein the target to-be-played track of the display processing can be played.
13. The apparatus of claim 11, wherein the generating unit is further configured to, prior to generating the playlist, if it is determined that there are short-term duplicate to-be-played tracks in the prioritization, only reserve, for short-term duplicate to-be-played tracks, a first occurring to be-played track in the prioritization; wherein the short-term repeated waiting tracks are the waiting tracks which appear in the play list at least twice and the number of interval tracks between any two occurrences is smaller than the preset number.
14. The apparatus of any one of claims 1-3, 5-13, wherein the determining module comprises:
the second determining unit is used for determining the playing duration corresponding to each playing time of each target track in the appointed time period; determining the target track with the playing time meeting the requirements as a track to be played based on the playing time length of each target track; determining the playing times corresponding to each to-be-played track; the playing times are used for indicating the accumulated playing times of the corresponding tracks to be played in the appointed time period;
a third determining unit, configured to determine a sum of playing times of all tracks to be played; and determining the ratio of the playing times of each to-be-played track to the sum of the playing times as the user preference weight corresponding to each to-be-played track.
15. A wearable device, the wearable device comprising: play means as claimed in any one of claims 1 to 14;
the playing device is used for playing the track based on the play list generated by the play list generating device.
16. A play list generation method, characterized in that the play list generation method comprises:
determining user preference weight and scene adaptation weight corresponding to each track to be played; the scene adaptation weight is used for indicating the fitting degree of the corresponding to-be-played track and the scene where the user is currently located;
and ordering each to-be-played track based on the user preference weight and the scene adaptation weight corresponding to each to-be-played track, and generating a play list.
17. A computer-readable storage medium, characterized in that the storage medium comprises a stored program, wherein the program, when run, controls a device on which the storage medium resides to execute the playlist generating method according to claim 16.
CN202310981866.4A 2023-08-04 2023-08-04 Playlist generation device and method and wearable device Pending CN117131224A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310981866.4A CN117131224A (en) 2023-08-04 2023-08-04 Playlist generation device and method and wearable device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310981866.4A CN117131224A (en) 2023-08-04 2023-08-04 Playlist generation device and method and wearable device

Publications (1)

Publication Number Publication Date
CN117131224A true CN117131224A (en) 2023-11-28

Family

ID=88861955

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310981866.4A Pending CN117131224A (en) 2023-08-04 2023-08-04 Playlist generation device and method and wearable device

Country Status (1)

Country Link
CN (1) CN117131224A (en)

Similar Documents

Publication Publication Date Title
US10719551B2 (en) Song determining method and device and storage medium
CN108028054B (en) Synchronizing audio and video components of an automatically generated audio/video presentation
CN101821735B (en) Generating metadata for association with collection of content items
CN101996627B (en) Speech processing apparatus, speech processing method and program
EP3132363A1 (en) Methods, systems, and media for presenting music items relating to media content
US20080189330A1 (en) Probabilistic Audio Networks
KR20070121810A (en) Synthesis of composite news stories
JP2006501502A (en) System and method for generating audio thumbnails of audio tracks
KR20180057409A (en) A method and an appratus for classfiying videos based on audio signals
CN107293308B (en) A kind of audio-frequency processing method and device
CN112929746B (en) Video generation method and device, storage medium and electronic equipment
WO2014096832A1 (en) Audio analysis system and method using audio segment characterisation
KR102255152B1 (en) Contents processing device and method for transmitting segments of variable size and computer-readable recording medium
US20160134855A1 (en) Scenario generation system, scenario generation method and scenario generation program
CN106802913A (en) One kind plays content recommendation method and its device
WO2023040520A1 (en) Method and apparatus for performing music matching of video, and computer device and storage medium
CN111625685B (en) Music radio station presenting method and device and video data classifying method and device
Vrysis et al. Mobile audio intelligence: From real time segmentation to crowd sourced semantics
CN110381336A (en) Video clip emotion determination method, device and computer equipment based on 5.1 sound channels
CN108628886B (en) Audio file recommendation method and device
CN109271501A (en) A kind of management method and system of audio database
CN110569447B (en) Network resource recommendation method and device and storage medium
CN117131224A (en) Playlist generation device and method and wearable device
CN110739006A (en) Audio processing method and device, storage medium and electronic equipment
CN110275988A (en) Obtain the method and device of picture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination