WO2023273440A1 - 一种生成多种音效的方法、装置和终端设备 - Google Patents
一种生成多种音效的方法、装置和终端设备 Download PDFInfo
- Publication number
- WO2023273440A1 WO2023273440A1 PCT/CN2022/083344 CN2022083344W WO2023273440A1 WO 2023273440 A1 WO2023273440 A1 WO 2023273440A1 CN 2022083344 W CN2022083344 W CN 2022083344W WO 2023273440 A1 WO2023273440 A1 WO 2023273440A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio data
- melody
- music
- application scenarios
- audio
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 230000000694 effects Effects 0.000 title claims abstract description 17
- 230000033764 rhythmic process Effects 0.000 claims description 52
- 238000012545 processing Methods 0.000 claims description 42
- 230000003595 spectral effect Effects 0.000 claims description 33
- 238000004590 computer program Methods 0.000 claims description 16
- 238000003860 storage Methods 0.000 claims description 13
- 238000001914 filtration Methods 0.000 claims description 8
- 238000013461 design Methods 0.000 abstract description 16
- 230000006870 function Effects 0.000 description 24
- 238000010586 diagram Methods 0.000 description 22
- 230000005236 sound signal Effects 0.000 description 19
- 230000008569 process Effects 0.000 description 15
- 238000007726 management method Methods 0.000 description 7
- 238000001228 spectrum Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 4
- 210000000056 organ Anatomy 0.000 description 4
- 238000009527 percussion Methods 0.000 description 4
- 239000011435 rock Substances 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 206010063659 Aversion Diseases 0.000 description 2
- 241000982634 Tragelaphus eurycerus Species 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000009966 trimming Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72448—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
- H04M1/72454—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M19/00—Current supply arrangements for telephone systems
- H04M19/02—Current supply arrangements for telephone systems providing ringing current or supervisory tones, e.g. dialling tone or busy tone
- H04M19/04—Current supply arrangements for telephone systems providing ringing current or supervisory tones, e.g. dialling tone or busy tone the ringing-current being generated at the substations
Definitions
- the present invention relates to the field of audio technology, in particular to a method, device and terminal equipment for generating various sound effects.
- the audio module plays audio signals such as ringtones and prompt tones, which is one of the most common methods for realizing the reminder function. Take the prompt sound as an example.
- the prompt sounds on the existing terminal equipment are all set when the terminal equipment leaves the factory. , and over time, users develop an aversion to the selected music.
- the embodiments of the present application provide a method, device and terminal equipment for generating various sound effects.
- the audio signal applied in different scenarios is not only personalized, but also And prolong the time for the user to feel disgusted.
- the present application provides a method for generating multiple sound effects, including: determining first audio data; extracting melody information in the first audio data; receiving a first operation instruction, and based on the first operation instruction , determining at least one application scenario; generating audio data suitable for each application scenario according to a preset audio file, where the audio file includes melody information corresponding to different application scenarios.
- one or more specific music segments in the music are intercepted, and then the melody in the specific music segment is extracted. If the specific music segment is applied to different application scenarios, the The melody in the specific music segment is replaced with the melody set in different application scenarios, so that the specific music segment can be used as a ringtone in different application scenarios, which improves the personalized design of the terminal in different application scenarios, and prolongs the user's interest in the selected music. time of disgust.
- before the determination of the first audio data it includes: receiving a second operation instruction, and selecting original audio data based on the second operation instruction; intercepting the original audio data according to a set rule The at least one target audio data in the audio data, the at least one target audio data includes the first audio data.
- the original audio data can generally be the audio data built into the terminal device, or it can be selected by the user from a third-party application program according to his own preference.
- the audio data selected by the user takes a long time to play, so it needs to be intercepted, and the duration that meets the requirements or the music segment that the user likes is intercepted, so that it can be subsequently used as audio data in different application scenarios, which is the user's favorite, thereby extending The time when the user developed an aversion to the selected music.
- the extracting the melody information in the first audio data includes: calculating at least one spectral peak in the first audio data according to the first audio data; A position of a spectral peak in the frequency domain, calculating the significance corresponding to the at least one spectral peak; constructing a pitch profile according to the frequency corresponding to the at least one spectral peak and the at least one spectral peak; filtering through the pitch profile , selecting the first significant pitch contour as the melody information of the first audio data.
- the generating audio data suitable for each application scenario according to a preset audio file includes: determining melody information corresponding to each application scenario according to the preset audio file; The melody information in the first audio data is replaced with the melody information corresponding to the respective application scenarios to obtain audio data applicable to the respective application scenarios.
- the set melodies corresponding to different application scenarios are replaced into the selected audio data, so that the selected audio data can be converted into audio data in different application scenarios , increasing the richness of audio data and ease of operation.
- the melody information includes melody type, timbre and rhythm
- generating audio data applicable to each application scenario according to a preset audio file includes: receiving a third operation instruction, and based on The third operation instruction is to replace the melody type, timbre, and rhythm in the first audio data with the melody type, timbre, and rhythm corresponding to the respective application scenarios.
- the melody type, timbre and rhythm of music are the most easily perceived different factors for the user, so by changing the melody type, timbre and rhythm in the audio data, the user can be more intuitively You can't feel the difference in music, so you can convert audio data into audio data for different application scenarios in the simplest way.
- the audio file further includes time lengths corresponding to different application scenarios
- the method further includes: adjusting the playing time length of the audio data applicable to each application scenario to the corresponding time length of each application scenario The length of time corresponding to the scene.
- the duration of playing audio signals is different in different application scenarios.
- the prompt tone is generally about 1-2s
- the playback duration of the alarm clock is about tens of seconds, while the intercepted audio data is large.
- the probability and the playback duration of each application scenario are different, so it is necessary to adjust the playback duration of the audio signal, such as fast playback or slow playback, to adjust the audio signal to a playback duration suitable for different application scenarios.
- the method further includes: determining second audio data; extracting melody information in the second audio data; and generating audio data suitable for each application scenario according to the audio file.
- the user can choose two or more audio data, which can be used for different audio signals.
- different audio data are selected, so as to further enhance the personalized design of the terminal in different application scenarios, and prolong the time for users to feel disgusted with the selected music.
- the embodiment of the present application also provides a device for generating various sound effects, including: a processing unit configured to determine the first audio data; the processing unit is also configured to extract the melody information; a transceiver unit, configured to receive a first operation instruction; the processing unit, further configured to determine at least one application scenario based on the first operation instruction; Audio data of application scenarios, where the audio files include melody information corresponding to different application scenarios.
- the transceiver unit is further configured to receive a second operation instruction, and select the original audio data based on the second operation instruction; the processing unit is also configured to intercept the extracting the at least one target audio data from the original audio data, where the at least one target audio data includes the first audio data.
- the processing unit is specifically configured to calculate at least one spectral peak in the first audio data according to the first audio data; position, calculating the significance corresponding to the at least one spectral peak; constructing a pitch profile according to the at least one spectral peak and the frequency corresponding to the at least one spectral peak; selecting the first significant The pitch contour serves as melody information of the first audio data.
- the processing unit is specifically configured to determine the melody information corresponding to each application scene according to the preset audio file; replace the melody information in the first audio data with the The melody information corresponding to each application scenario is used to obtain audio data suitable for each application scenario.
- the melody information includes melody type, timbre, and rhythm
- the processing unit is specifically configured to receive a third operation instruction, and based on the third operation instruction, convert the first audio data to The melody type, timbre, and rhythm of the application scene are replaced with the melody type, timbre, and rhythm corresponding to each application scenario.
- the audio file further includes time lengths corresponding to different application scenarios
- the processing unit is further configured to adjust the playing time length of the audio data applicable to the respective application scenarios to the The length of time corresponding to each application scenario.
- the processing unit is further configured to determine the second audio data; the processing unit is further configured to extract the melody information in the second audio data; and according to the audio file, generate an applicable Audio data for each of the above application scenarios.
- the embodiment of the present application also provides a terminal device, including at least one processor, and the processor is used to execute instructions stored in the memory, so that the terminal device executes the embodiments as possible in the first aspect. .
- the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored.
- a computer program is stored in the computer.
- the computer program is executed in the computer, the computer is instructed to execute the various possible implementations of the first aspect. example.
- the embodiment of the present application also provides a computer program product, which is characterized in that the computer program product stores instructions, and when the instructions are executed by a computer, the computer realizes the possible functions of the first aspect. What is implemented is an embodiment.
- FIG. 1 is a schematic diagram of a hardware structure of a terminal provided in an embodiment of the present application
- Fig. 2 is the schematic diagram that the display screen that provides in the embodiment of the application shows the music card
- FIG. 3 is a schematic diagram of a software structure of a terminal provided in an embodiment of the present application.
- FIG. 4 is a schematic diagram of an interface for displaying and selecting a system ringtone on a display screen provided in an embodiment of the present application;
- Fig. 5 is a schematic interface diagram showing how to passively intercept music clips in the county food provided in the embodiment of the present application
- FIG. 6 is a schematic diagram of the peak distribution of the analyzed music fragments provided in the embodiment of the present application.
- FIG. 7 is a schematic diagram of the frequency distribution of the analyzed music fragments provided in the embodiment of the present application.
- FIG. 8 is a schematic diagram of a melody extraction process provided in an embodiment of the present application.
- Fig. 9 is a frequency distribution diagram corresponding to the pitch of the piano in which the timbre provided in the embodiment of the present application.
- FIG. 10 is a schematic diagram of an interface for selecting a melody type when the application scenario provided in the embodiment of the present application is an incoming call ringtone;
- Fig. 11 is a schematic diagram of an interface for constructing a theme by selecting different application scenarios provided in the embodiment of the present application;
- Fig. 12 is a schematic diagram of an interface for constructing different themes provided in the embodiment of the present application.
- FIG. 13 is a schematic flowchart of a method for generating various sound effects provided in the embodiment of the present application.
- FIG. 14 is a schematic structural diagram of an apparatus for generating various sound effects provided in an embodiment of the present application.
- words such as “exemplary”, “for example” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design described as “exemplary”, “for example” or “for example” in the embodiments of the present application shall not be construed as being more preferred or more advantageous than other embodiments or designs. Rather, the use of words such as “exemplary”, “for example” or “for example” is intended to present related concepts in a specific manner.
- the term "and/or" is only an association relationship describing associated objects, indicating that there may be three relationships, for example, A and/or B may indicate: A exists alone, A exists alone There is B, and there are three cases of A and B at the same time.
- the term "plurality" means two or more. For example, multiple systems refer to two or more systems, and multiple terminals refer to two or more terminals.
- first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
- the terms “including”, “comprising”, “having” and variations thereof mean “including but not limited to”, unless specifically stated otherwise.
- FIG. 1 is a schematic diagram of a hardware structure of a terminal provided by an embodiment of the present application.
- the terminal 100 may include a processor 101 , a memory 102 and a transceiver 103 .
- the processor 101 may be a general purpose processor or a special purpose processor.
- the processor 101 may include a central processing unit (central processing unit, CPU) and/or a baseband processor.
- the baseband processor can be used to process communication data
- the CPU can be used to implement corresponding control and processing functions, execute software programs, and process data of the software programs.
- the processor 101 may intercept part of the audio data based on the set rules, and then extract the melody in the part of the audio data, such as mode, rhythm, beat, dynamics, timbre (performance method), etc.
- the processor 101 may also modify the melody in the extracted audio data, such as modifying it to a different rhythm and timbre, so that the extracted audio data produces different sound effects.
- a program (or an instruction or code) may be stored in the memory 102, and the program may be executed by the processor 101, so that the processor 101 executes the method described in this solution.
- data may also be stored in the memory 102 .
- the processor 101 can read the data stored in the memory 102 (for example, audio data, etc.), the data can be stored in the same storage address as the program, and the data can also be stored in a different storage address from the program.
- the processor 101 and the memory 102 can be set separately, or can be integrated together, for example, integrated on a single board or a system on chip (system on chip, SOC).
- the transceiver 103 can realize input (reception) and output (transmission) of signals.
- the transceiver 103 may include a transceiver or a radio frequency chip.
- the transceiver 103 may also include a communication interface.
- the terminal 100 can send audio data generating different sound effects to other modules or other devices through the transceiver 103, such as speakers, stereos, vehicles, etc., and the audio data can be played through the speakers on the terminal 100 or other devices.
- the terminal 100 may also receive audio data and the like from a server through the transceiver 103 .
- the terminal 100 may include a display screen 104 .
- the display screen 104 can display music cards of music played by the terminal 100 .
- the music card displayed on the terminal 100 may be the music card 21 shown in FIG. 2 .
- the display screen 104 can also be used to display an interface of an application program, display a display window of an application program, and the like.
- the terminal 100 may include an audio module 105 .
- the audio module 105 can convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
- the audio module 105 may also encode and decode audio signals.
- the audio module 105 may be set in the processor 101 , or some functional modules of the audio module 105 may be set in the processor 101 .
- the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the terminal 100 .
- the terminal 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components.
- the illustrated components can be realized in hardware, software or a combination of software and hardware.
- FIG. 3 is a schematic diagram of a software structure of a terminal provided by an embodiment of the present application.
- the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
- the Android system is divided into four layers, which are application program layer, application program framework layer, Android runtime (Android runtime) and system library, and kernel layer respectively from top to bottom.
- the application program layer may include a series of application program packages. As shown in Figure 3, applications such as camera, gallery, calendar, call, map, navigation, bluetooth, music, video, and short message can be installed in the application layer.
- applications such as camera, gallery, calendar, call, map, navigation, bluetooth, music, video, and short message can be installed in the application layer.
- the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
- the application framework layer includes some predefined functions. As shown in Figure 3, the application framework layer may include display policy services and display management services. Certainly, the application framework layer may also include an activity manager, a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, etc., which are not limited in this embodiment of the present application.
- a window manager can be used to manage window programs.
- the window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
- the window manager may specifically be a window manager service (window manager service, WMS), and the WMS stores the information of each application window displayed on the current screen, for example, the application window displayed on the current screen information such as the quantity.
- WMS window manager service
- Content providers can be used to obtain data and make this data accessible to applications.
- This data can include videos, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.
- the view system may include visual controls, for example, controls for displaying/inputting text, controls for displaying pictures, controls for displaying videos, and so on.
- the view system can be used to build applications.
- a display interface can consist of one or more views.
- a display interface including music playing may include a view displaying lyrics in music and a view displaying music cards 12 as shown in FIG. 1 .
- the phone manager is used to provide a communication function of the terminal 200 .
- the management of call status including connected, hung up, etc.).
- the resource manager provides various resources to the application, such as localized strings, icons, pictures, layout files, video files, and so on.
- the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction.
- the notification manager is used to notify the download completion, message reminder, etc.
- the notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, emitting a prompt sound, terminal vibration, and indicator light flashing, etc.
- the Android Runtime includes core library and virtual machine. The Android runtime is responsible for the scheduling and management of the Android system.
- the core library consists of two parts: one part is the function function that the java language needs to call, and the other part is the core library of Android.
- the application layer and the application framework layer run in virtual machines.
- the virtual machine executes the java files of the application program layer and the application program framework layer as binary files.
- the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
- a system library can include multiple function modules. For example: surface manager (surface manager), media library (media libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
- the Surface Manager can be used to manage the display subsystem and provide the fusion of 2D and 3D layers for multiple applications.
- the media library can support a variety of commonly used audio and video formats for playback and recording, as well as still image files.
- the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
- the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, synthesis, layer processing, etc.
- 2D graphics engine is a drawing engine for 2D drawing.
- the kernel layer is the layer between hardware and software.
- the kernel layer includes at least display driver, camera driver, audio driver, sensor driver and so on.
- the terminal 100 takes a mobile phone as an example to describe the audio processing solution of this solution in detail.
- the terminal 100 is not limited to a mobile phone, but may also be other devices such as a tablet and a notebook computer, which are not limited in this application.
- the terminal 100 When the user operates the terminal 100 to enter the "ringtone theme” mode, that is, the function of editing the terminal 100's ringtones for incoming calls, alarm clocks, messages, and notifications.
- the terminal 100 can automatically push out its own system ringtones, such as Bongo, Arrow, Bell and other ringtones, and can also display virtual buttons for selecting other music.
- the terminal 100 can call up audio data such as music and recordings that have not been stored in the terminal 100, and display the names of each audio data in a list The method is displayed on the interface so that the user can select the desired audio data.
- the terminal 100 can also mobilize third-party music software, such as the built-in music, Netease Cloud Music, Kugou Music and other applications (applications, APPs), enter With the third-party music software, the user can search for favorite music according to the intention, and select the favorite music as the ringtone theme selected by the user, and the terminal 100 downloads and stores the selected music in the memory.
- third-party music software such as the built-in music, Netease Cloud Music, Kugou Music and other applications (applications, APPs)
- the terminal 100 After the terminal 100 detects the music of the "ringtone theme" selected by the user, it will detect the playing time of the music.
- the playback time of a piece of music is more than one minute, and the ringtones of incoming calls, alarm clocks, notifications, etc. are relatively short.
- the notification ringtone is about 1 second, the incoming call It's about tens of seconds.
- the selected music is used as the music of the "ringtone theme"
- the selected music needs to be intercepted to extract music suitable for different application scenarios. seconds, notification ringtone for 1 second, etc.
- the user can also according to his personal preferences, such as wishing to use the climax segment of the selected music as the music of the "ringtone theme", and intercept the music segment at the climax part.
- the manner in which the terminal 100 intercepts music may be active interception, that is, the terminal 100 actively intercepts a piece of music according to a set mode.
- the application scenario takes an incoming call as an example.
- the terminal 100 detects the playback duration of the selected music, according to the application scenario, it intercepts 30 seconds of music playback duration audio data from the music playback time point, as the original audio data that can be applied in various application scenarios for subsequent editing .
- the terminal 100 can not only start intercepting from the start time point of playing, but also start intercepting from any time point in the middle, such as identifying the climax part of the selected music, and then start intercepting the time point when it enters the climax. limited.
- the method for the terminal 100 to intercept music may be passive interception, that is, the user selects and intercepts a piece of music through an operation.
- the terminal 100 enters "Edit Ringtone Theme"
- the selected music "Music A” enters the music playback mode
- the user can play the music according to personal preferences by sliding the two progress levels on the screen. (that is, the two black vertical lines with dots in Figure 5)
- the terminal 100 automatically saves a piece of music selected by the user as the original audio data that can be applied in various application scenarios for subsequent editing.
- the duration of the intercepted music clip is not necessarily associated with the application scenario of the ringtone theme application, and may be longer than or shorter than the duration set by the application scenario. This application It is not limited here either.
- the terminal 100 may perform preprocessing on the music segment.
- the terminal 100 parses out the waveform diagram of the music piece, as shown in FIG. 6 .
- a position with a relatively large peak in the waveform diagram indicates that the music is in a high tone
- a position with a relatively small peak in the waveform diagram indicates that the music is in a low tone.
- the terminal 100 After analyzing the waveform diagram of the music segment, the terminal 100 marks the time points corresponding to each large dynamic peak fluctuation in the waveform diagram, and obtains multiple marked time points, as shown by the black triangles in FIG. 6 .
- the terminal intercepts the music segment 100 times, and the music segment intercepted again is: the music segment between the first marked time point and the last marked time point.
- the most suitable start position and end position of the music segment selected by the user are obtained for trimming and calibration, so that both the start position and the end position of the intercepted music segment are in high pitch, ensuring that the music segment can be used as a ringtone, Alarm clock ringtones, etc., can remind users at the first time.
- the terminal 100 parses out the spectrum change diagram of the piece of music, as shown in FIG. 7 .
- the obvious ups and downs of the music are displayed on the frequency spectrum as rapid changes in the frequency band, that is, the stress will make the energy of the frequency band rise rapidly, and then decline, and the stress will make the energy rise again rapidly, and so on.
- the most obvious and easy-to-identify frequency band position is 20Hz-200Hz (box position)-the main sounding position of the drum/bass part, so as to help determine the upbeat of the music, that is, the position where the user starts to intercept the audio.
- the terminal 100 After analyzing the frequency spectrum of the music segment, the terminal 100 cuts off the music segment whose start position and end position are not at 20Hz-200Hz, and obtains the most suitable start position and end position of the music segment selected by the user for trimming and calibration , so that the beginning position and the end position of the intercepted music segment again are in positions where the frequency band is easy to identify.
- the ways in which the terminal 100 preprocesses the intercepted music clips are not limited to the above two methods, and there are other ways, such as noise reduction, filtering sounds of specific frequency bands, etc., which are not limited in this application.
- the terminal 100 can extract the music melody in the audio data, so that the audio data can be processed later, which can be applied to different application scenarios such as incoming call ringtones, alarm clock ringtones, and notification ringtones. .
- extracting the melody in music is introduced in the way of signal processing. As shown in Figure 6, the specific process of extracting the melody is as follows:
- Equal loudness filtering used to enhance the frequencies to which human listeners are more perceptually sensitive and attenuate less sensitive frequencies. Specifically, equal loudness in audio is to increase the volume of high-frequency and low-frequency components at low volumes, so that the loudness ratio of low, middle, and high parts remains the same as that at high volumes. Then, use the filter to filter out the sound corresponding to the frequency that the human body is not sensitive to in the volume of equal loudness, and retain the sound corresponding to the frequency that the human body is sensitive to, thereby enhancing the frequency that human listeners are more sensitive to perception.
- y(n) -a 1 ⁇ y(n-1)-a 2 ⁇ y(n-2)-...-a i ⁇ y(ni)+b 0 ⁇ x(n)+b 1 ⁇ x (n-1)+b 2 x(n-2)....+b i x(ni) (1)
- n represents, y(n) represents, a i represents the spectral peak value, x(n) represents the time series of the audio signal, and b i represents.
- STFT short-time Fourier transform
- spectral information including frequency and its corresponding amplitude and phase
- STFT is a mathematical transformation related to the Fourier transform, which is used to determine the frequency and phase of the sinusoidal wave in the local area of the time-varying signal.
- the input audio signal is subjected to equal-loudness filtering processing, and the formula is:
- l is the frame number
- M is the length of the window
- N is the length of STFT
- H is The step size of the sliding window.
- ki represents the i-th frequency segment
- STFT decomposes the signal into fs/N frequency segments
- fs represents the sampling frequency
- W Hann represents the Hann window kernel.
- the saliency value of each audio frame in the audio signal is calculated, and the saliency average value of the audio signal track is obtained, and the specific formula is:
- ⁇ represents the energy compression parameter
- e(a i ) represents the energy threshold function
- g(b, h, f i ) is the weight function
- f i is the frequency
- a i is the energy.
- ⁇ represents a non-zero threshold
- b means from 1 to N/2
- ⁇ is a harmonic weighting parameter
- a sound with a frequency of 73.416-1046.5 Hz is generally selected as the pitch recognition interval.
- Step 4.1 Calculate the pitch mean value P(t) of each frame as the spacing of all contours in the current frame, and calculate the distance between the spacing values of each frame in their overlapping area, and calculate the average value on the area; if the average distance is in Within a certain range, the contour lines are regarded as octave repeated pairs;
- Step 4.2 Smooth P(t) using a 5-second moving average filter with a step size of 1 frame to avoid large jumps
- Step 4.3 Detect octave repeated pairs and delete the contour farthest from P(t);
- Step 4.4 Following steps 4.1-4.2, recalculate P(t) using the remaining contours.
- Step 4.5 Remove inter-distance group values by removing contour lines at a distance beyond one octave from P(t);
- Step 4.6 follow steps 4.1-4.2 to recalculate P(t) using the remaining contours;
- Step 4.7 Repeat steps 4.3-4.6 twice;
- Step 4.8 Use the remaining silhouette after the last iteration as the final melody.
- the melody of music can be classified according to type, which can be energetic, dynamic, natural, rock, sad, etc. Wherein, the melody is "vitality”, and the music is played to give people a feeling of youthful vitality; the melody is “sad”, and the music is played to give people a feeling of sadness, and so on.
- the melody of music is generally composed of basic elements such as timbre, rhythm, mode, and beat.
- timbre means that different sounds always have distinctive characteristics in terms of waveforms. Different objects have different vibration characteristics. Tone can be divided into piano timbre, chromatic percussion instrument timbre, and organ timbre. , guitar sounds, and more.
- Tone can be divided into piano timbre, chromatic percussion instrument timbre, and organ timbre. , guitar sounds, and more.
- Tone can be divided into piano timbre, chromatic percussion instrument timbre, and organ timbre. , guitar sounds, and more.
- Rhythm is the combination of an unordered flow of beats into different patterns, and the integration of different parts of different lengths that are often repeated. Rhythm can be divided into triplets, syncopations, and so on. When the music is playing, different types of rhythms on the spectrum will be streamed with different beats.
- the terminal 100 will design different melodies for different application scenarios.
- take changing the melody type, the rhythm of the melody, and the timbre of the melody as an example.
- N types of melody modes are designed according to the melody type, such as “vigorous” mode, “natural” mode, “rock” mode, “sad” mode, etc.
- the timbre of the melody design M kinds of timbre modes in each melody mode, such as “piano type” mode, "chromatic percussion instrument” mode, "organ type” mode, etc.; then according to the rhythm of the melody, in each tone color mode
- Design K kinds of rhythm patterns such as "triplet” pattern, "splitting” pattern and so on. Therefore, for the application scenario of "incoming call ringtone", N ⁇ M ⁇ K ringtone patterns with different melodies can be designed.
- N, M, and K are all positive integers greater than 0.
- the user In the process of selecting the mode of "Incoming Call Ringtone", the user first selects the application scenario as the interface of "Incoming Call Ringtone". After entering the interface shown in Figure 10, the user can choose a type according to the type of melody he likes, such as " Vitality” mode; then, the interface shown in Figure 10 jumps to the interface for selecting timbre, and the user can choose a timbre according to the type of timbre he likes; then, the interface jumps to the interface for selecting the rhythm, and the user can choose according to the Choose the type of rhythm you like, and choose a rhythm; finally, click the "OK" virtual button on the interface to select the rhythm, and the interface will switch to the interface shown in Figure 10 again, and the user can swipe the screen from right to left to enter other applications Selection of scenes.
- a type according to the type of melody he likes such as " Vitality” mode
- the interface shown in Figure 10 jumps to the interface for selecting timbre, and the user can choose a timbre according to the type of
- the user only pays attention to the type of melody, and does not mind the timbre and rhythm of the melody.
- the user selects the "Vitality" mode on the interface shown in Figure 10
- he directly clicks the "OK” virtual button, and then slides the screen from right to left to enter the selection of other application scenarios.
- the timbre of the melody and the rhythm of the melody are not selected, when the intercepted music segment is played through the selected mode, it will be played with the timbre of the melody of the music segment and the rhythm of the melody of the music segment.
- the intercepted music segment may be cyclically superimposed.
- the time period of the intercepted music segment is 20s
- the ringtone time period set in the application scenario is 30s.
- the first 20s is the complete intercepted music segment
- the last 10s is the first 10s of the intercepted music segment.
- the intercepted music segment can be intercepted again to obtain the ringtone time period set by the application scene, or the intercepted music segment can be fast-forwarded. Allow the intercepted music clips to be played within the ringtone time period set by the application scenario.
- the terminal 100 When the user selects the mode whose application scenario is “message ringtone", the terminal 100 will also automatically replace the melody type, timbre and rhythm of the melody in the intercepted music segment with the mode selected by the user.
- Other application scenarios such as “notification ringtones” and “alarm clock ringtones”, and so on.
- the terminal 100 After the terminal 100 generates a ringtone of a corresponding pattern for each application scenario, it enters an interface as shown in FIG. 11 , and uses the ringtone currently generated for each application scenario as a theme. If the intercepted audio clips are multi-segment, you can also regenerate the ringtones of each application scene of a theme, as shown in Figure 12, the user clicks The virtual button enters the interface shown in Figure 10-11 again, and generates ringtones for various application scenarios of a theme again.
- the original music clips of application scenarios such as “calling ringtone”, “notification ringtone”, “message ringtone” and “alarm clock ringtone” can be the same music clip or for different pieces of music.
- the ringtones of the various application scenarios of the theme can be set as the currently executed ringtones of the terminal 100 by clicking the "application" virtual button on the interface according to the individual's intention. .
- one or more specific music segments in the music are intercepted, and then the melody in the specific music segment is extracted. If the specific music segment is applied to different application scenarios, the The melody in the specific music segment is replaced with the melody set in different application scenarios, so that the specific music segment can be used as a ringtone in different application scenarios, which improves the personalized design of the terminal in different application scenarios, and prolongs the user's interest in the selected music. time of disgust.
- Fig. 13 is a schematic flowchart of a method for generating multiple sound effects provided in the embodiment of the present application. As shown in Figure 13, the implementation process of this method is as follows:
- Step S1301 determine first audio data.
- the terminal 100 When the user operates the terminal 100 to enter the "ringtone theme” mode, that is, the function of editing the terminal 100's ringtones for incoming calls, alarm clocks, messages, and notifications.
- the terminal 100 After entering the "ringtone theme” mode, as shown in Figure 4, the terminal 100 can automatically push out the built-in system ringtones, such as Bongo, Arrow, Bell and other ringtones, and can also display virtual buttons for selecting other music.
- the terminal 100 After the terminal 100 detects the music of the "ringtone theme" selected by the user, it will detect the playing time of the music.
- the playback time of a piece of music is more than one minute, and the ringtones of incoming calls, alarm clocks, notifications, etc. are relatively short.
- the notification ringtone is about 1 second, the incoming call It's about tens of seconds. If the selected music is used as the music of the "ringtone theme", the selected music needs to be intercepted to extract music suitable for different application scenarios. seconds, notification ringtone for 1 second, etc.
- the user can also according to his personal preferences, such as wishing to use the climax segment of the selected music as the music of the "ringtone theme", and intercept the music segment at the climax part.
- the manner in which the terminal 100 intercepts music may be active interception, that is, the terminal 100 actively intercepts a piece of music according to a set mode as the first audio data.
- Step S1302 extracting melody information in the first audio data.
- the terminal 100 can extract the music melody in the audio data, so that the audio data can be processed later, which can be applied to different application scenarios such as incoming call ringtones, alarm clock ringtones, and notification ringtones. .
- extracting the melody in music is introduced in the way of signal processing. The specific process of extracting the melody is shown in Fig. 6 and correspondingly described in Fig. 6, which is not repeated in this application.
- Step S1303 receiving a first operation instruction, and determining each application scenario based on the first operation instruction.
- Step S1304 generating audio data suitable for each application scenario according to the preset audio file.
- the melody of music can be classified according to type, which can be energetic, dynamic, natural, rock, sad, etc. Wherein, the melody is "vitality”, and the music is played to give people a feeling of youthful vitality; the melody is “sad”, and the music is played to give people a feeling of sadness, and so on.
- the melody of music is generally composed of basic elements such as timbre, rhythm, mode, and beat.
- timbre means that different sounds always have distinctive characteristics in terms of waveforms. Different objects have different vibration characteristics. Tone can be divided into piano timbre, chromatic percussion instrument timbre, and organ timbre. , guitar sounds, and more.
- Tone can be divided into piano timbre, chromatic percussion instrument timbre, and organ timbre. , guitar sounds, and more.
- Tone can be divided into piano timbre, chromatic percussion instrument timbre, and organ timbre. , guitar sounds, and more.
- Rhythm is the combination of an unordered flow of beats into different patterns, and the integration of different parts of different lengths that are often repeated. Rhythm can be divided into triplets, syncopations, and so on. When the music is playing, different types of rhythms on the spectrum will be streamed with different beats.
- the terminal 100 will design different melodies for different application scenarios.
- take changing the melody type, the rhythm of the melody, and the timbre of the melody as an example.
- N types of melody modes are designed according to the melody type, such as “vigorous” mode, “natural” mode, “rock” mode, “sad” mode, etc.
- the timbre of the melody design M kinds of timbre modes in each melody mode, such as “piano type” mode, "chromatic percussion instrument” mode, "organ type” mode, etc.; then according to the rhythm of the melody, in each tone color mode
- Design K kinds of rhythm patterns such as "triplet” pattern, "splitting” pattern and so on. Therefore, for the application scenario of "incoming call ringtone", N ⁇ M ⁇ K ringtone patterns with different melodies can be designed.
- N, M, and K are all positive integers greater than 0.
- the user In the process of selecting the mode of "Incoming Call Ringtone", the user first selects the application scenario as the interface of "Incoming Call Ringtone". After entering the interface shown in Figure 10, the user can choose a type according to the type of melody he likes, such as " Vitality” mode; then, the interface shown in Figure 10 jumps to the interface for selecting timbre, and the user can choose a timbre according to the type of timbre he likes; then, the interface jumps to the interface for selecting the rhythm, and the user can choose according to the Choose the type of rhythm you like, and choose a rhythm; finally, click the "OK" virtual button on the interface to select the rhythm, and the interface will switch to the interface shown in Figure 10 again, and the user can swipe the screen from right to left to enter other applications Selection of scenes.
- a type according to the type of melody he likes such as " Vitality” mode
- the interface shown in Figure 10 jumps to the interface for selecting timbre, and the user can choose a timbre according to the type of
- the terminal 100 When the user selects the mode whose application scenario is “message ringtone", the terminal 100 will also automatically replace the melody type, timbre and rhythm of the melody in the intercepted music segment with the mode selected by the user.
- Other application scenarios such as “notification ringtones” and “alarm clock ringtones”, and so on.
- the terminal 100 After the terminal 100 generates a ringtone of a corresponding pattern for each application scenario, it enters an interface as shown in FIG. 11 , and uses the ringtone currently generated for each application scenario as a theme. If the intercepted audio clips are multi-segment, you can also regenerate the ringtones of each application scene of a theme, as shown in Figure 12, the user clicks The virtual button enters the interface shown in Figure 10-11 again, and generates ringtones for various application scenarios of a theme again.
- the ringtones of the various application scenarios of the theme can be set as the currently executed ringtones of the terminal 100 by clicking the "application" virtual button on the interface according to the individual's intention. .
- one or more specific music segments in the music are intercepted, and then the melody in the specific music segment is extracted. If the specific music segment is applied to different application scenarios, the The melody in the specific music segment is replaced with the melody set in different application scenarios, so that the specific music segment can be used as a ringtone in different application scenarios, which improves the personalized design of the terminal in different application scenarios, and prolongs the user's interest in the selected music. time of disgust.
- FIG. 14 is a schematic structural diagram of an apparatus for generating various sound effects provided in an embodiment of the present application.
- an apparatus 1400 includes a processing unit 1401 and a transceiver unit 1402 .
- the implementation process of device 1400 is as follows:
- the processing unit 1401 is used to determine the first audio data; the processing unit 1401 is also used to extract the melody information in the first audio data; the transceiver unit 1402 is used to receive the first operation instruction; the processing unit 1401 is also used to Determine at least one application scenario based on the first operation instruction; and generate audio data suitable for each application scenario according to a preset audio file, where the audio file includes melody information corresponding to different application scenarios.
- the transceiver unit 1402 is further configured to receive a second operation instruction, and select the original audio data based on the second operation instruction; the processing unit 1401 is also configured to intercept the extracting the at least one target audio data from the original audio data, where the at least one target audio data includes the first audio data.
- the processing unit 1401 is specifically configured to calculate at least one spectral peak in the first audio data according to the first audio data; position, calculating the significance corresponding to the at least one spectral peak; constructing a pitch profile according to the at least one spectral peak and the frequency corresponding to the at least one spectral peak; selecting the first significant The pitch contour serves as melody information of the first audio data.
- the processing unit 1401 is specifically configured to determine the melody information corresponding to each application scenario according to the preset audio file; replace the melody information in the first audio data with the The melody information corresponding to each application scenario is used to obtain audio data suitable for each application scenario.
- the melody information includes melody type, timbre, and rhythm
- the processing unit 1401 is specifically configured to receive a third operation instruction, and based on the third operation instruction, convert the first audio data to The melody type, timbre, and rhythm of the application scene are replaced with the melody type, timbre, and rhythm corresponding to each application scenario.
- the audio file further includes time lengths corresponding to different application scenarios
- the processing unit 1401 is further configured to adjust the playing time length of the audio data applicable to each application scenario to the The length of time corresponding to each application scenario.
- the processing unit 1401 is also used to determine the second audio data; the processing unit 1401 is also used to extract the melody information in the second audio data; the transceiver unit 1402 is also used to receive A first operation instruction; the processing unit 1402 is further configured to determine a second application scenario based on the first operation instruction; and generate audio data suitable for the second application scenario according to the audio file.
- the present invention provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed in a computer, the computer is made to execute any one of the methods described in the above-mentioned Figures 1-12 and corresponding descriptions.
- the present invention provides a computer program product, the computer program product stores instructions, and the instructions, when executed by a computer, cause the computer to implement any one of the methods described in the above-mentioned Figures 1-12 and corresponding descriptions.
- computer-readable media may include, but are not limited to: magnetic storage devices (e.g., hard disks, floppy disks, or tapes, etc.), optical disks (e.g., compact discs (compact discs, CDs), digital versatile discs (digital versatile discs, DVDs), etc.), smart cards and flash memory devices (for example, erasable programmable read-only memory (EPROM), card, stick or key drive, etc.).
- magnetic storage devices e.g., hard disks, floppy disks, or tapes, etc.
- optical disks e.g., compact discs (compact discs, CDs), digital versatile discs (digital versatile discs, DVDs), etc.
- smart cards and flash memory devices for example, erasable programmable read-only memory (EPROM), card, stick or key drive, etc.
- various storage media described herein can represent one or more devices and/or other machine-readable media for storing information.
- the term "machine-readable medium” may include, but is not limited to, wireless channels and various other media capable of storing, containing and/or carrying instructions and/or data.
- the apparatus 1400 for generating various sound effects in FIG. 14 may be fully or partially implemented by software, hardware, firmware or any combination thereof.
- software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
- the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
- the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a high-density digital video disc (digital video disc, DVD)), or a semiconductor medium (for example, a solid state disk (solid state disk) , SSD)) etc.
- a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
- an optical medium for example, a high-density digital video disc (digital video disc, DVD)
- a semiconductor medium for example, a solid state disk (solid state disk) , SSD)
- sequence numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not The implementation process of the embodiment of the present application constitutes no limitation.
- the disclosed systems, devices and methods may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions for enabling a computer device (which may be a personal computer, a server, or an access network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the embodiments of the present application.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Human Computer Interaction (AREA)
- Computer Networks & Wireless Communication (AREA)
- Telephone Function (AREA)
Abstract
本申请提供了生成多种音效的方法、装置和终端设备,涉及音频技术领域。其中,所述方法包括:确定第一音频数据;提取第一音频数据中的旋律信息;接收第一操作指令,并基于第一操作指令,确定至少一个应用场景;根据预设的音频文件,生成适用于各个应用场景的音频数据,音频文件包括不同应用场景对应的旋律信息。本申请中,在得到用户确定的音乐后,截取该音乐中一个或多个的特定音乐片段,提取特定音乐片段中的旋律,如果将特定音乐片段应用在不同应用场景时,则将特定音乐片段中的旋律替换成不同应用场景中设定的旋律,使得特定音乐片段可以作为不同应用场景的铃声,提升终端在不同应用场景下的个性化设计,且会延长用户对选定音乐产生厌恶感的时间。
Description
本申请要求于2021年06月30日提交中国国家知识产权局、申请号为202110741096.7、申请名称为“一种生成多种音效的方法、装置和终端设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明涉及音频技术领域,尤其涉及一种生成多种音效的方法、装置和终端设备。
现有的终端设备中,如智能手机、笔记本电脑、平板等等,都具有提醒功能。其中,音频模块播放铃声、提示音等音频信号,是实现提醒功能的最为常见方法之一。以提示音为例,现有终端设备上的提示音都是终端设备出厂时已经设置好的,用户只能从固定的几种提示音中选择一个作为自己终端设备的提示音,不仅缺乏个性化,而且随着时间的推移,用户会对选定的音乐产生厌恶感。
发明内容
为了解决上述的问题,本申请的实施例提供了一种生成多种音效的方法、装置和终端设备,通过改变选定音频信号的旋律,使得应用于不同场景下的音频信号不仅具有个性化,而且延长用户产生厌恶感时间。
为此,本申请的实施例采用如下技术方案:
第一方面,本申请提供一种生成多种音效的方法,包括:确定第一音频数据;提取所述第一音频数据中的旋律信息;接收第一操作指令,并基于所述第一操作指令,确定至少一个应用场景;根据预设的音频文件,生成适用于所述各个应用场景的音频数据,所述音频文件包括不同应用场景对应的旋律信息。
在该实施方式中,在得到用户确定的音乐后,截取该音乐中一个或多个的特定音乐片段,然后提取特定音乐片段中的旋律,如果将特定音乐片段应用在不同应用场景时,则将特定音乐片段中的旋律替换成不同应用场景中设定的旋律,使得特定音乐片段可以作为不同应用场景的铃声,提升终端在不同应用场景下的个性化设计,且会延长用户对选定音乐产生厌恶感的时间。
在一种实施方式中,在所述确定第一音频数据之前,包括:接收第二操作指令,并基于所述第二操作指令,选择出原始音频数据;按照设定规则,截取出所述原始音频数据中所述至少一个目标音频数据,所述至少一个目标音频数据包括所述第一音频数据。
在该实施方式中,原始音频数据一般可以为终端设备中自带的音频数据,也可以是用户根据自己爱好,从第三方应用程序中选择。一般用户选择的音频数据播放时间比较长,所以需要对其进行截取,截取出符合要求的时长或用户喜欢的音乐片段,以便后续作为不同应用场景下的音频数据,是用户最喜欢的,从而延长用户对选定音乐产生厌恶感的时间。
在一种实施方式中,所述提取所述第一音频数据中的旋律信息,包括:根据所述第一音频数据,计算出所述第一音频数据中的至少一个谱峰;根据所述至少一个谱峰在频域上的位置,计算所述至少一个谱峰对应的显著性;根据所述至少一个谱峰和所述至少一个谱峰对应 的频率,构建音高轮廓;通过音高轮廓滤波,选择出第一显著性的音高轮廓作为所述第一音频数据的旋律信息。
在一种实施方式中,所述根据预设的音频文件,生成适用于所述各个应用场景的音频数据,包括:根据所述预设的音频文件,确定所述各个应用场景对应的旋律信息;将所述第一音频数据中的旋律信息替换成所述各个应用场景对应的旋律信息,得到适用于所述各个应用场景的音频数据。
在该实施方式中,通过对音频信号中旋律进行替换,将已经设定的不同应用场景对应的旋律替换到选定音频数据中,使得选定的音频数据可以转换成不同应用场景下的音频数据,增加音频数据的丰富度和操作简便性。
在一种实施方式中,所述旋律信息包括旋律类型、音色和节奏,所述根据预设的音频文件,生成适用于所述各个应用场景的音频数据,包括:接收第三操作指令,并基于所述第三操作指令,将所述第一音频数据中的所述旋律类型、音色和节奏替换成所述各个应用场景对应的所述旋律类型、音色和节奏。
在该实施方式中,一般而言,音乐的旋律类型、音色和节奏是用户最容易感知的不同的因素,所以通过对音频数据中的旋律类型、音色和节奏进行改变,可以更加直观的让用户感受不到音乐的不同,从而以最简单的方式将音频数据转换成不同应用场景的音频数据。
在一种实施方式中,所述音频文件还包括不同应用场景对应的时间长度,所述方法还包括:将所述适用于所述各个应用场景的音频数据的播放时间长度调整成所述各个应用场景对应的时间长度。
在该实施方式中,一般而言,不同的应用场景下播放音频信号的时长是不相同的,如提示音一般在1-2s左右,闹钟播放时长在几十秒左右,而截取的音频数据大概率和每个应用场景的播放时长是不相同,所以需要对音频信号的播放时长进行调节,如快速播放或慢速播放等方式,将音频信号调整到适用于不同应用场景下的播放时长。
在一种实施方式中,所述方法还包括:确定第二音频数据;提取所述第二音频数据中的旋律信息;根据所述音频文件,生成适用于所述各个应用场景的音频数据。
在该实施方式中,针对不同的应用场景,如果采用一个音频信号,也是会容易造成用户会对选定的音乐产生厌恶感,所以用户可以选用两个或两个以上的音频数据,可以对不同的应用场景,选用不同的音频数据,从而进一步提升终端在不同应用场景下的个性化设计,且会延长用户对选定音乐产生厌恶感的时间。
第二方面,本申请实施例还提供了一种生成多种音效的装置,包括:处理单元,用于确定第一音频数据;所述处理单元,还用于提取所述第一音频数据中的旋律信息;收发单元,用于接收第一操作指令;所述处理单元,还用于基于所述第一操作指令,确定至少一个应用场景;以及根据预设的音频文件,生成适用于所述各个应用场景的音频数据,所述音频文件包括不同应用场景对应的旋律信息。
在一种实施方式中,所述收发单元,还用于接收第二操作指令,并基于所述第二操作指令,选择出原始音频数据;所述处理单元,还用于按照设定规则,截取出所述原始音频数据中所述至少一个目标音频数据,所述至少一个目标音频数据包括所述第一音频数据。
在一种实施方式中,所述处理单元,具体用于根据所述第一音频数据,计算出所述第一音频数据中的至少一个谱峰;根据所述至少一个谱峰在频域上的位置,计算所述至少一个谱峰对应的显著性;根据所述至少一个谱峰和所述至少一个谱峰对应的频率,构建音高轮廓;通过音高轮廓滤波,选择出第一显著性的音高轮廓作为所述第一音频数据的旋律信息。
在一种实施方式中,所述处理单元,具体用于根据所述预设的音频文件,确定所述各个应用场景对应的旋律信息;将所述第一音频数据中的旋律信息替换成所述各个应用场景对应的旋律信息,得到适用于所述各个应用场景的音频数据。
在一种实施方式中,所述旋律信息包括旋律类型、音色和节奏,所述处理单元,具体用于接收第三操作指令,并基于所述第三操作指令,将所述第一音频数据中的所述旋律类型、音色和节奏替换成所述各个应用场景对应的所述旋律类型、音色和节奏。
在一种实施方式中,所述音频文件还包括不同应用场景对应的时间长度,所述处理单元,还用于将所述适用于所述各个应用场景的音频数据的播放时间长度调整成所述各个应用场景对应的时间长度。
在一种实施方式中,所述处理单元,还用于确定第二音频数据;所述处理单元,还用于提取所述第二音频数据中的旋律信息;以及根据所述音频文件,生成适用于所述各个应用场景的音频数据。
第三方面,本申请实施例还提供了一种终端设备,包括至少一个处理器,所述处理器用于执行存储器中存储的指令,以使得终端设备执行如第一方面各个可能实现的是实施例。
第四方面,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行如第一方面各个可能实现的是实施例。
第五方面,本申请实施例还提供了一种计算机程序产品,其特征在于,所述计算机程序产品存储有指令,所述指令在由计算机执行时,使得所述计算机实现如第一方面各个可能实现的是实施例。
下面对实施例或现有技术描述中所需使用的附图作简单地介绍。
图1为本申请实施例中提供的一种终端的硬件结构示意图;
图2为本申请实施例中提供的显示屏显示音乐卡片的示意图;
图3为本申请实施例中提供的一种终端的软件结构示意图;
图4为本申请实施例中提供的显示屏显示选择系统铃声的界面示意图;
图5为本申请实施例中提供的县食品显示如何被动截取音乐片段的界面示意图;
图6为本申请实施例中提供的解析后的音乐片段的峰值分布示意图;
图7为本申请实施例中提供的解析后的音乐片段的频率分布示意图;
图8为本申请实施例中提供的旋律提取流程示意图;
图9为本申请实施例中提供的音色为钢琴的音高对应的频率分布图;
图10为本申请实施例中提供的应用场景为来电铃声时的旋律类型选取界面示意图;
图11为本申请实施例中提供的选定不同应用场景构建一个主题的界面示意图;
图12为本申请实施例中提供的构建不同主题的界面示意图;
图13为本申请实施例中提供的一种生成多种音效的方法的流程示意图;
图14为本申请实施例中提供的一种生成多种音效的装置的结构示意图。
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图,对本申请实施例中的技术方案进行描述。
在本申请实施例的描述中,“示例性的”、“例如”或者“举例来说”等词用于表示作 例子、例证或说明。本申请实施例中被描述为“示例性的”、“例如”或者“举例来说”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”、“例如”或者“举例来说”等词旨在以具体方式呈现相关概念。
在本申请实施例的描述中,术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,单独存在B,同时存在A和B这三种情况。另外,除非另有说明,术语“多个”的含义是指两个或两个以上。例如,多个系统是指两个或两个以上的系统,多个终端是指两个或两个以上的终端。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
图1为本申请实施例提供的一种终端的硬件结构示意图。如图1所示,该终端100可以包括处理器101、存储器102和收发器103。
其中,处理器101可以是通用处理器或者专用处理器。例如,处理器101可以包括中央处理器(central processing unit,CPU)和/或基带处理器。其中,基带处理器可以用于处理通信数据,CPU可以用于实现相应的控制和处理功能,执行软件程序,处理软件程序的数据。示例性的,处理器101可以基于设定规则,从音频数据中截取部分音频数据,然后提取出部分音频数据中的旋律,如调式、节奏、节拍、力度、音色(表演方法方式)等。处理器101也可以修改截取出的音频数据中的旋律,如修改成不同的节奏和音色,使得截取出的音频数据产生不同的音效。
存储器102上可以存有程序(也可以是指令或者代码),程序可被处理器101运行,使得处理器101执行本方案中描述的方法。可选地,存储器102中还可以存储有数据。例如,处理器101可以读取存储器102中存储的数据(例如,音频数据等),该数据可以与程序存储在相同的存储地址,该数据也可以与程序存储在不同的存储地址。本方案中,处理器101和存储器102可以单独设置,也可以集成在一起,例如,集成在单板或者系统级芯片(system on chip,SOC)上。
收发器103可以实现信号的输入(接收)和输出(发送)。例如,收发器103可以包括收发器或射频芯片。收发器103还可以包括通信接口。示例性的,终端100可以通过收发器103将产生不同音效的音频数据发送至其它模块或其它设备,如扬声器、音响、车辆等等,可以通过终端100或其它设备上的扬声器播放该音频数据。此外,终端100也可以通过收发器103从服务器接收音频数据等。
可选地,终端100中可以包括显示屏104。该显示屏104可以显示终端100所播放音乐的音乐卡片。示例性的,终端100上显示的音乐卡片可以为图2中所示的音乐卡片21。在一个例子中,显示屏104还可以用于显示应用程序的界面,显示应用程序的显示窗口等。
可选地,终端100中可以包括音频模块105。该音频模块105可以将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块105还可以对音频信号编码和解码。在一些示例中,音频模块105可以设置于处理器101中,或将音频模块105的部分功能模块设置于处理器101中。
可以理解的是,本申请实施例示意的结构并不构成对终端100的具体限定。在本申请另一些实施例中,终端100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
关于终端100在上述各种可能的设计中执行的操作的详细描述可以参照下文本方案提供的方法的实施例中的描述,在此就不再一一赘述。
图3为本申请实施例提供的一种终端的软件结构示意图。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层、应用程序框架层、安卓运行时(Android runtime)和系统库,以及内核层。
其中,应用程序层可以包括一系列应用程序包。如图3所示,应用程序层内可以安装相机、图库、日历、通话、地图、导航、蓝牙、音乐、视频、短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。如图3所示,应用程序框架层可以包括显示策略服务和显示管理服务。当然,应用程序框架层中还可以包括活动管理器、窗口管理器、内容提供器、视图系统、电话管理器、资源管理器、通知管理器等,本申请实施例对此不作任何限制。
窗口管理器可以用来管理窗口程序。窗口管理器可以获取显示屏大小、判断是否有状态栏、锁定屏幕、截取屏幕等。在本申请的一些实施例中,窗口管理器可具体为窗口管理服务(window manager service,WMS),该WMS存放有当前屏幕中显示的各个应用窗口的信息,例如,当前屏幕中显示的应用窗口的数量等信息。
内容提供器可以用来获取数据,并使这些数据可以被应用程序访问。这些数据可以包括视频、图像、音频、拨打和接听的电话、浏览历史和书签、电话簿等。
视图系统可以包括可视控件,例如,显示/输入文字的控件、显示图片的控件、显示视频的控件等。视图系统可以用来构建应用程序。显示界面可以由一个或多个视图组成。例如,包括音乐播放的显示界面,可以包括显示音乐中歌词的视图以及显示如图1中所示的音乐卡片12的视图。
电话管理器用于提供终端200的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串、图标、图片、布局文件、视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,终端振动,指示灯闪烁等。
Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager)、媒体库(media libraries)、三维图形处理库(例如:OpenGL ES)、2D图形引擎(例如:SGL)等。
表面管理器可以用来对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层 的融合。
媒体库可以支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4、H.264、MP3、AAC、AMR、JPG、PNG等。
三维图形处理库用于实现三维图形绘图、图像渲染、合成、图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动、摄像头驱动、音频驱动、传感器驱动等等。
接下来,基于图1所示的终端的硬件结构和图3所示的终端的软件结构,本申请实施例中,终端100以手机为例,对本方案的音频处理方案进行详细说明。显然,终端100不仅限于手机,还可以为平板、笔记本电脑等其它设备,本申请不作限定。
1、确定音频数据。
当用户操作终端100进入“铃声主题”模式,也即编辑终端100的来电铃声、闹钟铃声、信息铃声、通知铃声等声音的功能。在进入“铃声主题”模式后,如图4所示,终端100可以自动地推送出自带的系统铃声,如Bongo、Arrow、Bell等铃声,也可以显示选择其它音乐的虚拟按钮。可选地,当用户点击图4中的“选择本地音乐”的虚拟按钮后,终端100可以调取已存储的非自带的音乐、录音等音频数据,并将各个音频数据的名称以列表的方式显示在界面上,以便用户选择意向的音频数据。当用户点击图4中的“选择在线音乐”的虚拟按钮后,终端100也可以调动第三方音乐软件,如自带的音乐、网易云音乐、酷狗音乐等应用程序(application,APP),进入第三方音乐软件后,用户可以按照意向,搜索出喜欢的音乐,并将喜欢的音乐选定,作为用户选定的铃声主题,终端100将选定的音乐下载并存储在存储器中。
2、截取音频数据中的一段音乐。
终端100检测到用户选定的“铃声主题”的音乐后,会对该音乐播放时间进行检测。一般来说,一首音乐播放时间都在一分钟以上,而来电铃声、闹钟铃声、通知铃声等等,都是比较短的,如通知铃声在1秒左右,来电铃声在30秒左右,闹钟铃声也就几十秒左右。如果将选定的音乐作为“铃声主题”的音乐,则需要对选定的音乐进行截取,截取出适合各个不同的应用场景的时间段的音乐,如来电铃声需要30秒时长、闹钟铃声需要40秒、通知铃声1秒等等。用户也可以根据自己个人的爱好,如希望将选定的音乐中高潮部分的片段作为“铃声主题”的音乐,截取处于高潮部分的音乐片段。
终端100截取音乐的方式可以为主动截取,也即终端100根据设定的模式,主动截取一段音乐片段。示例性地,应用场景以来电为例。终端100检测到选定的音乐的播放时长后,根据应用场景,从音乐播放时间点开始,截取30秒的音乐播放时长的音频数据,作为后续编辑的可以应用在各个应用场景下的原始音频数据。可选地,终端100不仅可以从开始播放时间点开始截取,还可以从中间任意一个时间点开始截取,如识别选定音乐的高潮部分,然后对进入高潮时间点开始截取,本申请在此不作限定。
终端100截取音乐的方式可以为被动截取,也即由用户进行操作,选择截取一段音乐片段。示例性地,如图5所示,终端100进入“编辑铃声主题”后,选定的音乐“音乐A”进入音乐播放模式,用户可以根据个人的喜好,通过滑动屏幕上播放音乐的两个进度条(也即图5中的两个有圆点的黑色竖线),选择出一段音乐片段,作为后续编辑的可以应用在各个应用 场景下的原始音频数据。可选地,用户可以通过点击“确定”虚拟按键后,终端100自动保存用户选定的一段音乐片段,作为后续编辑的可以应用在各个应用场景下的原始音频数据。
用户滑动屏幕上两个进度条,很难准确截取的想要的音乐片段。可选地,用户通过滑动进度条截取到音乐片段后,可以通过调节屏幕上的向前“+3s”或向后“-1s”的虚拟按钮,准确截取出自己喜欢的音乐片段,作为后续编辑的可以应用在各个应用场景下的原始音频数据。
本申请上述仅举了两种截取音乐片段的方式,可以想到的是,本申请截取音乐片段的方式并不仅限于上述两种方案,还可以为其它方式,本申请在此不作限定。
另外,在用户主动截取音乐片段时,截取音乐片段的时长并不一定与铃声主题应用的应用场景相关联,可以为大于应用场景设定的时长,也可以小于应用场景设定的时长,本申请在此也不作限定。
终端100得到截取的音乐片段后,可以对音乐片段进行预处理。可选地,终端100在截取到一段音乐片段后,解析出该音乐片段的波形图,如图6所示。其中,波形图中峰值比较大的位置表示音乐处于高声调,波形图中峰值比较小的位置表示音乐处于低声调。
终端100解析出该音乐片段的波形图后,标注出该波形图中每次动态峰值较大起伏对应的时间点,得到多个标注时间点,如图6中黑色三角形。终端100次对该音乐片段进行截取,再次截取的音乐片段为:第一个标注时间点与最后一个标注时间点之间的音乐片段。通过对音乐片段再次截取,得到用户所选择音乐片段最适合的开始位置和结束位置进行裁剪校准,使得再次截取的音乐片段的开头位置和结尾位置都处于高音调,保证该音乐片段作为来电铃声、闹钟铃声等等,可以第一时间提醒用户。
可选地,终端100在截取到一段音乐片段后,解析出该音乐片段的频谱变化图,如图7所示。音乐明显的起伏在频谱上的显示为频段的快速变化,即重音会使频段能量迅速提升,然后衰退,再一次重音又会使能量再次迅速上升如此循环。最明显的易于识别的频段位置是在20Hz-200Hz(方框位置)——鼓/低音声部主要发声位置,依此来辅助确定音乐的正拍即适合用户截取音频开始的位置。
终端100解析出音乐片段的频谱后,将该音乐片段的开始位置和结尾位置不处在20Hz-200Hz的部分音乐片段截掉,得到用户所选择音乐片段最适合的开始位置和结束位置进行裁剪校准,使得再次截取的音乐片段的开头位置和结尾位置都处于频段易于识别的位置。
本申请中,终端100对截取的音乐片段进行预处理的方式,不仅限于上述两种,还可以有其它方式,可以降噪、过滤特定频段的声音等等,本申请在此不作限定。
3、提取音乐片段中的旋律。
终端100在得到一个或多个音乐片段对应的原始音频数据,可以提取该音频数据中的音乐旋律,以便后续将该音频数据进行处理,可以适用于来电铃声、闹钟铃声、通知铃声等不同应用场景。示例性地,以信号处理的方式介绍提取音乐中的旋律。如图6所示,提取旋律的具体过程如下:
(1)计算谱峰(用于构建随时间的音高显著性的表示)
a、等响滤波:用于增强人类听众对感知更敏感的频率,并衰减不敏感的频率。具体地,音响中的等响,就是在低音量时提升高频和低频成分的音量,使得低、中、高部分的响度比例保持和在高音量时的响度比例相同。然后,利用滤波器,将等响度音量中人体不敏感的频率对应的声音过滤掉,保留下人体敏感的频率对应的声音,从而增强人类听众对感知更敏感的频率。
示例性地,对输入的音频信号进行滤波处理,采用公式为:
y(n)=-a
1·y(n-1)-a
2·y(n-2)-...-a
i·y(n-i)+b
0·x(n)+b
1·x(n-1)+b
2·x(n-2)....+b
i·x(n-i) (1)
其中,n表示,y(n)表示,a
i表示谱峰值,x(n)表示音频信号的时间序列,b
i表示。
b、谱变换将等响滤波后的滤波数据采用短时傅里叶变换(short-time fourier transform,STFT),得到频谱信息(包含频率及其对应的幅值和相位),并通过局部最大值得到峰值。具体地,STFT是与傅里叶变换相关的一种数学变换,用以确定时变信号其局部区域正弦波的频率与相位。通过对滤波后的音频信号进行STFT,得到输入音频信号的频域能量值|X
l(k)|,并从频域能量值|X
l(k)|中找出所有能量峰值位置pi。
示例性地,对输入的音频信号进行等响滤波处理,采用公式为:
其中,l=0,1,2…;k=0,1,…,N+1,w(n)是窗函数,l是帧号,M是窗的长度,N是STFT的长度,H是滑窗的步长。
c、频率/幅度的校正:在由谱变换得到谱相位和谱幅值通过局部最大值获得峰值时,采用相位计算峰的瞬时频率(IF)和振幅。
示例性地,对IF的计算,采用公式为:
对振幅计算,采用公式为:
其中,W
Hann表示Hann窗核。
(2)计算谱峰的显著性:将公式(1)中计算的谱峰值a
i和相应频率f
i,通过频谱能量计算得到显著性特征。
具体地,根据频域能量峰值位置,计算该音频信号中每个音频帧的显著性的显著值,并得到该音频信号轨迹的显著平均值,具体采用公式为:
其中,β表示能量压缩参数,e(a
i)表示能量阈值函数,g(b,h,f
i)是权重函数,f
i为频率,a
i为能量。
(3)创建音高轮廓:通过公式(5)-(7)得到的音高显著性特征,通过峰值检测计算显著性特征峰值和对应频率,并利用静态和动态似然函数创建音高轮廓。其中,静态和动态似然函数创建音高轮廓基本原理可以参考现有的《李强,于凤芹.一种改进的基于音高显著性的旋律提取算法.计算机工程与应用,2019,55(3):115-119.》的第2.1节的介绍,本申请在此不再赘述了。
其中,创建音高轮廓之前,需要确定音高识别区间。示例性地,如图9所示,以钢琴为例,一般选取频率在73.416-1046.5Hz的音色作为音高识别区间。
(4)确定旋律。通过音高轮廓滤波将非旋律轮廓滤除,选出显著性和最高的轮廓作为旋律音高。具体实现过程如下:
步骤4.1:计算每帧的音高均值P(t)作为当前帧所有轮廓的间距,以及计算它们重叠区域的每帧间距值之间的距离,并计算该区域上的平均值;如果平均距离在一定范围内以内,则等高线被视为倍频程重复对;
步骤4.2:使用步长为1帧的5秒滑动均值滤波器平滑P(t),避免大幅跳跃;
步骤4.3:检测倍频程重复对,删除离P(t)最远的轮廓;
步骤4.4:按照步骤4.1-4.2,使用剩余等高线重新计算P(t)。
步骤4.5:通过删除一段距离处的等高线来删除间距离群值从P(t)开始超过一个八度;
步骤4.6:按照步骤4.1-4.2,使用剩余等高线重新计算P(t);
步骤4.7:重复两次步骤4.3-4.6;
步骤4.8:将最后一次迭代后剩余的轮廓作为最终的旋律。
4、设计不同应用场景的旋律。
音乐的旋律可以按照类型进行分类,可以为活力、动感、自然、摇滚、悲伤等。其中,旋律为“活力”,该音乐播放出来给人以青春活力的感觉,旋律为“悲伤”,该音乐播放出来给人以悲伤的感觉,等等。
音乐的旋律一般由音色、节奏、调式、节拍等基本要素有机结合而成。以音色为例,音色是指不同声音表现在波形方面总是有与众不同的特性,不同的物体振动都有不同的特点,音色可以分为钢琴类音色、半音阶打击乐器音色、风琴类音色、吉他类音色等种类。音乐在播放时,谱上不同类型的音色,会以不同乐器演奏的方式播放。以节奏为例,节奏是把一段无序的节拍流组合成不同的模式,对长短不同经常重复的不同部分的整合,节奏可以分为三连音、切分等等。音乐在播放时,谱上不同种类的节奏,会以不同节拍流播放。
终端100会为不同的应用场景设计不同的旋律。示例性地,以改变旋律类型、旋律的节奏和旋律的音色为例。如图10所示,应用场景为“来电铃声”时,根据旋律类型,设计N种旋律模式,如“活力”模式、“自然”模式、“摇滚”模式、“悲伤”模式等等;再根据旋律的音色,在每种旋律模式中设计M种音色模式,如“钢琴类”模式、“半音阶打击乐器”模式、“风琴类”模式等等;再根据旋律的节奏,在每个音色模式中设计K种节奏模式,如“三连音”模式、“切分”模式等等。因此,对于应用场景为“来电铃声”,可以设计出N×M×K个不同旋律的铃声模式。其中,N、M、K均为大于0的正整数。
用户在选定“来电铃声”的模式过程,先选定应用场景为“来电铃声”的界面,进入如图10所示的界面后,用户可以根据自己喜欢的旋律类型,选择一个类型,如“活力”模式;然后,图10所示的界面再跳转到选择音色的界面,用户可以根据自己喜欢的音色种类,选择一种音色;接着,界面再跳转到选择节奏的界面,用户可以根据自己喜欢的节奏种类,选择一种节奏;最后,点击选择节奏的界面上的“确定”虚拟按钮后,界面再次切换到图10所示 的界面,用户可以从右向左滑动屏幕,进入其它应用场景的选定。
可选地,如果用户只关注旋律类型,并不介意旋律的音色和旋律的节奏。用户在图10所示的界面上选定“活力”模式后,直接点击“确定”虚拟按钮,然后可以从右向左滑动屏幕,进入其它应用场景的选定。其中,由于旋律的音色和旋律的节奏没有选定,当截取的音乐片段通过该选定的模式播放时,会以该音乐片段的自身旋律的音色和自身旋律的节奏播放。
5、生成不同应用场景的铃声主题。
以应用场景为“来电铃声”,且选定的模式为:旋律类型“活力”(旋律的音色没有选定、旋律的节奏没有选定)为例。用户点击图10中“确定”虚拟按钮后,终端100将截取的音乐片段中的旋律类型替换“活力”,而该音乐片段中的旋律的音色和旋律的节奏不替换。
可选地,如果截取的音乐片段的时间段小于应用场景设定的铃声时间段,可以将截取的音乐片段循环叠加。如截取的音乐片段的时间段为20s,应用场景设定的铃声时间段为30s,在设计铃声时间段时,前20s为完整的截取的音乐片段,后10s为截取的音乐片段的前10s的音乐片段,从而使得截取的音乐片段可以设置应用场景为“来电铃声”的铃声。如果截取的音乐片段的时间段小于应用场景设定的铃声时间段,可以将截取的音乐片段再次截取,以得到应用场景设定的铃声时间段,也可以将截取的音乐片段进行快进处理,让截取的音乐片段可以在应用场景设定的铃声时间段内完成播放。
当用户选定了应用场景为“信息铃声”的模式后,终端100也会自动的将截取的音乐片段中的旋律类型、旋律的音色和旋律的节奏替换成用户选定的模式。其它“通知铃声”、“闹钟铃声”等应用场景,以此类推。
当终端100为每个应用场景生成对应模式的铃声后,进入如图11所示的界面,将当前每个应用场景生成的铃声作为一个主题。如果截取的音频片段为多段,还可以再生成一个主题的各个应用场景的铃声,如图12所示,用户通过点击
虚拟按键,再次进入图10-图11显示的界面,再次生成一个主题的各个应用场景的铃声。
可选地,当截取的音乐片段为多个时,“来电铃声”、“通知铃声”、“信息铃声”、“闹钟铃声”等应用场景的原始音乐片段,可以为相同的音乐片段,也可以为不同的音乐片段。
如果用户设置有多个主题的各个应用场景的铃声,可以根据个人的意向,通过点击界面上的“应用”虚拟按键,则将该主题的各个应用场景的铃声设置为终端100的当前执行的铃声。
本申请实施例中,在得到用户确定的音乐后,截取该音乐中一个或多个的特定音乐片段,然后提取特定音乐片段中的旋律,如果将特定音乐片段应用在不同应用场景时,则将特定音乐片段中的旋律替换成不同应用场景中设定的旋律,使得特定音乐片段可以作为不同应用场景的铃声,提升终端在不同应用场景下的个性化设计,且会延长用户对选定音乐产生厌恶感的时间。
图13为本申请实施例中提供的一种生成多种音效的方法的流程示意图。如图13所示,该方法实现过程如下:
步骤S1301,确定第一音频数据。
当用户操作终端100进入“铃声主题”模式,也即编辑终端100的来电铃声、闹钟铃声、信息铃声、通知铃声等声音的功能。在进入“铃声主题”模式后,如图4所示,终端100可以 自动地推送出自带的系统铃声,如Bongo、Arrow、Bell等铃声,也可以显示选择其它音乐的虚拟按钮。
终端100检测到用户选定的“铃声主题”的音乐后,会对该音乐播放时间进行检测。一般来说,一首音乐播放时间都在一分钟以上,而来电铃声、闹钟铃声、通知铃声等等,都是比较短的,如通知铃声在1秒左右,来电铃声在30秒左右,闹钟铃声也就几十秒左右。如果将选定的音乐作为“铃声主题”的音乐,则需要对选定的音乐进行截取,截取出适合各个不同的应用场景的时间段的音乐,如来电铃声需要30秒时长、闹钟铃声需要40秒、通知铃声1秒等等。用户也可以根据自己个人的爱好,如希望将选定的音乐中高潮部分的片段作为“铃声主题”的音乐,截取处于高潮部分的音乐片段。其中,终端100截取音乐的方式可以为主动截取,也即终端100根据设定的模式,主动截取一段音乐片段,作为第一音频数据。
步骤S1302,提取第一音频数据中的旋律信息。
终端100在得到一个或多个音乐片段对应的原始音频数据,可以提取该音频数据中的音乐旋律,以便后续将该音频数据进行处理,可以适用于来电铃声、闹钟铃声、通知铃声等不同应用场景。示例性地,以信号处理的方式介绍提取音乐中的旋律。提取旋律的具体过程如图6和图6相应描述内容,本申请在此不再赘述了。
步骤S1303,接收第一操作指令,并基于第一操作指令,确定各个应用场景。
步骤S1304,根据预设的音频文件,生成适用于各个应用场景的音频数据。
音乐的旋律可以按照类型进行分类,可以为活力、动感、自然、摇滚、悲伤等。其中,旋律为“活力”,该音乐播放出来给人以青春活力的感觉,旋律为“悲伤”,该音乐播放出来给人以悲伤的感觉,等等。
音乐的旋律一般由音色、节奏、调式、节拍等基本要素有机结合而成。以音色为例,音色是指不同声音表现在波形方面总是有与众不同的特性,不同的物体振动都有不同的特点,音色可以分为钢琴类音色、半音阶打击乐器音色、风琴类音色、吉他类音色等种类。音乐在播放时,谱上不同类型的音色,会以不同乐器演奏的方式播放。以节奏为例,节奏是把一段无序的节拍流组合成不同的模式,对长短不同经常重复的不同部分的整合,节奏可以分为三连音、切分等等。音乐在播放时,谱上不同种类的节奏,会以不同节拍流播放。
终端100会为不同的应用场景设计不同的旋律。示例性地,以改变旋律类型、旋律的节奏和旋律的音色为例。如图10所示,应用场景为“来电铃声”时,根据旋律类型,设计N种旋律模式,如“活力”模式、“自然”模式、“摇滚”模式、“悲伤”模式等等;再根据旋律的音色,在每种旋律模式中设计M种音色模式,如“钢琴类”模式、“半音阶打击乐器”模式、“风琴类”模式等等;再根据旋律的节奏,在每个音色模式中设计K种节奏模式,如“三连音”模式、“切分”模式等等。因此,对于应用场景为“来电铃声”,可以设计出N×M×K个不同旋律的铃声模式。其中,N、M、K均为大于0的正整数。
用户在选定“来电铃声”的模式过程,先选定应用场景为“来电铃声”的界面,进入如图10所示的界面后,用户可以根据自己喜欢的旋律类型,选择一个类型,如“活力”模式;然后,图10所示的界面再跳转到选择音色的界面,用户可以根据自己喜欢的音色种类,选择一种音色;接着,界面再跳转到选择节奏的界面,用户可以根据自己喜欢的节奏种类,选择一种节奏;最后,点击选择节奏的界面上的“确定”虚拟按钮后,界面再次切换到图10所示的界面,用户可以从右向左滑动屏幕,进入其它应用场景的选定。
以应用场景为“来电铃声”,且选定的模式为:旋律类型“活力”(旋律的音色没有选定、旋律的节奏没有选定)为例。用户点击图10中“确定”虚拟按钮后,终端100将截取的音 乐片段中的旋律类型替换“活力”,而该音乐片段中的旋律的音色和旋律的节奏不替换。
当用户选定了应用场景为“信息铃声”的模式后,终端100也会自动的将截取的音乐片段中的旋律类型、旋律的音色和旋律的节奏替换成用户选定的模式。其它“通知铃声”、“闹钟铃声”等应用场景,以此类推。
当终端100为每个应用场景生成对应模式的铃声后,进入如图11所示的界面,将当前每个应用场景生成的铃声作为一个主题。如果截取的音频片段为多段,还可以再生成一个主题的各个应用场景的铃声,如图12所示,用户通过点击
虚拟按键,再次进入图10-图11显示的界面,再次生成一个主题的各个应用场景的铃声。
如果用户设置有多个主题的各个应用场景的铃声,可以根据个人的意向,通过点击界面上的“应用”虚拟按键,则将该主题的各个应用场景的铃声设置为终端100的当前执行的铃声。
本申请实施例中,在得到用户确定的音乐后,截取该音乐中一个或多个的特定音乐片段,然后提取特定音乐片段中的旋律,如果将特定音乐片段应用在不同应用场景时,则将特定音乐片段中的旋律替换成不同应用场景中设定的旋律,使得特定音乐片段可以作为不同应用场景的铃声,提升终端在不同应用场景下的个性化设计,且会延长用户对选定音乐产生厌恶感的时间。
图14为本申请实施例中提供的一种生成多种音效的装置的结构示意图。如图14所示的装置1400,该装置包括处理单元1401和收发单元1402。其中,装置1400实现过程如下:
处理单元1401用于确定第一音频数据;所述处理单元1401还用于提取所述第一音频数据中的旋律信息;收发单元1402用于接收第一操作指令;所述处理单元1401还用于基于所述第一操作指令,确定至少一个应用场景;以及根据预设的音频文件,生成适用于所述各个应用场景的音频数据,所述音频文件包括不同应用场景对应的旋律信息。
在一种实施方式中,所述收发单元1402还用于接收第二操作指令,并基于所述第二操作指令,选择出原始音频数据;所述处理单元1401还用于按照设定规则,截取出所述原始音频数据中所述至少一个目标音频数据,所述至少一个目标音频数据包括所述第一音频数据。
在一种实施方式中,所述处理单元1401具体用于根据所述第一音频数据,计算出所述第一音频数据中的至少一个谱峰;根据所述至少一个谱峰在频域上的位置,计算所述至少一个谱峰对应的显著性;根据所述至少一个谱峰和所述至少一个谱峰对应的频率,构建音高轮廓;通过音高轮廓滤波,选择出第一显著性的音高轮廓作为所述第一音频数据的旋律信息。
在一种实施方式中,所述处理单元1401具体用于根据所述预设的音频文件,确定所述各个应用场景对应的旋律信息;将所述第一音频数据中的旋律信息替换成所述各个应用场景对应的旋律信息,得到适用于所述各个应用场景的音频数据。
在一种实施方式中,所述旋律信息包括旋律类型、音色和节奏,所述处理单元1401具体用于接收第三操作指令,并基于所述第三操作指令,将所述第一音频数据中的所述旋律类型、音色和节奏替换成所述各个应用场景对应的所述旋律类型、音色和节奏。
在一种实施方式中,所述音频文件还包括不同应用场景对应的时间长度,所述处理单元1401还用于将所述适用于所述各个应用场景的音频数据的播放时间长度调整成所述各个应用场景对应的时间长度。
在一种实施方式中,所述处理单元1401还用于确定第二音频数据;所述处理单元1401还用于提取所述第二音频数据中的旋律信息;所述收发单元1402还用于接收第一操作指令;所 述处理单元1402还用于并基于所述第一操作指令,确定第二应用场景;以及根据所述音频文件,生成适用于所述第二应用场景的音频数据。
本发明提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行上述图1-图12和相应描述内容中记载的任一项方法。
本发明提供一种计算机程序产品,所述计算机程序产品存储有指令,所述指令在由计算机执行时,使得所述计算机实施上述图1-图12和相应描述内容中记载的任一项方法。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。
此外,本申请实施例的各个方面或特征可以实现成方法、装置或使用标准编程和/或工程技术的制品。本申请中使用的术语“制品”涵盖可从任何计算机可读器件、载体或介质访问的计算机程序。例如,计算机可读介质可以包括,但不限于:磁存储器件(例如,硬盘、软盘或磁带等),光盘(例如,压缩盘(compact disc,CD)、数字通用盘(digital versatile disc,DVD)等),智能卡和闪存器件(例如,可擦写可编程只读存储器(erasable programmable read-only memory,EPROM)、卡、棒或钥匙驱动器等)。另外,本文描述的各种存储介质可代表用于存储信息的一个或多个设备和/或其它机器可读介质。术语“机器可读介质”可包括但不限于,无线信道和能够存储、包含和/或承载指令和/或数据的各种其它介质。
在上述实施例中,图14中的生成多种音效的装置1400可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD))、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
应当理解的是,在本申请实施例的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可 以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者接入网设备等)执行本申请实施例各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请实施例的具体实施方式,但本申请实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请实施例的保护范围之内。
Claims (17)
- 一种生成多种音效的方法,其特征在于,包括:确定第一音频数据;提取所述第一音频数据中的旋律信息;接收第一操作指令,并基于所述第一操作指令,确定至少一个应用场景;根据预设的音频文件,生成适用于各个应用场景的音频数据,所述音频文件包括不同应用场景对应的旋律信息。
- 根据权利要求1所述的方法,其特征在于,在所述确定第一音频数据之前,包括:接收第二操作指令,并基于所述第二操作指令,选择出原始音频数据;按照设定规则,截取出所述原始音频数据中所述至少一个目标音频数据,所述至少一个目标音频数据包括所述第一音频数据。
- 根据权利要求1或2所述的方法,其特征在于,所述提取所述第一音频数据中的旋律信息,包括:根据所述第一音频数据,计算出所述第一音频数据中的至少一个谱峰;根据所述至少一个谱峰在频域上的位置,计算所述至少一个谱峰对应的显著性;根据所述至少一个谱峰和所述至少一个谱峰对应的频率,构建音高轮廓;通过音高轮廓滤波,选择出第一显著性的音高轮廓作为所述第一音频数据的旋律信息。
- 根据权利要求1-3任意一项所述的方法,其特征在于,所述根据预设的音频文件,生成适用于各个应用场景的音频数据,包括:根据所述预设的音频文件,确定所述各个应用场景对应的旋律信息;将所述第一音频数据中的旋律信息替换成所述各个应用场景对应的旋律信息,得到适用于所述各个应用场景的音频数据。
- 根据权利要求1-4任意一项所述的方法,其特征在于,所述旋律信息包括旋律类型、音色和节奏,所述根据预设的音频文件,生成适用于各个应用场景的音频数据,包括:接收第三操作指令,并基于所述第三操作指令,将所述第一音频数据中的所述旋律类型、音色和节奏替换成所述各个应用场景对应的所述旋律类型、音色和节奏。
- 根据权利要求1-5任意一项所述的方法,其特征在于,所述音频文件还包括不同应用场景对应的时间长度,所述方法还包括:将所述适用于所述各个应用场景的音频数据的播放时间长度调整成所述各个应用场景对应的时间长度。
- 根据权利要求1-6任意一项所述的方法,其特征在于,所述方法还包括:确定第二音频数据;提取所述第二音频数据中的旋律信息;根据所述音频文件,生成适用于所述各个应用场景的音频数据。
- 一种生成多种音效的装置,其特征在于,包括:处理单元,用于确定第一音频数据;所述处理单元,还用于提取所述第一音频数据中的旋律信息;收发单元,用于接收第一操作指令;所述处理单元,还用于基于所述第一操作指令,确定至少一个应用场景;以及根据预设的音频文件,生成适用于各个应用场景的音频数据,所述音频文件包括不同应用场景对应的旋律信息。
- 根据权利要求8所述的装置,其特征在于,所述收发单元,还用于接收第二操作指令,并基于所述第二操作指令,选择出原始音频数据;所述处理单元,还用于按照设定规则,截取出所述原始音频数据中所述至少一个目标音频数据,所述至少一个目标音频数据包括所述第一音频数据。
- 根据权利要求8或9所述的装置,其特征在于,所述处理单元,具体用于根据所述第一音频数据,计算出所述第一音频数据中的至少一个谱峰;根据所述至少一个谱峰在频域上的位置,计算所述至少一个谱峰对应的显著性;根据所述至少一个谱峰和所述至少一个谱峰对应的频率,构建音高轮廓;通过音高轮廓滤波,选择出第一显著性的音高轮廓作为所述第一音频数据的旋律信息。
- 根据权利要求8-10任意一项所述的装置,其特征在于,所述处理单元,具体用于根据所述预设的音频文件,确定所述各个应用场景对应的旋律信息;将所述第一音频数据中的旋律信息替换成所述各个应用场景对应的旋律信息,得到适用于所述各个应用场景的音频数据。
- 根据权利要求8-11任意一项所述的装置,其特征在于,所述旋律信息包括旋律类型、音色和节奏,所述处理单元,具体用于接收第三操作指令,并基于所述第三操作指令,将所述第一音频数据中的所述旋律类型、音色和节奏替换成所述各个应用场景对应的所述旋律类型、音色和节奏。
- 根据权利要求8-12任意一项所述的装置,其特征在于,所述音频文件还包括不同应用场景对应的时间长度,所述处理单元,还用于将所述适用于所述各个应用场景的音频数据的播放时间长度调整成所述各个应用场景对应的时间长度。
- 根据权利要求8-13任意一项所述的装置,其特征在于,所述处理单元,还用于确定第二音频数据;所述处理单元,还用于提取所述第二音频数据中的旋律信息;以及根据所述音频文件,生成适用于所述各个应用场景的音频数据。
- 一种终端设备,包括至少一个处理器,所述处理器用于执行存储器中存储的指令,以使得终端设备执行如权利要求1-7任一所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-7中任一项的所述的方法。
- 一种计算机程序产品,其特征在于,所述计算机程序产品存储有指令,所述指令在由计算机执行时,使得所述计算机实施权利要求1-7任意一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110741096.7 | 2021-06-30 | ||
CN202110741096.7A CN115550503B (zh) | 2021-06-30 | 2021-06-30 | 一种生成多种音效的方法、装置和终端设备、存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023273440A1 true WO2023273440A1 (zh) | 2023-01-05 |
Family
ID=84691151
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/083344 WO2023273440A1 (zh) | 2021-06-30 | 2022-03-28 | 一种生成多种音效的方法、装置和终端设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115550503B (zh) |
WO (1) | WO2023273440A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114897157A (zh) * | 2022-04-28 | 2022-08-12 | 北京达佳互联信息技术有限公司 | 节拍重拍联合检测模型的训练及节拍重拍联合检测方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103247286A (zh) * | 2013-03-28 | 2013-08-14 | 北京航空航天大学 | 一种基于gpu的多声部音乐旋律提取方法 |
CN107656719A (zh) * | 2017-09-05 | 2018-02-02 | 百度在线网络技术(北京)有限公司 | 电子设备提示音的设置方法和电子设备 |
CN110062267A (zh) * | 2019-05-05 | 2019-07-26 | 广州虎牙信息科技有限公司 | 直播数据处理方法、装置、电子设备及可读存储介质 |
CN110430326A (zh) * | 2019-09-10 | 2019-11-08 | Oppo广东移动通信有限公司 | 铃声编辑方法、装置、移动终端及存储介质 |
CN110881074A (zh) * | 2018-09-06 | 2020-03-13 | 中兴通讯股份有限公司 | 一种铃音设置方法、终端及计算机可读存储介质 |
CN112637047A (zh) * | 2020-12-21 | 2021-04-09 | Oppo广东移动通信有限公司 | 通知提示音处理方法、装置和电子设备 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4174950B2 (ja) * | 2000-04-11 | 2008-11-05 | 松下電器産業株式会社 | 音楽編集方法、およびその方法を記録した記録媒体 |
KR20170106165A (ko) * | 2016-03-11 | 2017-09-20 | 삼성전자주식회사 | 음악 정보 제공 방법 및 이를 위한 전자 기기 |
CN105847572A (zh) * | 2016-04-19 | 2016-08-10 | 乐视控股(北京)有限公司 | 提醒铃声的创建方法及装置 |
CN107248415A (zh) * | 2017-07-10 | 2017-10-13 | 珠海格力电器股份有限公司 | 一种闹钟铃声生成的方法、装置及用户终端 |
CN109119101B (zh) * | 2018-09-20 | 2021-04-06 | 维沃移动通信有限公司 | 一种音频数据的处理方法、装置及移动终端 |
CN111415643B (zh) * | 2020-04-26 | 2023-07-18 | Oppo广东移动通信有限公司 | 通知音创作方法、装置、终端设备和存储介质 |
-
2021
- 2021-06-30 CN CN202110741096.7A patent/CN115550503B/zh active Active
-
2022
- 2022-03-28 WO PCT/CN2022/083344 patent/WO2023273440A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103247286A (zh) * | 2013-03-28 | 2013-08-14 | 北京航空航天大学 | 一种基于gpu的多声部音乐旋律提取方法 |
CN107656719A (zh) * | 2017-09-05 | 2018-02-02 | 百度在线网络技术(北京)有限公司 | 电子设备提示音的设置方法和电子设备 |
CN110881074A (zh) * | 2018-09-06 | 2020-03-13 | 中兴通讯股份有限公司 | 一种铃音设置方法、终端及计算机可读存储介质 |
CN110062267A (zh) * | 2019-05-05 | 2019-07-26 | 广州虎牙信息科技有限公司 | 直播数据处理方法、装置、电子设备及可读存储介质 |
CN110430326A (zh) * | 2019-09-10 | 2019-11-08 | Oppo广东移动通信有限公司 | 铃声编辑方法、装置、移动终端及存储介质 |
CN112637047A (zh) * | 2020-12-21 | 2021-04-09 | Oppo广东移动通信有限公司 | 通知提示音处理方法、装置和电子设备 |
Also Published As
Publication number | Publication date |
---|---|
CN115550503A (zh) | 2022-12-30 |
CN115550503B (zh) | 2024-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2650612C (en) | An adaptive user interface | |
US9672800B2 (en) | Automatic composer | |
US7546148B2 (en) | Method and system of creating customized ringtones | |
US20170060520A1 (en) | Systems and methods for dynamically editable social media | |
CN110430326B (zh) | 铃声编辑方法、装置、移动终端及存储介质 | |
US20140128160A1 (en) | Method and system for generating a sound effect in a piece of game software | |
US20090254206A1 (en) | System and method for composing individualized music | |
US20090067605A1 (en) | Video Sequence for a Musical Alert | |
WO2023273440A1 (zh) | 一种生成多种音效的方法、装置和终端设备 | |
US8681157B2 (en) | Information processing apparatus, program, and information processing method | |
US9755764B2 (en) | Communicating data with audible harmonies | |
CN112669811B (zh) | 一种歌曲处理方法、装置、电子设备及可读存储介质 | |
CN112037739B (zh) | 一种数据处理方法、装置、电子设备 | |
CN104978974B (zh) | 一种音频处理方法及装置 | |
US20130261777A1 (en) | Systems and methods for facilitating rendering visualizations related to audio data | |
CN109410972A (zh) | 生成音效参数的方法、装置及存储介质 | |
CN108269561A (zh) | 一种声音合成方法及系统 | |
US7612279B1 (en) | Methods and apparatus for structuring audio data | |
CN110097618A (zh) | 一种音乐动画的控制方法、装置、车辆及存储介质 | |
US20240213943A1 (en) | Dynamic audio playback equalization using semantic features | |
US11609948B2 (en) | Music streaming, playlist creation and streaming architecture | |
US20240153475A1 (en) | Music management services | |
WO2023010949A1 (zh) | 一种音频数据的处理方法及装置 | |
CN111862935A (zh) | 一种音频处理方法、装置、智能设备及存储介质 | |
CN112954481A (zh) | 特效处理方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22831271 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22831271 Country of ref document: EP Kind code of ref document: A1 |