CN107452361B

CN107452361B - Song sentence dividing method and device

Info

Publication number: CN107452361B
Application number: CN201710670846.XA
Authority: CN
Inventors: 赵伟峰
Original assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Current assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date: 2017-08-08
Filing date: 2017-08-08
Publication date: 2020-07-07
Anticipated expiration: 2037-08-08
Also published as: CN107452361A

Abstract

The embodiment of the invention provides a song clause dividing method and a device, wherein the song clause dividing method comprises the following steps: analyzing a midi file of a musical instrument digital interface of a current song to acquire note information in the midi file, wherein the note information comprises the playing start time, the playing duration and the pitch value of each note in the current song; calculating playing time difference values between adjacent notes of the current song according to the note information; acquiring a time difference threshold according to the distribution condition of the playing time difference between each adjacent note; and searching adjacent notes with playing time difference larger than the time difference threshold from the adjacent notes, and separating the current song from the searched adjacent notes. The embodiment of the invention can realize the sentence division of the song under the condition of no lyrics.

Description

Song sentence dividing method and device

Technical Field

The embodiment of the invention relates to the field of audio processing, in particular to a method and a device for separating sentences from songs.

Background

Generally, a song may be divided into sentences according to its lyrics. However, in some special cases, for example, when there is no lyric to be issued or there is a lack of lyrics, it is impossible to separate the songs. No effective solution to this problem has been proposed.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for sentence splitting of a song, which can implement sentence splitting of a song without lyrics.

The song clause dividing method provided by the embodiment of the invention comprises the following steps:

analyzing a midi file of a musical instrument digital interface of a current song to acquire note information in the midi file, wherein the note information comprises the playing start time, the playing duration and the pitch value of each note in the current song;

calculating playing time difference values between adjacent notes of the current song according to the note information;

acquiring a time difference threshold according to the distribution condition of the playing time difference between each adjacent note;

and searching adjacent notes with playing time difference larger than the time difference threshold from the adjacent notes, and separating the current song from the searched adjacent notes.

The song clause dividing device provided by the embodiment of the invention comprises:

the device comprises an analyzing unit, a judging unit and a judging unit, wherein the analyzing unit is used for analyzing a midi file of a musical instrument digital interface (midi) of a current song to acquire note information in the midi file, and the note information comprises the playing start time, the playing duration and the pitch value of each note in the current song;

the calculating unit is used for calculating the playing time difference value between each adjacent note of the current song according to the note information;

the acquisition unit is used for acquiring a time difference threshold according to the distribution condition of the playing time difference between each two adjacent notes;

and the sentence dividing unit is used for searching adjacent notes with playing time difference values larger than the time difference value threshold from the adjacent notes and dividing the current song from the searched adjacent notes.

In the embodiment of the invention, the midi file of the current song can be analyzed to obtain the note information in the midi file, the playing time difference between each adjacent note of the current song is calculated according to the note information, the time difference threshold value is obtained according to the distribution condition of the playing time difference between each adjacent note, the adjacent note with the playing time difference larger than the time difference threshold value is searched from each adjacent note, and the current song is divided into sentences from the searched adjacent note; namely, the embodiment of the invention can realize the sentence division of the current song by analyzing the midi file of the current song, thereby realizing the sentence division of the current song under the condition of no lyrics.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of a song clause method according to an embodiment of the present invention.

Fig. 2 is a flowchart illustrating a song clause dividing method according to an embodiment of the present invention.

FIG. 3 is a diagram illustrating a histogram established according to an embodiment of the present invention.

Fig. 4 is a schematic structural diagram of a song clause dividing apparatus according to an embodiment of the present invention.

Fig. 5 is another schematic structural diagram of a song clause dividing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a song clause method and a device, because the prior art lacks a scheme for implementing clause segmentation on songs without lyrics. Taking the example that the song clause device is integrated in the terminal, referring to fig. 1, the terminal may interact with the server through a network, which may be a mobile communication network, a wide area network, a local area network, or the like. Specifically, in the embodiment of the present invention, a terminal may download a song from a server through a network, and after the download is completed, the terminal parses a midi file of a song (i.e., a current song) that needs to be divided into sentences to obtain note information in the midi file, where the note information includes a play start time, a play duration, and a pitch value of each note in the current song; according to the note information, the playing time difference between each adjacent note of the current song can be calculated; and obtaining a time difference threshold according to the distribution condition of the playing time difference between every two adjacent notes, then searching for the adjacent notes with the playing time difference larger than the time difference threshold from the adjacent notes, and performing sentence segmentation on the current song from the searched adjacent notes. Namely, the embodiment of the invention can realize the sentence division of the current song by analyzing the midi file of the current song, thereby realizing the sentence division of the current song under the condition of no lyrics.

The following detailed description will be made separately, and the description sequence of each embodiment below does not limit the specific implementation sequence.

Example one

As shown in fig. 2, the song clause method of the present embodiment includes the following steps:

step 201, analyzing a midi file of a musical instrument digital interface of a current song to acquire note information in the midi file, wherein the note information comprises the play start time, the play duration and the pitch value of each note in the current song;

a musical instrument digital interface (midi), which is an industry standard electronic communication protocol, defines various musical notes or playing codes for playing devices (such as a synthesizer) of an electronic musical instrument, and allows the electronic musical instrument, a computer, a mobile phone or other stage performance devices to be connected, adjusted and synchronized with each other, so as to exchange playing data in real time. The midi file stores music information in the form of digital information, which is a music melody file of a song.

When the midi file of the current song is parsed, a structural body may be defined by a program method to store note information, and the specific method may be as follows:

Tydef struct tag_note{

int start_ms；

int end_ms；

int note_value；

}Tnote

Tnote note；

after the midi file is analyzed by using the above-described structure, note information shown in table 1 can be obtained.

Play start time (ms)	Play duration (ms)	Pitch value
			38210	311	71
38524	309	69
			38837	622	67
40711	309	64
			41024	309	67
……	……	……

TABLE 1

Table 1 shows the partial note information obtained from the midi file, which may be any piece in the midi file.

The parsed note information includes the start time, duration and pitch value of the note. Wherein the pitch value is generally located in the interval [21, 108], and the larger the pitch value, the higher the pitch. Each pitch value corresponds to a note in the music "do, re, mi … …". As shown in table 1, a pitch value corresponds to a set of play start time and play duration having an associated relationship. For example, the pitch value "71" in the first row of Table 1 corresponds to a play start time of 38210 and a play duration of 311. That is, the note starts at a pitch value of 71 and starts at 38210ms, which is 311ms long. The play start time and the play duration are related, and based on the play start time and the play duration, the play end time of a note is obtained, in the above example, in the first row of table 1, the note is played for 311 seconds at a pitch value of 71, and the play end time is 38521 ms.

Step 202, calculating the playing time difference between each adjacent note of the current song according to the note information;

specifically, the playing end time of each note may be calculated according to the playing start time and the playing duration of each note; and calculating the playing time difference value between each adjacent note according to the playing ending time of each note and the playing starting time of the adjacent note of each note. As shown in the first and second rows of Table 1, the end time of the first note is 38521ms, and the start time of the second note is 38524ms, so that the difference between the two adjacent notes is 3 ms.

Step 203, obtaining a time difference threshold according to the distribution of the playing time differences among the adjacent notes;

there is a certain playing time interval (i.e. playing time difference) between adjacent notes, if the interval is small, the adjacent notes may be in a sentence, if the interval is large, the adjacent notes may be the pause of the preceding and following sentences, so a reasonable time difference threshold is needed for the separation.

The embodiment provides a specific solving method of a time difference threshold, which comprises the following steps:

(1) and establishing a histogram according to the distribution of the playing time difference values between each adjacent note.

For convenience of processing, all the play time differences may be first divided by 1000 to convert the unit of the play time difference from ms to s, and the difference may be converted into an integer by rounding or rounding.

Creating a histogram function hist (x) y, x representing each of said play time differences and y representing the number of occurrences of each of said play time differences. For example, if the playing time difference 10 appears 5 times in all the calculated playing time differences, hist (10) ═ 5 is performed, and a histogram is plotted according to the established histogram function hist. Theoretically, the calculated playing time difference will be mostly concentrated in a very small range, and at a larger value, there will be a small peak, so the plotted histogram will have two peaks, as shown in fig. 3.

(2) Solving a time difference threshold value according to the histogram;

specifically, a preset sliding window may be used to find two peak points in the histogram. For example, a sliding window Q may be preset, and all peak points during the Q-sliding period may be recorded. The sliding window Q may take the values 3, 5, etc. Taking the sliding window Q as 5 for example, the center value of Q is Q (n), the values to the left of Q are Q (n-1) and Q (n-2), the values to the right of Q are Q (n +1) and Q (n +2), and the condition of the peak point may be: while being greater than the values of both sides (left and right sides), or greater than one side and equal to the other side.

After finding two peak points, a valley point can be found between the two peak points. The definition of the valley point is opposite to the peak point, and needs to be smaller than the values on both sides, or smaller than one side and equal to the value on the other side. The found valley point can be as shown in fig. 3, and in this embodiment, the playing time difference value corresponding to the found valley point is determined as the time difference threshold.

The time difference threshold has a unit of s, which may be multiplied by 1000 to convert the unit of the time difference threshold from s to ms.

And 204, searching adjacent notes with playing time difference larger than the time difference threshold from the adjacent notes, and separating the current song from the searched adjacent notes.

Specifically, for example, if the playing time interval between the 6 th note and the 7 th note is greater than the time difference threshold, a sentence is divided between the 6 th note and the 7 th note, the 6 th note is taken as a note in the previous sentence, and the 7 th note is taken as a note in the next sentence.

After the song is divided into sentences, words can be filled in the song according to the sentence dividing condition, or the temperament similarity between each sentence of the song and each sentence of other songs can be compared, and the like.

In this embodiment, the midi file of the current song may be analyzed to obtain note information in the midi file, calculate a playing time difference between each adjacent note of the current song according to the note information, obtain a time difference threshold according to a distribution of the playing time differences between each adjacent note, find an adjacent note with a playing time difference larger than the time difference threshold from each adjacent note, and perform sentence segmentation on the current song from the found adjacent note; that is, in this embodiment, the sentence division of the current song can be realized by analyzing the midi file of the current song, so that the sentence division of the current song is realized under the condition of no lyrics.

Example two

In order to better implement the above method, the present invention further provides a song clause dividing apparatus, as shown in fig. 4, the apparatus of this embodiment includes: analyzing section 401, calculating section 402, acquiring section 403, and clause dividing section 404 are as follows:

(1) an analysis unit 401;

the parsing unit 401 is configured to parse the midi file of the musical instrument digital interface of the current song to obtain note information in the midi file, where the note information includes a play start time, a play duration, and a pitch value of each note in the current song.

Tydef struct tag_note{

int start_ms；

int end_ms；

int note_value；

}Tnote

Tnote note；

(2) A calculation unit 402;

a calculating unit 402, configured to calculate, according to the note information, a playing time difference between adjacent notes of the current song.

Specifically, the calculation unit 402 may calculate the play end time of each note according to the play start time and the play duration of each note; and calculating the playing time difference value between each adjacent note according to the playing ending time of each note and the playing starting time of the adjacent note of each note. As shown in the first and second rows of Table 1, the end time of the first note is 38521ms, the start time of the second note is 38524ms, and the difference between the two adjacent notes is calculated to be 3 ms.

(3) An acquisition unit 403;

an obtaining unit 403, configured to obtain a time difference threshold according to a distribution of the playing time differences between the adjacent notes.

In this embodiment, the obtaining unit 403 is configured to obtain the time difference threshold, and specifically, the obtaining unit 403 may include a building subunit and a solving subunit, as follows:

and the establishing subunit is used for establishing a histogram according to the distribution of the playing time difference values between the adjacent notes.

For ease of processing, the setup subunit may first divide all of the play time differences by 1000 to convert the units of play time differences from milliseconds, ms, to seconds, s, and convert the differences to integers by rounding or rounding operations.

The creating subunit creates a histogram function hist (x) y, where x represents each of the play time differences, and y represents the number of occurrences of each of the play time differences. For example, if the playing time difference 10 appears 5 times among all the calculated playing time differences, hist (10) becomes 5, and a histogram is plotted according to the histogram function hist. Theoretically, the calculated playing time difference will be mostly concentrated in a very small range, and at a larger value, there will be a small peak, so the plotted histogram will have two peaks, as shown in fig. 3.

And the solving subunit can solve the time difference value threshold according to the histogram established by the establishing subunit.

Specifically, the solving subunit may first find two peak points in the histogram by using a preset sliding window. For example, a sliding window Q may be preset, and all peak points during the Q-sliding period may be recorded. The sliding window Q may take the values 3, 5, etc. Taking the sliding window Q as 5 for example, the center value of Q is Q (n), the values to the left of Q are Q (n-1) and Q (n-2), the values to the right of Q are Q (n +1) and Q (n +2), and the condition of the peak point may be: while being greater than the values of both sides (left and right sides), or greater than one side and equal to the other side.

After finding the two peak points, the solving subunit may find a valley point between the two peak points. The definition of the valley point is opposite to the peak point and needs to be smaller than the values on both sides or smaller than one side and equal to the value on the other side. The found valley point may be determined as a time difference threshold by determining a playing time difference corresponding to the found valley point as shown in fig. 3.

(4) A sentence dividing unit 404;

the sentence dividing unit 404 may find an adjacent note having a playing time difference larger than the time difference threshold from the adjacent notes, and divide the current song from the found adjacent note.

Specifically, for example, if the time interval between the 6 th and 7 th notes is greater than the time difference threshold, the sentence dividing unit divides the sentence between the 6 th and 7 th notes, and sets the 6 th note as the note in the preceding sentence and the 7 th note as the note in the succeeding sentence.

It should be noted that, when the song clause device provided in the above embodiment performs song clauses, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the above described functions. In addition, the song clause device and the song clause method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

In this embodiment, the parsing unit may parse a midi file of a current song to obtain note information in the midi file, the calculating unit may calculate a playing time difference between each adjacent note of the current song according to the note information, the obtaining unit obtains a time difference threshold according to a distribution of the playing time differences between each adjacent note, and finally the sentence dividing unit finds an adjacent note of which the playing time difference is greater than the time difference threshold from each adjacent note and divides the current song from the found adjacent note; that is, in this embodiment, the sentence division of the current song can be realized by analyzing the midi file of the current song, so that the sentence division of the current song is realized under the condition of no lyrics.

EXAMPLE III

Accordingly, an embodiment of the present invention further provides a song clause apparatus, as shown in fig. 5, the apparatus may include a Radio Frequency (RF) circuit 501, a memory 502 including one or more computer-readable storage media, an input unit 503, a display unit 504, a sensor 505, an audio circuit 506, a wireless fidelity (WiFi) module 507, a processor 508 including one or more processing cores, and a power supply 509. Those skilled in the art will appreciate that the configuration of the device shown in fig. 5 is not intended to be limiting of the device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:

the RF circuit 501 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, for receiving downlink information of a base station and then sending the received downlink information to the one or more processors 508 for processing; in addition, data relating to uplink is transmitted to the base station. In general, RF circuit 501 includes, but is not limited to, an antenna, at least one Amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 501 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.

The memory 502 may be used to store software programs and modules, and the processor 508 executes various functional applications and data processing by operating the software programs and modules stored in the memory 502. The memory 502 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the device, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 502 may also include a memory controller to provide the processor 508 and the input unit 503 access to the memory 502.

The input unit 503 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, in one particular embodiment, the input unit 503 may include a touch-sensitive surface as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations by a user (e.g., operations by a user on or near the touch-sensitive surface using a finger, a stylus, or any other suitable object or attachment) thereon or nearby, and drive the corresponding connection device according to a predetermined program. Alternatively, the touch sensitive surface may comprise two parts, a touch detection means and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 508, and can receive and execute commands sent by the processor 508. In addition, touch sensitive surfaces may be implemented using various types of resistive, capacitive, infrared, and surface acoustic waves. The input unit 503 may include other input devices in addition to the touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 504 may be used to display information input by or provided to the user and various graphical user interfaces of the terminal, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 504 may include a Display panel, and optionally, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch-sensitive surface may overlay the display panel, and when a touch operation is detected on or near the touch-sensitive surface, the touch operation is transmitted to the processor 508 to determine the type of touch event, and then the processor 508 provides a corresponding visual output on the display panel according to the type of touch event. Although in FIG. 5 the touch-sensitive surface and the display panel are two separate components to implement input and output functions, in some embodiments the touch-sensitive surface may be integrated with the display panel to implement input and output functions.

The device may also include at least one sensor 505, such as light sensors, motion sensors, and other sensors. In particular, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or backlight when the device is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when the mobile phone is stationary, and can be used for applications of recognizing the posture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured in the terminal, detailed description is omitted here.

Audio circuitry 506, a speaker, and a microphone may provide an audio interface between the user and the terminal. The audio circuit 506 may transmit the electrical signal converted from the received audio data to a speaker, and convert the electrical signal into a sound signal for output; on the other hand, the microphone converts the collected sound signal into an electric signal, which is received by the audio circuit 506 and converted into audio data, which is then processed by the audio data output processor 508, and then sent to, for example, another device via the RF circuit 501, or output to the memory 502 for further processing. The audio circuit 506 may also include an earbud jack to provide communication of peripheral headphones with the device.

WiFi belongs to short-distance wireless transmission technology, and the device can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 507, and provides wireless broadband internet access for the user. Although fig. 5 shows the WiFi module 507, it is understood that it does not belong to the essential constitution of the device, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 508 is a control center of the apparatus, connects various parts of the entire apparatus using various interfaces and lines, performs various functions of the terminal and processes data by operating or executing software programs and/or modules stored in the memory 502 and calling data stored in the memory 502, thereby performing overall monitoring of the apparatus. Optionally, processor 508 may include one or more processing cores; preferably, the processor 508 may integrate an application processor, which primarily handles operating systems, user interfaces, application programs, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 508.

The device also includes a power supply 509 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 508 via a power management system to manage charging, discharging, and power consumption management functions via the power management system. The power supply 509 may also include any component such as one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

Although not shown, the device may further include a camera, a bluetooth module, etc., which will not be described herein. Specifically, in this embodiment, the processor 508 in the apparatus loads the executable file corresponding to the process of one or more application programs into the memory 502 according to the following instructions, and the processor 508 runs the application programs stored in the memory 502, thereby implementing various functions:

In some embodiments, in calculating the playing time difference between adjacent notes of the current song based on the note information, processor 508 is configured to perform the following steps:

calculating the playing ending time of each note according to the playing starting time and the playing duration of each note;

and calculating the playing time difference value between each adjacent note according to the playing ending time of each note and the playing starting time of the adjacent note of each note.

In some embodiments, when obtaining the time difference threshold according to the distribution of the playing time differences between the adjacent notes, the processor 508 is configured to perform the following steps:

establishing a histogram according to the distribution condition of the playing time difference values between each adjacent note;

and solving a time difference threshold value according to the histogram.

In some embodiments, when creating the histogram according to the distribution of the playing time difference between the adjacent notes, the processor 508 is configured to perform the following steps:

and representing each playing time difference value by a horizontal axis, representing the occurrence frequency of each playing time difference value by a vertical axis, and establishing the histogram.

In some embodiments, processor 508 is configured to perform the following steps when solving for a time difference threshold from the histogram:

searching two peak points in the histogram by using a preset sliding window;

searching a valley point between the two peak points;

and determining the playing time difference value corresponding to the valley point as the time difference value threshold.

The song clause dividing device of the embodiment can acquire note information in a midi file by analyzing the midi file of a current song, calculate a playing time difference value between each adjacent note of the current song according to the note information, and acquire a time difference threshold value according to the distribution condition of the playing time difference value between each adjacent note; searching adjacent notes with playing time difference larger than the time difference threshold from the adjacent notes, and separating the current song from the searched adjacent notes; that is, the device of this embodiment can implement clause splitting on the current song by analyzing the midi file of the current song, thereby solving the problem of implementing clause splitting on the current song under the condition of no lyrics.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer (which may be a personal computer, an apparatus, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for sentence segmentation of a song, comprising:

acquiring a time difference threshold according to the distribution of the playing time differences among the adjacent notes, wherein the acquiring method comprises the following steps: establishing a histogram according to the distribution condition of the playing time difference values between every two adjacent notes, and solving a time difference value threshold value according to the histogram;

2. The method of claim 1, wherein calculating the difference in playing time between adjacent notes of the current song based on the note information comprises:

3. The method according to claim 1, wherein said creating a histogram according to the distribution of the playing time difference between the adjacent notes comprises:

4. The method of claim 1, wherein solving a time difference threshold from the histogram comprises:

searching two peak points in the histogram by using a preset sliding window;

searching a valley point between the two peak points;

5. A song clause apparatus, comprising:

the acquisition unit is used for acquiring a time difference threshold value according to the distribution condition of the playing time difference between each two adjacent notes, and comprises an establishing subunit and a solving subunit, wherein the establishing subunit is used for establishing a histogram according to the distribution condition of the playing time difference between each two adjacent notes; the solving subunit is configured to solve a time difference threshold according to the histogram;

6. The apparatus of claim 5,

the calculating unit is specifically configured to calculate a playing end time of each note according to the playing start time and the playing duration of each note; and calculating the playing time difference value between each adjacent note according to the playing ending time of each note and the playing starting time of the adjacent note of each note.

7. The apparatus of claim 5,

the establishing subunit is specifically configured to establish the histogram by using a horizontal axis to represent each play time difference value and using a vertical axis to represent the occurrence frequency of each play time difference value.

8. The apparatus of claim 5,

the solving subunit is specifically configured to search two peak points in the histogram with a preset sliding window; searching a valley point between the two peak points; and determining the playing time difference value corresponding to the valley point as the time difference value threshold.

9. A storage medium having stored thereon a computer program, characterized in that, when the computer program is run on a computer, it causes the computer to execute the song clause method according to any one of claims 1 to 4.