CN1833282A

CN1833282A - Method for reproducing audio documents with the aid of an interface comprising document groups and associated reproducing device

Info

Publication number: CN1833282A
Application number: CNA2004800226426A
Authority: CN
Inventors: 路易斯·谢瓦莱尔; 艾泽贝拉·格拉斯兰德; 让-罗南·维古鲁; 让-巴普蒂斯特·亨利
Original assignee: Thomson Licensing SAS
Current assignee: InterDigital CE Patent Holdings SAS
Priority date: 2003-08-07
Filing date: 2004-08-05
Publication date: 2006-09-13
Anticipated expiration: 2024-08-05
Also published as: CN1833282B; FR2858712A1

Abstract

The invention relates to a method for reproducing audio documents of a package of documents by means of a reproducing device. The inventive method involves a preliminary stage for partitioning the package of documents into document groups having similar audio parameters which make it possible to determine at least one group-representing document taking into account the audio parameters thereof. Afterwards, the identifier of the group representing document is graphically and/or acoustically reproduced. In this way, a user can recognise a type of music in question and select the group by means of the graphical identifier. He can activate a command for switching from one group to another one, select a group and reproduce the documents thereof. A reproducing device provided with a user interface which enables a reproduction is also disclosed.

Description

Reproduce the side of audio file and the reproducer that is associated by means of the interface that comprises file group

Technical field

The present invention relates to a kind ofly reproduce the method for audio file based on transcriber, and a kind of transcriber that allows the graphic user interface selected that has been equipped with.

Background technology

The storage of a large amount of audio files in the equipment that sell the mass market is known.Transcriber has an interface, the file that makes retrieval user expectation easily.Transcriber for example is personal audio CD player, comprise the personal broadcasting machine of the hard disk of the music that can store 300 hours (such as the MP3 Lyra model of introducing to the market) by the applicant, have display and telepilot the family expenses player, have the personal computer of screen, hard disk, CD player and keyboard.In all cases, the user must introduce the unique identifier of the audio file that will reproduce.Under the situation of audio frequency CD, he must arrange the numbering of CD and the numbering of the segment in this CD.In some cases, transcriber has the player of the identifier that is used to show current reproduced audio file.For example, Lyra MP3 player has little lcd screen, makes to show the function selected with the icon form and the numbering of audio-frequency fragments.Housed device has the jumbo hard disk such as the 20G byte, makes thus and might store thousands of sound-content.Graphical interfaces has giant-screen, makes to show more information, for example the complete exercise question of described segment.

According to the type at interface, by the numbering in the tabulation that on screen, shows or carry out the selection of audio files by identifier.Along with the development of memory unit, the quantity of the file that store is bigger, and therefore, the user may need some times to search for his interested content.When the information of digital form is associated with audio files (being called as attribute), transcriber can be set up grouping.The attribute of audio file for example is classification (classical music, pop music, chorus, jazz etc.), exercise question, author, singer, publisher etc.

By determine to have to a certain degree the conforming grouping of music and by show these groupings by means of identifier, the user can at first select a grouping, navigates within it then to search for a segment.Therefore the identifier of described grouping is the public attribute by file-sharing.

But for example when its music clip of the own field notes of user, some audio content of user-accessible does not have these attributes automatically.

In this case, the another kind of mode of division audio file is directly to analyze voice signal.Exist to make the signal analysis technology of value of what is called " rudimentary (low the level) " parameter that might calculate each audio content.These parameters for example are: speed, energy, brightness, envelope etc.Determine described parameter by analyzing with its digital form or with the signal of its analog form.In August, 2000, the technology of audio content indexation has been described in the article " Speech and Language Technologies foraudio indexing and retrieval (voice of audio index and retrieval and language technology) " that IEEE periodical the 88th volume 1338-1353 page or leaf is published.Described the article states understands how might divide various contents by the analyzing audio signal.The means of calculating low-level parameters and possible use described in other articles, is present patent application and some other articles of comprising by reference here:

B.Feiten and S.Gunzel, Automatic indexing of a Sound Database usingself-organizing neural networks, Computer Music Journal, 18 (3 °, 1994 (B.Feiten and S.Gunzel, the automatic index of the audio database of use self organizing neural network, Computer Music periodicals, 18 (3 °, 1994)

Eric Scheirer, Music Listening systems, PhD thesis, MIT MediaLaboratory, Apr 2000 (Eric Scheirer, music listening system, PhD dissertation, MIT Media Lab, in April, 2000)

In case determined low-level parameters for each audio files of set, then as a function of these parameters, storage or transcriber can be divided them according to grouping.Therefore, the classical music content can constitute a group, and same, the jazz segment constitutes another group.Patent application PCT/the GB01/00681 that publishes in August 23 calendar year 2001 has described a kind of user interface, and it is included in the figure that shows on the screen and controlled by the audiovisual recipient.Shown menu shows and can activate the reproduction of its sound-content for the selection of the file that divides into groups by the icon (" allusion ", " jazz ", " pop music ranking list ", " reply " etc.) of user's selection.As a function of the file that in the grouping of given time, comprises, can introduce the identifier of grouping by the user.But when downloading new file, the sign of grouping must be able to develop so that define described grouping better.And, if to a packet allocation many files, then it being divided into several groupings is useful with the diversity of the file that obtains mean size.Such operation forces the user to redefine identifier.

Jap.P. JP07-044575 discloses a kind of method of voice recognition, makes to handle audio files or sound source, and they are placed a video.By the symbol that can select by means of mouse in a space (" sound field space ") the described audio files of expression.The user moves in described " sound field space " by means of mouse.According to the hierarchy described file that divides into groups.When in acoustic space, navigating, the volume of the sound of file and be inversely proportional to the user in this space and the distance between this file.Therefore, send all sound that are associated with the file of a grouping, the stack of this sound is unfavorable for navigation and the selection in this acoustic space.

Summary of the invention

One of purpose of the present invention is intended to provide a kind of being used for that file is divided into grouping to the user, and makes the user discern their automatic means easily.Therefore, in mode effectively and easily, the user navigates between group and in the group.

Theme of the present invention is a kind of method of reproducing in the audio file transcriber, it is characterized in that, it comprises following step:

-file is divided into the grouping of the file with at least one similar acoustic characteristic,

At least one audio file of-definite each grouping of expression,

-a plurality of the audio files in location in the space, the location of audio file depends at least one characteristic of described file, the position of CU in described space,

-reproduce at least one identifier of a file of a grouping of expression, a described reproduced identifier or a plurality of identifier bit in the distance of the position of user in the space position less than preset distance.

By this way, described device itself is determined the grouping of audio file and is used to represent at least one file of described group, the mode that the identifier of described one or more expression files is seen with the figure and/or the sense of hearing is emphasized to the user, by this way, the user can note the type of related music, and can select the element of this grouping and this grouping so that reproduce them.According to first improvement, the user can activate an order, makes to arrive another grouping from a grouping, and reproduced described identifier and file are automatically updated into the function that current file divides into groups.Improve according to another kind, the user can order the file that is reproduced in the reproduced grouping of its identifier by activating one.

Improve according to another kind, described method comprises step: equal the described file of expression in the space of the quantity of audio frequency parameter and its file and the spot correlation connection of arranging at its dimension in this space.By this way, the file representative that is defined as this grouping of grouping is depended on the point that is associated with a plurality of files of described grouping wait barycenter (equibarycentre) and with point that this file is associated between distance.The file of the most approaching described barycenter such as grade of its point that is associated is taken as the representative of described grouping.

Improve according to another kind, described method comprises step: the spot projection that will be associated with the file of described diversity and has audio frequency parameter as coordinate to the space of predetermined dimensions.By this way, can described file diversity be shown by represent projector space with figure.And the calculating of the distance between described each point that waits barycenter and be associated with a file of a grouping is simpler.According to a kind of variation pattern, the point of the expression file of a grouping is positioned at and waits barycenter to have the position at predetermined interval.By this way, not with single file but with the characteristic of several files as described grouping, described several files make the user can note the classification of described grouping better when understanding its difference around the described barycenter that waits.

Improve according to another, when the user has selected a grouping and when he reproduces the file of this grouping, the order of the reproduction of file be with its point the file of approaching described barycenter begin, thereafter, those farther files of employing position.

Improve according to another kind, the file that is taken as the representative of a grouping has the low-level parameters of its value near the mean value of the file of described grouping.

Improve according to another kind, if each reproduction of described file is then carried out in grouping of several representation of file in regular turn during predetermined period.

Improve according to another kind, transcriber receives the value of audio frequency parameter.According to these values, described device is determined described grouping and is represented the file of these groupings.

Theme of the present invention also is a kind of audio file transcriber that comprises the parts that order is introduced, and it is characterized in that it also comprises: calculating unit is used for file is divided into the grouping of the file with at least one similar acoustic characteristic; Be used for determining that at least one represents the parts of the file of each group; Be used for calculating the parts of the locator data that is associated with each file, determine described data, also distribute a locator data to the position of the user in described space by distinctive at least one characteristic of described file in a space; Be used to select represent the parts of at least one file of a grouping, selected one or more files be arranged in in the distance of the user's in space position position less than preset distance; Be used to reproduce the parts of at least one identifier of at least one file of a grouping of expression.

Description of drawings

In the framework of explanation illustration embodiment, subsequently that provides by illustration, and referring to accompanying drawing, other characteristics of the present invention and advantage will become clearer now in more detail, and described accompanying drawing is represented:

Fig. 1 is the block scheme that is used to realize illustration audio files transcriber of the present invention,

Fig. 2 is the array for the value of each its low-level parameters of file association of set,

Fig. 3 represents the projection on the two-dimensional space of the point that is associated with the file that belongs to three groups,

Fig. 4 has described and has presented the Snipping Tool at interface that screen background and being used to is selected the grouping of various audio files,

Fig. 5 is the block scheme according to the illustration audio files transcriber of the second illustration embodiment,

Fig. 6 has described the expression according to the acoustic space second illustration embodiment of the present invention, that the user moves therein,

Fig. 7 has described according to of the present invention second block scheme illustration embodiment, audio interface.

Embodiment

We will at first illustrate and the mode of operation that is used to show and reproduces the multimedia receiver 1 that the equipment 2 of sound is associated.Described receiver comprises: the central location 3 that links with program storage 12; Interface 5 is used for communicating by letter with high bit rate local digital bus 6, makes to receive audio frequency and/or video data with high bit rate.This network for example is IEEE 1394 networks.Receiver also can come by the receiving antenna that is associated with detuner 4 to receive audio frequency and/or video data from transmission network, and this network can be the radio or television type.Receiver also comprises the receiver 7 of infrared signal, is used for from telepilot 8 received signals; Storer 9 is used for stored data base; Audio/video decode logic circuit 10 is used to produce the audio visual signal that is dispatched to television screen 2.Telepilot 8 be equipped with directionkeys ↑, ↓, → and ← and " determining ", " grouping ", " audio files " and " selection " key, we will see its function in the back.

Described receiver also comprises the circuit 11 that is used for video data on screen, and it often is called as osd circuit, and OSD is used for expression and " shows at screen ".Described osd circuit 11 is text and pattern generator, it make might be on screen display menu, picto-diagram or other figures, and show the menu that is used to present navigation.Osd circuit is by central location 3 and omniselector 12 controls.Useful is, embodies omniselector 12 with the form of the program module that writes down in ROM (read-only memory).Also can embody omniselector 12 with the form of the special circuit of for example ASIC (application-specific IC) type.

Number bus 6 and/or transmission network come to send audio content to receiver with digital form or with analog form, and described receiver is recorded in them in the storer 9.According to a preferred embodiment, receive audio content (preferably according to the coding of the compression standard such as MP3) with digital form, and it is stored with identical form.According to this preferred embodiment, storer 9 is big capacity hard disks, for example the 40G byte.The audio content of storing one minute with MP3 takies about 1M byte, and such dish can write down the file of 666 sound hour.The download of audio content is the known technology that needn't illustrate in present patent application.

In case stored the audio content of some in storer 9, then the user will reproduce them, and wishes to carry out under the situation that does not have too many human intervention, and he wishes that also content follows each other with a kind of similarity, so that keep melodious environment.For so, the software module of omniselector is analyzed each audio content at its reception period, and from wherein extracting low-level parameters.Pointed in preorder as us, there are various signal analysis technologies, their feasible arrays that might obtain the numeric field descriptor of these songs.The quantity of the element of a descriptor is tens.

The array that comprises in the screen page of Fig. 2 has presented value descriptor, low-level parameters that constitutes the audio file of some.First row of described array have presented the exercise question of audio content, and each content is numbered.Row have subsequently presented the value of low-level parameters associated with the file, such as average intensity of sound, speed, energy, zero-crossing rate, brightness, envelope, bandwidth, loudness, cepstral coefficient etc.

According to a kind of improvement, can provide low-level parameters with audio content with digital form.When providing content when the parts that send by numeral with compressed format, the low-level parameters that is associated constitutes the field that is affixed to audio content.This solution is useful especially because product survivor by content or supplier rather than carry out described CALCULATION OF PARAMETERS by the user, so and it be performed only once.

If this locality is downloaded or is calculated them, then descriptor is stored in the storer 9, be used to set up the grouping of file then with certain similarity.According to first method, can carry out by means of so-called " cluster (clustering) " algorithm content is divided into relevant grouping (or bunch), described algorithm is such as k-method algorithm (Mac Queen, " Some Methods for classification and analysis ofmultivariate observations ", Proc Fifth Berkeley Symposium on Math., Stat.andProb., vol1, pp 281-296,1967 (Mac Queen, " are used to classify and analyze the certain methods of multivariable observation ", mathematics, the 5th Berkeley symposial proceedings of statistics and probability, the 1st volume, 281-296 page or leaf, 1967)).The array of the descriptor of Fig. 2 has new row, is used to define the residing grouping of content.The grouping computing technique is known, uses k-method algorithm, can easily control the quantity of the grouping of generation like this.

According to second method, determine grouping by selection sort in advance (for example: mood, main musical instrument, speed etc.) and the ground truth that helps to define these classification.

The file in case classified in various groupings, then described program will be determined the representative of one or more representative files or described grouping.

A kind of disposal route is included in the hyperspace identifier point Pi that the location is used to discern each file of a group, and calculating near the set of these points etc. the file of barycenter.Described barycenter such as grade is one group of center of gravity with point of equal in quality.Obtain the position of the point relevant with each file according to low-level parameters, the space that comprises these points has the as many dimension of the low-level parameters that is had with described file.

Can use to the projection of two-dimensional space and clearly explain described principle.Fig. 3 represents two-dimensional space, has wherein arranged the point corresponding to three groups of files representing with A, B and C.By will put Pi project to the coordinate that obtains each point on dimension 2 the space (xi, yi).Determine described projection by fundamental component analysis or PCA.Be to understand PCA specifically in the Saporta file 1990 of " Probabilit é s Analyse de donn é es et statistiques, EditionTechnip (the probability data analysis and the statistics of publishing by Technip) " at exercise question.This known data analysis algorithm seeks to find to depend on linearly the subsystem of the axle of original axis, and described original axis " expansion " is best sampled, and these trend towards merging the original axis that is associated.Suppose rudimentary descriptor have appreciable correlativity (and if have only the value of rudimentary descriptor approaching, just can the approaching sound of perception) and projection be continuous, then the audio files that is associated with the points of proximity in the space of dimension 2 is similar each other from the angle of the sense of hearing.Can use projection in such space with same example application to dimension 3 space.

The calculating that waits barycenter that is applied to described three diversity causes determining three some GA, GB and GC, and they roughly are positioned at the center of every outline line delimiting grouping A, B and C, such as shown in Figure 3.According to this illustration embodiment, (xi, yi) the file that waits barycenter of approaching grouping is used as the representative of described grouping with its point.

Spot projection is made the diagrammatic representation of the set might set up the file that can visit from a device to one dimension, two dimension or three-dimensional step.And the calculating of the distance between described each point that waits barycenter and be associated with a file of a grouping is simpler, because the dimension of projector space is significantly less than the quantity of low-level parameters.According to the subordinate relation of this or that grouping, the point that is associated with described file is given shape (as shown in Figure 3) or has specific outline line or have any other other graphics feature of phase region.Constitute user interface such as such diagrammatic representation with keypad, make to be chosen in the interior any point of a group.For this reason, the user can jump to another point from a point by the direction of indicating navigation by means of directionkeys.

But, select for use to one dimension, two dimension or three-dimensional projection step, therefore determine probably the point in hyperspace, arranged a grouping wait barycenter, might calculate any point and the distance that waits barycenter in described grouping equally.In this case, be difficult to represent file that therefore graphical interfaces only presents the graphical identifier of grouping by point.Such example of in Fig. 4, having represented graphical interfaces.

Be a diversity of the graphical identifier of screen background image and grouping described in Fig. 4.The graphical identifier of a grouping is a chart, and it comprises from 1 to the quantity that changes the quantity of the grouping of calculating during the step of determining grouping.Combination provides the indication that will be activated with the navigation command that changes grouping to these identifiers to the user by the figure link.In the example shown in Fig. 3, select grouping 7, by press ↑ directionkeys selects to divide into groups 6, and by press ↓ directionkeys selects to divide into groups 8.By one more runic outline line or by highlight or by flashing or color background emphasizes to comprise the icon of current group (grouping 7 in Fig. 4).If flatly arrange icon, then the user use → and ← directionkeys changes grouping.

When the user navigated with packet mode, described device reproduced the audio files of this grouping of expression.By this way, the user can determine for the public sound of the file diversity of described grouping or the classification of music in the mode that can listen.A kind of variation pattern is that the audio files of quantification represents the fact of described grouping.According to this variation pattern, ring-type mode ground reproduces these files when selecting described grouping.Described representative file for example is to be positioned at and the distance that waits barycenter those representative files less than the position of predetermined value.A kind of improvement of this variation pattern is the fact of quantity that user itself determines every group representative file.By this way, the user can initiate to have the reproduction of the successional heap file of the sense of hearing, and this has to manually select them.By procedure Selection be first file of representative be in the grouping with the minimum file of distance that waits barycenter, be second file then, be the 3rd file then, or the like.When reaching the quantity of user's plan, described procedure Selection first file.

The another kind of improvement is the extracts that only reproduces each file.Can be defined the duration of each extracts by described program, perhaps useful is that the user designs this duration.By this way, the user can promptly obtain the idea of the classification of the audio files in described grouping.

When selecting a grouping, the user presses " audio files " key to select each file of described grouping, therefore activates its audio reproduction.He can use subsequently → and ← directionkeys and from another file of file whereabouts.If graphical interfaces allows so then to show the exercise question of audio files.Useful is, also show be positioned at tight before (can select) by ← key and tight after the exercise question of two files of (can select) by → key.Therefore the user can be determines described two files that can directly reproduce according to current file.

In the above, described and be applied to the have display unit embodiment of device of (2).These parts make the identifier of file that might be used to represent to have one group of file of property similar in sound with graphic rendition.According to another embodiment, described device does not have improved display unit, makes him can show packet identifier at least.

Describe such device by Fig. 5, and will at first describe the mode of the operation of the player 5.1 that is used to reproduce audio file.This player is portable and independently, it has battery 5.2, is linked to the central location 5.3 (UC) of program storage 5.12, and have keypad 5.8 and audio interface 5.10, described keypad 5.8 makes the user to introduce to reproduce prime amplifier that needed all orders of audio content, audio interface 5.10 comprise that at least one digital to analog converter, at least one its gain can be adjusted by UC 5.3 and the amplifier that is used for assigning at least two loudspeakers 5.11 voice signal that is amplified.Keypad 5.8 has four direction key and rotating element (make might introduce or turn right and move), and the tradition order (broadcast, F.F., fall soon, stop, volume adjustment), rotary selector and at least one finger wheel that are used to reproduce audio files.Loudspeaker 5.11 is connected to described player, and they can be the earphones on the head-telephone that the user wears.Audio content is recorded in the hard disk 5.9 valuably, but can use any other recording medium, particularly removable media (audio frequency CD, DVD, cassette disk, electronic cards etc.).Can be audio content be downloaded in the hard disk 5.9 with the described same way as of Fig. 1.The download of audio content is the known technology that needn't illustrate in presents.

In case the audio content of some has been stored in the storer 5.9, the user wishes to select them and reproduces them.For this reason, each audio content of process analysis, and from its extraction low-level parameters.Pointed those of signal analysis technology and the device of describing for Fig. 1 before are identical.

According to the example of this second embodiment of the present invention, can be by the next actual expression of the some Pi that is arranged in acoustic space from the audio files Di of player visit with n dimension.For simple and understanding, this second illustration embodiment uses the acoustic space with two dimension.The layout of Fig. 6 illustrates such layout.Calculate by its coordinate (xi, yi) Ding Yi the position of some Pi in acoustic space according to low-level parameters.According to the example of Fig. 3, some Pi is the identifier of expression audio files Si.Can will put Pi according to selected expression type projects to sampled voice, projects to dimension space of 2,3 or the like and obtain coordinate (xi, yi), the coordinate of described some Pi is the value of rudimentary descriptor.Determine from the space of descriptor to the projection of this two-dimensional space by fundamental component analysis or PCA.Be to understand PCA specifically in the Saporta file 1990 of " Probabilit é sAnalyse de donn é eset statistiques, Edition Technip (the probability data analysis and the statistics of publishing by Technip) " at exercise question.This data analysis algorithm is intended to definite subsystem that depends on the axle of original axis linearly, and described original axis is " expansion " file best, and these trend towards merging the original axis that is associated.By this way, described program can be analyzed audio files and itself and determine main dimension, the dimension of procedure Selection acoustic space then.According to this technology, can represent file set by having greater than the space of two dimension.Therefore might set up and have that the user moves therein, three-dimensional acoustic space, in this case, loudspeaker 5.11 that must described equipment configuration is other, and they must be arranged up and down so that bring sound also from the impression of top or bottom to the user.Suppose that rudimentary descriptor has appreciable correlativity and projection is continuous, the points of proximity corresponding to appreciable near sound.In a general way, the coordinate { x of the some Pi in hyperspace _i, y ₂... zi} makes the user can determine the type of the audio files that is associated.On concrete, the position of some Pi is calculated as the function of the value of low-level parameters, if two points have distance on figure, then the value by the low-level parameters of two audio files of this two somes identification is very different, therefore, the type difference of sound-content, for example one section classical music and political oration.On the other hand, if two points are approaching, then from the angle of the sense of hearing, the type of the audio files that is associated is also approaching.

The sense of hearing perception that the user produces by player is chosen in the file in the acoustic space.For this reason, player places the center of acoustic space to have coordinate (xu, some Pu yu), and (xu, audio file yu) is to reproduce them to select the approximated position of its Pi the user.Perception by its sense of hearing, the user knows acoustic space, and can by startup provide with the highest loudness reproduce this file loudspeaker 11 direction button, by means of the sound that " sends " by the some Pi that is associated with this file make he itself towards file Di.

The layout of Fig. 7 illustrates the details of audio interface 5.10.Audio interface 5.10 is made up of two identical parts, and a part is used for reproducing on left earphone 5.11, and another part is used for right earphone 5.11.Quantity by the file of procedure Selection must be little, for example is 5.For each sound channel, UC 5.3 controls 5 selector switch S1, S2, S3, S4 and S5 of being associated with its program of record in storer 5.12, its function is the diversity select File from the audio file of storer 5.9, and with its reproduction.5 sound signals being selected by selector switch Si are sent to 5 prime amplifier A1, A2, A3, A4 and the A5 that its gain is controlled by UC 5.3 respectively.The gain of prime amplifier Ai that is used to reproduce audio file Di is with (xu, (xi, the distance between some Pi yi) is directly proportional for acoustic space yu) and the coordinate that has that is associated with this file at burble point.Described gain also depends on point, and (described straight line is from the point on the direction in the user the place ahead that is arranged in acoustic space (xu, yu) beginning for xi, the yi) direction of locating with respect to straight line.Represent this straight line with the arrow among Fig. 7.Therefore, L channel reproduces the All Files that its Di is located at the user's left in the acoustic space, and is reproduced in right-hand those by R channel.And described gain is the same big with the angle between the straight line Du of line segment that is formed by a Pi and Pu and the direction that is illustrated in user the place ahead.Therefore if described file is an inactivation in user the place ahead, then put Pi on this straight line Du, so the user is at left with right-handly hear audio content good equally, this point.At last, mixed in the totalizer amplifier by 5 signals that prime amplifier sends, and before being dispatched to earphone or loudspeaker 5.11, be exaggerated.

Therefore, the user hears different audio contents at the left of his ear with right-hand.As the function of voice signal, he can handle to the left or to the right by means of the directionkeys on keypad 5.8, and makes him own towards the point to the content Di that wishes corresponding to him to listen to.When point (xu, yu) be positioned at corresponding to the point of audio files Di (xi, yi) identical position or with its when the preset distance at most, regard as described file selected and on two earphones 5.11 with stereophonics, no longer reproduce other four files.If the user presses directionkeys and leaves from the file that he has just listened to, then program is used then corresponding to the weighting of distance and direction and is reproduced closest approach (xu, 5 files yu).

A kind of variation pattern is realized " selection " key on the keypad 5.8 of player 5.1.When the user presses this key, the point that in fact procedure Selection is positioned near the user (xu, audio files yu), and specify and get rid of any other file it is reproduced.(xu yu) is stored in the storer, reproduces the closest approach with box lunch and (when xu, 5 audio files of position yu), presses that " selections " key is feasible to return previous state for the second time in the position.

Now, we illustrate the improvement that helps the user to navigate in acoustic space.

5 files of the most approaching point that is associated with the user are also approaching aspect the sense of hearing, so the user is difficult for determining as for example shifting axle of the function of particular type of music.First kind of improvement determines to have the grouping of the audio files of sense of hearing correlativity, and reproduces one or more so-called " representativeness " file of each grouping.Can carry out determining of grouping as mentioned above, for example value---no matter they are downloaded by this locality still is to calculate---and by dividing class value approaching those by relatively in the descriptor of audio files, comprising.

With simple especially account form, the representative of a grouping is the audio file at the center of the colony of the point of each audio file of approaching described grouping of its point.Its identifier is an audio content.According to a kind of variation pattern, described representative is a succession of file of described grouping or the extracts of file, and therefore, identifier is the sound-content that the successively reproducing of extracts by each file of the described grouping of expression constitutes, and each was taken passages reproduced for example 10 seconds.Ring-type mode ground reproduces described extracts.According to another kind of variation pattern, program produces the synthetic video that the mean value according to the low-level parameters characteristic of the audio files of described grouping calculates.

Increase new row by the descriptor array to Fig. 2 and carry out to the packet allocation file of determining, these new row comprise the numbering that is used to discern the affiliated grouping of described file.In Fig. 6, discern four groupings by outline line.When the user wished navigation packet, he pressed button player, that be called as " grouping ", and according to the example shown in the figure thus, reproduced four files (these four files use bold outline lines to be apparent among Fig. 6) of representing each grouping most.Stop this navigation mode by pressing " grouping " button once more.By at first from a grouping to another grouping navigation, the user promptly selects the type of the audio content that he wants, then by stopping described pattern, it in this grouping from approaching file navigation to approaching file.By being enabled in the rotating element of arranging on the keypad 5.8, the user remains on the same point Pu of acoustic space, and changes by the direction shown in the arrow among Fig. 6.Therefore, in the time of on remaining on described point, the user can search for moving direction, suspends his rotation when perceiving music type in his the place ahead, makes him own towards this direction then.

The speed that a kind of variation pattern of described " grouping " key is to move is when the means of navigation mode with the mode of calculating grouping that elect.The user moves by pressing the four direction key, and when it pushed button for a long time or continuously and rapidly, program thought that the user will improve translational speed.Single and short time push button and make and might return normal translational speed.A kind of variation pattern is to be implemented in the finger wheel on the keypad 5.8, makes the user can determine speed subtly.Under situation about moving rapidly, program is set up large-sized several grouping.These groupings comprise how first song, and the representative that the user will listen to must only provide the approximate situation of the content of grouping.If the user is slack-off with his translational speed, then program will be set up littler grouping, therefore make the user can carry out meticulousr selection.In this case, needn't calculate song complete or collected works' grouping, but only calculate the grouping in user's neighborhood.These groupings are defined the content of the more faithful to grouping of described representative more subtly.When described speed hour, only reproduce immediate file, therefore regain one by one navigation mode near file.

Though the present invention has been described, has the invention is not restricted to these embodiment, but only limited by appended claim with reference to illustrated specific embodiment.Should be noted that those skilled in the art can change or revise.

Claims

1. a method of reproducing in the audio file transcriber is characterized in that, it comprises following step:

At least one audio file of-definite each grouping of expression,

-a plurality of the audio files in location in a space, the location of audio file depends at least one characteristic of described file, and the user occupies a position in described space,

2. according to the reproducting method of claim 1, it is characterized in that it comprises step: introduce the order with the packet mode navigation, each order activates the reproduction of at least one identifier that is used to represent the grouping emphasized with graphics mode.

3. according to the reproducting method of claim 1 or 2, it is characterized in that it comprises step: introduce the order of the reproduction be used to be enabled in the audio file in the reproduced grouping of its identifier, carry out the reproduction of audio file with predetermined order.

4. any one reproducting method that requires according to aforesaid right, it is characterized in that described determining step comprises step: equal the described file of expression in the space of the quantity of audio frequency parameter and its file and the spot correlation connection of in this space, arranging at its dimension, depend on the point that is associated with the file of described grouping wait barycenter and with point that this file is associated between distance, a file of grouping is defined as the representative of this grouping.

5. according to the reproducting method of claim 4, it is characterized in that, described expression step comprises step: project on the space of predetermined dimensions of the point that is associated with the file that divides into groups, and have audio frequency parameter as coordinate, in projector space, carry out grouping wait barycenter and with point that this file is associated between the calculating of distance.

6. according to any one reproducting method of claim 4 or 5, it is characterized in that file and a plurality of spot correlation of a grouping of expression join, the point of the file of described a plurality of points and described grouping etc. the distance of barycenter be predetermined space.

7. according to any one method of the claim 4-6 that is subordinated to claim 3, it is characterized in that, the predefined procedure of the reproduction of the file of described grouping be with its point the most approaching described barycenter file begin, thereafter, adopt those farther files of position.

8. according to the reproducting method of claim 1-3, it is characterized in that the representation file of grouping has the low-level parameters of its value near the mean value of the file of described grouping.

9. any one method that requires according to aforesaid right is characterized in that, if the reproduction of described each file is then carried out in grouping of several representation of file in regular turn during predetermined period.

10. according to any one method of aforesaid right requirement, it is characterized in that it comprises step: transcriber receives the value of audio frequency parameter, and these values participate in the step of the file of described partiting steps and definite expression grouping.

11. any one method according to aforesaid right requires is characterized in that reproduced identifier has sound property.

12. any one reproducting method according to claim 1-10 is characterized in that the identifier of reproduction has graphics feature.

13. audio file transcriber (1 that comprises the parts (8,5.8) that order is introduced; 5.1), it is characterized in that it also comprises:

Calculating unit (3,12,5.3,5.12) is used for file is divided into the grouping of the file with at least one similar acoustic characteristic;

Be used for determining that at least one represents the parts (3,12 of the file of each grouping; 5.3,5.12);

Be used for the parts (3,12,5.3,5.12) of the locator data that is associated with each file of computer memory, determine described data, also distribute a locator data to the position of the user in described space by distinctive at least one characteristic of described file;

Be used to select represent the parts (3,12,5.3,5.12) of at least one file of a grouping, selected one or more files be arranged in in the distance of the user's in space position position less than preset distance;

Be used to reproduce the parts (10,11 of at least one identifier of at least one file of a grouping of expression; 5.1,5.11).

14. the audio file transcriber according to claim 13 is characterized in that, it also comprises packet mode navigation command instruction unit (8; 5.8), each order starts the reproduction of at least one identifier be used to represent the grouping emphasized with graphics mode.

15. the audio file transcriber according to claim 13 or 14 is characterized in that, it comprises the parts (10,11 of introducing the reproduction be used to be enabled in the audio file in the reproduced grouping of identifier; 5.1,5.11) the parts (8 of order; 5.8), carry out the reproduction of audio file with predetermined order.

16. any one audio file transcriber according to claim 13-15, it is characterized in that, it comprises the parts (11) that are used for equaling at dimension the quantity of audio frequency parameter and its file and the described file of space expression of the spot correlation connection of arranging in this space, the parts (3,12 that are used for the file of a grouping of definite at least one expression; 5.3,5.12) consider the point that is associated with the file of described grouping wait barycenter and with point that this file is associated between distance.

17. the audio file transcriber according to claim 16 is characterized in that, is used for the parts (3,12 of determining that at least one represents the file of a grouping; 5.3,5.12) select its point the file of approaching described grouping point wait barycenter those/that file.

18. the audio file transcriber according to the claim 16 that is subordinated to claim 15 or 17 is characterized in that, the parts (10,11 that are used to reproduce; 5.1,5.11) and begin to reproduce file from the file of described barycenter of the most approaching point that is associated with the file of described grouping of point, reproduce and those farther files of centroid distance thereafter.

19. any one audio file transcriber according to claim 13-18 is characterized in that, is used for the parts (3,12 of determining that at least one represents the file of grouping; 5.3,5.12) select its acoustic characteristic value near the mean value of the file of described grouping those/that file.

20. any one audio file transcriber according to claim 13-19 is characterized in that, if determine parts (3,12; 5.3,5.12) selected to represent several files of a grouping, then reproduction block (10,11; 5.1,5.11) and during predetermined period, reproduce each identifier of selected file in regular turn.

21. any one audio file transcriber according to claim 13-20 is characterized in that reproduction block (10,11; 5.1,5.11) and reproduce at least one sound identifier.

22. any one audio file transcriber according to claim 13-20 is characterized in that reproduction block (10,11; 5.1,5.11) and reproduce at least one graphical identifier.