CA2467466A1  System and method for compressing and reconstructing audio files  Google Patents
System and method for compressing and reconstructing audio files Download PDFInfo
 Publication number
 CA2467466A1 CA2467466A1 CA 2467466 CA2467466A CA2467466A1 CA 2467466 A1 CA2467466 A1 CA 2467466A1 CA 2467466 CA2467466 CA 2467466 CA 2467466 A CA2467466 A CA 2467466A CA 2467466 A1 CA2467466 A1 CA 2467466A1
 Authority
 CA
 Canada
 Prior art keywords
 frequency
 value
 compression
 data
 audio signal
 Prior art date
 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
 Abandoned
Links
 238000007906 compression Methods 0.000 claims abstract description 60
 230000005236 sound signal Effects 0.000 claims abstract description 29
 230000003595 spectral Effects 0.000 claims abstract description 20
 230000000694 effects Effects 0.000 claims abstract description 6
 230000000875 corresponding Effects 0.000 claims description 16
 238000001228 spectrum Methods 0.000 claims description 9
 230000001965 increased Effects 0.000 claims description 8
 230000002194 synthesizing Effects 0.000 claims description 4
 238000007493 shaping process Methods 0.000 claims description 3
 230000002708 enhancing Effects 0.000 claims description 2
 238000010187 selection method Methods 0.000 claims 1
 238000000034 methods Methods 0.000 description 21
 238000004458 analytical methods Methods 0.000 description 11
 210000002304 ESC Anatomy 0.000 description 10
 238000004364 calculation methods Methods 0.000 description 7
 239000006185 dispersions Substances 0.000 description 5
 230000004048 modification Effects 0.000 description 5
 238000006011 modification reactions Methods 0.000 description 5
 206010001488 Aggression Diseases 0.000 description 3
 230000003044 adaptive Effects 0.000 description 3
 230000015572 biosynthetic process Effects 0.000 description 3
 230000004301 light adaptation Effects 0.000 description 3
 238000002156 mixing Methods 0.000 description 3
 239000000203 mixtures Substances 0.000 description 3
 238000003786 synthesis reactions Methods 0.000 description 3
 230000002087 whitening Effects 0.000 description 3
 229940101532 Meted Drugs 0.000 description 2
 230000002596 correlated Effects 0.000 description 2
 230000003247 decreasing Effects 0.000 description 2
 238000001914 filtration Methods 0.000 description 2
 230000002123 temporal effects Effects 0.000 description 2
 230000001702 transmitter Effects 0.000 description 2
 241001527902 Aratus Species 0.000 description 1
 102100007320 BCL2/adenovirus E1B 19 kDa proteininteracting protein 3 Human genes 0.000 description 1
 101710031751 BNIP3 Proteins 0.000 description 1
 101710018540 BRAFLDRAFT_281537 Proteins 0.000 description 1
 230000037010 Beta Effects 0.000 description 1
 240000007124 Brassica oleracea Species 0.000 description 1
 235000003899 Brassica oleracea var acephala Nutrition 0.000 description 1
 235000012905 Brassica oleracea var viridis Nutrition 0.000 description 1
 280000720540 Cedar companies 0.000 description 1
 241000218645 Cedrus Species 0.000 description 1
 241000276438 Gadus morhua Species 0.000 description 1
 241000269435 Rana <genus> Species 0.000 description 1
 241000220317 Rosa Species 0.000 description 1
 102000006463 Talin Human genes 0.000 description 1
 108010083809 Talin Proteins 0.000 description 1
 238000009825 accumulation Methods 0.000 description 1
 238000007792 addition Methods 0.000 description 1
 229920002892 ambers Polymers 0.000 description 1
 230000005540 biological transmission Effects 0.000 description 1
 150000001768 cations Chemical class 0.000 description 1
 230000001413 cellular Effects 0.000 description 1
 229910021320 cobaltlanthanumstrontium oxides Inorganic materials 0.000 description 1
 235000019516 cod Nutrition 0.000 description 1
 230000003750 conditioning Effects 0.000 description 1
 238000010276 construction Methods 0.000 description 1
 238000000354 decomposition reactions Methods 0.000 description 1
 230000001419 dependent Effects 0.000 description 1
 238000010586 diagrams Methods 0.000 description 1
 238000005516 engineering processes Methods 0.000 description 1
 239000000284 extracts Substances 0.000 description 1
 238000009499 grossing Methods 0.000 description 1
 150000002500 ions Chemical class 0.000 description 1
 239000010410 layers Substances 0.000 description 1
 238000010606 normalization Methods 0.000 description 1
 238000005457 optimization Methods 0.000 description 1
 RZVAJINKPMORJFUHFFFAOYSAN pacetaminophenol Chemical compound data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0nMS4wJyBlbmNvZGluZz0naXNvLTg4NTktMSc/Pgo8c3ZnIHZlcnNpb249JzEuMScgYmFzZVByb2ZpbGU9J2Z1bGwnCiAgICAgICAgICAgICAgeG1sbnM9J2h0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnJwogICAgICAgICAgICAgICAgICAgICAgeG1sbnM6cmRraXQ9J2h0dHA6Ly93d3cucmRraXQub3JnL3htbCcKICAgICAgICAgICAgICAgICAgICAgIHhtbG5zOnhsaW5rPSdodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hsaW5rJwogICAgICAgICAgICAgICAgICB4bWw6c3BhY2U9J3ByZXNlcnZlJwp3aWR0aD0nMzAwcHgnIGhlaWdodD0nMzAwcHgnIHZpZXdCb3g9JzAgMCAzMDAgMzAwJz4KPCEtLSBFTkQgT0YgSEVBREVSIC0tPgo8cmVjdCBzdHlsZT0nb3BhY2l0eToxLjA7ZmlsbDojRkZGRkZGO3N0cm9rZTpub25lJyB3aWR0aD0nMzAwJyBoZWlnaHQ9JzMwMCcgeD0nMCcgeT0nMCc+IDwvcmVjdD4KPHBhdGggY2xhc3M9J2JvbmQtMCcgZD0nTSAxMy42MzY0LDE2MC4xMDIgTCA1Ny42NzkyLDE0OC4zMDEnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC0xJyBkPSdNIDYyLjA4MzUsMTQ5LjQ4MSBMIDY2LjM1NTIsMTMzLjUzOScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTEnIGQ9J00gNjYuMzU1MiwxMzMuNTM5IEwgNzAuNjI2OCwxMTcuNTk4JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojRTg0MjM1O3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtMScgZD0nTSA1My4yNzUsMTQ3LjEyMSBMIDU3LjU0NjYsMTMxLjE3OScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTEnIGQ9J00gNTcuNTQ2NiwxMzEuMTc5IEwgNjEuODE4MiwxMTUuMjM3JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojRTg0MjM1O3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtMicgZD0nTSA1Ny42NzkyLDE0OC4zMDEgTCA2OC4yMjY4LDE1OC44NDknIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC0yJyBkPSdNIDY4LjIyNjgsMTU4Ljg0OSBMIDc4Ljc3NDQsMTY5LjM5Nicgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzQyODRGNDtzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTMnIGQ9J00gMTAxLjA2NywxNzcuNTU2IEwgMTE3LjUxNiwxNzMuMTQ5JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojNDI4NEY0O3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtMycgZD0nTSAxMTcuNTE2LDE3My4xNDkgTCAxMzMuOTY0LDE2OC43NDInIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC00JyBkPSdNIDEzMy45NjQsMTY4Ljc0MiBMIDE0NS43NjUsMTI0LjY5OScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTQnIGQ9J00gMTQ0LjU0MywxNjQuNDk1IEwgMTUyLjgwMywxMzMuNjY1JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojM0I0MTQzO3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtMTAnIGQ9J00gMTMzLjk2NCwxNjguNzQyIEwgMTY2LjIwNSwyMDAuOTgzJyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojM0I0MTQzO3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtNScgZD0nTSAxNDUuNzY1LDEyNC42OTkgTCAxODkuODA4LDExMi44OTcnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC02JyBkPSdNIDE4OS44MDgsMTEyLjg5NyBMIDIyMi4wNSwxNDUuMTM5JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojM0I0MTQzO3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtNicgZD0nTSAxODguMTk2LDEyNC4xODIgTCAyMTAuNzY1LDE0Ni43NTEnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC03JyBkPSdNIDIyMi4wNSwxNDUuMTM5IEwgMjM4LjI0MiwxNDAuOCcgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTcnIGQ9J00gMjM4LjI0MiwxNDAuOCBMIDI1NC40MzUsMTM2LjQ2MScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6I0U4NDIzNTtzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTgnIGQ9J00gMjIyLjA1LDE0NS4xMzkgTCAyMTAuMjQ4LDE4OS4xODInIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC05JyBkPSdNIDIxMC4yNDgsMTg5LjE4MiBMIDE2Ni4yMDUsMjAwLjk4Mycgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTknIGQ9J00gMjAxLjI4MiwxODIuMTQ0IEwgMTcwLjQ1MiwxOTAuNDA0JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojM0I0MTQzO3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHRleHQgZG9taW5hbnQtYmFzZWxpbmU9ImNlbnRyYWwiIHRleHQtYW5jaG9yPSJzdGFydCIgeD0nNjMuOTAyNycgeT0nMTA2LjUzOCcgc3R5bGU9J2ZvbnQtc2l6ZToxNXB4O2ZvbnQtc3R5bGU6bm9ybWFsO2ZvbnQtd2VpZ2h0Om5vcm1hbDtmaWxsLW9wYWNpdHk6MTtzdHJva2U6bm9uZTtmb250LWZhbWlseTpzYW5zLXNlcmlmO2ZpbGw6I0U4NDIzNScgPjx0c3Bhbj5PPC90c3Bhbj48L3RleHQ+Cjx0ZXh0IGRvbWluYW50LWJhc2VsaW5lPSJjZW50cmFsIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiB4PSc4OS45MjA5JyB5PScxODIuODIzJyBzdHlsZT0nZm9udC1zaXplOjE1cHg7Zm9udC1zdHlsZTpub3JtYWw7Zm9udC13ZWlnaHQ6bm9ybWFsO2ZpbGwtb3BhY2l0eToxO3N0cm9rZTpub25lO2ZvbnQtZmFtaWx5OnNhbnMtc2VyaWY7ZmlsbDojNDI4NEY0JyA+PHRzcGFuPk48L3RzcGFuPjwvdGV4dD4KPHRleHQgZG9taW5hbnQtYmFzZWxpbmU9ImNlbnRyYWwiIHRleHQtYW5jaG9yPSJzdGFydCIgeD0nODQuODU0JyB5PScxOTguMDIyJyBzdHlsZT0nZm9udC1zaXplOjE1cHg7Zm9udC1zdHlsZTpub3JtYWw7Zm9udC13ZWlnaHQ6bm9ybWFsO2ZpbGwtb3BhY2l0eToxO3N0cm9rZTpub25lO2ZvbnQtZmFtaWx5OnNhbnMtc2VyaWY7ZmlsbDojNDI4NEY0JyA+PHRzcGFuPkg8L3RzcGFuPjwvdGV4dD4KPHRleHQgZG9taW5hbnQtYmFzZWxpbmU9ImNlbnRyYWwiIHRleHQtYW5jaG9yPSJzdGFydCIgeD0nMjYwLjUxNScgeT0nMTM1LjYxOCcgc3R5bGU9J2ZvbnQtc2l6ZToxNXB4O2ZvbnQtc3R5bGU6bm9ybWFsO2ZvbnQtd2VpZ2h0Om5vcm1hbDtmaWxsLW9wYWNpdHk6MTtzdHJva2U6bm9uZTtmb250LWZhbWlseTpzYW5zLXNlcmlmO2ZpbGw6I0U4NDIzNScgPjx0c3Bhbj5PSDwvdHNwYW4+PC90ZXh0Pgo8L3N2Zz4K data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0nMS4wJyBlbmNvZGluZz0naXNvLTg4NTktMSc/Pgo8c3ZnIHZlcnNpb249JzEuMScgYmFzZVByb2ZpbGU9J2Z1bGwnCiAgICAgICAgICAgICAgeG1sbnM9J2h0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnJwogICAgICAgICAgICAgICAgICAgICAgeG1sbnM6cmRraXQ9J2h0dHA6Ly93d3cucmRraXQub3JnL3htbCcKICAgICAgICAgICAgICAgICAgICAgIHhtbG5zOnhsaW5rPSdodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hsaW5rJwogICAgICAgICAgICAgICAgICB4bWw6c3BhY2U9J3ByZXNlcnZlJwp3aWR0aD0nODVweCcgaGVpZ2h0PSc4NXB4JyB2aWV3Qm94PScwIDAgODUgODUnPgo8IS0tIEVORCBPRiBIRUFERVIgLS0+CjxyZWN0IHN0eWxlPSdvcGFjaXR5OjEuMDtmaWxsOiNGRkZGRkY7c3Ryb2tlOm5vbmUnIHdpZHRoPSc4NScgaGVpZ2h0PSc4NScgeD0nMCcgeT0nMCc+IDwvcmVjdD4KPHBhdGggY2xhc3M9J2JvbmQtMCcgZD0nTSAzLjM2MzY0LDQ0Ljg2MjQgTCAxNS44NDI1LDQxLjUxODcnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC0xJyBkPSdNIDE3LjA5MDMsNDEuODUzIEwgMTguNDczNywzNi42OTAyJyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojM0I0MTQzO3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtMScgZD0nTSAxOC40NzM3LDM2LjY5MDIgTCAxOS44NTcxLDMxLjUyNzQnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiNFODQyMzU7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC0xJyBkPSdNIDE0LjU5NDYsNDEuMTg0MyBMIDE1Ljk3NzksMzYuMDIxNScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTEnIGQ9J00gMTUuOTc3OSwzNi4wMjE1IEwgMTcuMzYxMywzMC44NTg3JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojRTg0MjM1O3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtMicgZD0nTSAxNS44NDI1LDQxLjUxODcgTCAxOS40NzY5LDQ1LjE1MzEnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC0yJyBkPSdNIDE5LjQ3NjksNDUuMTUzMSBMIDIzLjExMTMsNDguNzg3NScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzQyODRGNDtzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTMnIGQ9J00gMjYuODQzOCw1MC4xNTM3IEwgMzIuMTUwMSw0OC43MzE5JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojNDI4NEY0O3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHBhdGggY2xhc3M9J2JvbmQtMycgZD0nTSAzMi4xNTAxLDQ4LjczMTkgTCAzNy40NTY0LDQ3LjMxMDEnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC00JyBkPSdNIDM3LjQ1NjQsNDcuMzEwMSBMIDQwLjgwMDEsMzQuODMxMycgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTQnIGQ9J00gNDAuNDUzNyw0Ni4xMDcgTCA0Mi43OTQzLDM3LjM3MTknIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC0xMCcgZD0nTSAzNy40NTY0LDQ3LjMxMDEgTCA0Ni41OTE1LDU2LjQ0NTInIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC01JyBkPSdNIDQwLjgwMDEsMzQuODMxMyBMIDUzLjI3ODksMzEuNDg3Nicgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTYnIGQ9J00gNTMuMjc4OSwzMS40ODc2IEwgNjIuNDE0LDQwLjYyMjcnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC02JyBkPSdNIDUyLjgyMjIsMzQuNjg0OSBMIDU5LjIxNjcsNDEuMDc5NScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTcnIGQ9J00gNjIuNDE0LDQwLjYyMjcgTCA2Ny42NDc5LDM5LjIyMDMnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC03JyBkPSdNIDY3LjY0NzksMzkuMjIwMyBMIDcyLjg4MTgsMzcuODE3OScgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6I0U4NDIzNTtzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTgnIGQ9J00gNjIuNDE0LDQwLjYyMjcgTCA1OS4wNzAzLDUzLjEwMTYnIHN0eWxlPSdmaWxsOm5vbmU7ZmlsbC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlOiMzQjQxNDM7c3Ryb2tlLXdpZHRoOjJweDtzdHJva2UtbGluZWNhcDpidXR0O3N0cm9rZS1saW5lam9pbjptaXRlcjtzdHJva2Utb3BhY2l0eToxJyAvPgo8cGF0aCBjbGFzcz0nYm9uZC05JyBkPSdNIDU5LjA3MDMsNTMuMTAxNiBMIDQ2LjU5MTUsNTYuNDQ1Micgc3R5bGU9J2ZpbGw6bm9uZTtmaWxsLXJ1bGU6ZXZlbm9kZDtzdHJva2U6IzNCNDE0MztzdHJva2Utd2lkdGg6MnB4O3N0cm9rZS1saW5lY2FwOmJ1dHQ7c3Ryb2tlLWxpbmVqb2luOm1pdGVyO3N0cm9rZS1vcGFjaXR5OjEnIC8+CjxwYXRoIGNsYXNzPSdib25kLTknIGQ9J00gNTYuNTI5OCw1MS4xMDczIEwgNDcuNzk0Niw1My40NDc5JyBzdHlsZT0nZmlsbDpub25lO2ZpbGwtcnVsZTpldmVub2RkO3N0cm9rZTojM0I0MTQzO3N0cm9rZS13aWR0aDoycHg7c3Ryb2tlLWxpbmVjYXA6YnV0dDtzdHJva2UtbGluZWpvaW46bWl0ZXI7c3Ryb2tlLW9wYWNpdHk6MScgLz4KPHRleHQgZG9taW5hbnQtYmFzZWxpbmU9ImNlbnRyYWwiIHRleHQtYW5jaG9yPSJzdGFydCIgeD0nMTcuNjA1OCcgeT0nMjkuNjg1OCcgc3R5bGU9J2ZvbnQtc2l6ZTo0cHg7Zm9udC1zdHlsZTpub3JtYWw7Zm9udC13ZWlnaHQ6bm9ybWFsO2ZpbGwtb3BhY2l0eToxO3N0cm9rZTpub25lO2ZvbnQtZmFtaWx5OnNhbnMtc2VyaWY7ZmlsbDojRTg0MjM1JyA+PHRzcGFuPk88L3RzcGFuPjwvdGV4dD4KPHRleHQgZG9taW5hbnQtYmFzZWxpbmU9ImNlbnRyYWwiIHRleHQtYW5jaG9yPSJtaWRkbGUiIHg9JzI0Ljk3NzYnIHk9JzUxLjI5OTgnIHN0eWxlPSdmb250LXNpemU6NHB4O2ZvbnQtc3R5bGU6bm9ybWFsO2ZvbnQtd2VpZ2h0Om5vcm1hbDtmaWxsLW9wYWNpdHk6MTtzdHJva2U6bm9uZTtmb250LWZhbWlseTpzYW5zLXNlcmlmO2ZpbGw6IzQyODRGNCcgPjx0c3Bhbj5OPC90c3Bhbj48L3RleHQ+Cjx0ZXh0IGRvbWluYW50LWJhc2VsaW5lPSJjZW50cmFsIiB0ZXh0LWFuY2hvcj0ic3RhcnQiIHg9JzIzLjU0MicgeT0nNTUuNjA2MScgc3R5bGU9J2ZvbnQtc2l6ZTo0cHg7Zm9udC1zdHlsZTpub3JtYWw7Zm9udC13ZWlnaHQ6bm9ybWFsO2ZpbGwtb3BhY2l0eToxO3N0cm9rZTpub25lO2ZvbnQtZmFtaWx5OnNhbnMtc2VyaWY7ZmlsbDojNDI4NEY0JyA+PHRzcGFuPkg8L3RzcGFuPjwvdGV4dD4KPHRleHQgZG9taW5hbnQtYmFzZWxpbmU9ImNlbnRyYWwiIHRleHQtYW5jaG9yPSJzdGFydCIgeD0nNzMuMzEyNScgeT0nMzcuOTI1JyBzdHlsZT0nZm9udC1zaXplOjRweDtmb250LXN0eWxlOm5vcm1hbDtmb250LXdlaWdodDpub3JtYWw7ZmlsbC1vcGFjaXR5OjE7c3Ryb2tlOm5vbmU7Zm9udC1mYW1pbHk6c2Fucy1zZXJpZjtmaWxsOiNFODQyMzUnID48dHNwYW4+T0g8L3RzcGFuPjwvdGV4dD4KPC9zdmc+Cg== CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJFUHFFFAOYSAN 0.000 description 1
Classifications

 H—ELECTRICITY
 H03—BASIC ELECTRONIC CIRCUITRY
 H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
 H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
 H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
 H03M7/40—Conversion to or from variable length codes, e.g. ShannonFano code, Huffman code, Morse code
 H03M7/4006—Conversion to or from arithmetic code

 G—PHYSICS
 G10—MUSICAL INSTRUMENTS; ACOUSTICS
 G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 G10L19/00—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
 G10L19/02—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
 G10L19/0204—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition

 G—PHYSICS
 G10—MUSICAL INSTRUMENTS; ACOUSTICS
 G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 G10L19/00—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
 G10L19/04—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques

 G—PHYSICS
 G11—INFORMATION STORAGE
 G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
 G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
 G11B20/00007—Time or data compression or expansion

 G—PHYSICS
 G11—INFORMATION STORAGE
 G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
 G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
 G11B20/00007—Time or data compression or expansion
 G11B2020/00014—Time or data compression or expansion the compressed signal being an audio signal

 G—PHYSICS
 G11—INFORMATION STORAGE
 G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
 G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
 G11B20/10—Digital recording or reproducing
 G11B20/10527—Audio or video recording; Data buffering arrangements
 G11B2020/10537—Audio or video recording
 G11B2020/10546—Audio or video recording specifically adapted for audio data
Abstract
Description
SYSTEM A~T~) NiET~I~I) FGIZ C~MPYtESSIhTG AlolD
~ECt~NSTUCTIl~(G AIJ1?IC7 FIL.ES>
Field of the Invention This invention relates to file compression. Tn particular, this invention relates to a system and method for compressing and reconstructing audio files.
Background of the Invention Compression/decompression (codec) algorithms are used to compress digital files, including text, images, audio and video, for easier storage and faster transmission over network connections. I~asic compression involves removing to redundant data, leaving enough data to reco:astruct the file during decompression with the desired degree of accuracy, or 'tolerance.' If the original uncompressed file has a higher resolution than is rewired for the end use, much nonredundant data can be eliminated because it is unnecessary in the decompressed file; for example, where a high resolution image is only needed for display on a computer monitor (which 15 typically has relatively low resolution), the image file can lose a lot of data without sacrificing the quality of the final image.
Similarly, in the case of most audio files, some data that is not redundant can nevertheless be eliminated during compression because some of the frequencies represented by the data are either not perceivable or not discernible by the human ear.
2o The psychoacoustic characteristics of the human ear are such that ultrahigh and ultralaw frequencies are beyond. its perceptual capabilities, and tomes that are very close together in pitch are often not discernible fi°om one another so the human ear perceives only a single tong anyway. Codecs which take advantage of this phenomenon, including the very popular NIP3 codec, a.re known as "perceptual audio 25 codecs." Such a codec analyzes the source audio, compares it to psychoacoustic models stored within the encoder, and discards data that falls outside the models.
With the widespread use of perceptual audio codecs, the problem of high frequency reconstruction has become of great importance. Wizen digitally encoding an audio signal, the high .frequency portion of the signal occupies a disproportionately _1_ large part of the encoded bit stream. To faithfully capture the high frequency content in the encoded signal, a very large amount of data would required to accurately represent the original, uncorlpressed audio signal.
~ne known method of increasing the compressibility of the encoded signal is to take advantage of the correlation between the high and low frequency components.
since these two components are correlated, it is possible to i~lter out the high frequency component at the encoder, transmit only the low f~eduency component and reconstruct the high frequency component at the decoder to generate an approximation of the original audio signal. Including additional information that describes the to correlation between the low and high frequency components with the transmitted low frequency component enables a more faithful reconstruction of the original audio signal.
Lossless compression in MP3 and other like compression formats uses the l~uffman algorithm for frayne compression of signal data. These techniques have z s proved to be very popular, since they are able to achieve significant compression of the original audio signal while retaining the ability to produce a reasonably accurate representation of the original signal.
The allocation of the number of bits to be allottE;d to storing each interval (e.g.
second) of sound sets a 'tolerance' level that determme:D the fidelity of the 2o decompressed audio file. Techniques that rely on this are known as "lossy"
compression techniques, b~~~ause data is lost in the compression/decompression process and the fidelity of the reconstructed file depends upon how much data was lost.
The earliest examples of successful high frequency reconstruction in lossy 2s audio encoding are the MPGPlus and Al~~Plus standards. Both techniques are based upon a patented spectral bandwidth replication technique. The problem with this approach is that for highly harmonic signals the high frequency content is not always harmonically correlated to the love frequency content. Thus, special treatment of harmonic signals is required. Tonality control is also missing from this approach.
An alternative method is highfrequency reconstruction by linear interpolation.
~ne example of this technique is PlusV''"~, a completely parametric approach by VLSI
Solutions ~L3~'. This method reconstructs high frequencies using a twopart harmonic plus noise model. The original audio signal :is sent to an encoder. The encoder extracts up to the four most prominent harmonic components, identified as peaks in the shorttime magnitude spectrum, and encodc;s their parameters" 'fhe remaining high frequency component in the audio signal is considered to be noise. The high frequency component is encoded by parametrization of :its amplitude envelopes in eight frequency bands. The encoded signal consists of only the low frequency 1o component of the original signal and the noise model parameters identified by the encoder. Tn order to extract a reconstructed signal from ~tl~e compressed signal, the decoder unpacks the parametric data and reconstructs the high frequencies by generating the corresponding harmonic and noise components of the high frequency signal, without relying on the low frequency component of the audia signal.
Another approach to high frequency reconstruction has been described by Liu, Lee, TIsu, National Chiao T~xng University, Taiwar°x, 200 : "IIi gh Frequency Reconstruction by Linear L'<~trapolation", which is incorporated herein by reference.
Liu et al. suggest using spectral replication, copying filterbanlc coefficients to generate a "detail spectrum" of the high frequency component of the audio signed, followed by 2o the application of a linearly decaying amplitude envelope, with leastmeansquares estimation of the decay slope from the existing low frequency component.
Problems with this approach include the absence of tonality control, non~harmonicity of the restored audio, possible inadequacy of the replicated spectrum block, and the possibly increasing slope of the amplitude envelope.
Ultimately, however, all these techniques are limited in their ability to compress an audio signal sin ce they do not account for l;c;mporal relations in the audio signal. Thus, the compressed signal inevitably retains substantial redundancies which must be stored in order for ire algorithm to reproduce a reasanably accurate representation of the original uncompressed file.
There is accordingly a need for a compression and recoc~struction scheme that accommodates temporal relations in an audio signal to increase compression ratios and improve the accuracy of reconstructed audio signals. There is a further need for a method of reconstruction that can be used on old archived sound files, to reconstruct the high frequency component of the file that had been lost due to limits in recording technology or storage media. existing at the ~:ime the file vas created.
summary of the Invention The present invention provides a system and method for the improved compression of audio signals. The present invention also provides a system and 1 o method for the improvetnea~~ of perceptual sound quality for audio recordings that are missing high frequency con~:ent, for example due to limitations in the storage medium or where the recording was compressed by a lossy audio data compression technique.
The method of the invention may thus be used to restore and enhance previously archived audio recordings, where the high frequency cornpone:.~.t of the original signal 15 has been lost due to limitations in the recording hardware or storage media.
The compression method of the invention is capable of being fully automated and does not require preinii:ialization for different types of audio data. It is sufficiently flexible to adjust for different sizes of spectral audio data, permitting it to be used with different spectral transforms. The method of the irwention is optimized 2o for integer arithmetic, improving calculation efficiency and ex3:ending the range of devices that can implement the invention and the range of devices in which the method could be applied, e.g. sound players, dictating machines, cellular phones, wired phones, radio receive~_°s and transmitters, the sound tracks of television receivers and transmitters.
25 In the present invention, context modeling is applied to all of the main types of audiodata: ll~odified Discrete Cosine Transform (le/IDC'f ), scalefactors, side information. The invention ~Nomprises applying context modeling to the data stream and constructed algorithmic models, and the algoritlunic optimization of a decoder function. The invention is based upon the use of adaptive arithmetic compression 3o techniques involving increasing the probability of the value coded.
lVlethods of context modelling are used for choosing the table of probability for arithmetic compression.
In the preferred embodiment tlae system and method of the invention, different context models are applied to increase the compression ratio of spectral information, j quantization coefficients and other information. The spectral data is divided into five frequency bands (0..31, 32..sp3, 64..L27, 12~..25~, 256.._'i75), each band corresponding to a different frequency range, and the last ten values for each frequency (statistics) for each band are independently obtained. Compression of spectral data uses the prediction of coefficient vahzes by several preceding frames of audio data by calculating the mean value ofthe last ten values of MDCT coe'_~cients.
Preferably context models and arithmetic compression are used for final compression. The filtered value of the Nth MDCT coefficient is compared with the largest value of all MDCT coeffaci.ents in the band, to which N belongs. The largest value of all MDCT coefficients in the band is obtained from the first iteration. The ratio of those values determines the number of tables used for ~~rithmetic compression.
The invention can be: directly applied to spectral data of' various characteristics and spectral bands of various frequencies. This include;> data obtained by standard algorithms, such as MPECT~ Layer 3 and MPEG4 AAC, as well as new compression algorithms.
2o In the preferred embodiment a rough estimate of the high frequency component is performed by .applying a mult~band distortion effect, waveshaping, to the low frequency content. This enables the proper harmonic structure, i.e.
overtones of the low frequency compo:raent, to be recreated in the reconstructed high frequency component. Control of tonality is achieved by means of varying the number of bands within the multiband framework. More bands leads to less intermodulation distortion, and hence greater tonality.
The use of waveshaping functions, such as Chebychev polynomials, ensures that the number of generateu: harmonics is limited and no aliasing occurs. A
filterbank is used that roughly shapes i:he reconstructed high frequency component according to _5_ an estimation of the most probable shape, performed us:irzg only the information extracted from the low frequency component v~rithout coiasidering additional information.
To ensure accurate reconstruction of° the high frequenc;r component, the timefrequency amplitude envelope and degree of tonality parameters are extracted from the loin frequency coypone:nt.
In one aspect the present invention provides a method f~r compressing an audio signal, comprising the steps of: a. dividing spectrai data ;corresponding to the audio signal into a plurality of frequency bands, each ba~i~d corresponding to a 1Q different frequency range; b. obtaining a plurality ofthe last Modified Discrete C:'osine Transform (MDCT) coefl~cients corresponding to the spectral data for each frequency for each band; and c. compressing the spectral data using a pref~iction of coefficient values in a plurality of frames of audio data by calculating a mean value of the plurality of last MDCT coeff'lcients.
15 In a further aspect the present invention provides a mef.~~od for increasing a compression ratio in the co~~pression of an audio signal, _: omp_rising compressing scalefactors using MTF'3 method.
In a further aspect the present invention provide:9 a method of reconstructing an audio signal from a set ol° compressed audio data corresponding to an originai zo audio signal, comprising the steps of: a. timefrequency decomposing the compressed audio data, b. estimating parameters from the audio data. comprising at least an amplitude envelope estimated from a modules of a first set of corresponding filterbank coefficients and a tonality estimated from a magnitude spectrum of a second set of corresponding filterba~~k coefficients; and c. syntl7.esi~inghigh frequency 25 components of the audio signal by: i) dividing the audio data into several frequency bands, ii) passing each frequency band through a nonlinear ~va~reshaping distortion effect to generate distorted ftequency bands, and iii) smnmir~g i:he distorted frequency bands to form an estimate of the high frequency comporunts.
_6_ brief Description of the Drawangs In drawings which illustrate by way cf exazrzple only areferred embodiment of the invention, Figure 1 is a blocl~ diagram showing the IV1DCT compression scheme.
Figures 2A to 2f are plots illustrating the dependencies of the sum of signs on the number of the series.
Figure 3 is a flow chart showing the sign prediction method used in the invention.
Figure ~. is a flow chart showing the method use~~~ to determine the countQ
to boundary.
Figure 5 is a flow chart showing the method of cleterznilzing the optimal LSC
value.
Figure 6 is a flow chart showing the employment; of ge3zeral statistic gained at the first iteration in magnitude prediction.
t5 Figure 7 is a flow chart showing the method of coding t:he general statistic, gained at first iteration.
figLZre 8 is a flow chart showing the implementat9zon of scalefactors in the znventzon.
Figure 9 is a flow chart showing the dispersion calculation.
20 figure 10 is a flow chart showing thc~ low frequency fihtering by means of a recursive filter.
Detailed Description of the l:nvention Some components ot.'the present invention are b~~sed upon an extension of known algorithms of arithmetic compression techniques.. for e~:ample as described in 25 the following US patents, al:l of which are incorporated I:zerein by reference:
_7_ 4,122,440 Langdon, Jr.; Glenn George (San Josc, CA.); lssanen; J~rnaa J~bannen (San Jose, CA), Method and means for arithmetic; string coding, t~et~aber 24, 197;
4,286,256 L.angdon, Jr.; Glen . (~>an Jose, CA); I~ISSanen; J~rana J. (Los CJatos, CA), Method and means for arithmetic coding utilizing a reduced number of operations, August 25, 191;
4,295,125 Langd~n9 Jr.; Glen G. (San Jose, CA), Me~lzod and means for pipeline decoding of the high to low order pairwise combined digits of a decodable set of relatively shifted finite number of strings, ct~ber 1:~, 191;
4,463,342 I,angdon, Jr.; Gleb G. (San :rose, CA); l~.ass,anen; J~rma J. (Los ~atos, CA), Method and means fog carryover control ire she high order to low order pairwise combining of digits of a decodable set of relatively shifted finite number strings, July 31, 194;
4,467,317 L.angdon, Jr.; glen . (San Jose, C~.); I~issanen; Ja~r~raa .1. (Los CIatOS, CA), Nighspeed arid metic compression coding using concurrent value a,~pdating, August 21, 194;
4,633,490 G~ertzel; Gerald (VYhite flai.ns, hi Y); lZitchell; Japan L.
(C3ssining, I~TY), Symmetrical optipr~ized adaptive data compression/transfer/decom:~ression system, December 30, 196;
4,652,8561VI~l~ia~ddin; I~ottappram IVI. A. (S~m Jose, CA); l~issanen;
J~rrna J. (Los Gatos, CA), lVlultiplicationfree muftialphabet arithmetic code, March 24, 197;
4,792,954 Arps; Donald 1B. ( san Jose, CA); I~arnin; E,hud 11~. (I~iriatMotzkin, IL), Concurrent detection of errors in arithmetic data compression coding, 2s December 20, 19~~;
_g_ 4,891,643 Mitchell; Joan L. (Jssining,1VT'~'); Pa~nneb~:er; illian~ 13.
(Carmel, N~'), Arithmetic coding data compression/de<~c~mpression by selectively employed, diverse arithmetic coding encoders and decoders, January 2, 1990;
4,901,363 'Toyol~a~.a; I~azuharu (~'amato, JP), System for compressing bilevel data, Febrraary I~,19~90;
4,905,29? L,angdon; Jr.; Glen G. (San Jose, CA;%%; Mitchell; Joan L.
(~ssining, NY); Pennebal~er; William 13. (Carmel, N'~); T~issanen; Jor~a J.
(Los C7atos, CA), Arithmetic coding encoder and decoder system, ~'f~bruary 27, 1990;
4,933,883 Pennebaher; Williapn ~. (Carmel, N~); Mitchell; Joan L.
70 (C~ssining, NY), probability adaptation for arithmetic coders, Jone 12, 1990;
4,935,882 Pennebah;er; William ~. (Carmel, N~'); lVIitehell; Joan L.
(C3ssining, N~'), Probability adaptation for arithmetic coders, June 19, 1990;
5,045,852 Mitchell; Joan L. (Cssini.ng, N~); Po~nnebaker; William .
(Carmel, N"~'); Rissanen; Jorra~a J. (Los C3atos, CA), I)ynarnic: model selection ~ 5 during data compression, September ~, 199I;
5,099,440 Pennebal~er; Williaan . (Carmel, N~'); Mitchell; Joan L.
(~ssining, NY), Probability adaptation for arithmetic coders, lWare24, 1992;
5,142,283 Chevion; Dan S. (I~aifa, i'.L); I~arnin; Ehud( ~. (Kiryat l~otzkin, IL); Walaeh; Eu~eniusz (I~iryat 1'~lotzkin, 1L), Arithmetic corrlpression coding using 2o interpolation for ambiguous symbols, ~luust 25, 1992;
5,210,53 Furlan; Gtilbert (San Jose, CA), Date. c;o~npression/coding method and device for implementing said method, May I I, 199:x;
5,414,423 3~ennebal~er; ~'illia~ D. (Carmel, N~~), Stabilization of probability estimates by conditioning on prior decisions of a given context, May 9, 2s 1995;
_~5,~46,0~0 L~n~do, J~°.; glen . (Aptos, CA); ;~~ndi, Ah~ci (Cupertino, CA), Orderpreserving, fastdecoding arithmetic coding a rithmc;tic coding and compression method and ap.;~aratus, August 13, 196.
The present inventioc~ also m~.kes usf° of data context modeling methods that have recently been developed, the best known application of context modeling being the Context Arithmetic Based Adaptive Coding (CABAL) algorithm; as implemented in the MPECa4 A~IC standard, which is incorporated herein by reference.
~~~nitions For purposes of this description the hollowing de:ilnitions are provided:
to ''Sign" is the sign of Modified Discrete Cosine ~Cransform (~J1DCT) coefficient.
"Magnitude" is the magnitude of MDCT coefficient.
"count0" is the region of zero MDCT coefficients (coded by storing only the boundary of this region).
"count0 boundary" is the left boundary of the count0. It is equG.l to the last nonzero IS MDCT coefficient position plus one.
"Deltacoding" is storing flee difference between a current valu~° and the .previous one.
A standard implementation has redundancy, concerned 'r~ith doubling the range of the value to be coded. E.g. ifthe value to be coded has the r,~~~.lge (a...b), the difference between the value to be cod~:d and ,.he value previously coded has the range (ab...b2o a), which is twice as wide.
"Arithmetic compression'' is the method of coding based on dividing the unitary interval into sections, which length is proportional to the; probability of the value to be coded.
''Accumulated frequency of ~he value'' is a number which indicates how maiay times z5 the value was previously meted.
':Recursive filter" is a filter based on summation of previous values weighted by exponentially decreasing coefficients. To avoid a large amount of summations and multiplications, the recursive filter calculates the current filter value using the linear combination of the previous filter result and the current value to be f ltered.
"Scalefactor" is the value needed to rascals the 1VIDCT coefficients. The resealing is implemented by the following equations for short and long blocks, respectively:
(blolxal _gaixx2108",subbloak_gcun~ _ ,,kale ac uxultr lier~scale actor r MDCT  ~es~caled = .sign * O!IDKT ~ * 2 4 ' * 2 ( f _ ~ r _. ) ~(~lalxaf gain210) _ .Eccrle ixc mxxlti lien".wale actor l+ ire la " retab MDCT _ re,scaled = s ign * ~rIDCT ~ * 24 * 2 ( t _ ~ t _ I t ~ n is the normalizing multiplier for MDCT data.
"Band" is the group of frequency values in one or several frames (for example with indices: 0..3I, 32..63, 64..127, 12..255, 256..575.
"Series'' is the set of valuc;s in different frequencies, but in one frame.
"Columns'' is the set of values in one frequency but in different frames.
"ESCsymbol" is the value chosen so that all values larger than it are assLimed to have approximately identical probability.
''ESCsequence" is the sequence of ESCsymbol and the difference between the current value and ESCsymbol.
''ESCsequence coding'' is the method of coding values with small probability, which consists of coding the ESCsymbol and the difference between the value to be coded 2o and the ESCsymbol.
"Table" is the table of probabilities of the value to be coded. The probability table can be changed during the coding process. The more symbols that were coded using the table, the larger is the probability of this value.
"Statistics" is the last values stored in a buivfer.
li"The MTF3" (Move To Front 3 last values) is a method. of coding by v,~hicl~
the last three different values coded are remembered and placed in stacl~, then coded with the probability of the location vv~ere they are stored plus th~:i.r own probability, while all other values are coded with their own probabilities.
''Aggressiveness" is the parameter that shows the frequency of table resealing.
"Symbol" is the value to be ;;oded, e.g. sealefactor, MI~~T coefficient etc.
">3inary code" is a code whi~rh consists of 0 and 1. 'This code needs N bits for encoding 2~N different values. The value can be calcul~.ted as: ~rxlue = ~ r~;
2' .
"IJnary code"  is a code, which consists of N symbols "'1" anct one symbol "0" in the end (if the value isn't the largest; for the largest value the last "'0" isn't necessary).
This code needs from 1 to I~ bits for encodi~lg N different valwus. 'The value can be calculated as Yaiue = ~~z; .
"e" is the base of natural logarithm, e=2.718281828.
Entropy Compressio~a In the preferred embodiment the entropy comprfasion stage of the invention comprises the following coYnponents:
1. MDCT data compression:
a. T he method scheme;
b. The method description;
c. Sign prediction algorithm;
d. count0 boundary;
e. Magnitude prediction algorithm;
f. ESCsequence employment;
_ 12g. First iteration employment.
2. Scalefactors usage and compression.
M.DCT Data ~'cheme Input data have a corrlplicated structure and consist of fzve parts. The first type of data is the MDCT coeff dents. l~IL~CT coefficients have the following format:
~Ialues in the range of820;'...207 are grouped into series of ~7G values each. The number of series containing these values is limited with ~2bit arithmetic usage. The algorithm works by the series, that is the coding of each series is started only after all previous series are coded. each series is divided into 5 bands as shown in Table l, 1o each "band" is a subset of data within the series. The di~rision into bands does not depend upon the values, but depends only upon the place of thc~ symbol in the series.
For example, the first band starts at tl3e zero position and ends at the 31 st position, composed of a group of values dependent upon their series position. The series in each band are shown in Table 1.
Table l Band number First position in Last position in band band ~
0 ~ 0 31 i 1 ~ 32 63 i 2 b4 12'7 i _ 128 25:>
4 J 256 ~ 57:~
The algorithm separately treats magnitudes and signs of values, because there is no correlation between tlleTrl. EnGOdlng the Sign "~" G(3rreS~iOndS $o "+"
and "~"
corresponds to ''". If the magnitude is equal to 0, the sign is not written to output stream.
_13_ if (l~TI7CT<0) then Magnitude=MDCT else Magnitude=MDCT
If (MI7CT<0) then Sign=1 else Sign=0 NIDCT Comp~er~ion Met~oc~
The algorithm is based on any suitable arithmetic compression procedure.
Input data for this procedure are the following: the number of possible values for the symbol to be compressed, the table of appearance frequencies (a probability table analogue) and the sum frequency (total weight of the table). Tho table is generated during the compression process as described below. The: arodinl; (compression) of the data is thus reduced to the optimum table fitting for each magnitude or sign to be 1 o compressed, by implementation of aritlunetic;al compression. The optimal table is taken in dependence of the, filtered MI)CT Coefficient to the maximal MI7CT
coefficient in the band ratio.
The table refresh x:rea~uency is controlled by the "aggres,siveness"
parameter.
'~~Jhen the sum of all accumulated frequencies in the table exceeds the aggressiveness 1 s parameter, the entire table is'. divided by 2 (re,scaled). The ''aggressiveness" parameter is constant and is fitted for better compression.
The procedure implementing the arie:hmetic compression calculates the left and the right ranges for the range coder, to be used for further com~~ression. It can be e,alled by the following string:
20 ~ncodeONEint(int a, int ~Cnt, int step, int Size, int totfreq) in which:
int a = the value to be encoded;
int ~'Cnt = the table of accumulated frequencies pointer:, int step = the value, which is added to the appearance frequency of the symbol after 2s coding;
int Size = the table size;
int totfreq = the whole frequency of the table equal to the sum of all its elements and the size of the table.
The table is preinitialized bvy the information gainod at the first iteration, and then the appearance frequency of the symbol to be compressed is. increased oath time when the appropriate symbol is coded.
During the coding process the procedure uses a table of accumulated frequencies which differs fr~~m the original table by increasing each accumulated frequency by 1. This increment is implemented to avoid the possibility of a zero probability for a symbol.
l0 There is a restriction for tho mochanism of suboptimu rn table fitting. The algorithm must be reproducible while decocting. The original series is transformed into the serios of magnitudes and signs by the rule described alcove. Tables during the initialization process are filled (~ to 11 itoms per band) by Gau;Ssian distribution according to the formula: ~3~.~~ = f~~X~  2~ , where G~, ~' are defined parameters and .f~ is normalization coefficient. Such a distribution approximates accumulation by the file distribution tables, described below (different bands for each f le of IVIDCT data).
Sign P~ec~iction Algor~ith~
Considering the values signs distribution for different columns of 1V1DCT
data, Figures 2A to 2F illustrato the dependencies of the sum. of signs ("+''_ +l, ''" 1 to the sure) on the numbex of oho series (designated "row number" in Figuros lA
to 2F) for 576 columns. Each plot corresponds to a real melody.
The independent behaviour of the first sign colaxmn is clear in these plots.
For most melodies the first sign column generally has the same value for all series, so this 2s sign column is coded separately fi°om the other columns using the compression algorithm.
lather sign columns behave chaotically. There is ~. slight: dependence of the signs in these columns on the sequence of previous sign: in the sara~e column.
The table number for sign compression depends upon the sequence of previous signs, as shown in Table 2 and Figure 3.
s 'fable 2 Sequence of signs in column Numerical equivalence Tabie Number + + + + + I OflOflfl I fl ++++ I flflOfll I 1 ++++ I flflfllfl I 2 +++ I flflflll I ~
++++ I flfll0fl I 4 _ I 11111 i ~1 "co~ent0" b~undary The position where the last nonzero MDC'Tcoefficient is located plus one, is the "count0 boundary''. The magnitude distribution through the series has a tendency to include high values at the start of the series and to decrease to Iow values at the end of the series. Therefore, it i<.5 more efficient to point to the location of the Past nonzero element than to automatically include the last data poinia of a series in the compression, since they may all be zero. The eountfl bocrndary is coded as the difference between the current count0 boundary and the countfl boundary in previous t 5 series.
Because of the Deltacoding used for count0 boundary compression it is often more efficient to shift the eount0 boundary artificially 1:o the right for some numbers l~and to compress all zeros between the last nonzero MDCT coefficient and the artificially shifted count0 boundar~T. Ilowever, storing the precise last nonzero vah~e position can eliminate the asymmetry of such an approach. also, coding the last nonzero value with a smaller table can eliminate the redundancy of" this approach. The last nonzero value cannot be equal to zero, so the zero value probability must be zero.
This approach allows the compression ratio to be increased.
Magnitude P~edic~ion Method The prediction algoritlun uses the filtered value of the previous MDCTcoefficient in the same colmnn {Fi.ltexed_va:~ue). The f ltering is carried out by the to lowfrequency recursive filter. The table with which the number will be coded is selected depending on the ~omparisor~ of the Filtered value with the largest value in the band (MaxBand). The concordance coefficients I~= Filtered value /Maxl3afad are fixed for each band. The current coefficient distribution is illustrated in Table ~.
1s Table 3 Band\Table0 1 2 3 4 5 6 7 ~ 9 10 i ' Band 0 0 0.01C.03 , 0.1 0.2 0.3 0.6 1.0 2.0 ......
0.06 f j Band 1 0 0.050.1 0.2, 0.3 0.5 0.'7I.0 3.0 5.0 ......
Band 2 0 0.05~?.l 0.2 0.3 0.5 0.~ 3.0 5.0 ......
~ ~
1..0 Band 3 0 0.010.03 0.06 0.1 0.2 0.3 1.0 ......0.6 The standard recursive alter is used to carry out the low fivequency Iiltering.
Coefficients of the recursive alter are selected to decrease the value meted 7 frames previously in a times2o Filtered value[tdt~ = Filteredvalue[t~ * 6fi' + Last vatue[t]~~ 1/7 17_ When using integer values, to reduce the rounding error the Last value is multiplied by 10:
Filtered value[t+dt] _ (Filtered value[t] * 6 + Last value[t]* 10)/7;
The maximal values in each band are calculated dL~ring i.he firs: iteration.
To code a "high amplitude'' frame it is desirable to use the previous value in the same frame. It is filtered by means of a lo~~ frequency recursive filter:
Filtered value[f+ df] _ (Filtered value[f] * 4 + Last value[f]* 10)/5;
and the filtered value is con spared with the faltered valLre of the: MDCT
coefficient from the previous frame. It can be compared not only with the filtered value, bud with the filtered value plus the s~~iaare root of dispersion. The; dispersion is calculated by the following equation:
Dis,~ersion = < value~2 >  <value>~2, where <...>  is a low frequency filtering (by means of the recursive filter).
Value sq = Value~2;
15 Filtered value sq=Filtered value sq*e~(2f)+Value_sq*(1 a"(2f));
Filtered value=Filtered va)~ue*e~(f~+Value*(1a~(f)) Dispersion=Filtered value sq(Filtered'value~2) After the comparison, in different cases different sots of tables are picked out.
The recursive filtering is shown in Figure 10 and the dispersion calculation is shown 2o in Figure 9.
To improve the compression ratio, the mixing of tables can be implemented. If the ratio of the filtered MDCT coefficient to the maximal MDCT coefi~cient is not exactly the value from table 3, a linear combination of two tables can be used for _ 1~
encoding. The coefficients of the linear combination are ~,alculated as a simple linear approximation:
'W1=(Filtered_MDCTLeft_boundary);(Right boundaryi.eft boundary) "~2=(Right boundaryFiltered MDC T )/(Right boundaryLeft boundary) Mixed tableLeft table"VV2+Right table*'~11 (~Uhere Left boundary<Filtered_MDCT<Right boundary) To increase compression rafio for binary da~~a (where the data c;an be 0 or I) a simple limitation of the largest and the lowest probability of 1 and 0 can be implemented. To reduce the MDCT coefficie='t encoding to the binary data encoding, each MDCT
coefficient can be converted to binary or unary code. In fact, it is valuable to implement unary code for some small values (for example, for vahaes 0..15).
For some larger values it's preferable to use binary code (for example, for values 16..527). It is preferable to compress the largest values with the equiprobablfv table (for example, for values 528..8207).
I S ESCSequence Usage Sometimes it is necessary to encode a large value with the table, even though there is a small probability of this value occurring. In such cases it can be more efficient to use the ESCsequence and to write the enccned ESCvalue and the difference between that value and the ESCvalue to the output: stream. The ESCvalue 2o is fitted dynamically for ea h table from the ratio Price{ESC) + Price{WalueESC) < lPrice(4~'alue) This inequality corresponds to the discontinuous variety of ESCvalues. Thus, the selection of the optimal ESCvalue is not a singlevalue problem. The optimal ESCvalue is located between tb.e smallest ESCvalue that sai~is~'°zes this inequality and the biggest ESCvalue that does riot satisfy this inequality.. The optimal ESCvalue calculation process is shown in Figure 5.
When using the arithmetic cedar to code the data, high values have a low probability of occurring {they were in previous data only once, or there are no such values at all). This Iow probability can be inadvertently reduced to zero due to truncation error. To prevent this error, and to compress such values more effectively, ESCcoding is used. Namely, one of the possible values is taken as the ESCsymbol and alI values after it and th6~ ESCsymbol essentially arc: coded with the same probability. In this algorithm the probability of these values are added and are said to be the probability of the ESCsymbol. When a data point has a value that is greater than or equal to the ESCsymbol, the ESCsymbol is coded with the probability of the to ESCsymbol and the difference of the value and the ESCsymbol (zero included) is coded by the equiprobable table.{The coding with equiprobable table is a particular case of arithmetic compression, which uses a probability table in which all probabilities are equal) The selection of the ESCsymbol is carried out by minimizing the integral of function f{x)~log{p(x)), where p(x) is the probability to be coded with, and f(x) is tl~e estimated probability, i.e. the smoothed probability, collected through the process of coding. The smoothing is an ordinary calculation of the mean value of the probabilities of the five nearest values. however, when the highest and nonpossible values are known with high precision, ESCcoding is not necessary.
The Fist I~e~°ation E~cplc~yaner~t 2o The first iteration is used for general statistic collection. This statistic is used for initialization of tables before the second iteration, for maxrnnum detection in each band, and for detection ofrconused values. The general statistic is collected for each band separately (0..31, 32..63, 64..127, 128..255, 256..575), as shown in Figure 6. The general statistic can be changed during the coding. When the ~~alue is coded, the corresponding number in tlm general statistic (with the same ~~alue and in the same band) is decreased by 1.
The general statistic is stored in a compressed lzle. It contains numbers, which indicates how many times ;,ach value appeared in each band. The number of series is known, so the sum of all n:~ambers for each band can be calculated as the product of 3o the series number to the band width (bands have the following width: first 32, second  32, third  64, forth 128, fifth  320). As the number of MDCTlines is known, the last zerovalues in the file do not need to bE: stored. The 8206th value is not stored even if it is not zero, because it can be reconstructed correctly as the difference between the sung of all values (vuhich is known and the sum of all values except the 8206th (which were stored before). The table is compressed by arithmetic compression with four different tables for each byte of 32bit words of statistic.
When decoding it is necessary to reconstruct the last zeros and the 8206th value. When storing such a statistic some redundancy is introduced i.n the output file.
To eliminate the redundancy unused values are excluded from the table when coding I o the absolute values of the NIDCT coefficients.
Scale, factors Usage and Compt~ession There is a redundancy of scalefactor, in that when the scaiefactor is known not all MDCT coeff dents are possible. For example, when the scalefactor is not the smallest, all MDCT coefficients in the band of this scalefactor cannot be small because in such a case the scalefactor would have to be smaller. So when the last value of the MDCT coefficient is coded the low values from the table can be discarded when all previoL~s values were small. For low bit rates the scalefactor precision is artificially reduced to achieve higher compression. The context model uses not only time correlation but also frequency correlation. Preferably the 2o method is applied in the temporal domain to increase the compression rate of scalefactors.
Frequency ~2ec~nstruction Frequency construction comprises the following components:
1. Analysis of the input signal 2. Synthesis of high frequencies 3. Analysis of generated high frequencies 4. Extrapolation of parameters Analysis of the Input Sig~aal The f'zrst stage of the reconstruction of the audio file according to the method of the invention is the analysis of the sound file to be improved, or of the input audio stream, passed from the decoder. The analysis comprises two stages: timefrequency decampositions and parameter estimation.
There are two types of timefrequency decompositions that are performed during the analysis stag. 'fhe furst type is the oversampled windowed Fast Fourier Transform (FFT) filterbanlc, which is timefrequency aligned with the filterbank used in the reconstruction phase: the size of the window is small enough (around 5 to 10 1 o ms) to provide a sufficient time resolution. This filterbank is used for the estimation of the timefrequency amplitude envelope (described below). The second filterbank is a simple windowed FFT with a longer time window. This filterbanlc provides line frequency resolution for the tonality estimation (described below).
The parameters estimated from the input audio are the timefrequency 15 amplitude envelope and the degree of tonality. The amplitude envelope is a odulus of the corresponding filterbank coefficients, obtained from the first filterbank. The tonality is estimated from the magnitude spectrum of zv second falterbanlc.
Several tonality estimates are currently elaborated. The estimator that is preferred for use in the invention calculates the ratio of the maximal spect~°al magnitude value over the 2o specified frequency range to the total energy in the specified frequency range. The higher this ratio, the higlser the degree of tonality is. The frequency range used for estimation of the tonality is [F/2, F], where F is the cutoff frequency of the given audio file or of the input audio stream. The magnitude spectrL~m undergoes a "whitening" modification before calculation of the tonality. The purpose of the 25 whitening modification is to increase the robustness of the estimator in case of a low degree of tonality. The ~lhitening modification comprises multiplication of the spectral magnitude array by sqrt(f), where f is the frequency. This operation converts the pink noise spectrum into a white noise spectrum and lowers the tonality degree for the naturally nontonal pink noise.
The output of the analysis block provides the estimates of amplitude envelope and tonality, comprising a 2D tinefrequency array of amplitudes and 1l~ array of tonality variations in time.
Synthesis of High Freq~ce~zcies The synthesis of high frequencies comprises the following srteps:
1. The input audio is split into several frequency bands by means of a crossover.
If the cutoff frequency is denoted as F and the desired. nLn~nber of bands is 2N+1, then the crossover bands are assigned with the following frequencies: [F~'2dF/2, F/2+dF/2], [F/2, F/2+dF'~, [F/2+3dF/2, F/2+SdF/2], ..., [FdF/2, F~dF/2,].
The to crossover comprises bandpass F1R filters designed using a windowed sync method.
The size of the window is around 10 ms. 'The window used is preferably Kaiser (beta = 9). The filtered fullrate signals comprise outputs of the crossover.
2. Each of the output crossover output signals is passed throug:n a nonlinear waveshaping distortion effect. The distortion effect is provided by the following formula: y(t) = F(x(t)), where F is the nonlinear transiPormation. The simplest form of F (not used in the invention) is a simple clipping, i.e. F=x(t) when abs(x(t)) <= A, and F=~/A where A is sons arbitrar~T threshold amplitude. This distortion generates an infinite row of harmonics of the input signal, which is undesirable for digital audio processing because higher harmonics may be aliased about tl°~e Nyqaaist frequency and 2o generate undesirable intermodulation distortion. To prevent this distortion, the invention preferably employs a special kind of distortion function in the form of Chebychev polynomials, which allows control over the exact number of generated harmonics. For the present invention the secondorder polynomial is suitable, so y(t) _ F(x(t)) = x(t)~2 is a useful formula. It generates the second harmonic of the input ~5 signals.
j. The resulting distorted bands are summed up to form the .first estimate of the reconstructed high freqL~encies in [F, 2F] frequency range. The intermodulation distortion products are out of the [F, 2F] range, so they can be filtered out by simply excluding all frequencies above ~F.
_ '73 _ ~~nalysis ~f ~eaae~°ated ~i~la Fr~g~ce~~ie,s The generated high frequencies are analy;~cd ire. the same way as the analysis of the input audio signal (step l, above. Since these two steps of analysis are identical, they can be combined into one analysis step. At the output of this analysis step the estimates of amplitude envelope and tonality of the ge;t~erated high f=rcquencies are obtained.
~xt~c~polataon ~f Paramet~a~,s The parameters detected from step 1 are extrapolated into the domain of high frequencies. The following extrapolation methods are preferred:
1o For extrapolation of the amplitude envelope, detect the slope of the amplitude envelope in the frequency range ~F/2, ..., F ]. The calculation is done as follows.
1. Spectral whitening ~nodif~cation, multiplication of magnitudes by sqrt(f) for each tTequency point.
n+.nr ~
~Xz f=~'=''v 2. Detection of the slope over a wide frequency range: ~1 =
Xa t l =~' ~ .fir t 5 where K is the number of lilterbanh frequency bins between F/2 and F, and i~T is the number of bins used for energy averaging. 1Iere it is assumed to be l~/4 z r:_+:v x 1 ...4 __x?
1 1,~~ ~'~~ 1 3. Detection of a slope over a narrower frequency range: ~Sz =
,; __ ~~ X1 1=/;N
 '~ 4 4. The final slope is calculated vas S = '~l + S' 5. The slope is linearly extrapolated in the decibel domain to the higher frequencies.
6. The resulting slope is smoothed in time using a recursive lowpass filter:
Xsmoothed[t][f] = aX[t][fj + (1~) Xsmoothed[t1][f].
For extrapolation ofthc tonality a simple zeroorder extrapolation is used.
The tonality of the reconstructed highfrequency signal should be equal ~:o the tonality of the [F/2, F] band.
fldjustment ~f I~igher F~equea~cies I laving obtaincd the estimated parameters of tl;e, reconstructed high frequency component, the Final step is to adjust the estimated parameters to approximate the actual parameters.
The first adjustment to be undertaken is the tonality adjustment. There are two methods for adjustment of tonality. The first is to adjust the number of bands used in the crossover in step 2. T he second method is a direct adjustment in the domain of filterbank coefficients. ~t is performed by means of ampliucation of peaks in a spectrum. Two possibilities exist for this: either each coefncient is scaled proportionally to its energy, or only peaks in a spectrvrrz are located and arnpiified.
2o 'f'he peaks are located using the X[f ~]<X[~ l ]<=X[f]>=X(f+i ]>X[f+2]
criterion for magnitudes of adjacent iilterbank frequency bins.
The second adjustment is the adjustment of the amplitude envelope. This adjustment is performed in the domain of ~ilterbank cocfficients. The difference between the extrapolated envelope end the real estimated envelope is calculated and then smoothed in frequency by means of a ~piss simple zerophase lowpass recursive filter: Xsmoothed[f] _ ~X[f] + (l[3) Xsmoothed[f 1] and Xsmoothed[f] _ (3X[fj + (1[~) Xsmoothed[f+1]. Then the smoothed correction amplitude envelope is applied to filterbank coefficients, i.e. the filterbank coefficients are multiplied by the magnitude correction coeff dents.
Mixing with the input audie3 The resulting reconstructed highfrequency signal, that contains no energy below f', is mixed with the input audio to form the final output signal of the algorithm.
The pracess of mixing is just addition of two signals in time domain.
Optionally, the amplitude coefficient A can be applied to the reconstructed high frequency signal in order to alter its amplitude accordin ; to user demand.
carious embodiments of the present invention having beard thus described in 1o detail by way of example, it will be apparent to those skilled in the art that variations and modifications may be made without departing from the invention. The invention includes all such variations and modifications as fall within the scope of the appended claims.
~6
Claims (11)
a. dividing spectral data corresponding to the audio signal into a plurality of frequency bands, each band corresponding to a different frequency range;
b. obtaining a plurality of the last Modified Discrete Cosine Transform (MDCT) coefficients corresponding to the spectral data for each frequency for each band; and c. compressing the spectral data using a prediction of coefficient values in a plurality of preceding frames of audio data by calculating a mean value of the plurality of last MDCT coefficients.
d. further compressing the spectral data using one or more content models or arithmetic compression, or both.
e. further compressing the spectral data using an MDCT sign prediction algorithm and ESCsequence.
f. further compressing the spectral data using a count0 selection method.
coefficients obtained from a first iteration, in the band containing N.
a. timefrequency decomposing the compressed audio data, b. estimating parameters from the audio data comprising at least an amplitude envelope estimated from a modules of a first set of corresponding filterbank coefficients and a tonality estimated from a magnitude spectrum of a second set of corresponding filterbank coefficients; and c. synthesizing high frequency components of the audio signal by:
i) dividing the audio data into several frequency bands, ii) passing each frequency band through a nonlinear waveshaping distortion effect to generate distorted frequency bands, and iii) summing the distorted frequency bands to form an estimate of the high frequency components.
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CA 2467466 CA2467466A1 (en)  20040517  20040517  System and method for compressing and reconstructing audio files 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

CA 2467466 CA2467466A1 (en)  20040517  20040517  System and method for compressing and reconstructing audio files 
Publications (1)
Publication Number  Publication Date 

CA2467466A1 true CA2467466A1 (en)  20051117 
Family
ID=35452167
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CA 2467466 Abandoned CA2467466A1 (en)  20040517  20040517  System and method for compressing and reconstructing audio files 
Country Status (1)
Country  Link 

CA (1)  CA2467466A1 (en) 

2004
 20040517 CA CA 2467466 patent/CA2467466A1/en not_active Abandoned
Similar Documents
Publication  Publication Date  Title 

JP5788833B2 (en)  Audio signal encoding method, audio signal decoding method, and recording medium  
JP5291815B2 (en)  Scaleable coding using hierarchical filter banks  
US8060375B2 (en)  Adapting masking thresholds for encoding a low frequency transient signal in audio data  
JP5253565B2 (en)  Audio coding system that uses the characteristics of the decoded signal to fit the synthesized spectral components  
Tribolet et al.  Frequency domain coding of speech  
EP1444688B1 (en)  Encoding device and decoding device  
CA2140779C (en)  Method, apparatus and recording medium for coding of separated tone and noise characteristics spectral components of an acoustic signal  
KR100209870B1 (en)  Perceptual coding of audio signals  
EP1537562B1 (en)  Low bitrate audio coding  
EP1440432B1 (en)  Audio encoding and decoding device  
US6446037B1 (en)  Scalable coding method for high quality audio  
JP2906646B2 (en)  Voice band division coding device  
USRE40281E1 (en)  Signal processing utilizing a treestructured array  
Painter et al.  A review of algorithms for perceptual coding of digital audio signals  
Brandenburg  OCFA new coding algorithm for high quality sound signals  
Cvejic et al.  A wavelet domain LSB insertion algorithm for high capacity audio steganography  
US6006179A (en)  Audio codec using adaptive sparse vector quantization with subband vector classification  
EP1016320B1 (en)  Method and apparatus for encoding and decoding multiple audio channels at low bit rates  
EP2308045B1 (en)  Compression of audio scalefactors by twodimensional transformation  
JP4478183B2 (en)  Apparatus and method for stably classifying audio signals, method for constructing and operating an audio signal database, and computer program  
DE69632340T2 (en)  Transport of hidden data after compression  
JP3926726B2 (en)  Encoding device and decoding device  
CN1973319B (en)  Method and apparatus to encode and decode multichannel audio signals  
US6092041A (en)  System and method of encoding and decoding a layered bitstream by reapplying psychoacoustic analysis in the decoder  
EP1334484B1 (en)  Enhancing the performance of coding systems that use high frequency reconstruction methods 
Legal Events
Date  Code  Title  Description 

FZDE  Dead 