Archive for Uncategorized

Digital Audio Compression – a little more insight

For those who don’t work in the field, digital audio coding, and the associated practice of compression, can be mysterious and confusing.  For now i want to avoid the rat’s-hole of high definition and simply assume that most music – whether you think its great or awful, is recorded in CD quality.  By the way, CD quality can be pretty darn good, although it often is not.  but that’s the fault of recording, mastering etc.

The basics of CD quality are that it is:

— 2 channels

— 44,100 samples per second

— 16 bits per sample (meaning 2 to the 16th power shades of musical gray, or about 64000)

Multiply all this out and you have 1.411 Mbps plus overhead in “RAW” format, no compression.

As audiophiles, many of us have a low opinion of compression. We have heard 128 kbps MP3s on our iPods and pronounced them unacceptable.  That is true, but it’s also misleading.  Why?

  1. Its more than 10:1 compression!  That’s a lot.
  2. most of us hear these on poor quality internal DACs and amplifiers, via the analog jack.

But let’s get back to compression.  First, there are two kinds of compression, lossless and lossy. Lossless does not change the digital data one “bit”.  A good example is a ZIP file which makes your excel spreadsheet (or whatever) smaller but preserves all data.  This is done by mathematics that eliminates redundancy (like stings of zeros) or other methods of coding the data – without removing anyLossy formats on the other hand, DO remove information, and therefore musical accuracy. Some algorithms are better than others; MP4 (AAC) is about twice as efficient as MP3 for example.

FLAC (Free Lossless Audio Codec and ALAC (Apple Lossless Audio Codec) are the dominant lossless systems. Each can compress CD audio “about” 2:1, or to ~ 768 kbps.  It depends on how much redundancy exists int he the music and may be larger or smaller – after all it LOSSLESS – not driven to some arbitrary design speed.  When the process is reversed you have CD audio, no more, no less.  It should be sonically transparent. Although some claim to be able to hear it, this is unlikely. Most probably they are hearing something else, or imagining a difference. I cannot hear the difference on a VERY revealing system.

AAC (m4a or MP4) and MP3 dominate lossy compression.  Each can operate at many speeds, from 96 kbps (total, both channels) to 384 kbps or in special circumstances even more.  MP3, by far the worse, is most often used because it is the “least common denominator” — supported by everything.  We lose.  The important thing to realize is that there is a HUGE difference between 128 kbps MP3 and 384 kbps MP3 in terms of quality.  At 384 its only 2:1 compression beyond what can be achieved with ALAC or FLAC.  And i have heard great recordings in m4a at 384 kbps sound superb – try “Ripple” on for size if you doubt me, but do it on a great digital system (i played it on iTunes, MacBook Pro, BitPerfect (a $10 app), USB, galvanically isolated USB, re clocked to nano-second jitter, into a franken-DAC that began life as a MSB Full Nelson, with 96 kbps up-sampling.

I am not arguing that compression is desirable in high end – only that it needs to be understood in a broader context.  In fact, i plan another blog in which I’ll share some findings when i was working with the JPEG and MPEG standards groups (when employed by Bell Communications Research Inc., aka “Bellcore”) and related projects in the late 1980s and early 1990s – with some really surprising results.

In short, i find the poor recording and mastering practices evident especially in many rock/pop recordings, and more than a few classical recordings, to be far more detrimental and nasty sounding than relatively might AAC compression. Ditto the effects of jitter on the digital signal (see my existing blog on the evils of jitter).

Digital is complex. It is frustrating.  And yet it is misunderstood, and very early in its development.  I believe it has huge potential if we clear away the confusion and focus on finding solutions to the real problems.  So rip your stuff lossless. If you have to compress, dig into the expert settings (they are there in everything from iTunes up) and rip at least at the highest speed setting. Hard drives are cheap – enjoy the music.

Grant

CEO Sonogy Research, LLC

Jitter, or “why digital audio interfaces are analog”

Confused by the title?  Most people probably are, and that’s the point.  We constantly hear that “digital is perfect” and “there cannot be differences between transports” etc. We heard this from engineers, computer scientists and armchair experts. All three are wrong, but the engineers really ought to know better.

Let’s start with some basics.  Most musical instruments from the human voice to a guitar or piano, are analog. Our ears are analog.  And the sound waves between  the two must be analog.  God did it. Don;t argue with God.

Digital is a storage method.  It can only occur in the middle of this chain, with sound converted to digital and then back.  The goal is 100% transparency – or, more accurately, transparency so good we cannot tell the difference.  While that sounds like a cop out, its not. Analog records are also intended to be 100% transparent and fail miserably. CD, DSD, or whatever need only fail less to be an improvement.  My opinion is that done right, it DOES fail less and is potentially superb. Its that word “potentially” that trips us up.

While there are many points along the chain where we can lose fidelity I want to talk about one in particular:  Jitter.  I want to talk about jitter for two reasons:

  1. It has a huge impact on sound quality in real life systems today.
  2. No one talked about it until recently, and very few understand what it is or why it’s a problem.

To understand jitter, first we need to understand CD playback. I will use the CD example simply because I have to pick one and it is the most common High end format. CD music is digitized by measuring the music signal at very tiny increments. An analogy would be Pixels on your screen, and they use a lot of “pixels” — 44,100 samples every second. The height, or “voltage” of each sample is represented by a number we debate endlessly: the bit depth. CDs use 16 bits which means 64,000 shades of gray.

     
Illustration of the height and spacing of music samples;  courtesy: Wikimedia.org.

But there is another characteristic that is equally important to sound quality in fact, mathematically it is part of the same calculation, and nobody talks about it. That characteristic is the time between samples. Think about height and time like a staircase; each step has a height and tread depth — the two together determine the steepness.   Similarly the analog output of “pulse code modulation” (which CD is) is determined by the height (limited by 2^16 or 64000 levels) and the time between samples. That time is assumed to be precisely 1/44,100 second. But we live in an imperfect world and that fraction of a second varies some, in the variation, which is random, is call Jitter.

Because it is random, it is not harmonically related to the music; and therefore in musical terms dissonant (or lousy sounding).  So while bits are in fact bits, there is much more on the interface between the transport and a DAC, than bits.  There is also jitter and noise, and noise causes jitter.

Any engineer who tells you that transports & streaming servers cannot impact sound quality have failed to take into consideration fully one-half of the data necessary to re-create the wave form. They have focused only on the bit depth ( and its 96 dB theoretical signal to noise!) and ignored the jitter contribution.

For those of you who don’t much care, and just want your music to sound good, it gets both better and worse 🙂

There are two ways to send a digital signal between a source and a DAC.  The traditional method is by an interface called SPDIF.  It’s the little yellow RCA jack on the back of your CD transport. The problem with SPDIF is that a) it is Synchronous and b) the source is the master clock, and therefore determines jitter.  So when you do something logical like buy a fancy DAC to make your cheap CD player sounds better, you get the jitter of the cheap CD player and, as we noted above, that’s half the story.

There are more problems- primarily related to electrical noise but I think they are less severe than jitter and certainly are another topic.

I hope this has shown you that sound differences from transports and digital signals are neither snake oil nor mysterious, only annoying.  I will make only one recommendation: make sure that SPDIF connection is a true 75 ohm cable. It need not be a fancy audiophile cable. It can be amazon basics. It can be cheap Schiit (I think they call them that).  But it must be 75Ω.

Now if we could only fix crummy digital mastering, but that’s out of our control.

All the best,

 

Grant

CEO Sonogy Research LLC

High res Digital Music vs. “redbook CD” a quick overview

I’m getting whiplash from polarized — and shallow — opinions in the high end world.

In digital music specifically, I’m  bothered by the hardened, and often fact-free opinions of both audiophiles and engineers, the latter who ought to know better.

As a design engineer and true audiophile and music lover, I’m a rare bird, sitting in both camps.   I have learned (I don’t argue with reality) that things don’t always sound as they measure, and furthermore understand rational reasons for this (beginning with incomplete measurements).  For this post I’ll try to avoid the quagmire of subjective thresholds and simply ask “where are the differences and what is possible?”

I’ll turn up the contrast. At one extreme are many who believe digital is fatally flawed, always has been, and cannot be cured.  At the other end we have engineers who say “all is well, and if the bits are right, its perfect”. This is factually (technically) incorrect.  I’ll touch on only one small aspect of why here.

I don’t want to boil the ocean. I only want to address the question of whether Redbook CD format is good enough for even highly discriminating music lovers and revealing systems, and if so, whether high res files and recordings can simultaneously sound better. I’ll touch other topics in future posts: (digital interface signals and their contribution to audio quality, why SPDIF and DSD among others are  part analog) and other related topics.

My personal opinion is that, done perfectly (not in this world) RedBook — or 16bit, 44.1 k-sample, linear PCM encoding, is theoretically equal to and likely superior to any analog we are likely to experience – $5-10k turntables and all.  The problem is, Redbook digital is rarely (ok, never) done perfectly.   The “flaw” in the Redbook standard, again in my opinion, is that the sampling frequency chosen — for practical/cost reasons, makes it very hard for both studios and consumer playback equipment to perform ideal A–> and D–> A. These are analog processes folks.

The biggest problem in the standard itself is the 44,000-sample (don’t confuse this with 44 kHz analog signals) rate. This sampling rate was chosen to be more than 2X the highest audible frequency of about 20,000 Hz.  Per Shannon’s math and Nyquist’s specific theorem, one must sample at **more than** 2X a frequency in order to faithfully reproduce a smooth, distortion-free “X Hz” tone – and all frequencies below it.  Really, it can be perfect – but there’s a catch.  If you have ANY — **ANY** — signals above 20 kHz that get into the recording path they can alias and play havoc with the recording, interfering down into the audio band.  Plus, these sorts of non-musically related distortions are particularly annoying, leading in part to “digital glare”.

That’s one of the measurement flaws.  All distortions are not created equal, sonically.  Yep there’s good distortion and bad distortion, or at least bad and worse.  This is understood well in music theory.  A Boesendorfer or Steinway concert grand Piano is valued for its consonant harmonic distortions.  So are (some) tubes.  So distortion can be pleasant, or at least not unpleasant.  Digital aliasing is not in that group – its just nasty. As is “slewing” distortion – and any odd-order, high-order harmonics.  Back to the sampling frequency – to rid ourselves of aliasing nastiness, we must filter out 100% of that ultra-sonic stuff — the stuff above our cut-off frequency of 20kHz.

Ok, but i said it could be done. It can.  In theory. The problem is, to get rid of everything above 20,000 Hz the standard only leaves us 4,000 Hz for filters. And good sounding, phase coherent filters typically work by roughly halving the sound every OCTAVE, not every 1/10 of an octave, which is what the standard leaves  us, almost exactly.  Bottom line #1: the filters used can be nasty.  Bottom line #2: they are not 100% perfect so we typically get at least some aliasing. Maybe not much, but some. Note this is only ONE problem in the standard.  But rejoice, there are real workable, solutions and they don’t begin with throwing away CD (16/44k redbook).

And this, in my worthless opinion, is why high res files (24/96 etc) sound better. They had WAY more headroom to work with for the filtering in the studio, and our home CD players have more space too.  Furthermore, with 24 bits, engineers can miss their levels by a few dB and it all works out.  And they can edit and make digital copies and still have perfection in the 16 or 18 most significant bits – which is still way better, on paper, than your mondo turntable – or mine (one of my collection is a Logic DM101 with a Syrinx arm and a Grado Reference, The other a Linn triple-play, if you care).

So we should quit worrying about format errors, and do two things:

1. Encourage studios to do the best job possible.  Think they do? Listen to most rock, then listen to an old Verve two-track recording. ’nuff said.

2. Buy home equipment that gets the signal processing right.  That’s another blog, but by this i mean low noise, low jitter, fast filter and buffer amps, and great power suppliers.  Just like I built.  Trust me, it works.

I hope you found this useful.  When i have time to organize a complex subject I’ll tackle why the digital player and interface can make a difference. After all buts are bits. Its true… but that signal isn’t (just) bits. Intrigued?

Grant

CEO Sonogy Research LLC

Applying MANO to Change the Economics of our Industry –
A Promising TMForum Catalyst (Dec 2015)

Appledore Research Group has been outspoken on the importance of automation and optimization in the Telco Cloud. We have outlined its importance, and the mechanism to minimize both CAPEX and OPEX in recent research. Our belief is that this kind of optimization depends on three critical technologies:

  1. Analytics to collect data and turn it into useful information
  2. Policy driven MANO to allow for significant flexibility within well defined constraints, and
  3. Algorithms capable of identifying the most cost effective solutions, within the constraints (location, performance, security, etc.) enforced by the policies

Here’s an excerpt from recent ARG research outlining the process:
Policy_flow_chart_for_blog

Until now, we have seen relatively little action and innovation in the industry to pursue these goals – but here’s an interesting project that’s right on point. I want to share an exciting TMForum Catalyst; one that investigates the economic power of NFV, and asking “how, in practice…?”

That is not a typo. I did say “exciting” “catalyst” and “TMForum” in the same sentence. I realize that standards and management processes are not usually the stuff that makes your heart beat faster; but if you care about our industry’s commercial future (and like innovative thinking), this one’s different.

The premise is simple: the flexibility inherent in the “Telco Cloud” – underpinned by NFV and SDN, makes it possible and feasible to consider economic factors when deciding how to instantiate and allocate resources across data-centers. This catalyst, involving Aria-Networks, Ericsson NTT Group, TATA and Viavi set out to demonstrate this capability, along with a realistic architecture and contributions back to the TMF’s Frameworks construct.

To me, this is exciting. It says we can use the “MANO+” environment to drive down costs, and possibly even, over time, to create a “market” for resources such that high quality, low cost resources flourish while more marginal ones are further marginalized. This goes straight to the economics, competitiveness, and profitability of our industry and deserves serious attention.

This catalyst team appears well balanced in this regard, with each player bringing expertise in one or more of those critical areas, and one of the leading operators driving the cloud transformation guiding the objectives.

Ericsson summed up the challenge and the objective as follows:

“This TM Forum catalyst project intends to bridge the gap between OSS/BSS and the data silos in finance systems and data center automation controls to enable the kind of dynamic optimization analytics needed to achieve business-agile NFV orchestration.” – Ravi Vaidyanathan, Ericsson Project Lead

At the moment the industry is understandably focused on making NFV and MANO work – even simply. We must all walk before we try to run. Yet its very rewarding and encouraging to see the industry not only attempt to run, but to think about how far they can run. Step #1 in any journey is a destination; hats off to this team for picking a worthy one.

By the way, this team won a deserved award for most important contributions to the TM Forum’s standards. They deserve it for really thinking!

Grant Lenahan
Partner and Principal Analyst
Appledore Research Group
grant@appledorerg.com