Digital Audio Compression – a little more insight

For those who don’t work in the field, digital audio coding, and the associated practice of compression, can be mysterious and confusing.  For now i want to avoid the rat’s-hole of high definition and simply assume that most music – whether you think its great or awful, is recorded in CD quality.  By the way, CD quality can be pretty darn good, although it often is not.  but that’s the fault of recording, mastering etc.

The basics of CD quality are that it is:

— 2 channels

— 44,100 samples per second

— 16 bits per sample (meaning 2 to the 16th power shades of musical gray, or about 64000)

Multiply all this out and you have 1.411 Mbps plus overhead in “RAW” format, no compression.

As audiophiles, many of us have a low opinion of compression. We have heard 128 kbps MP3s on our iPods and pronounced them unacceptable.  That is true, but it’s also misleading.  Why?

  1. Its more than 10:1 compression!  That’s a lot.
  2. most of us hear these on poor quality internal DACs and amplifiers, via the analog jack.

But let’s get back to compression.  First, there are two kinds of compression, lossless and lossy. Lossless does not change the digital data one “bit”.  A good example is a ZIP file which makes your excel spreadsheet (or whatever) smaller but preserves all data.  This is done by mathematics that eliminates redundancy (like stings of zeros) or other methods of coding the data – without removing anyLossy formats on the other hand, DO remove information, and therefore musical accuracy. Some algorithms are better than others; MP4 (AAC) is about twice as efficient as MP3 for example.

FLAC (Free Lossless Audio Codec and ALAC (Apple Lossless Audio Codec) are the dominant lossless systems. Each can compress CD audio “about” 2:1, or to ~ 768 kbps.  It depends on how much redundancy exists int he the music and may be larger or smaller – after all it LOSSLESS – not driven to some arbitrary design speed.  When the process is reversed you have CD audio, no more, no less.  It should be sonically transparent. Although some claim to be able to hear it, this is unlikely. Most probably they are hearing something else, or imagining a difference. I cannot hear the difference on a VERY revealing system.

AAC (m4a or MP4) and MP3 dominate lossy compression.  Each can operate at many speeds, from 96 kbps (total, both channels) to 384 kbps or in special circumstances even more.  MP3, by far the worse, is most often used because it is the “least common denominator” — supported by everything.  We lose.  The important thing to realize is that there is a HUGE difference between 128 kbps MP3 and 384 kbps MP3 in terms of quality.  At 384 its only 2:1 compression beyond what can be achieved with ALAC or FLAC.  And i have heard great recordings in m4a at 384 kbps sound superb – try “Ripple” on for size if you doubt me, but do it on a great digital system (i played it on iTunes, MacBook Pro, BitPerfect (a $10 app), USB, galvanically isolated USB, re clocked to nano-second jitter, into a franken-DAC that began life as a MSB Full Nelson, with 96 kbps up-sampling.

I am not arguing that compression is desirable in high end – only that it needs to be understood in a broader context.  In fact, i plan another blog in which I’ll share some findings when i was working with the JPEG and MPEG standards groups (when employed by Bell Communications Research Inc., aka “Bellcore”) and related projects in the late 1980s and early 1990s – with some really surprising results.

In short, i find the poor recording and mastering practices evident especially in many rock/pop recordings, and more than a few classical recordings, to be far more detrimental and nasty sounding than relatively might AAC compression. Ditto the effects of jitter on the digital signal (see my existing blog on the evils of jitter).

Digital is complex. It is frustrating.  And yet it is misunderstood, and very early in its development.  I believe it has huge potential if we clear away the confusion and focus on finding solutions to the real problems.  So rip your stuff lossless. If you have to compress, dig into the expert settings (they are there in everything from iTunes up) and rip at least at the highest speed setting. Hard drives are cheap – enjoy the music.


CEO Sonogy Research, LLC

Jitter, or “why digital audio interfaces are analog”

Confused by the title?  Most people probably are, and that’s the point.  We constantly hear that “digital is perfect” and “there cannot be differences between transports” etc. We heard this from engineers, computer scientists and armchair experts. All three are wrong, but the engineers really ought to know better.

Let’s start with some basics.  Most musical instruments from the human voice to a guitar or piano, are analog. Our ears are analog.  And the sound waves between  the two must be analog.  God did it. Don;t argue with God.

Digital is a storage method.  It can only occur in the middle of this chain, with sound converted to digital and then back.  The goal is 100% transparency – or, more accurately, transparency so good we cannot tell the difference.  While that sounds like a cop out, its not. Analog records are also intended to be 100% transparent and fail miserably. CD, DSD, or whatever need only fail less to be an improvement.  My opinion is that done right, it DOES fail less and is potentially superb. Its that word “potentially” that trips us up.

While there are many points along the chain where we can lose fidelity I want to talk about one in particular:  Jitter.  I want to talk about jitter for two reasons:

  1. It has a huge impact on sound quality in real life systems today.
  2. No one talked about it until recently, and very few understand what it is or why it’s a problem.

To understand jitter, first we need to understand CD playback. I will use the CD example simply because I have to pick one and it is the most common High end format. CD music is digitized by measuring the music signal at very tiny increments. An analogy would be Pixels on your screen, and they use a lot of “pixels” — 44,100 samples every second. The height, or “voltage” of each sample is represented by a number we debate endlessly: the bit depth. CDs use 16 bits which means 64,000 shades of gray.

Illustration of the height and spacing of music samples;  courtesy:

But there is another characteristic that is equally important to sound quality in fact, mathematically it is part of the same calculation, and nobody talks about it. That characteristic is the time between samples. Think about height and time like a staircase; each step has a height and tread depth — the two together determine the steepness.   Similarly the analog output of “pulse code modulation” (which CD is) is determined by the height (limited by 2^16 or 64000 levels) and the time between samples. That time is assumed to be precisely 1/44,100 second. But we live in an imperfect world and that fraction of a second varies some, in the variation, which is random, is call Jitter.

Because it is random, it is not harmonically related to the music; and therefore in musical terms dissonant (or lousy sounding).  So while bits are in fact bits, there is much more on the interface between the transport and a DAC, than bits.  There is also jitter and noise, and noise causes jitter.

So well the engineers telling you that transports we’re streaming servers, where the digital signal cannot impact sound quality failed to read Shannon’s (and Nyquist’s) work.  It’s that simple guys.

For those of you who don’t much care, and just want your music to sound good, it gets both better and worse 🙂

There are two ways to send a digital signal between a source and a DAC.  The traditional method is by an interface called SPDIF.  It’s the little yellow RCA jack on the back of your CD transport. The problem with SPDIF is that a) it is Synchronous and b) the source is the master clock, and therefore determines jitter.  So when you do something logical like buy a fancy DAC to make your cheap CD player sounds better, you get the jitter of the cheap CD player and as we noted above that’s half the story.

There are more problems- primarily related to electrical noise but I think they are less severe than jitter and certainly are another topic.

I hope this has shown you that sound differences from transports and digital signals are neither snake oil nor mysterious, only annoying.  I will make only one recommendation: make sure that SPDIF connection is a true 75 ohm cable. It need not be a fancy audiophile cable. It can be amazon basics. It can be cheap Schiit (I think they call them that).  But it must be 75Ω.

Now if we could only fix crummy digital mastering, but that’s out of our control.

All the best,



CEO Sonogy Research LLC

High res Digital Music vs. “redbook CD” a quick overview

I’m getting whiplash from polarized — and shallow — opinions in the high end world.

In digital music specifically, I’m  bothered by the hardened, and often fact-free opinions of both audiophiles and engineers, the latter who ought to know better.

As a design engineer and true audiophile and music lover, I’m a rare bird, sitting in both camps.   I have learned (I don’t argue with reality) that things don’t always sound as they measure, and furthermore understand rational reasons for this (beginning with incomplete measurements).  For this post I’ll try to avoid the quagmire of subjective thresholds and simply ask “where are the differences and what is possible?”

I’ll turn up the contrast. At one extreme are many who believe digital is fatally flawed, always has been, and cannot be cured.  At the other end we have engineers who say “all is well, and if the bits are right, its perfect”. This is factually (technically) incorrect.  I’ll touch on only one small aspect of why here.

I don’t want to boil the ocean. I only want to address the question of whether Redbook CD format is good enough for even highly discriminating music lovers and revealing systems, and if so, whether high res files and recordings can simultaneously sound better. I’ll touch other topics in future posts: (digital interface signals and their contribution to audio quality, why SPDIF and DSD among others are  part analog) and other related topics.

My personal opinion is that, done perfectly (not in this world) RedBook — or 16bit, 44.1 k-sample, linear PCM encoding, is theoretically equal to and likely superior to any analog we are likely to experience – $5-10k turntables and all.  The problem is, Redbook digital is rarely (ok, never) done perfectly.   The “flaw” in the Redbook standard, again in my opinion, is that the sampling frequency chosen — for practical/cost reasons, makes it very hard for both studios and consumer playback equipment to perform ideal A–> and D–> A. These are analog processes folks.

The biggest problem in the standard itself is the 44,000-sample (don’t confuse this with 44 kHz analog signals) rate. This sampling rate was chosen to be more than 2X the highest audible frequency of about 20,000 Hz.  Per Shannon’s math and Nyquist’s specific theorem, one must sample at **more than** 2X a frequency in order to faithfully reproduce a smooth, distortion-free “X Hz” tone – and all frequencies below it.  Really, it can be perfect – but there’s a catch.  If you have ANY — **ANY** — signals above 20 kHz that get into the recording path they can alias and play havoc with the recording, interfering down into the audio band.  Plus, these sorts of non-musically related distortions are particularly annoying, leading in part to “digital glare”.

That’s one of the measurement flaws.  All distortions are not created equal, sonically.  Yep there’s good distortion and bad distortion, or at least bad and worse.  This is understood well in music theory.  A Boesendorfer or Steinway concert grand Piano is valued for its consonant harmonic distortions.  So are (some) tubes.  So distortion can be pleasant, or at least not unpleasant.  Digital aliasing is not in that group – its just nasty. As is “slewing” distortion – and any odd-order, high-order harmonics.  Back to the sampling frequency – to rid ourselves of aliasing nastiness, we must filter out 100% of that ultra-sonic stuff — the stuff above our cut-off frequency of 20kHz.

Ok, but i said it could be done. It can.  In theory. The problem is, to get rid of everything above 20,000 Hz the standard only leaves us 4,000 Hz for filters. And good sounding, phase coherent filters typically work by roughly halving the sound every OCTAVE, not every 1/10 of an octave, which is what the standard leaves  us, almost exactly.  Bottom line #1: the filters used can be nasty.  Bottom line #2: they are not 100% perfect so we typically get at least some aliasing. Maybe not much, but some. Note this is only ONE problem in the standard.  But rejoice, there are real workable, solutions and they don’t begin with throwing away CD (16/44k redbook).

And this, in my worthless opinion, is why high res files (24/96 etc) sound better. They had WAY more headroom to work with for the filtering in the studio, and our home CD players have more space too.  Furthermore, with 24 bits, engineers can miss their levels by a few dB and it all works out.  And they can edit and make digital copies and still have perfection in the 16 or 18 most significant bits – which is still way better, on paper, than your mondo turntable – or mine (one of my collection is a Logic DM101 with a Syrinx arm and a Grado Reference, The other a Linn triple-play, if you care).

So we should quit worrying about format errors, and do two things:

1. Encourage studios to do the best job possible.  Think they do? Listen to most rock, then listen to an old Verve two-track recording. ’nuff said.

2. Buy home equipment that gets the signal processing right.  That’s another blog, but by this i mean low noise, low jitter, fast filter and buffer amps, and great power suppliers.  Just like I built.  Trust me, it works.

I hope you found this useful.  When i have time to organize a complex subject I’ll tackle why the digital player and interface can make a difference. After all buts are bits. Its true… but that signal isn’t (just) bits. Intrigued?


CEO Sonogy Research LLC

“Bitperfect” – huh?

After a rather long hiatus from audio technology, i was re-immersing myself in the technology — especially with regard to the evolution of digital formats and streaming.  An odd word kept coming up – “bitperfect” – commonly used, almost never defined, what the heck? Of course bits are perfect. The problems are all analog.

I’ll not go down this rat-hole today, but suffice it to say that digital audio signals have analog characteristics to them that have direct impact on the reconstruction of the analog wave.  More on THAT later.

So what is “biperfect” and what’s imperfect about much digital (computer) audio?

I’ll oversimplify.  Most of this has to do with how volume is controlled in computer audio.  One would think that once in the digital domain, manipulation – for example turning down the volume – would be easy and without distortion.  In theory, it can, but in reality, one would be wrong. the vast majority of music is coded initially as “RedBook” – CD format with a resolution of 16 bits – or “65,000 shades of gray” which is pretty darned good – and IMNSHO, NOT where the problems in CD audio lie.  But if we simply do volume-control multiplication (like make it half as big) on the 16-but words, we slowly lose resolution (think through the math its true). If this doesn’t make sense – think about an extreme example – we digitally “turn down” the volume  99.99something% of the way and are left with only three digital levels – zero, one and two.  This is two-bit resolution and will sound like absolute crap. That’s a technical term.  For a comparison, if you can, turn your monitor to 4 or 8 bits of color and look at the screen.  Yuk.  ‘nuf said I hope.

You can see the numbers in a presentation by ESS Technology here:

To get it right we need to do two things:

  1. do our math at higher resolution, for example 24 or 32 bits, so we can maintain full 16 bit resolution,
  2. **AND** directly convert these higher-resolution words to analog (meaning a 32-buit conversion process) so we don’t truncate that resolution.

Simply doing the math in 32 bits is pretty simple. Yes, we’d need some code to convert it and do floating point math, and we’d need to temporarily save the much bigger buffered file, but that’s easy for machines that edit in photoshop.  The problem comes next; we need to convert this 32 bit word into a squiggly AC analog music voltage.  Problem: we have a 16-bit D/A chip.  And all the interfaces (SPDIF, AES/EBU, carrier pigeons).

You can read more about it, but the bottom line is this: in 99% of all cases, and 100% of all PC/MAC/LINUX cases, you should never use the digital volume control – that convenient little slider.  Just say no.  Set the volume to full and send the output to your DAC or networked digital player – and let ALL the bits get there, to be converted to a nice, clean, high-res music signal.  Then you can attenuate it with good, old-fashioned resistors.

(note: if you are just playing MP3 files through the sound card to your earbuds, none of this really matters)

So “bitperfect” is a word that we should never have had to invent nor explain. It comes of shortcuts made in commercial music players.

Fortunately, most high res players with real aspirations know this, and take care of it. JRiver, Roon, etc. are all bitperfect. Sorry to leave out many others.

I’ll add that for Macs there is a surprisingly good app that simply takes your iTunes library and hijacks the signal, delivering it without manipulation – in other words, “biperfect” and costs $10.  Its called….. Bitperfect for Mac.




CEO, Sonogy Research, LLC


Please see my blogs on!

My continuing blogs will be on Appledore Research Group Website
ARG is becoming THE acknowledged experts at managing virtualized and hybrid telecom networks and services.

My technology, music, audio and driving related rants will remain here!

— Grant

Applying MANO to Change the Economics of our Industry –
A Promising TMForum Catalyst (Dec 2015)

Appledore Research Group has been outspoken on the importance of automation and optimization in the Telco Cloud. We have outlined its importance, and the mechanism to minimize both CAPEX and OPEX in recent research. Our belief is that this kind of optimization depends on three critical technologies:

  1. Analytics to collect data and turn it into useful information
  2. Policy driven MANO to allow for significant flexibility within well defined constraints, and
  3. Algorithms capable of identifying the most cost effective solutions, within the constraints (location, performance, security, etc.) enforced by the policies

Here’s an excerpt from recent ARG research outlining the process:

Until now, we have seen relatively little action and innovation in the industry to pursue these goals – but here’s an interesting project that’s right on point. I want to share an exciting TMForum Catalyst; one that investigates the economic power of NFV, and asking “how, in practice…?”

That is not a typo. I did say “exciting” “catalyst” and “TMForum” in the same sentence. I realize that standards and management processes are not usually the stuff that makes your heart beat faster; but if you care about our industry’s commercial future (and like innovative thinking), this one’s different.

The premise is simple: the flexibility inherent in the “Telco Cloud” – underpinned by NFV and SDN, makes it possible and feasible to consider economic factors when deciding how to instantiate and allocate resources across data-centers. This catalyst, involving Aria-Networks, Ericsson NTT Group, TATA and Viavi set out to demonstrate this capability, along with a realistic architecture and contributions back to the TMF’s Frameworks construct.

To me, this is exciting. It says we can use the “MANO+” environment to drive down costs, and possibly even, over time, to create a “market” for resources such that high quality, low cost resources flourish while more marginal ones are further marginalized. This goes straight to the economics, competitiveness, and profitability of our industry and deserves serious attention.

This catalyst team appears well balanced in this regard, with each player bringing expertise in one or more of those critical areas, and one of the leading operators driving the cloud transformation guiding the objectives.

Ericsson summed up the challenge and the objective as follows:

“This TM Forum catalyst project intends to bridge the gap between OSS/BSS and the data silos in finance systems and data center automation controls to enable the kind of dynamic optimization analytics needed to achieve business-agile NFV orchestration.” – Ravi Vaidyanathan, Ericsson Project Lead

At the moment the industry is understandably focused on making NFV and MANO work – even simply. We must all walk before we try to run. Yet its very rewarding and encouraging to see the industry not only attempt to run, but to think about how far they can run. Step #1 in any journey is a destination; hats off to this team for picking a worthy one.

By the way, this team won a deserved award for most important contributions to the TM Forum’s standards. They deserve it for really thinking!

Grant Lenahan
Partner and Principal Analyst
Appledore Research Group

The Rise of Policy in Network Management:
Seductive Opportunities Along with Complex Risks

author: Grant Lenahan

The role of policy is about to expand rapidly, projecting a little-understood area, mostly associated with the operation of real-time routers, into the domain of management. It’s a great boon, but will demand re-thinking both what we think policy is, and what we think “OSS and BSS” are. Success will demand a well-defined plan, executed in a series of clearly defined steps.

Policy has been with us since the relatively early days of the Internet, when the IETF defined “Policy Decision Points” and “Policy Execution Points” – or PDPs and PEPs. Used only in very specific instances, policy has been limited to AAA/edge routers, and in 3G and 4G mobile networks “flow based charging”, where 3GPP defined the derivative “PCRF” and PCEF” to manage flow based charging.

The bottom line is that policy will quickly expand from relatively few use cases, to handling a wide range of network configuration tasks, all based on some key questions:

  • Who is the user, and what priority does that user have?
  • What is the product/service, or plan, and what parameters are demanded, possibly by SLA?
  • What is the network condition? Is it congested? Empty?
  • What are the technical and economic feasibility limits we must work within?

Policy is already being defined to control many attributes in SDN and NFV – scale, reliability, bandwidth, security, and location (geographic or datacenter) among others. Elements of a policy model are being talked about in various industry groups, from ETSI/MANO to the TMForum (Den-ng, ZOOM). But this is the dry “how?”; let’s discuss the exciting “what?”.

The real excitement begins when we understand that policy, combined with analytics and real-time (MANO-style) orchestration, can implement real-time, all-the-time, optimization of networks. While scary, these sorts of feedback loops have long been used in military and commercial guidance systems, in machine control, and in myriad other control systems. In fact, the basic ideas are called, in academia, “control theory”.

Imagine a data-center that approaches congestion, and through analytics driving new policy rules, automatically moves demand to a lightly used datacenter – improving performance and averting capital spend; quite the happy outcome. Or, consider analytics that correlate a set of security breaches with specific parameters, and closes the loophole, changing the policies that define those parameters. SDN, SON, NFV, and “3rd Network” based MEF services can all benefit from such dynamic and far-reaching policy.

Discussing each is beyond the scope of this Blog, but I’d like to set the stage for future dives into several elements of policy. In preparation, let’s consider that control-theory flow, from information collection (analytics), to determining the corrective action (optimization) to issuing the revisions (policy control and possibly orchestration). This simple, yet complex concept can fundamentally change the economics, flexibility and operation of networks. In my opinion, it is essential to derive the greatest benefit from virtualized networks.

Policy Circle

Before we begin a new hype cycle though, consider the challenges and risks. This level of automation will be difficult to deploy and tune. Policy conflicts must be managed, and autonomous systems must be tested, and trusted (“I wouldn’t do that, Dave”), and instability must be controlled (all control systems can oscillate). Success will likely come from a set of incremental steps, each of which adds – and tests – a layer of automation, and will therefore take years to complete. But those that benefit greatly will be those who build, brick by brick, to well understood goal or vision.

Stay tuned for future installations touching on specific areas of policy in tomorrow’s network.


Dynamic, Self-Optimizing VNFs:
Overview of an Innovative TMForum Catalyst

author: Grant Lenahan

Here’s a link to a short, to the point explanation of how several major industry players, sponsored by at&t, and under the auspices of the TMForum are looking at how policy, virtualization, analytics and “orchestration” combine to usher in a new world of dynamic optimization.  In the not-too-distant future we may have networks smart enough to heal themselves, scale themselves, and optimize costs and resource utilization.  Of course, the devil is in the details, and policy is still poorly understood int he world of management (OSS, BSS, etc.) software.  While the functions denoted by ‘FCAPS’ remain as important as ever, the methods are changing rapidly. or maybe i should say, “must change rapidly if we want to succeed”.

Enjoy, and many thanks to RCR and the TMForum for making this possible.


Hybrid, Virtualized Networks Require Modern, Hybrid Management

author: Grant Lenahan

It sounds obvious, but apparently, its not.

I’ve been working closely with ETSI/MANO, TMForum and TMF ZOOM and several thought-leading carriers implementing virtualization, and I’m observing a disturbing trend: a focus on the virtualization technology itself that seems to be omitting the broader challenges of management systems and processes that are absolutely critical to making it work.

The business value of Virtualization, although delivered by new technology, is that it enables new flexibility, new business models, and infinite product packages, all at low incremental cost. Let’s set out some examples:

  • Services can be scaled infinitely, matching price-point and capacity to need and willingness to pay
  • Services can be turned on and off for any arbitrary time period
  • Bundles can be easily created, likely based on each buyer’s needs
  • Flows and network functions can be placed where capacity exists or where they can be delivered at the lowest cost consistent with SLA needs
  • … And many more

But these benefits do not occur magically; each is dependent on various management functions, implemented in OSS/BSS.   Poor support; limited value.

No one is explicitly ignoring OSS/BSS of course; it’s subtler. In ETSI the focus is on user stories and the technology that closely surrounds NFVs. This is reasonable – people can’t concentrate on sufficient depth and breadth at the same time and make good progress – its just “focus”. The ETSI diagram below illustrates the point – lots of detail in the MANO domain and OSS/BSS – dozens of important functions – are relegated to the top left corner.  In operators the focus tends to be segregated teams – again, good for concentrating expertise, but bad for an E2E view.

Unintentionally, this kind of specialization has historically led to one of two outcomes –neither desirable:

  1. Creation of a shiny new stack for the new technology, which inevitably is a new silo that creates messy, hard to manage integration between the “old” and the “new” and stifles agility, and/or
  2. Forcing existing systems, some of which may not be up to the task, to support the new technology – inevitably poorly, and with similar impact of flexibility and agility.

My point is simple, yet the implementation is subtle and complex – but ultimately very worthwhile. OSS and BSS are critical to realizing the benefits of virtualization, and to monetizing this exciting technology. They must support the same transformational operational models that NFV, cloud & SDN do. And they must do so in an environment that continues to support many non-virtualized technologies, especially those in the distribution (access) network(s). We cannot separate them and focus only on the “new technology” (outcome 1), nor can we assume existing systems are up to the challenge (outcome 2); rather, we can only succeed when we manage the end-to-end business process – across domains — efficiently. In general, while many complex systems will remain, this means a re-think of the E2E architecture

Virtualization is truly transformative but the decisions we make over the next months and years will determine just how extensive and successful that transformation is. As I like to joke (half) – “if you can’t efficiently bill for it, it’s just a hobby.”

Food for thought! Watch this space for function-by-function examples in the near future – Grant