Tracking down calls to AvSetMmThreadCharacteristics

Boring introductory stuff first:

Windows Vista introduced a feature called the “Multimedia Class Scheduler Service” (MMCSS). This is designed to give audio and video threads regular, short, bursts of very high priority, so that audio and video can stream without glitching. The bursts are regular so that audio and video packets don’t get delayed, and they are short so that non-multimedia activity doesn’t get starved.

It’s very important to note that apps which called into high-level audio and video APIs (e.g., the <audio> and <video> HTML tags) don’t have to worry about this kind of thing; the implementation of the high-level API takes care of registering the right pieces of its code with MMCSS.

For apps which (for one reason or another) choose not to use the high-level APIs, and instead want to hook into low-level APIs directly, the original design was for the app to call AvSetMmThreadCharacteristics function from its streaming thread when streaming begins, and AvRevertMmThreadCharacteristics when streaming ends.

In Windows 8.1, the Real-Time Work Queue API was created. This is the preferred approach for apps which want to use low-level APIs.


Now the good stuff. I got an email from a pro audio application developer who said that AvSetMmThreadCharacteristics was giving an ERROR_TOO_MANY_THREADS error in his app… even though he only registered a single thread with MMCSS! Surely one thread is not too many…

At this point you should pause and read the excellent book One Kitten is Not Too Many. I’ll wait.

Welcome back! A bunch of different theories started popping up in my head, like “maybe some other process on the system is consuming all the MMCSS slots”, or “maybe a plugin for the application is registering threads without the application developer’s knowledge”, or “maybe the call to AvRevertMmThreadCharacteristics isn’t happening”, or “maybe the task handle is being overwritten between Set and Revert”.

But then I remembered Raymond Chen’s advice: Theorize if you want, but if the problem is right there in front of you, why not go for the facts?

MMCSS has “Event Tracing for Windows” (ETW) logging. In particular, when a task is created, the Microsoft-Windows-MMCSS provider logs a Thread_Join event. This will shed light on what processes and threads are registering with MMCSS.

Also, ETW supports grabbing a stack at the point of an event being logged! This will shed light on whether the registration is happening from app code directly, or a plugin, or whatever.

So I sent the developer these instructions:

  1. Download mmcss.wprp (right-click the link and save)
  2. Open an elevated Command Prompt or PowerShell window
  3. Run wpr.exe -start mmcss.wprp
  4. Launch the app
  5. Create some audio objects and let them run for one second or so
  6. Close the app
  7. Run wpr.exe -stop mmcss.etl (you can change the output file name if you like)
  8. Inspect the resulting mmcss.etl file

In my local testing (I just used echo ^G from a command prompt) I was able to see a Microsoft-Windows-MMCSS/Thread_Join/ event with this stack (only the bold part is interesting)


How to negotiate an audio format for a Windows Audio Session API (WASAPI) client

The Windows Audio Session API (WASAPI) provides a family of interfaces for playing or recording audio.

Chief among these are the IAudioClient, IAudioClient2, and IAudioClient3 interfaces.

There is a Windows audio session (WASAPI) sample on GitHub, but in this blog post I want to dive into the nitty-gritty of one particular question:

How do I decide what WAVEFORMATEX to pass to IAudioClient::Initialize*?
*Or equivalent

Before I answer this question, let’s take a look at some of the relevant methods on these interfaces.

  1. IAudioClient2::SetClientProperties is a way for you to tell Windows some things about the audio stream before actually creating it (by passing an AudioClientProperties structure.)
    The client properties you specify can affect the answers to some of the questions you ask Windows, so be sure to set this BEFORE calling any of the other methods.
  2. IAudioClient::GetMixFormat gives you the audio format that the audio engine will use for this client (with its given AudioClientProperties) to mix all the similar playback streams together, or to split all the similar recording streams apart.
    This format is guaranteed to work*, but sometimes there is a better format that also works.
    * Unless you use AUDCLNT_SHAREMODE_EXCLUSIVE, or AudioClientProperties.bIsOffload = TRUE.
  3. PKEY_AudioEngine_DeviceFormat gives you the audio format that the audio engine uses after the playback mix to talk to the audio driver for the speaker, or to talk to the audio driver for the microphone before splitting the recording streams apart.
    This format is guaranteed to work with AUDCLNT_SHAREMODE_EXCLUSIVE.
    If the audio device has not been used with AUDCLNT_SHAREMODE_SHARED yet, the format may not have been calculated, and the property will be empty.
    You can force the format to be calculated by calling IAudioClient::GetMixFormat.
  4. IAudioClient::IsFormatSupported lets you ask Windows whether the client (with its given AudioClientProperties) supports a given format in a given share mode.
    In certain cases (e.g., AUDCLNT_SHAREMODE_SHARED), if the client does not support the format in question, Windows may suggest a format which (Windows thinks) is close.
  5. IAudioClient::Initialize considers the previously given AudioClientProperties; takes the WAVEFORMATEX you have decided on; and takes a set of flags, including AUDCLNT_STREAMFLAGS_XXX flags.
    The two interesting flags for format negotiation are AUDCLNT_STREAMFLAGS_AUTOCONVERTPCM and AUDCLNT_STREAMFLAGS_SRC_DEFAULT_QUALITY, which tell Windows that you want the WASAPI audio engine to do any necessary conversions between the client format you are giving it and the playback mix format or recording split format.
    This will work for uncompressed integer PCM and uncompressed floating-point client formats, but will not work for compressed formats like AAC.

OK, with all that background out of the way, let’s try to answer the question. There are several approaches which will work.

  1. Use a higher level audio API instead of WASAPI
    This is the preferred approach. WASAPI is complicated. And even with all that complication, it’s a very underpowered API – for example, it doesn’t even do MP3 decoding.
    No matter what your application is, there is almost always a higher-level audio API which is better suited for you. If you are not sure what it is, send me an email and ask me; I might be able to recommend one for you, or I may be able to put you in touch with someone else who can.
    A few examples of higher-level audio APIs: MediaElement, MediaCapture, AudioGraph, XAudio2.
    If you’ve tried a higher-level API, but you’ve run into some problem or other and now you’re resorting to WASAPI, email me and tell me about the problem; I want to fix it so we can get you back on the right API for you.
  2. If you don’t care what format is used
    use IAudioClient::GetMixFormat // no need to call IsFormatSupported here
  3. If you have a format in hand
    Maybe you’re playing audio from a file, or maybe you need to record from the microphone and hand off to a DSP library that insists on a particular input format.
    If that is the case, use the following pattern:

    if (IAudioClient::IsFormatSupported(formatInHand)) { use that }
    else { use IAudioClient::GetMixFormat, or the suggested closest supported format, and convert between that and formatInHand in the app code }

    Another option which will work, but which is less preferred, is to use this pattern:


    Since you’re using WASAPI directly (see point 1 above!) you will need to compress/decompress the audio in app code.

Or in tabular form (because people like tables)

Any format is fine
use IAudioClient::GetMixFormat()
use PKEY_AudioEngine_DeviceFormat
I have a particular format I want to use
if (IAudioClient::IsFormatSupported(yourFormat)) { use it }
else { use the suggested closest-supported-format and convert between it and yourFormat in app code }


if (IAudioClient::IsFormatSupported(yourFormat)) { use it }
else { use PKEY_AudioEngine_DeviceFormat and convert between it and yourFormat in app code }

Regardless of which approach you use, you should always have some assurance that the format will work before calling IAudioClient::Initialize.

You could get this assurance in various ways – IAudioClient::GetMixFormat, IAudioClient::IsFormatSupported, or AUDCLNT_STREAMFLAGS_AUTOCONVERTPCM | AUDCLNT_STREAMFLAGS_SRC_DEFAULT_QUALITY.

It is an application bug to call IAudioClient::Initialize blind.

Getting the schema for Windows Performance Recorder profiles

I was having some trouble authoring a Windows Performance Recorder profile by hand-editing the XML, and I wanted to check it against the schema.

This MSDN page says “you can get the WPRControlProfiles.xsd file in %ProgramFiles(x86)%\Windows Kits\8.1\Windows Performance Toolkit“. That appears to no longer be true.

I dug around a little and eventually found it. It’s a resource in %ProgramFiles(x86)%\Windows Kits\8.1\Windows Performance Toolkit\windowsperformancerecordercontrol.dll – the rest of the instructions seem to work fine.


You can extract it by opening the .dll in Visual Studio, expanding to the right node (highlighted above), then right-click and choose “Export”.

Pulling ClickOnce applications from the client using a script

I have a bunch of internal Microsoft ClickOnce applications that I use all the time.

But every time I clean install my machine I need to go look up all the application entry points and reinstall them.

This got boring, so I wrote an executable that takes a URL on the command line, and then uses the System.Deployment.Application.InPlaceHostingManager API to pull the ClickOnce application to the machine.

Browse source

Download binary

Riffing on Raymond – incrementing the reference count on a smart pointer

Last Friday, Raymond Chen blogged about how to use a function that released one of its inputs with various smart pointer classes.

He made two suggestions, and I offered a third:

  1. (Raymond) tell the smart pointer class to release ownership to the function
  2. (Raymond) use a different function that doesn’t release the input
  3. (Me) take an explicit reference on the function’s behalf

Raymond suggested that I should actually try my suggestion. So I did.

For each of the four smart pointer types, I tried four different ways to add a reference. Here are the results:

Smart pointer type .AddRef() ->AddRef() Cast to IUnknown * Get underlying pointer
Microsoft::WRL::ComPtr Compile error Compile error Compile error .Get()
ATL::CComPtr Compile error Compile error OK .p
_com_ptr_t OK OK OK .GetInterfacePtr()
std::unique_ptr Compile error OK Compile error .get()

Here’s the code I used.

Playing audio to the earpiece from a Windows Phone 8.1 universal app

Some time ago I blogged about the Windows Phone AudioRoutingManager API which allows you to put a “Speakerphone” and “Bluetooth” button in your Windows Phone app.

A common question that I get now is “I’m trying to play audio to the earpiece from my app, but AudioRoutingManager::SetAudioEndpoint(AudioRoutingEndpoint_Earpiece) is failing.”

It’s an interesting question, because Windows Phone will automatically route your audio to the earpiece if you have everything set up right – and if you don’t have everything set up just right, you can’t route it to the earpiece at all!

So how do you “set things up right?”

There are two things you have to do.

  1. Tag the audio in question as “communications”
  2. Tag your app as either a “voice over IP” app or a “voicemail” app

If you do both of these things, audio will flow to the earpiece automatically – no call to SetAudioEndpoint needed.

(At this point, if you want to get fancy, you can put a “Speakerphone” button in your app and wire it up to a call to SetAudioEndpoint(AudioRoutingEndpoint_Speakerphone), but that’s up to you.)

Let’s look at the two things in a little more detail.

Tag the audio in question as “communications”

How to do this depends on what API you’re using. It could be as simple as <audio src=”…” msAudioCategory=”communications” msAudioDeviceType=”communications” />. Or you might have to call IAudioClient2::SetClientProperties with an AudioClientProperties structure whose AudioClientProperties.eCategory = AudioCategory_Communications.

Tag your app as “voice over IP” or “voicemail”

You’ll need to set either the ID_CAP_VOIP or ID_CAP_VOICEMAIL Phone capability on your app. (The docs reference an ID_CAP_AUDIOROUTING capability, but that doesn’t exist.)

If you’re writing a Silverlight app, you can do that directly in the manifest.

If you’re writing a Windows Phone 8.1 (non-Silverlight) or Universal app, you have to create a WindowsPhoneReservedAppInfo.xml file and add it to your application package. It should look like this.

<?xml version=”1.0″ encoding=”utf-8″?>
<WindowsPhoneReservedAppInfo xmlns=””&gt;
<SoftwareCapability Id=”ID_CAP_VOIP” />

Using StrCmpLogicalW to sort strings the way the shell does

I have various scripts which use the “dir” command to enumerate files and then act on the most recent file that meets certain criteria. I noticed that “dir” and the Windows shell sometimes have different orders.

For example, if I have a folder containing files { track1.mp3, track2.mp3, …, track9.mp3, track10.mp3, track11.mp3 } then the Windows shell will enumerate them in the natural order, but “dir” will enumerate them in the rather strange order { track1.mp3, track10.mp3, track11.mp3, track2.mp3, …}.

Windows provides the StrCmpLogicalW API which is smart enough to realize that track9.mp3 < track10.mp3, so I whipped up a quick shellsort.exe utility that takes its standard input, breaks it up into lines, sorts those lines according to StrCmpLogicalW, and prints the resulting lines on its standard output. For bonus points I gave it a -reverse command-line option to print the output in reverse order (although it would have been more in the “toolbox” spirit to create a separate reverse.exe.)

>dir /b

>dir /b | shellsort -reverse

As with my previous exercise to find the longest duplicated substring in a given text, most of the code is taken up in reading the contents of the standard input in an efficient way.

Browse source

Download binary