iOS: encoding to AAC with the Extended Audio File Services gotchas

I recently implemented encoding to AAC from raw PCM in memory (as opposed to from a file) using the extended audio file services, which have their calls prefixed with ExtAudioFile*. ExtAudioFile stuff basically wraps a converter with the standard AudioFile functionality. It was a bit hard to track down the correct documentation, but it’s not that far off from using the regular AudioFile stuff you’d do when writing a WAV file, for instance.

Then, a few days later, AAC encoding suddenly stopped working, without my code changing at all. Debugging led me to find that this only happened on my iPhone 4S, and not my iPod or 3GS, nor the simulator. This problem existed with apps from other developers too, including the helpful Michael Tyson’s TPAAC converter project on github. I was getting a long amount of blocking (10-30 seconds) and then an error from sometimes ExtAudioFileSetProperty on the kExtAudioFileProperty_ClientDataFormat and if not that, the ExtAudioFileWrite call. I have no clue why sometimes the ExtAudioFileSetProperty would go through. In all cases though, the audio on my app was dead after this call failed until I restarted the app. This made me think it was something to do with AudioSessions (as Michael Tyson points out can cause issues), but turns out this wasn’t the case for me. FYI my OSError code was 268451843, which doesn’t even translate to anything in core audio.

After reverting to swearing at the docs that are the notorious API documentation, and exploring various dead ends such as modifying my interrupt handler for the AudioService as well as trying it on the main thread just in case there was some race case issue, I came across this life saving stack overflow post by a mystery user1021430 who I am feeling very fond for (since he has an ambiguous handle I can imagine what I want). His post mentioned that it seems for certain dual core devices like the 4S, the hardware encoder is finicky, and it is possible to switch to a software encoder to fix these issues. I had no clue about this from the docs, and really think this guy that posted the answer probably has the most undervalued reputation (1).

Anyway, here’s my finalized (and unpolished, including progressbar updates that you should remove to use) code that exports to aac from a buffer. The FilterSound and FilterAudioBuffer classes are ommited, but it basically just helps fill out a buffer and shouldn’t be hard to understand.


-(BOOL)saveAACFileWithSound:(FilterSound*)s filename:(NSString*)filename progressView:(UIProgressView*)progView
{
   OSStatus err = noErr;

   NSString *path = [self pathForAudioFileWithFilename:filename];
   NSURL *url = [NSURL fileURLWithPath:path];

   char errCode[5];

   uint64_t doneSamples = 0;
   uint64_t totalSamples = [s approxTotalSamples];   
   
   AudioStreamBasicDescription format;
   ExtAudioFileRef outFile;
   memset(&format, 0, sizeof(format));

   format.mFormatID = kAudioFormatMPEG4AAC;
   //format.mFormatFlags =  kMPEG4Object_AAC_Main;
   //format.mSampleRate = kFilterSampleRate; // the encoder for compressed formats uses a different sample rate (see docs)
   format.mChannelsPerFrame = 2;
   
   err = ExtAudioFileCreateWithURL((CFURLRef)url,
                                kAudioFileM4AType,
                                &format,
                                   NULL,
                                kAudioFileFlags_EraseFile,
                                &outFile);

   if (err != noErr) {
      strncpy(errCode,(const char*) &err, 4);
      errCode[4] = 0;
      
      NSLog(@"AudioFileCreateWithURL path=%@ Error %li = %s", path, err, errCode);

      return NO;
   }

   // sometimes the dual core devices like iPhone 4S will freak out on the next set property call unless you do this.
   // What's more the device will be broken.
   // No idea why this is.
   UInt32 codecManf = kAppleSoftwareAudioCodecManufacturer;
   ExtAudioFileSetProperty(outFile, kExtAudioFileProperty_CodecManufacturer, sizeof(UInt32), &codecManf);

   // setup the client format since we want to convert from PCM
   AudioStreamBasicDescription clientFormat;
   memset(&clientFormat, 0, sizeof(format));

   clientFormat.mFormatID = kAudioFormatLinearPCM;
   // WAV should be little endian
   clientFormat.mFormatFlags =  kLinearPCMFormatFlagIsSignedInteger | 
         kLinearPCMFormatFlagIsPacked;//~kAudioFormatFlagIsBigEndian;
   clientFormat.mSampleRate = kFilterSampleRate;
   clientFormat.mBitsPerChannel = sizeof(AudioSampleType) * 8; // AudioSampleType == 16 bit signed ints
   clientFormat.mChannelsPerFrame = 2;
   clientFormat.mFramesPerPacket = 1;
   clientFormat.mBytesPerFrame = (clientFormat.mBitsPerChannel / 8) * clientFormat.mChannelsPerFrame;
   clientFormat.mBytesPerPacket = clientFormat.mBytesPerFrame * clientFormat.mFramesPerPacket;
   
   err = ExtAudioFileSetProperty(outFile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);
   if (err != noErr){
      NSLog(@"ExtAudioFileSetProperty error %li", err);
      return NO;
   }
   
   size_t outNumPackets = 0;
   size_t numBytes = 0;

   FilterAudioBuffer* audio_buf;
   audio_buf = [[FilterAudioBuffer alloc] initWithChannels:2
                                                    frames:kSaveWavBufferSamples
                                                shouldZero:NO];

   AudioBufferList writeBuffer;
   while (![s shouldBeRemoved]) {
      // we use the same buffer each iteration, so we need to wipe it clean.
      memset(audio_buf->buffer, 0, sizeof(SInt16) * 2 * kSaveWavBufferSamples);
      // have the sound object fill out the next sound
      // for this function the term 'packet' is a frame for uncompressed formats.
      // fortunately writeToBuffer returns the number of frames written
      // (This can be shorter for buffer playback at the end when the sound size doesnt
      // divide evenly into kSaveWavBufferSamples
      outNumPackets = [s writeToBuffer:(SInt16*)audio_buf->buffer samples:kSaveWavBufferSamples];
      numBytes = outNumPackets * sizeof(SInt16) * 2;

      writeBuffer.mNumberBuffers = 1;
      writeBuffer.mBuffers[0].mNumberChannels = clientFormat.mChannelsPerFrame;
      writeBuffer.mBuffers[0].mDataByteSize = numBytes;
      writeBuffer.mBuffers[0].mData = audio_buf->buffer;

      err = ExtAudioFileWrite(outFile, outNumPackets, &writeBuffer);

      doneSamples += outNumPackets;
      float prog;
      prog = doneSamples / ((float) totalSamples);
      [self performSelectorOnMainThread:@selector(updateProgress:)
                             withObject:[NSArray arrayWithObjects:[NSNumber numberWithFloat:prog],
                                                 progView, nil]
                          waitUntilDone:NO];
      if (err != noErr){
         NSLog(@"AudioFileWritePackets error %li", err);
         break;
      }
   }
   [audio_buf release];
   ExtAudioFileDispose(outFile);
   return err == noErr;
}

SoundCloud iOS integration gotchas

SoundCloud's iOS SDK

Last year when I was living in Berlin I had the opportunity to visit the SoundCloud campus where Henrik Lenberg and Eric Wahlforss were nice enough to talk to me about a range of things from Audacity to how SoundCloud functions as a development team. I’ve been to quite a few coorporate campuses and a few startup offices, but SoundCloud’s place really felt different. Located in the middle of the design-savvy Mitte district, with a couple floors of really fashionable looking workspaces and employees, I was kind of surprised to feel myself being a little self-conscious about my clothes at a tech company. SoundCloud’s public image is also of interest to me because a few people have told me soundcloud is ‘open-source’, and while that claim doesn’t hold or really even make sense, I understand the association one might make with these freemium services that are as inviting as soundcloud.

As it were, there was no work I could do with soundcloud at that time. However, I’ve been integrating SoundCloud into an iOS project lately, following their instructions from their one of the their many github pages (CocoaSoundCloudAPI). There are a few other projects like the CocoaAPIWrapper that seems to be a wrapper but I chose not use it because their SDK page points to the other.

The directions in the readme on their github are pretty good, but they didn’t get me up and running in 15 minutes, as would have been possible if I had these extra notes to get around some gotchas from their readme on CocoaSoundCloudAPI that I will share in case it saves other people time:

The “In XCode” section, Step 1 says :

1. Create a Workspace containing all those submodules added above.

This is glossing over a lot of stuff. First there are many ways to create an xcode workspace. Since they used the word ‘create’ I wrongly went to File->New->Workspace… .
The next nit is that they should specify to add just the project files instead of the subdirectories, which the wording sort of implies.
So the correct thing to do here is actually to just drag in to your existing XCode project (the one you want to add SoundCloud integration to) the 5 .xcodeproj files. The first one you add will ask you if you want to convert to a workspace, which is kind of like a Microsoft Visual Studio solution file to manage multiple projects. I just named my new workspace with the same file prefix as my xcode project and saved it in the same place the xcodeproj file was, and from now on I open the workspace instead of the project file.

To be honest I had no idea what an xcode workspace was, or how it interacts with project files in XCode. I recently upgraded to XCode 4, but I don’t remember seeing workspaces in XCode 2 or 3, and I did work with some projects that used multiple projects.

2.To be able to find the Headers, you still need to add ../** (or ./** depending on your setup) to the Header Search Path of the main project.

For their step 2 I had to use ./** in User Search Paths (not Header Search Paths) in the build settings for the project. If you are adding the submodules inside of the top level folder of the project, I don’t see how you would need ../**, since that goes outside all of your projects. But maybe new xcode projects do something weird like stick the project file two levels down, in which case you may need the ../**.

That one only held me up for a little while, but the next item in the “Usage” section tripped me up for much longer:

+ (void)initialize;
{
    [SCSoundCloud  setClientID:@""
                        secret:@""
                   redirectURL:[NSURL URLWithString:@""]];
}

Client ID and secret are provided to you when you register your app on soundcloud. The redirect URL is something you have to input.
At first I thought it was just some support webpage url, and that it was not crucial, so I just entered my webpage.
After finishing all their steps, the SoundCloud login view was appearing, but my sounds would not upload, with the log messages:

[NXOAuth2PostBodyStream open] Stream has been reopened after close
Cancelled

each time I would try to upload a wav track. Nothing would happen to my soundcloud account I was uploading to. So it seems although I could enter my password, the transaction was being cancelled midway.

I spent a long time looking into the first message, but it turns out that was completely unrelated. My app’s soundcloud upload now works, but that “Stream has been reopened after close” message is still there.
Eventually I looked at the redirect URI and realized it is some callback for the iOS app to handle messages from the soundcloud api, via some kind of URL handler method. I just changed the URL from http://company.com to myapp://soundcloud and then everything worked. I did not implement any kind of URL handler for this. This one’s kind of a mystery still for me, but maybe it is some common standard since I don’t deal with web services all that often.

One other item of note that will help newer git users who are already managing their projects with git save a few minutes of googling on how to integrate these submodules. Since the instructions tell you to do the ‘git submodule’ commands within your project folder, this creates something like a weak reference, so that when you commit and push, and then pull from another computer, you will have empty folders where these submodule would be. So you will need to run ‘git submodule init;git submodule update’ to fill them out on the other computer.

So, that’s it. All in all I’d say it took 4 hours for simple integration of a SoundCloud upload without knowing these details. Now that I know them, getting the same basic functionality should be pretty quick next time. <plug> So if you’re looking for someone to do contract work to integrate SoundCloud into your project who you don’t want to pay to learn the SoundCloud API, consider hiring me through Roughsoft.</plug>

Return of the unsigned int gotcha for the unary ‘-‘ operator

I came across another unsigned int issue today.
If you have an unsigned int and you negate it with the unary ‘-‘ operator, you will just get a really large unsigned integer value, and not a negative value.
From my last post about unsigned int, you can’t do

smallInt - unsignedInt

and expect a negative value because this converts the int to unsigned int.

That was for operators that worked on two values. Seems like the lower precedence of unsigned in still prevails with the unary ‘-‘ operator.
According to this stack overflow post the value you get is

UINT_MAX - origUIntValue + 1

That seems to make sense even though I haven’t thought it out. The lesson here is to cast your goddamn unsigned ints when dealing with things that require negative intness.
Here’s my lldb (xcode’s gdb replacement for llvm) output that inspired this post:

(lldb) print -m_labelTex.pixelsWide
(NSUInteger) $3 = 4294967232
(lldb) print m_labelTex.pixelsWide
(NSUInteger) $4 = 64
(lldb) print -((int)m_labelTex.pixelsWide)
(int) $6 = -64
(lldb) print 0 - m_labelTex.pixelsWide
(unsigned int) $7 = 4294967232
(lldb) print ((int) 4) - m_labelTex.pixelsWide
(unsigned int) $8 = 4294967236

The Class Class as an argument in objective C++

When you say the same word over and over it starts to become meaningless. Mix this together with metaclasses and reflection. Then take objective-C and add it to C++. You then have an objective-C @Class construct for forward declaration, the Class object, the NSObject class method, and the C++ class keyword. It was only a matter of time before this would drive somebody crazy and lead me to waste an hour of debugging the wrong thing.

Here is the definition of the class method in objective-C which you can include in objective C++ without warnings or errors:

- (Class)class

‘class’ doesn’t mean anything in regular c or objective-c, but my projects often involve C++ classes, so I’m usually working in .mm files.
This lowercase method name matches the C++ keyword, but is allowed by the compiler in objective-c++, presumably because the compiler looks at context to apply c keywords to.

However this is not allowed:

-(void)giveMeAClass:(Class)class

I spent about an hour and a half trying to figure out how to forward-declare or include the class declaration. Because this was in a class header that also had NSObject, which declares the ‘class’ method above (which uses Class as a return value), this went against my intuition but unfortunately when going into debugging mode, sometimes I (and admit it, you too!) get fixated on one path and make some very silly decisions along the way that end up preventing a breadth first approach that would get the solution faster.

I did learn some interesting things along the way. Such as, Class is a typedef pointer to a struct, which is why it is one of the only (possibly the only) objective-c object that doesn’t use an asterix when passing it around.

But the real problem was not ‘Class’. It was ‘class’. ‘class’ is a c++ keyword, as mentioned earlier, and you cannot use it for parameter names, only for method names. I should have seen this, but the ‘class’ method case through me off.

The correct solution (as NSObject also uses) is just to call your Class variable names aClass.

-(void)giveMeAClass:(Class)aClass

Arg.

std::vector size() method and gotcha with unsigned vs signed

Guess who wins in signed vs unsigned integer conversion? I guessed wrong today, or rather, I’ve been assuming wrong for quite a while.

consider this example:

   std::vector<SomeClass*> myVec;
   int countdown = -1;
   // ... (fill the vector)
   if (countdown < myVec.size()) 
      countdown++;  // this will never get hit

myVec.size() returns size_t, which is an unsigned int and countdown is an int, so to compare these two, the compiler will implicitly cast. You will get warnings when you do this kind of comparison, so listen to them. I assumed in the conversion of primitive values, signed int has precedence. That line casts countdown to an unsigned int, meaning that it has a very high value (0xfffffff), so the conditional is always false.

So the if statment was evaluating to

   if ((unsigned int) -1 < 1) // pretend there is 1 element in the array 
      countdown++;

which is

   if (0xFFFFFFFF < 1) // false (same as  UINT_MAX < 1)
      countdown++; 

The correct code then is:

   if (countdown < (int) myVec.size()) 
      countdown++;  // this will now be run

Here are the rules for implicit type conversion from the C99 standard:

6.3.1.8 Usual arithmetic conversions

If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

I don't understand the statement after the bold section. But it is clear that unsigned types have precedence over same ranked types.