Category: iOS


Been reading http://www.codinghorror.com/blog/2012/07/new-programming-jargon.html, and thinking about how many I’ve dealt with and how many I’ve written.

When a project grows, there is going to be an unavoidable accumulation of tech debt as methods and objects are moved and expanded to handle new requirements.  Eventually you get to the point where trying to change anything results in a plethora of bugs and crashes.  You’ve hit the refactor zone.

The biggest challenge is convincing project management that the refactor is needed, since changes to the “plumbing” don’t advance the revenue-generating features that they want to get out ASAP.  Even worse, the refactor will take time, since you have to retest every bug fix and feature that got the code here in the first place.

Best bet is to follow the Agile Methodology of Constant Refactoring – whenever you see something that doesn’t look right, fix it.  This way net tech debt grows more slowly, or even shrinks, and Product isn’t counting the opportunity cost of your refactor since you’re still adding features.

Also remember, “Clever” is a four-letter word.  Doing something weird and clever that saves 3 machine instructions or lets you write an entire function in one line of code will only bite you later.  Not only won’t anyone else understand it, but when you have to change it in 6 months you’ll wonder what the f— you were thinking at the time.  Trust me- been there, done that, bought the t-shirt at thinkgeek.com .

Hallway Lights

My Hallway Lights, controlled by three MOSFETs and an Arduino

On Teaching

This weekend I attempted to teach iOS programming to a class of 11 people. Although I had specified that people should have a mac with Snow Leopard and XCode installed, and familiarity with the C programming language, 8 of the 11 present did not. The writeup for the course said it was great for beginners, no prior knowledge assumed, and I didn’t read over the writeup before it went up.
A true iPhone dev camp costs $1000+, is an entire intense weekend (16 hours+), and is fully staffed. I was charging $75 for 4 hours – I couldn’t possibly offer the same level of education.
I’m going to make it up to the 6 people who did not demand their money back, with an intro to C, followed by a re-presentation of the iphone material, as Beta version two.

There’s an old engineering proverb that a bad programmer does the same task over and over, a good programmer automates the task, and a great programmer makes it unnecessary.

In the iPhone world, the Bad Programmer overrides -(void)layoutSubviews for every view class to customize the layout the way the client wants it.

The Good Programmer figures out the commonalities and creates a view class with a layout function that does things the way the client wants, and inherits all the program’s views from it.

The Great Programmer figures out how to make UIView’s layout do what the client wants directly, and doesn’t fight Cocoa to make it work.

In last week’s episode we single stepped through the assembly code generated by the compiler, to see what happens when a function is called.  We learned that selectors are just C strings, how they are stored and how to find them in the code.

Now we will use otool to explore the program as it’s saved on disk, so we can find the selector strings themselves.  Compile the helloworld example from the previous episode.  Click the disclosure arrow by Product.  Open a terminal window, then type “cd ” (note the space following, it’s important) in the terminal and drag helloworld.app to the terminal to paste in the path.

Enter “otool -o helloworld | less” on the terminal.  What you see is a list of all the objective-c classes defined in the executable, along with the protocols they adopt, and the ivars and methods they define.  What you’re looking at here is the __OBJC segment of the executable.

Have a look at the MACH-O specification.  Programs on disk are divided into segments.  For simple C programs, there’s just __TEXT and __DATA.  The __TEXT segment contains the assembly language from the compiled code and immutable constants.  You can look at the generated code with otool -tV executable_file | less.  If it’s a C++ program, also pipe it through c++filt to demangle the names.  By using less’s search feature or the -p flag to otool you can find the compiled code corresponding to any function in your program.

The __DATA segment contains all the mutable constants declared in your program.  Integer and shorter constants are just immediates in the code, e.g. MOV r0,7, but strings, arrays, and structs are put in the data segment.

Which leads us to sections.  A segment is broken up into one or more sections.  For example, the constant C strings are stored in the __cstring section of the __TEXT segment.  Note that segment names are all caps, and section names are all lower case.

Enter otool -s __TEXT __cstring helloworld.  If you compiled for both armv6 and armv7 you’ll notice that each architecture has its own cstring section.  You’ll also notice that the output is in hex, with the load address as determined by the linker on the left, then the string data as 4-byte words.  Add the -v flag and you’ll see the output as strings.

The segment and section for the selectors is __TEXT __objc_methname, so if you enter otool -arch armv7 -v -s __TEXT __objc_methname helloworld you’ll see all the selectors that your program calls as strings.

otool -arch armv7 -v -s __TEXT __objc_methname helloworld
helloworld:
Contents of (__TEXT,__objc_methname) section
000036ec  release
000036f4  init
000036f9  alloc
000036ff  window
00003706  viewController
00003715  setWindow:
00003720  setViewController:
00003733  dealloc
…

Now that we know what selectors our program calls, the next step is to determine which ones correspond to illegal calls.

Tune in next week for the exciting conclusion!  In the mean time, read up on the mach-o specification and otool, there is a lot of interesting information you can query out of a program.

Forbidden APIs, part 2/n

Riddle me this…

When last we left our heroes, they were about to start the debugger and probe the mysteries of the Objective-C objc_msgSend function.

Pull down and build the hello world project from github.  You can run it in the simulator if you don’t have an iDevice or developer license.

Put a breakpoint in the first source line of touchesEnded in helloWorldViewController.m.  Make sure the build settings are set for Simulator and Debug.  Click “Build and Debug” and wait for the simulator to come up.  Click in the window on the simulator or device.  The breakpoint should hit, and you’ll see something like this:

Right after the first breakpoint

We’re now going to leave the comfy confines of the source debugger behind, and dive into the world of assembly.  If you don’t see the disassembly pane on the bottom right of your debugger window, select Run→Debugger Display→Source and Disassembly, as shown:

We’ll start with the standard C call to NSLog, to introduce you to the Application Binary Interface, or ABI.  The ABI defines the way that high level languages should organize the compiled code, so any compiled code can be linked together, and external libraries will work with your code.

The first thing we need to do is pass the parameters to the function that will be called.  The first four 32-bit parameters are placed into processor registers for speed.  Registers R0,R1,R2, and R3 are used.  Any past the first four are saved to memory in an area called the stack.  The stack starts at the highest address and works its way down in descending addresses.  There are two parameters to this invocation of NSLog: the format string, @”touchesended: %@” and the first argument, event.  The string constant is passed as a pointer to constant memory, and event relative to the stack pointer, since it’s a parameter of the current function.

The instruction that actually calls the function is BLX – that stores the return address in a special register called the Link Register – this prevents memory reference if the return address does not need to be saved.  Note that I switched from the simulator in the above two images to actual device debugging.

Let’s delve into how the constants are figured out.  They are stored right after the compiled code for each function, in the TEXT segment, which is the same one the code is in.  They are specified as offsets from the current value of the program counter, since the compiler doesn’t know what address the function will reside at after it’s linked.  All it can do is insert an offset once the function is fully compiled.  Switch to the debug console (Command-Shift-R) and enter the “si” command twice.  The cursor on the assembly side should now be on “mov r0,r3”.  We need to add the PC to the value of R3 for the same reason as we had to load R3 from an offset of the PC.  The linker places a string table in a completely different section from the compiled code, the offset to the string in the table is what is placed at the address pc+#648 in the above code, by the linker.

In the top right pane, where local variables and parameters are displayed, you can scroll to the bottom and see a category labelled “Registers” with a disclosure arrow.  Click that arrow and look for $r3.  That is pointing to the offset to the string in memory.  In this case $r3 had the value 0x4040.

If you select Run→Show→Memory Browsers, you can look at the contents of memory.  Enter 0x4040 into the address field.  Set the word size to 4, since we are looking at an “indirect” memory offset.  Notice that the contents of address 0x4040 is 0x3e2184e0.  Let’s follow that offset, which takes us to the program’s String Table.

That is what a NSString looks like in memory.

Back to function calls and the ABI.  After the BLX instruction, we’re in the function prolog.  This sets things up so the function itself has storage space to run, and can call other functions and ultimately return without messing up the memory of its calling function.  We need to save all the processor registers, including the Link Register, to the stack.  If you compile with optimization, only the registers that are affected by the function will be saved to the stack.  After that, a register called the Frame Pointer is set to the current value of the stack pointer (the frame pointer is always saved before this).  This is so the stack pointer can be restored at the end of the function, in a section called the epilog.  Then the number of bytes necessary for the function’s local variables, including those in subordinate scopes, is subtracted from the stack pointer.  After that, the function itself runs.

The function epilog then undoes what the prolog does, so when control is transferred back to the calling function, everything is restored to what was there before the function call, except for the registers which are known to contain the return value.

So NSLog is called, and has no return so we don’t check anything.  Let’s step ahead to an objective-c function, so we can see the similarities.  Place a breakpoint at the first objective-c call, which will be the [[CATransition alloc] init] (the first will be the alloc), and hit Continue.

Let’s have a look at the assembly language here.

Disassembly of objc_msgsend call

We can see that this is a call to objc_msgsend(self, cmd).  There are two arguments to this function, stored in r0 and r1.  R0 will contain the objective-c class CATransition, and r1 will contain the selector for the alloc method.  The selector is what we’re looking for, since that is how we’ll put together our scanner for Forbidden APIs.  Let’s step ahead to just before the call itself, and have a look from the debugger console:

So what we get from this is that $r0 is an object, the CATransition class, and $r1, the selector is simply an old-style C char*!  This makes things a lot easier for us – all we have to do is find the selectors and basically grep out the text of the ones we want to avoid!

Whew!  That was quite a bit – I hope I didn’t leave too many people behind.  In my next post, which is the first of the New Year, I will talk about the organization of a program on disk, and where those selectors are.

References

Mach-o EABI

iOS ABI

ARM’s EABI

Forbidden APIs, part 1

Anyone who’s submitted an application to Apple’s app store knows that your app will be rejected if it uses any undocumented APIs.  Which is a shame, since there is lots of useful stuff in there.

I recently had to resubmit an app because of a forbidden API, and started wondering if there were a way to detect these things automatically, and save a bunch of heartache.

Continue reading