Latest Entries »

Been reading http://www.codinghorror.com/blog/2012/07/new-programming-jargon.html, and thinking about how many I’ve dealt with and how many I’ve written.

When a project grows, there is going to be an unavoidable accumulation of tech debt as methods and objects are moved and expanded to handle new requirements.  Eventually you get to the point where trying to change anything results in a plethora of bugs and crashes.  You’ve hit the refactor zone.

The biggest challenge is convincing project management that the refactor is needed, since changes to the “plumbing” don’t advance the revenue-generating features that they want to get out ASAP.  Even worse, the refactor will take time, since you have to retest every bug fix and feature that got the code here in the first place.

Best bet is to follow the Agile Methodology of Constant Refactoring – whenever you see something that doesn’t look right, fix it.  This way net tech debt grows more slowly, or even shrinks, and Product isn’t counting the opportunity cost of your refactor since you’re still adding features.

Also remember, “Clever” is a four-letter word.  Doing something weird and clever that saves 3 machine instructions or lets you write an entire function in one line of code will only bite you later.  Not only won’t anyone else understand it, but when you have to change it in 6 months you’ll wonder what the f— you were thinking at the time.  Trust me- been there, done that, bought the t-shirt at thinkgeek.com .

Hallway Lights

My Hallway Lights, controlled by three MOSFETs and an Arduino

The Assignment From Hell

Disclaimer: This is not about any specific employer I’ve ever worked for; there have been aspects in all of them.  Any resemblance to past, present, or future employers is purely intentional.

Anyone who’s worked in the industry for more than a year or two has had one.  They’ve had to maintain or enhance a code base that was a heaping stinking pile of, er, sunshine.  The manager is putting lots of pressure to fix it now, there’s a year’s worth of features and enhancements, and only 6 months to do them.  The pressure varies between abuse and outright attacks on your ability and work ethic.

Welcome to the Assignment From Hell.  You’ve probably inherited it from the last person who worked on it, who was fired or left in a hurry.  Now you know why.

The program crashes constantly, to call the code spaghetti code is an offense to spaghetti, and things which could have easily been calculated or put in a database are hardcoded all over the place.  The procedure to add a common feature involves modifying 10 different source files, in 20 different data structures, that hold overlapping but contradictory information.

Your every instinct screams that you need to tear this garbage out and rewrite it, which you could do in half the time it will take to make it work, but there’s never the time or resources to do that, and you can never get buyin from the stakeholders anyway.

Meantime your self-esteem is sub-basement – simple tasks that should take a day are taking you a week, and this is being pointed out to you starting on day 2.  And what about the backlog that’s now 5 days late?

So what to do?

Step 1.  Breathe.  The unreasonable expectations of the powers that be are how this situation got to this point in the first place.  The previous developer or developers who caused this mess probably weren’t given nearly enough time, and just took on technical debt to get paid at all.  Even that obnoxious manager is likely getting pressure from their bosses to get this thing working, and doesn’t want to admit that their inability to push back in the first place is what got them into this mess.

You’re probably someone they see as competent, so they’re really hoping you can fix it.  The hardest challenge for me through my career is not taking the criticism personally.

Step 2.  Communicate.  Don’t point out that the code base is crap – they know it, and are in a level of denial that would give Freud a headache.  Come up with a plan based on reality, and point it out respectfully – respecting past and present management and engineering.  Be prepared to defend the plan and to push back – hard.  Try to find allies from the Old Times who agree with you and have pull with management.

Step 3.  Breathe.  Again.  Like I said, I take this stuff too personally.

Step 4.  Be honest.  Don’t agree to fix anything in a shorter timeframe than your gut tells you, doubled.  It just won’t happen, and you’ll only look worse when you can’t deliver.  Speaking reasonably and respectfully will get you respected in return.  If you need additional resources, say so, and back it up with hard facts.

Step 5.  Be honest with yourself.  You won’t be able to singlehandedly fix an entire organization that has grown up around this method of doing business.  Do your best to get through it alive.

Happy 2012!

Let’s hope the last year of the Mayan calendar is better than the previous!  And that someone is busy carving another 35,000 year calendar.

It’s been a while

Life has become interesting since we last spoke!

I wrote another iPhone app, you’ll be able to buy it on the App store next week, search for Top of the Rock.  I won’t get anything from sales, but it’s a pretty cool app!

Meantime, my responsibilities at AOL increased, I’m now working on the Huffington Post iPhone and iPad apps, the recent updates to those apps are in part due to my work.  I’m going to start looking at the server side, to see what I can improve there.

Which brings me to my next topic set, the Server Side.  This consists of quite a few subtopics which I hope to touch on:

  1. Server Architecture.  There are a lot of wheels out there already invented.  Except for some weird specialty ones, wheels are round, and have an axle.  Most servers have a database full of some sort of content, some business logic which is mostly concerned with categorizing and authenticating to content, and a bit more business logic for adding content to the database.  Content may include professionally authored and/or curated data, as well as user-provided comments and questions.
  2. Business Logic.  This consists of the nontrivial business logic once you get past authentication and generation.  A server may have to process a lot of data, and it’s important to ensure that this data moves through as quickly as possible.

It is the business logic that is most fascinating.  Sure, you need to make decisions as to which wheel you want to use, and how you want to spread out your server farm, database replication, etc.  But these are already solved problems.  One from column A, …

I’m working on a system that analyzes and optimizes stock portfolios.  Imagine you have 5 years’ worth of data for the entire S&P 500.  You first generate metrics for the 500 securities that comprise it.  That means iterating over about 1260 sets of statistics per security.  Then you need to compare each of the 500 against its 499 counterparts to come up with comparison metrics.  Taking advantage of the fact that m(a,b) = m(b,a), and that fact that I don’t need to recalculate m(a,a)  I have to perform 124,750 comparisons to get the final set of metrics.

My answer to this was to use OpenCL.  This is a system that allows you to utilize the 64-1600 cores provided by the GPU to perform calculations in parallel (or at least better than linear) time.  Xcode provides an OpenCL framework that makes integration a bit easier.

Future posts will document my explorations and discoveries of this platform.

On Teaching

This weekend I attempted to teach iOS programming to a class of 11 people. Although I had specified that people should have a mac with Snow Leopard and XCode installed, and familiarity with the C programming language, 8 of the 11 present did not. The writeup for the course said it was great for beginners, no prior knowledge assumed, and I didn’t read over the writeup before it went up.
A true iPhone dev camp costs $1000+, is an entire intense weekend (16 hours+), and is fully staffed. I was charging $75 for 4 hours – I couldn’t possibly offer the same level of education.
I’m going to make it up to the 6 people who did not demand their money back, with an intro to C, followed by a re-presentation of the iphone material, as Beta version two.

There’s an old engineering proverb that a bad programmer does the same task over and over, a good programmer automates the task, and a great programmer makes it unnecessary.

In the iPhone world, the Bad Programmer overrides -(void)layoutSubviews for every view class to customize the layout the way the client wants it.

The Good Programmer figures out the commonalities and creates a view class with a layout function that does things the way the client wants, and inherits all the program’s views from it.

The Great Programmer figures out how to make UIView’s layout do what the client wants directly, and doesn’t fight Cocoa to make it work.

In last week’s episode we single stepped through the assembly code generated by the compiler, to see what happens when a function is called.  We learned that selectors are just C strings, how they are stored and how to find them in the code.

Now we will use otool to explore the program as it’s saved on disk, so we can find the selector strings themselves.  Compile the helloworld example from the previous episode.  Click the disclosure arrow by Product.  Open a terminal window, then type “cd ” (note the space following, it’s important) in the terminal and drag helloworld.app to the terminal to paste in the path.

Enter “otool -o helloworld | less” on the terminal.  What you see is a list of all the objective-c classes defined in the executable, along with the protocols they adopt, and the ivars and methods they define.  What you’re looking at here is the __OBJC segment of the executable.

Have a look at the MACH-O specification.  Programs on disk are divided into segments.  For simple C programs, there’s just __TEXT and __DATA.  The __TEXT segment contains the assembly language from the compiled code and immutable constants.  You can look at the generated code with otool -tV executable_file | less.  If it’s a C++ program, also pipe it through c++filt to demangle the names.  By using less’s search feature or the -p flag to otool you can find the compiled code corresponding to any function in your program.

The __DATA segment contains all the mutable constants declared in your program.  Integer and shorter constants are just immediates in the code, e.g. MOV r0,7, but strings, arrays, and structs are put in the data segment.

Which leads us to sections.  A segment is broken up into one or more sections.  For example, the constant C strings are stored in the __cstring section of the __TEXT segment.  Note that segment names are all caps, and section names are all lower case.

Enter otool -s __TEXT __cstring helloworld.  If you compiled for both armv6 and armv7 you’ll notice that each architecture has its own cstring section.  You’ll also notice that the output is in hex, with the load address as determined by the linker on the left, then the string data as 4-byte words.  Add the -v flag and you’ll see the output as strings.

The segment and section for the selectors is __TEXT __objc_methname, so if you enter otool -arch armv7 -v -s __TEXT __objc_methname helloworld you’ll see all the selectors that your program calls as strings.

otool -arch armv7 -v -s __TEXT __objc_methname helloworld
helloworld:
Contents of (__TEXT,__objc_methname) section
000036ec  release
000036f4  init
000036f9  alloc
000036ff  window
00003706  viewController
00003715  setWindow:
00003720  setViewController:
00003733  dealloc
…

Now that we know what selectors our program calls, the next step is to determine which ones correspond to illegal calls.

Tune in next week for the exciting conclusion!  In the mean time, read up on the mach-o specification and otool, there is a lot of interesting information you can query out of a program.

Like this Blog?

The more people my Dashboard shows me are reading this, the more time I will spend crafting interesting articles, and the more likely I am to buy the “no ad” option!

My plan is to blog about things I’ve learned in my 20+-odd year career as a software engineer, as well as my excursions into the Arduino, embedded microcontrollers, and electronics.

Tell your friends, or tell me!

thanks, Rob D

Forbidden APIs, part 2/n

Riddle me this…

When last we left our heroes, they were about to start the debugger and probe the mysteries of the Objective-C objc_msgSend function.

Pull down and build the hello world project from github.  You can run it in the simulator if you don’t have an iDevice or developer license.

Put a breakpoint in the first source line of touchesEnded in helloWorldViewController.m.  Make sure the build settings are set for Simulator and Debug.  Click “Build and Debug” and wait for the simulator to come up.  Click in the window on the simulator or device.  The breakpoint should hit, and you’ll see something like this:

Right after the first breakpoint

We’re now going to leave the comfy confines of the source debugger behind, and dive into the world of assembly.  If you don’t see the disassembly pane on the bottom right of your debugger window, select Run→Debugger Display→Source and Disassembly, as shown:

We’ll start with the standard C call to NSLog, to introduce you to the Application Binary Interface, or ABI.  The ABI defines the way that high level languages should organize the compiled code, so any compiled code can be linked together, and external libraries will work with your code.

The first thing we need to do is pass the parameters to the function that will be called.  The first four 32-bit parameters are placed into processor registers for speed.  Registers R0,R1,R2, and R3 are used.  Any past the first four are saved to memory in an area called the stack.  The stack starts at the highest address and works its way down in descending addresses.  There are two parameters to this invocation of NSLog: the format string, @”touchesended: %@” and the first argument, event.  The string constant is passed as a pointer to constant memory, and event relative to the stack pointer, since it’s a parameter of the current function.

The instruction that actually calls the function is BLX – that stores the return address in a special register called the Link Register – this prevents memory reference if the return address does not need to be saved.  Note that I switched from the simulator in the above two images to actual device debugging.

Let’s delve into how the constants are figured out.  They are stored right after the compiled code for each function, in the TEXT segment, which is the same one the code is in.  They are specified as offsets from the current value of the program counter, since the compiler doesn’t know what address the function will reside at after it’s linked.  All it can do is insert an offset once the function is fully compiled.  Switch to the debug console (Command-Shift-R) and enter the “si” command twice.  The cursor on the assembly side should now be on “mov r0,r3″.  We need to add the PC to the value of R3 for the same reason as we had to load R3 from an offset of the PC.  The linker places a string table in a completely different section from the compiled code, the offset to the string in the table is what is placed at the address pc+#648 in the above code, by the linker.

In the top right pane, where local variables and parameters are displayed, you can scroll to the bottom and see a category labelled “Registers” with a disclosure arrow.  Click that arrow and look for $r3.  That is pointing to the offset to the string in memory.  In this case $r3 had the value 0×4040.

If you select Run→Show→Memory Browsers, you can look at the contents of memory.  Enter 0×4040 into the address field.  Set the word size to 4, since we are looking at an “indirect” memory offset.  Notice that the contents of address 0×4040 is 0x3e2184e0.  Let’s follow that offset, which takes us to the program’s String Table.

That is what a NSString looks like in memory.

Back to function calls and the ABI.  After the BLX instruction, we’re in the function prolog.  This sets things up so the function itself has storage space to run, and can call other functions and ultimately return without messing up the memory of its calling function.  We need to save all the processor registers, including the Link Register, to the stack.  If you compile with optimization, only the registers that are affected by the function will be saved to the stack.  After that, a register called the Frame Pointer is set to the current value of the stack pointer (the frame pointer is always saved before this).  This is so the stack pointer can be restored at the end of the function, in a section called the epilog.  Then the number of bytes necessary for the function’s local variables, including those in subordinate scopes, is subtracted from the stack pointer.  After that, the function itself runs.

The function epilog then undoes what the prolog does, so when control is transferred back to the calling function, everything is restored to what was there before the function call, except for the registers which are known to contain the return value.

So NSLog is called, and has no return so we don’t check anything.  Let’s step ahead to an objective-c function, so we can see the similarities.  Place a breakpoint at the first objective-c call, which will be the [[CATransition alloc] init] (the first will be the alloc), and hit Continue.

Let’s have a look at the assembly language here.

Disassembly of objc_msgsend call

We can see that this is a call to objc_msgsend(self, cmd).  There are two arguments to this function, stored in r0 and r1.  R0 will contain the objective-c class CATransition, and r1 will contain the selector for the alloc method.  The selector is what we’re looking for, since that is how we’ll put together our scanner for Forbidden APIs.  Let’s step ahead to just before the call itself, and have a look from the debugger console:

So what we get from this is that $r0 is an object, the CATransition class, and $r1, the selector is simply an old-style C char*!  This makes things a lot easier for us – all we have to do is find the selectors and basically grep out the text of the ones we want to avoid!

Whew!  That was quite a bit – I hope I didn’t leave too many people behind.  In my next post, which is the first of the New Year, I will talk about the organization of a program on disk, and where those selectors are.

References

Mach-o EABI

iOS ABI

ARM’s EABI

Follow

Get every new post delivered to your Inbox.