Category Archives: Programming

Milestone-based project fees as an alternative to hourly rates

A conversation with a client (who may be reading this now) prompted me to think over my payment structure.

My current rates for consulting are almost all hourly, even though I greatly dislike charging hourly rates: first, they’re predicated on the amount of time I’ve worked rather than what I’ve produced (and I’ve never held myself to a proficiency standard that said mere effort without results was ok). Second, they penalize me for working quickly. Charging by “lines of code” has similar problems, for conciseness instead of time: I can write one line of Perl that does what may take 10 in C++, for instance, but that one line may take as long as the 10 did to write.

I used to charge per-project rates. Unfortunately, aside from the difficulty of estimating an entire project in advance, my past experience has demonstrated why it is a truism never to accept a per-project rate that is less than four times what you think the project is worth: extra features.

It is common for features out of the initial requirements specification to crop up midway through a project. On a fixed budget, you are placed in the difficult position of refusing to implement the features (which annoys your clients), modifying your estimate (which might be difficult or impossible), or performing the extra work for free (which isn’t fair to you).

The model I’m now considering is a milestone-based model, under which milestones are set with particular deadlines and fees. From the time a milestone is agreed upon to the milestone’s completion or deadline, features leading up to it are frozen. Additional features can be requested, but these become part of the next milestone and factor into the cost estimate. This way, the client gets what he wants, you get compensated for the extra features, and your development time remains predictable. Everyone’s happy.

A nifty little Perl trick: default variable values with ||

I’ve used Perl for about 6 years now and I’m still running across new tricks that make my code more concise. Here’s one I just recently discovered:

It’s well known that you can use the “or” operator (that is, the literal word “or”) to take actions such as “die” if a command fails or returns a “false” or undefined value. Well, you can use the “||” (logical “or”) operator to give an assignment a default value (presumably the return value of the evaluated code fragment). For example, we can define variables in the following way:

my $var = $var1 || $var2;

In C++, this would return a boolean value. In Perl, it’s going to evaluate $var1 and either assign that value if it’s a “true” value or assign the value of $var2 if it isn’t. This is handy, because otherwise we’d be writing it my $var = ($var1) ? $var1 : $var2;, which is just klutzy and harder to read, to boot.

Note that using this as a boolean value still works: if $var1 is a true value, $var will also be the same true value. If $var1 is false and $var2 is true, $var will be the (true) value of $var2. If both are false, $var is going to also have a false value (probably undef, perhaps the value of $var2; the result is the same either way). This corresponds to the truth table for “or”:

V1 V2 V1 || V2
T T T
T F T
F T T
F F F

Finding good programmers

I just read an interesting article on Slashdot (made moreso by the fact that I have an interview with a company looking for the very best developers later today) about how a company can go about finding superstar programmers. As I expected, the discussion veered quickly into what I now call “the scientist bias”. That is, people began talking about how to set up tougher interviews, screening more rigorously, not checking past work, etc. etc. It’s the same sort of closed-mindedness that I lamented about when discussing peer review (and I still lament, even though 4/4 of my recently submitted papers were accepted). Just as you cannot hope to discover great ideas by being closed-minded, you can’t hope to discover great people by being closed minded.

The easiest way to find good programmers, I would think, would be to first admit that, like peer reviewers, you don’t know precisely what you’re looking for. Just as one scientist unfamiliar with the work of another can’t objectively judge the full ramifications of the other’s science, one programmer cannot objectively judge the programming ability of another. Pretending that you can introduces all sorts of ascertainment biases into the loop, not the least of which being that some people just don’t interview well. I know – I’m among them.

Once you admit that, look at the candidate’s past work. Run it. Try to break it. If you’re feeling adventurous, sic QA on it and see what they find. See (or ask) how long it took to make, how much maintenance had to go into it, and how efficient it is. Don’t neglect the circumstances under which it was written, either – I personally wrote some very professional code when I was 12 with no formal CS education, which I think is quite an achievement. I couldn’t have used knowledge I did not possess, but I made up for it with sheer ability. Not many people independently rediscover alpha-beta pruning at 14, for instance 🙂

Next, if you want to see how the person can actually code, give him a short assignment and ask him to hand it in the next day, for example. This models a real workplace condition – the skills demonstrated in the assignment are the same that will be demonstrated on the job.

By the time you bring someone in for an interview, you should basically have made a decision to hire already, and the final interview would then consist of making sure the company and candidates were fits for each other. If you ask any sort of technical question, ask it in full and let the candidate work. If you jump in every 5 seconds to change the requirements, it’s guaranteed to screw up the candidate. This happened to me during my final interview at Google and really cast doubt on my view of the whole process. For example:

“Write an algorithm to reverse a string”
*Writes about 80% of it, when suddenly…*
“Do it with just one byte of storage”
*Erases and starts over, getting about 50% of the way*
“Making just one pass over the array”
*Erases and goes again*

etc.

I think that interview actually went well, as I was able to perform everything the interviewer asked without hesitation, but it was a very frustrating process, and prevented me from forming any lasting “big picture” image of the problem.

Anyway, those are just my thoughts on the process. Right now I have to put up with whatever techniques people decide to use, but when I hire people, I’m going to do it my own way.

(Update: And I am completely amazed, but the average programmer apparently only codes 76.7 lines of code per hour. This isn’t a measure of quality, just speed, but now I understand what the mythical man month means about the best developers being orders of magnitude better than the worst.)

Metasquares is back (sort of)

Some of you may know me as the author of a game client called MetaSquarer (which I’ve maintained for over a decade now! Phew!)

The client was a response to the removal of MetaSquares, the original game upon which it is based, from AOL in 1997. Approximately two years ago, I received an email from Scott Kim, the designer behind the original game (and a very interesting person to talk to) telling me the company had plans to re-create the game. They asked me to transfer the domain “metasquares.com” to them so they could use it for the re-launched game, and, because I felt indebted to them, I did so while moving the existing site to a new domain name (metasquared.com). They wanted me to remain active in the development process, and mentioned that I was to become their contact on server-related matters. They outsourced the actual development work to Russia, which I was always a bit puzzled about, considering that I had written all of the algorithms already (indeed, I wrote the whole client, network play and all, at the ripe old age of 12) and could very easily port them to any language they wanted. I did make some of these algorithms, such as my O(n) square finding algorithm, public to allow them to easily leverage the work.

Anyway, the game just re-launched, sort of. They did it in Javascript using AJAX and it currently works on Firefox, Safari, and probably some other browsers. It’s meant to be played on the iPhone. Since I don’t have an iPhone, I don’t know how that’s supposed to work, but I guess you simply browse to the site and play. Anyway, the client is at http://www.metatools.com/iphonemsq.

The client they have up there has some limitations: primarily, it only supports solo play against a rather weak computer opponent, although I was told that a version for the Mac (presumably a standalone app, although you could do multiplayer using AJAX too) that supports multiplayer games would be out soon. Still, if you’re looking for an officially endorsed client, there it is.

Once the client matches the features in MetaSquarer, I will be taking MetaSquarer offline, as there will be no further need for it. However, I will be sure to give plenty of advance notice of this, so as not to catch anyone by surprise. Maybe I’ll even opensource it.

Another Vista Bug

The beta hasn’t ended just because of the release! 🙂

If you keep refreshing explorer while a file in the folder you’re viewing is growing, the file size at the bottom of the status bar grows much larger and much faster than the actual contents of the folder.

Simplifying the closed form of linear regression

Here’s one that must already exist:

Linear regression is given by the closed matrix form:

θ = (XTX)-1 XTY.

We have a rule we can apply here: (AB)-1 = B-1 A-1
Which gives us: θ = X-1 (XT)-1 XT Y.

But the transpose and its inverse cancel, yielding the identity matrix when multiplied.
This leaves us with:

θ = X-1 Y.

Now, surely there must be some reason that it’s not taught this way. Is it because this requires X to be a square matrix? Can we use a pseudoinverse instead to solve this?

Update: Indeed we can. Now I understand why it’s presented that way; the pseudoinverse is given by (XTX)-1 XT! Therefore, we can also just say the optimal parameters are given by X+Y, where X+ is the pseudoinverse.

“Reinvention is talent crying out for background”, I suppose.

A Digg Link

Signs you’re a bad programmer and don’t know it:
http://damienkatz.net/2006/05/signs_youre_a_c.html?repeat

9 of the 12 things in this list applied to my undergraduate algorithms professor, who even took it beyond his own bad programming and imposed things like function length restrictions on the class.

Like everything else, the key to being a good developer is to figure out what actually works for you and what is just fluff, then learn what works. Good Programmers realize that things like design patterns and languages are just tools. Different tools apply to different jobs. Design patterns especially do not apply unless there really is a canned solution to the class of problems you’re working on, because otherwise you’re imposing a structure on the solution that it may not necessarily have. If you can’t reasonably justify your choice of tools, you are not solving the problem correctly.

Researcher's Golden Rule no. 2

These rules are good research conventions that I’ve adopted based on both their intuitive appeal and the observed consequences of not applying them. The first is “it always needs more study”, which refers both to the perfectionism that can keep people from ever accomplishing anything as well as the convention of stating this in papers. I only intended one, but then I realized that there are a number of unstated rules that lead to good research productivity. That said, the second can be given as follows:

“Don’t be sloppy.”

The methodology / algorithm should be clean and easy to understand. So should the way the data is formatted. Make sure that the function of each file is immediately clear and that the entry point to running the experiments is easy to spot (something like run_classification_experiments.m is a good idea). Program generically, as your dataset and analysis will probably change at some point. Don’t program only for yourself, because at some point, someone else is going to need to run your analysis. That person will not think highly of you if you make his life difficult. Don’t program unless you know how to program well; it is a vital skill in computer science research and you should be as proficient in it as a professional programmer would be.

I spent the majority of this weekend wrapping data up from over a thousand different .hdr / .img files into one matlab “data” structure. The fields of the structure correspond to properties of the data. For example:

data.Source //”DVD 1″
data.task //”Left Squeeze”
data.subject //”John Doe”
data.volume //Raw image data.
data.foregroundIndices //Indices into volume that represent foreground voxels.
data.wavelets //Wavelet descriptors of the volume.

etc.

This is neat. Any researcher just joining the project could easily follow what is going on in this structure.

Sometimes I miss being a programmer…

As I sit 30 pages deep into the maze of English, mathematics, and mathematical English (which is a language in its own right) that is my dissertation, I can’t help but reminisce about the days when I just used to code all day. It didn’t matter what I was writing; every project became a labor of love, though it was eked out in a battle for mastery against a mercilessly correct machine and the equally merciless ambiguities of the human mind. Receiving an interview feedback form from Google brought me back for a time, forced me to remember all of my victories – and defeats – as I tried to impart the thoughts that flitted through my mind at the interview.

I’ve spoken of my childhood already: of the early victory that was Metasquarer, of the elation and superlative mastery that breathed life into Final Aegis, and of the zero-sum victory in the PlanetSourceCode contest that firmly embedded a non-competition principle into my code of ethics.

My primary thoughts today did not trace over those paths so much as my more recent evolution as a programmer: the culmination of my long years of study, the final self-acknowledgment of mastery (I’m always the last one to), and the associated conclusion: it was no longer a challenge worthy of being my primary activity. The evolution of programming from the desktop to the web simply served to reinforce these concepts; “programmers” these days are more likely to use languages such as Javascript and HTML (which I still consider a markup rather than programming language) than C++ and Java. Fun as that is, that’s web development, and its practitioners tend not to understand either the elegance of – or need for – a good computer program. “Why compute squares on a board in O(n) when you can do it in O(n4) by scanning the whole board for each point?” sums this attitude up. “Computers are getting faster, so who will notice?” (well, you might if your program becomes popular and your server goes down in flames as the number of users grows). I even proposed a new paradigm that built classes bottom-up (by their behavior) instead of top-down (by their structure), which was promptly, since most people can’t see the point and prefer to work top-down (a study which I can no longer find concluded that despite top-down programming being encouraged and perceived as being more efficient, the best programmers tended to work bottom-up, which is true of the way I generally code as well, though I’ve become more amenable to top-down approaches as I’ve grown).

In the end, I just decided that I should move on from programming. So I decided to study algorithms in grad. school.

Well, fast forward through all of the application drama (the righteous indignation still hasn’t faded; it probably never will, since my entire life plan was essentially derailed and had to be rebuilt) and I am now at Temple studying biomedical data mining, and the last people I want to work with are the ones who study algorithms. I’ve never met such an unhappy yet demanding group of people in my life. Instead of focusing my efforts on programming, I am now focusing them on… well, everything, but especially research at the moment. I still code enough to keep my skills sharp, but only in support of my other activities. Coding for the sake of coding has been lost.

It’s something I miss from time to time, but it almost seems as if the world itself has moved past the need when I wasn’t looking – or perhaps I’m now content to describe the solution without expending the effort of implementation, since I know no one will bother with it anyway. Whatever the reason, I sometimes feel orphaned from the first thing I was really really good at.

I’m thinking about taking a job that primarily involves programming when I graduate. I started the doctorate with the notion that I was doing it more for the training than the degree, and I meant it, but I badly misjudged the research community and thus I now spend most of my time writing about concepts that anyone who cared could find in a textbook, just so I can present my new idea while meeting some sort of expected page limit (they call it “scope”) on my dissertation. I don’t know if I want to deal with this for the rest of my life. I love coming up with new ideas, but… there’s so much meaningless work that accompanies it! So much bureaucracy, so much conformity, even some hypocrisy… just to maintain a job that isn’t even particularly rewarding to begin with. I love research, but I can’t stand the way research is practiced, while I also love programming and can at least tolerate the way programming is practiced.

The idea of taking an easy job and doing my research independently looks more and more intriguing…