Auto Ads by Adsense

Booking.com

Friday, November 20, 2009

Programming Tools

Every time I think about programming tools, I get really annoyed. If you've been programming for a while, you probably started off with the basic PRINT statement as your debugging tool way back when micro-computers were too small and insufficiently powerful to run anything as sophisticated as a debugger.

When Turbo Pascal 3.0 came out for the PC, it was a revelation, at least for me. You could have a programming environment that could not only compile at lightning fast speeds, but it too, was restricted to debugging via print statements --- debuggers only became available starting from Turbo Pascal 4.0.

When I got to college and had access to UNIX machines, having a debugger was a revelation. You could single-step through code, print variables, set break points (and even conditional break points), walk up and down the stack, and if you recompiled the code, you could restart the program and the debugging would automatically pick up the new binary. I got out of the habit of writing print statements.

As an intern at Geoworks, I became even more spoiled. Geoworks had an in-house debugger called swat, and the basic development environment was a SUN workstation connected to a PC via a serial cable. You would then cross-compile on your SUN (using a distributed compiling environment), download the code via the serial cable to the PC, and swat would run on your workstation while talking to a debugging stub on the PC. Swat was ridiculously sophisticated --- to this day, I still have not used a debugger that works as well. (The author, Adam de Boor, like most of the smart people I've ever met, now works at Google) First of all, it had an extension language built into it (tcl). But secondly, the programmers working on GEOS had a very tool-oriented ethic: every time a new data structure was added, they would also write a swat extension that understood how the data structure was laid out in memory. This enabled you to type "heapwalk" at the swat prompt, and the debugger would then walk through memory and dump out all the data structures in human-readable, human-formatted form! If you had a linked list, you could tell it to walk the linked list and dump every element in it. If it was a linked list of a certain object, you could tell it to dump out the actual objects while walking through the list, rather than just dumping the pointer. Even though GEOS was written entirely in assembly (yes, even the applications --- how do you think everything fit into 512KB?), it felt more sophisticated than any high level language except Lisp.

When I graduated school and worked at Pure Software, we took a lot of pains to make sure the purify would work with debuggers. Stack traces, etc., would work with whatever debugger you used, and variable names always remained intact. This was despite incremental linkers and other techniques that Purify applied to binaries under inspection. To this day, no other UNIX vendor or free software tool has deployed an incremental linker.

When I started having to do Windows development again, the IDEs such as Visual C++ felt like a step backwards --- they had a lot of pretty visuals, but none of them were extensible, so you couldn't teach it about your new data structure, or get it to walk a list. Nevertheless, I still didn't need to write PRINT statements. When I ended up writing VxDs for a living in 1995, I had a much more primitive environment, and it was painful, but I quickly learned to abstract away most of the issues and not rewrite VxDs as much as possible.

Enter the internet server age, and I feel like it's 1986 again, and I might as well be programming on a PDP-11 using RSTS/E BASIC. Today, any kind of cloud programming that requires harnessing multiple machines essentially relies on RPCs. One would think that with all the knowledge we have from building old debuggers and such systems, we would be able to do things like single-step through a procedure from one machine to a remote machine, and still be able to do stack dumps, walk stack traces, and print data structures. The sad truth is, we can't. In fact, in many environments, you can barely attach a debugger to a remote process, and in some cases if you do attach a debugger and then detach it, the process immediately exits. Symbolic variable names? Thanks to C++ name mangling, I can barely decipher error messages from the compiler, let alone use a symbolic name in a debugger. Combine that with threads, remote systems, and other such setups, and pretty soon you're back to debugging using PRINT statements. You might dress it up and call it "logging" (and I know I've been guilty of doing that myself), but really, it's debugging via PRINTs, and as someone who calls himself a software engineer, whenever I put in yet another LOG statement I feel ashamed, both for myself and for my profession --- we had such beautiful tools in the 80s and 90s, but they are all wasted in the internet era. Yes, I'm well aware that people have written RPC analyzers --- but again, they're all after-the-fact analysis tools --- not nearly as useful as being able to "stop the state of the world and examine the state at leisure", which was what swat and the other tools were capable of doing.

What's responsible for this state of affairs? I think the big one is the decline of the market for programming tools. After Borland died, there was no longer an effective programming tools company that had the kind of end-to-end reach that could provide a development environment that was sophisticated. Microsoft all but stopped evolving its programming tools. Since it was impossible to compete against the free gdb/gcc/g++ tools (and now the free Eclipse), it became a case of "don't beat them, join them." Without end-to-end control of a development environment, it's hard to build a debugger that would do the right thing --- Microsoft could probably do that for its environment, as can Apple, but neither are power-houses in client/server/distributed computing. Google and Yahoo could invest in their distributed debugging infrastructure, but have chosen to invest resources elsewhere. The net result: I don't feel like our programming tools have done anything but gone backwards, despite all the progress we've made in other areas.

8 comments:

Unknown said...

Another way I think about this is-- yes tools sucks today, but even with better visibility into the runtime, will it really help with today's software stack? I think yes, but not as much as in the old days.

In the old days you usually write something for 1 platform. PC. Mac. Different UNIX ports. One language or at least two (C/low-level assembly) are sufficient. There are less dependencies; most of the time software can run stand-alone. It's easier to step into code and inspect the stack/heap space on a program that simply runs on one machine.

Today "the internet is the computer." I can't do email or pay e-bills or anything else without the internet. The client/server interaction is complex. Servers have much more dependencies. Even typical backend servers have lots of dependencies like the Ga** authentication server, Mone** billing, **phil cluster text inference, big ta*** storage, so on and so forth. Each stack typically uses a very different technology, and if any one is not working properly, the client is f***ed.

Even with better debuggers, it's so much more difficult to use a debugger to step into a multi-threaded, multi-tier project with X different components & dependencies. There is simply much more moving parts than in the old days. A lot of emphasis today isn't making a better debugger, but rather, 1) adding more logging (as you mentioned)+statistics during run-time for increased visibility 2) much more emphasis on unit testing and functional testing 3) much more sophisticated monitoring like b***mon.

So in a sense, yeah maybe debuggers suck. But in today's highly complex, asynchronous, heterogeneous environments how much value do they provide?

ovidiu said...

I think Borland's death was due to the rise of Microsoft tools, and later to a more general shift towards Internet applications. I think they missed the latter trend and did not focus on the tools you're talking about.

But maybe even more so, I think the reason we don't have these tools is because it's a hard problem. Most of the problems in a client-server system (forget the browser for a moment) have to deal with perhaps different languages and application frameworks. You have clients written in Python, talking to servers written in Java, which in turn connect to servers written in C++.

I'd claim that another reason why we don't seem to need debuggers for distributed systems is because of better software development methodologies. I can write a series of unit tests to test my core logic, and the only thing that remains to be tested is the client/server integration.

I find myself not needing a debugger really, since PRINTs do really help me understand what's going pretty quickly during a "debugging" session. Most problems that show up do seem to have to do with scaling up: number of requests, data you handle etc. For these no debugger can help you, your greatest help is your understanding of the system's design and possible problems.

That's why good software engineers should be paid a lot of money :) Perhaps that's why the engineers at Borland that could have developed the tools you're talking about, left for the Yahoos and Googles of the world.

Piaw Na said...

I disagree --- many situations are still mono-lingual. While unit tests are great, many difficult situations such as deadlocks are far easier to analyze with a debugger than with prints, especially since the way most programmers approach it is to keep adding prints until they understand the situation.

lahosken said...

So... what would this thing look like for an internet server program? Is there a stub? Would each server program have such a stub? Uhm, a thread in the program? Listening to a port? And if you queried it, it would tell you stuff about program state? So maybe not a thread in the program... like if the server was in Java, maybe this thing should be part of the virtual machine? I dunno, what are you thinking this thing would look like?

Piaw Na said...

An easy way to do it would be for your debugger to spawn a debugger on the remote machine as well, attached to the remote machine's process. The two debugging processes would serve to act as one as far as stack dumps, variable examination, etc., was concerned.

Unknown said...

Sure I'm not saying mono-lingual & mono-environment no longer exists. I'm just saying that typically these days there are more and more abstractions and the divergence of specialized languages and environments, which ultimately *takes away* momentum for creating better debuggers. I bet that given the hypothetical situation of magically changing all the Java/Python/C#/Ruby/Go-heads (~40%) to C/C++ (27%) only programmers, there will be a bigger momentum to create better debuggers (percentages based on http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html).

Maybe this is an opportunity to use your 20% time. What are you waiting for?!

Piaw Na said...

My 20% time is already spoken for: http://code.google.com/p/google-gtags/ (the internal version requires quite a bit of work).

Weren't you on my case a while back on my book? :-)

Spacetime said...

Well, Turbo Pascal is still alive...