Some Neeping About Code


When I came up with the name “stderr” for this blog, I was thinking about the way that the error-stream is unbuffered; that means that things come out in the order that they are printed, sort of like how I speak/write: blah blah blah unfiltered and there’s no “pause” button.

That’s all you need know about the origin of the blog name, though Crip Dyke generously [pervert justice] led an exploration into that topic.

The rest of the blog’s header is a really sick couple of old-school UNIX code jokes. The fact that nobody has found them funny so far is probably because they’re not funny.

Because I was putting them in a blog header, the lines of code are not complete. I was trying to put the tantalizing bits of lovecraftian code-horror on the left-hand side, where they belong anyhow. [It is my opinion that a line of code should do its most important thing at the left side of the line; go ahead, argue with me]

Let me walk through a few of them though:

fprintf("Oh no! System is
fprintf(stderr,"%s", strerror(errno)

What’s going on there? That’s a chunk of code I’ve seen in some commercial products, by the way. It’s why some Microsoft crap will say: “The command failed for the following reason: no error.” Here is how it works: when you use “fprintf( )” on a FILE structure, the FILE * does its own buffering, even if the output FILE * is ‘stderr’ which is supposed to be unbuffered. What happens is that fprintf( ) is usually line-buffered and flushes the buffer at a newline if the output device is a terminal. If the output is a different type of file descriptor, it may buffer the data in units of BUFSIZ, which you’d better not assume is 2k, because it has changed and when it changes, things get awkward. So, that code implements a subtle and nasty race condition because sometimes the first line may cause the buffer to flush, which invokes the system call write(2) on the underlying file description attached to the FILE *. When the write(2) completes, if it succeeds, it resets the global errno to 0, indicating there is no error condition.  When the next line comes along, it uses strerror(errno), which converts the global errno to a string, and if it’s 0, that string is “no error.”

The correct way to code that is to put the entire error processing on the same call to fprintf(), so the errno is put on the stack as it’s invoked, and you’re no longer dereferencing the global errno.

That’s an example of code that “mostly works but sometimes fails” and when I see a programmer writing that, I force a review of all the code they have put into the product.

A worse example (but too long to put in the blog header) is using memory that was allocated with malloc() after it has been freed with free(). This is a bad example:

for(x = listhead; x != NULL; x = x->next)
        free(x);

That code almost always works unless the memory item being used is small enough that the malloc() memory allocator doesn’t use some of the allocated memory to maintain a free-block list. With some malloc() implementations, like BSD’s binary buddy allocator, there is always spare memory allocated so bad coding practice like that works (or, as I prefer to say: “appears to work”) just fine. But if you put that in a small footprint system with a stingy allocator, x->next may get stepped on when the memory is put on the free block list. Best of all, when that happens the “stepped on” usually entails replacing the memory’s contents with a pointer to the next free block; which means that now the for-loop is walking the free block list, re-freeing it. Blammo. I have seen commercial products do that; suddenly they start consuming a large amount of CPU and eventually die when something hits a pointer that leads out of process memory.

The next line reads:

if(exit(1)) {

Heh, that’s ugly. exit() is the library routine (it is not a system call!) that flushes buffers for stdio and closes file descriptors and a few things, then calls the system call _exit(2) that causes the program to terminate. The code there is heinous because exit() should never return. And it certainly never should return an error, as the example indicates. Basically, that line of code says. “if you’re dead, raise your hand.” A beginning programmer might learn that “every function’s error code must be checked” and put in a check like that. If you see a programmer checking an error return from exit() – run.

To compound it, I made it worse! Our fictional programmer prints an error message:

fprintf(stderr,"Help, cannot escape!"

and then tries to kill the process using an interrupt

kill(getpid(), SIG_SEGV);

That line of code uses getpid (get the process number) then calls kill() with its own process number. Except, our fictional programmer used the wrong interrupt code – SEGV instead of KILL. SEGV is “segmentation violation” – it means the process is trying to access memory that is not active in its virtual memory tables; basically a straight-up crash. I haven’t seen anything like that in production code, but it’s a horrible possibility.

My old friend Andrew and I used to sometimes noodle about truly appalling code and would tell jokes in bad code. One of my favorites was: (let me see if I can remember how to do this… he said, as he mixed the nitric acid and the glycerine…)

jmp_buf env;

void signal_handler() {
         longjmp(env);
}

main()
{

// setup
setjmp(jmp_buf env);
signal(SIG_SEGV,signal_handler);

//run

That’s from fading memory. But here’s what it ostensibly does – it declares a global “jump” context for setjmp/longjmp, which is an old UNIX hack that allows a program to store a copy of its stack. Setjmp() stores it, longjmp() restores it – you can “jump” right out of a deeply nested call stack, but only if you are one of the worst programmers that has ever lived. Or, the best. Josh Osborne (“stripes”) implemented X-tank as a multi-threaded application using an array of jump contexts to implement internal process threading. That was disgusting and horrible but it worked fine. It was so disgusting and horrible that for a long time, that’s how POSIX threads were implemented under the hood. I looked under that hood, once, and closed it gently, and backed away. That was when I began to agree with Rob Pike, “people who want threads don’t understand fork() and exec()“. So that code sets up an error handler that restores the stack to right after the program began running at main(). Then it sets up an error handler so if there’s a segmentation fault, a wild pointer, the error handler just whacks a new copy of the stack in, and keeps running. What is terrifying is that: this works. In fact, I once did some stuff with it, where there was a global file descriptor table and the application would just “recover” by re-establishing the pointers and keep running. Why would one do this? Because memory allocators cause weird system errors when they cannot allocate more memory. A lot of programmers (myself included) allocate memory and check the return:

if((x = malloc(sizeof(thing))) == (thing *)NULL) {
          fprintf(stderr,"Cannot allocate memory: %s\n",sterror(errno));

If you have a memory allocation fail, what do you do? All you can reasonably do is a clean bomb-out. I worked briefly on a version of OSF/Mach, where the allocator sometimes returned errors for no apparent reason. What a horrible mess that was; the guys who coded that probably worked on the F-35.

------ divider ------

When I was coding full-time, I developed my own version of malloc() which was based on the 4.2BSD version but which added a bunch of error checking. It did double-entry book-keeping on the allocator, and additionally kept a table of what line of code had allocated which blob of memory. There was a stats-keeping module that I wrote, which would dump out detailed reports of what was allocated where and how, so I could see where my programs were doing a lot of memory sucking, and sometimes I’d optimize it if it seemed necessary. It would also do things like tombstoning at both ends of a chunk of memory – so if you allocated char[20] it would allocate char[20 + (2 * sizeof(char *))] put ‘0Xdeadbeef’ at [0] and the last char *, and if those values changed, I knew something had stomped memory. My version of malloc() and free() kept a usage ticker and after 20 passes through either of those functions, it would invoke a check routine that walked all the tombstones and made sure they were intact. It also kept a table of which functions allocated the most memory, and could detect memory leaks by checking for routines that allocated but never freed memory. Developing that version took me a day or two, and it was interesting to see how the allocator worked, anyway. I used to be able to link it into other people’s code and have it spit out a litany of errors. That was good fun and saved me tons of time debugging other people’s code. I had a similar library that tracked file descriptors and routines doing file I/O. Compiling against these libraries had no effect on the code while I was developing it, and when it was time to ship the code, I’d let it link against the system libraries instead of mine.

Comments

  1. Allison says

    fprintf(“Oh no! System is …”);
    fprintf(stderr,”%s”, strerror(errno);

    Most of my programming over the past 20 years has been in multithreaded applications, so I am very, very paranoid about global variables (and depending on the implementation, errno might be something more complicated than a variable.) I would write that code something like:

    if ( function_that_might_set_errno() ) {
    int my_errno = errno;
    fprintf(“Oh no! System is …”);
    fprintf(stderr,”%s”, strerror(my_errno);
    }

    That is, save the value of errno as soon as possible.

     
     

    “people who want threads don’t understand fork() and exec()“.

    How so? I mean, I’ve written code that used threads and fork() and exec(); I see them as having rather different uses. We use threading in contexts where it would be very difficult and awkward to if we had to restrict ourselves to using only fork() and exec(). Or if we tried to code the same thing without threads. (In fact, we converted one library from single- to multi-threaded because the single-threaded version was a lot harder to use and didn’t work as well.)

    What is true is that you need a rather different way of looking at programming than one gets from the average programming 101 class.

  2. sonofrojblake says

    It’s a tricky business, making obscure jokes. On the one hand, you actively want the joke to be obscure for your own satisfaction (look how clever I am, I can construct this obscure joke) and for the satisfaction of your target audience (ooh, I feel clever because I got this obscure joke). On the other hand, though, there’s a part of you that feels the need to explain the joke (no, seriously, look, this joke is REALLY clever, here’s why…), with the obvious danger that (a) you’ll lose the exclusivity and (b) the obvious frog-dissecting accusations.

    It puts me in mind of Paul Kidby’s work illustrating the covers for Terry Pratchett books such as “The Science of Discworld” and “Night Watch”. Those covers were parodies of, respectively, “An Experiment on a Bird in the Air Pump” by Joseph Wright and Rembrandt’s “The Company of Frans Banning Cocq and Willem van Ruytenburch”… but there’s no particular reason why you’d think they were a parody of anything if you weren’t already aware of the originals. I vaguely remember reading Pratchett explaining these references somewhere, saying something along the lines of “what’s the point in doing something clever if people can’t tell.

    I suspect the reason people haven’t found your jokes funny so far is that fairly few of us even realised they were jokes, or possibly (as in my case) guessed they were jokes but lacked the skills and references to get them. (I’ve read the explanation and I still don’t get them, but that’s absolutely no reason to think I don’t appreciate the effort).

    British standup comedian Richard Herring says that for him, his idea of a perfect joke is one that when told to a decent sized live audience (couple of hundred upwards, say) causes exactly ONE person to laugh out loud. If that’s your criterion, then I have heard that joke and been that person, in Shrewsbury theatre watching Barry Cryer say he’d been to watch a rugby union match, and the crowd were so posh they chanted “here we go, here we gas, here we gat”. Nobody but me laughed, and Saint Barry said “thank YOU sir!”.

  3. cvoinescu says

    Marcus: I did not catch the fact that the first fprintf clears errno, so the second one is pointless. I did chuckle at the fact that the code checks the return value of exit() — and at what it falls back to.

    sonofrojblake: That is funny, and obscure as hell these days. Very well done. It’s a bit like the background jokes in Futurama, most notably the one where Bender (a robot) appears to keep two separate binders in his closet, labelled “P” and “NP”.

Leave a Reply