In a comment to my previous post, John Morales asked my opinion of self-modifying code.
Answering the question as asked
That’s an easy one: it’s a maintenance nightmare. Don’t do it.
More generally
Over the years, and I’m old enough to remember punching Hollerith cards1 and sticking them in sorting machines (wired boards were before my time), I’ve developed a very explicit coding style, principally because a week, a month, or a year from now, I don’t want to be wasting any time trying to puzzle out what the hell I was thinking about. I always use meaningful identifiers (names of things that programmers make up), although I’ll break down and use abbreviations at block scope; and I avoid all the known anti-patterns (“magic numbers” come easily to mind). When a function has more than one big thing to do, it’s time for refactoring. I even prefer Allman-style curly brace placement precisely because it puts more white space in the code and so separates bits of code in ways that are immediately visible.
Back in my newbie days, I made all the newbie mistakes; I even thought that self-modifying code was Really Cool. The only reason that I’m not still a newbie is that I learned from my mistakes (and I hope that I never stop learning).
Just for fun, I’ll put a cute little self-modifying PDP-8 program at the end of this post.
A related issue: machine-generated code
This is a completely different thing and not particularly scary. Programmers know how to write source code, which is just text; and they know how to write text to a file. No big deal. Indeed, the two programs that I wrote to generate simple Amtrak timetables and on-time performance statistics spit out complete Web pages2; and my timezone code comes with a couple of utilities that read Zoneinfo data and generate array initializers that get #included in other programs. That’s all really simple stuff.
It could even be argued that this is what C++ templates are: when you write a class template or function template, you’re telling the compiler how to write a class or a function for you. Yes, really. And if reflection makes its way into C++26, which is highly likely, we’ll have lots more compile-time code generation.3
1If I might stretch the meaning of “program” a bit beyond the breaking point for a moment, my first program was an 026 drum card. That was back in the Viet Nam era when Sgt. Seymour was Base Fuels Accountant at March Air Force Base. There was a very complicated daily report that got run overnight at 15th Air Force HQ; and it didn’t take me long to figure out that, in order to get the cards punched right the first time, I had to do it myself.
2It turns out that all I needed was good old HTML1 with no anchors or scripts…easy.
3But this kind of code generation is something that compiler authors do, not something that J. Random Coder like me does. I know some compiler authors, and they’re way smarter than I am. If those folks are the big leagues, I’m like an acceptable AAA baseball player when I’m having a good day.
Here’s a bit of machine-language PDP-8 code (not written by me) that sets all the bits in a memory field, including itself, to zero.
A PDP-8 field had 4096 12-bit words, so all the addresses and data are four octal digits.
ADDR DATA MNEMONIC ---- ---- -------- 0004 1005 TAD 5 0005 3410 DCA I 10 0006 5004 JMP 4 0007 5404 JMP I 4 0010 0011 (data) 0011 2010 ISZ 10
TAD: Two’s complement add. Add the contents of the referenced location to the accumulator.
DCA: Deposit and clear the accumulator. Store the accumulator in the referenced location and set all the accumulator bits to zero. The “I” in the mnemonic means “indirect”; and when absolute addresses 108 through 178 are used as indirect addresses, they pre-increment; so the first time the DCA I 10 instruction gets executed, it stores the accumulator in address 128, then 138, and so on.
JMP: Jump to the referenced location.
ISZ: Increment, skip if zero. Add 1 to the referenced location and skip the next instruction if the sum is zero.
That first three-instruction loop just sticks 3410 all over memory until it finally wraps around to location 6 where we continue to location 7 and JMP I 4 to location 3410 and start executing 3410 instructions. ๐ Since DCA clears the accumulator, at this point we’re storing zeros all over memory. 0000 is an AND instruction: load the bitwise AND of the referenced location and the accumulator into the accumulator.
I’ve forgotten what that final ISZ instruction is for, and I’m not in the mood to puzzle it out, so I’ll leave that as the dreaded exercise for the reader. ๐
You’ll notice that that program is not an algorithm because it doesn’t halt; it just keeps on executing AND 0 instructions. When all the lights on the front panel stop flashing, you press the STOP switch. ๐
I’ve written a little paper about the PDP-8 if anybody is interested.