Side-effects: a more useful definition

A typical CS-ish definition of whether a function has a side effect is whether the function modifies any globally visible state during its execution.

e.g. this function is side-effect free:

int add(int x, int y) { return x + y; }

And this function is not:

int add(int x, int y) { interlocked_increment(&GlobalAddCounter); return x + y; }

Side effects will be interesting to us soon when we discuss what kinds of actions need to occur to undo partial state changes in failure and exit paths.

I'm going to make up a term here: Innocuous Side Effects.  I thought about this term for about 15 seconds just now so if someone has a better name, feel free to suggest it, but I'll go with the acronym ISE for now.  (Hey, I came from Digital a long time ago, just be glad I'm not calling I1E or something... ;-)

Here's an example of an ISE: [there's a bug in here; addressed later...]

int add(int x, int y) {
if (TraceLogPointer != NULL) fprintf(TraceLogPointer, "Adding %d and %d\n", x, y);
return x + y;
}

Notice that like 99% of the code in the world I did not check the return value from fprintf(), but unlike 90% of that code, it's actually intentional.  This was a trace log; if the trace log is on a disk that's out of space or something the implied contract for diagnosibility tracing is that exogenous factors which make the tracing unable to proceed should not break me.  I would probably have labelled the second add() implementation (which incremented a global counter) as having only ISEs also since it's extremely unlikely that the counter is used to derive any basic correctness guarantees.

I guess we need to have a term for the opposite so I'll call the others Egregious Side Effects.  Here's an ESE.  Yes, I know no real transactional system would use fprintf to write to the log; I'm just trying to be illustrative.

int tx_append_to_data_file(
FILE *fp,
const char *psz
) {
if (fprintf(RollbackLog, "Appending: \"%s\"\n", psz) < 0) return EUNABLETOWRITETOLOGFILE; // bug??
if (fprintf(fp, "%s\n", psz) < 0) return EUNABLETOWRITETOFILE; // bug??
if (fprintf(RollbackLog, "Append completed\n") < 0) return EUNABLETOWRITETOLOGFILE; // bug??
return 0;
}

Why does every line have a // bug?? comment on it?  It's rather egregious to lose the actual problem that occurred when writing the log file and would massively hurt diagnosibility.  Probably the right thing to do in all cases was to return the value in errno.  (I already chose not to return boolean success/failure indicators, leaving things like errno for passing more specific information.  Arbitrary decision but I'm going to stick with it.) Another topic for another day.

Note another possible problem here.  Even the ISE example wasn't pure; it actually calls a function that is an ESE which promotes it to be an ESE itself unless you do extra work.  Let's do the extra work:

int add(int x, int y) {
if (TraceLogPointer != NULL) {
int old_errno = errno;
fprintf(TraceLogPointer, "Adding %d and %d\n", x, y);
errno = old_errno;
}
return x + y;
}

(This inadvertant promotion of an ISE to an ESE due to out-of-band error information is one of the main reasons to avoid this.  I don't know how many times I had to fix code because in a cleanup path, errno or the win32 GetLastError() value was overwritten.)

The main point of this post is to establish that there are two basic kinds of side effects:

  1. ESEs: Ones which are critical to correct execution of the program/system
  2. ISEs: Ones which are for debugging or diagnosibility etc. where the state changes have basically two qualities:
    1. They cannot (meaningfully) fail
    2. Their actions do not cause other "type 1" side effects to change.

You could argue that 2.1 is implied by 2.2 and that 2.2 is the really critical point but I want to make it explicit that propagating an error from an ISE is a fundamental bug.

[mgrier: fixed inconsistently used global variable name for the tracing log file pointer]