Yeah, with vidcolormap static, the compiler knows nothing should be able to change it unexpectedly, and thus can do such optimizations.
My general rule (for other reasons, actually) is if it doesn't need to be accessible to other files, make it static (whether variable or function). Also, as a related rule: the "extern" keyword should not appear in a .c file (limit its use to .h files). The reason for this is there was a bug (possibly in only QF, but I don't know) where a variable was declared as int in one .c file and short in another .c file. The compiler didn't catch the problem because one of the .c files was using extern, and for some reason the linker didn't complain.
While places like R_DrawSurfaceBlock8_mip0 are obvious places to optimize (or at least likely places), always follow Abrash's rule: profile, profile, reconsider the algorithm, profile, profile, then optimize. There's little point in optimizing code that has little impact on performance, and there's no point optimizing an O(N**2) algorithm when there's an O(N) (or even the holy grail, O(1)*) algorithm available.
* Yes, for some problem spaces, O(1) exists, but they're rare. Sure, they're sometimes more expensive for small N, but since they scale so nicely, who cares?

eg, using a hash table will be slower for small N (hash string, compare a string or two vs just comparing a few strings in a simple linked list), but get to a few thousand (eg, compiling frikbot with qcc) and the hash table wins hands down.
Anyway, if the code is fast enough (thus the need for profiling), clean, easily read (and modified!) code is much more important than "fast" code. Also, compiler writers put their efforts into getting the compiler to produce optimial code for the clean, easily read code. Hand optimizing the code
can wind up fighting the compiler's optimization algorithms and wind up producing sub-optimal code (probably still better than the clean code with compiler optimizations turned off).
Now, the
best thing to do for "hand optimizing" code is actually to give the compiler hints (and as a fringe benefit, improve the readability of the code and let the compiler help catch errors). Things like "static" and "const" wherever possible can make a big difference. Just making vec3_t params "const" where possible made a noticeable difference!