Generally struct's should be organized from largest to smallest size of member elements to minimize the amount of trailing/wrapping.
Quake2World gets a ~5 fps boost for 64 bit clients (from 85 to 90 on my machine) by rearranging the order of elements in r_bsp_surface_t to save about 8 bytes total (I forget the before and after sizes, that infos on a different machine), But 32 bit clients see no difference. The FPS gain comes almost completely from reducing cache miss as far as I can tell. (Saved about 500KB of memory total on the worst case map I had on hand.) If someone has a better explanation for why saving 8 bytes created such a difference I'd love to hear it

Caveat: Be careful about rearranging all of your structs to minimize memory used. You can do dumb stuff and mess up the copying of common members between structs

(node and leaf in the Q2 setting for example) Ofc if that screws up your code that badly it might be an indication that you have some really messy code that would benefit from some more love

@Spike: Any dislike for going all the way to uint32_t rather then just unsigned int for example? I've been going and using all the stdint.h types since it was pointed out to me that there is only a minimum size associated with unsigned int's and that implementation/platform/compiler specific they could easily be 64 or 128 bit's long on me. (Which would be very surprising in file formats

) Alternatively anyone played with the fast types and seen any noticeable differences? IE uint_fast32_t