jit/x86-64: fix FPU FLDCW codegen crashes on 64-bit hosts
The x86-64 JIT could crash when compiling FPU rounding-mode changes
(FLDCW with an indexed operand) via two defects in codegen_x86.cpp:
1. x86_64_rex() dereferenced its 'b' (REX.B) pointer unconditionally and
ignored 'r'/'x'. raw_fldcw_m_indexed() passes its index register in the
'x' (REX.X) slot with b == NULL, so the function read through a NULL
pointer; an index register in r8-r15 would also have been mis-encoded.
x86_64_rex() now null-guards each pointer and emits REX.R/X/B (and W).
2. raw_fldcw_m_indexed() loaded the 64-bit base of the x87 control-word
table into RAX. RAX is allocatable (not in always_used[]) and may hold a
live m68k value mid-block, and a LOWFUNC cannot declare a register
clobber, so the allocator's view of RAX was silently corrupted; a later
access through that register (e.g. MOVEM.L ...,-(An)) then dereferenced
the table pointer as an m68k address and faulted. The base is now
materialized in a push/pop-preserved scratch register chosen to differ
from the index register.
Only x86-64 is affected; the 32-bit path uses a direct FLDCW [reg] encoding
and ARM64 uses different codegen.