Table of Contents
Appendix E. In-line Assembly
The LynPlexS and LynPlexC compilers both support in-line assembly.
In-line assembly is not available in the LynPlex interpreter.
Why is it desirable to be able to use assembly code within the program?
- To run ASM instructions that are not available to the LynPlexS compiler
- To modify the assembly code generated by the LynPlexS compiler
- To optimize the assembly code generated by the LynPlexS compiler
- To learn assembly language programming
The compilers currently only produce code for Intel 80×86 based machines, however, in the future, they might well be ported to a platform that uses a different instruction set. Asm blocks should therefore only be used when necessary.
The use of Asm code in a program is an advanced feature and should not be used unless the programmer is familiar with Asm programming. The topic of assembly language programming is beyond the scope of this reference manual.
The Asm Block
An Asm block is used to insert specific machine-code instructions in a program in order to perform operations that cannot be carried out using the features of the language or to hand-optimize performance-sensitive sections of code.
asm architecture-dependent instruction
Asm block comments have the same syntax as usual LynPlex comments and not “ ; ” as is usual in 80×86 assembler.
The syntax of the in-line assembler is a simplified form of Intel syntax. Intel syntax is used by the majority of x86 assemblers, such as MASM, TASM, NASM, YASM and FASM.
In general, the destination of an instruction is placed first, followed by the source. Variables and functions defined by a program may be referenced in an Asm block. The assembler used by LynPlexC is GAS, using the .intel_syntax noprefix directive, and Asm blocks are passed through unmodified, except for the substitution of local variable names for stack frame references, and comment removal.
Instruction syntax is mostly the same as FASM. One important difference is that GAS requires size settings to be followed by the word “ptr”.
// Assuming "blah" is a LynPlex global or local UINTEGER variable mov eax,[blah] // Fine, size obvious inc [blah] // Bad, size not specified inc dword [blah] // All said, but GAS still won't accept this inc dword ptr [blah] // GAS needs "ptr" here
The return value of a function may be set by using the Function keyword within brackets as shown in the example below.
// This is an example for the x86 architecture. Function AddFive(ByVal num As Integer) As Integer Asm mov eax, [num] add eax, 5 mov [Function], eax End Asm End Function Dim ix As Integer = 4 Print "4 + 5 = ":AddFive(ix) //4 + 5 = 9
LynPlexC uses AS/GAS, the GCC assembler. As this is an external program, some quirks apply:
The error lines returned by LynPlexC for Asm blocks are not related to the LynPlex source file. As LynPlexC simply displays the errors returned by AS, the lines are related to the assembly file. To make LynPlexC preserve the assembly file, the compiler must be invoked with the -R option (“don't delete ASM files”).
Label names are case sensitive inside Asm blocks.
Function double Pi() // example of loading a 80-bit extended precision real // and returning it as a DOUBLE asm fldpi ; load pi into st(0) jmp end.Pi.fldpi ; return with pi in st(0) end asm end function
In-line assembly code can only be used within a declared function.
To have a function return an integer value, place the value into the eax register before returning. Functions whose return type require more than 4 bytes would need to return the address of the value or string in eax.
function SetBit(bit, x) asm mov ecx,[ebp+0x08] ; bit must go here for or mov ebx,[ebp+0x0C] ; put x into ebx mov eax,0x01 ; put 1 into eax sll eax,cl ; shift arith left bit times = 2^bit or eax,ebx ; inclusive OR x with result jmp end.SetBit.setbit ; jump to end of function end asm end function
The LynPlexS compiler generates assembly code which is compatible with the GoAsm assembler.
An HTML Help manual for GoAsm can be found in the \LynPlexS\manual folder. See GoAsm.chm.
For further resources concerning assembly language programming, see the next section.
80x86 Specific Code
The registers ebx, esi, and edi are generally required to be preserved by most or all OS's using the x86 CPU. For this reason, when an Asm block is opened, these registers are pushed to the stack and when the block is closed, they are restored. You can therefore use these registers without explicitly saving them.
You should not change esp and ebp, since they are usually used to address local variables.
The names of the registers for the x86 architecture are written as follows in an Asm block:
4-byte integer registers: eax, ebx, ecx, edx, ebp, esp, edi, esi
2-byte integer registers: ax, bx, cx, dx, bp, sp, di, si (low words of 4-byte e- registers)
1-byte integer registers: al, ah, bl, bh, cl, ch, dl, dh (low and high bytes of 2-byte -x registers)
Floating-point registers: st(0), st(1), st(2), st(3), st(4), st(5), st(6), st(7)
MMX registers (aliased onto floating-point registers): mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7
SSE registers: xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, xmm7
Note that the LynPlexC compiler produces 32-bit protected-mode code for the x86 which usually runs in an unprivileged user level. This means that privileged and sensitive instructions will assemble, but probably will not work correctly or cause a runtime “General Protection Fault”, “Illegal instruction”, or SIGILL error.
The following are the privileged and sensitive instructions as of the Intel Pentium 4 and Xeon:
mov to/from CRn, DRn, TRn
all SSE2 and higher instructions *2
*1: sensitive to IOPL, fine in DOS
*2: sensitive to permission bits in CR4, see below
The privileged instructions will work “correctly” in DOS when running on a Ring 0 DPMI kernel, like the (non-default) Ring 0 version of CWSDPMI, WDOSX or D3X, nevertheless most of them are not really useful and dangerous when executed from DPMI code.
RDTSC (Read Time Stamp Counter) has been shown to be allowed by most, or all OS'es. However the usefulness of RDTSC has been diminished with the advent of multi-core and hibernating CPUs. SSE2 and higher instructions are disabled “by default” after CPU initialization, Windows and Linux usually do enable them, in DOS it is business of the DPMI host: HDPMI32 will enable them, CWSDPMI won't.
The INT instruction is usable in the DOS version/target only, note that it works slightly differently from real mode DOS.
The segment registers (cs, ds, es, fs, gs) should not be changed from an Asm block, except in certain cases with the DOS port (note that they do NOT work the same way as in real-mode DOS). The operating system or DPMI host is responsible for memory management; the meaning of segments (selectors) in protected mode is very different from real-mode memory addressing.
Note that those “unsafe” instructions are not guaranteed to raise a “visible” crash even when ran with insufficient privilege - the OS or DPMI host can decide to “emulate” them, either functionally (reading from some CRx works under HDPMI32), or “dummy” (nothing happens, instruction will pass silently, like a NOP).
Assembly Language Programming Resources
- NASM x86 Instruction Reference (Please note that NASM is not the assembler used by LynPlexC, but this page provides a good overview of x86 instructions)