Interfacing DJGPP with Assembly-Language Procedures

Written by Matthew Mastracci

Last Revised: 10/11/1997

---------------------------------------------------

Table of Contents

=---------------------------------------------------

Section

=---------------------------------------------------------------------------

1.0 Introduction

1.1 A few words

1.2 Where to get NASM

2.0 Assembly language with DJGPP

2.1 Inline assembly (AT&T style)

2.2 When and when not to use NASM

3.0 DJGPP and NASM

3.1 Introduction to NASM

3.2 Using NASM with makefiles

3.3 Using NASM with RHIDE

3.4 Getting used to NASM

3.5 A note on memory references

3.6 Returning 64-bit values

3.7 Name-mangling

3.8 Internal data structures

3.9 Exportable data structures

3.10 A note about labels

3.11 Accessing external symbols

4.0 Advanced NASM topics

4.1 Accessing real-mode interrupts

4.2 Direct memory access (to protected-mode memory)

4.3 Direct memory access (to real-mode memory)

4.4 malloc() and NASM

4.5 Hooking interrupts

4.6 _CRT0_FLAG_LOCK_MEMORY

4.7 Real-mode callback functions

4.8 Doubleword-aligned accesses

5.0 Contacting the author

5.1 Closing comments

5.2 Getting DJGPPASM.DOC

=---------------------------------------------------

1.0 - Introduction

=---------------------------------------------------

1.1 - A few words

=---------------------------------------------------

Most programmers have no trouble dealing with assembly language in

real mode. Your code is in one segment, your data is in another and you

have complete, unhindered access to the entire system. Things like

"general protection faults", "segment violations" and "page faults" seemed

like things only Windows programmers had to deal with.

In protected mode, however, things are different. Instead of

segments, we have to use selectors. You aren't allowed to write to any

memory location you please and the absolute addresses you took for granted

in real mode don't work in quite the same way. The purpose of this

tutorial is to ease the real to protected mode transition and help the

reader to grasp the important fundamental concepts that are important to

well-behaved assembly language.

1.2 - Where to get NASM

=---------------------------------------------------

The easiest place to get NASM from is Simtel. There is a large list of

Simtel mirrors in the DJGPP FAQ. It's in the "/pub/simtelnet/msdos/asmutl"

directory, with the name NASM???.ZIP (where ??? is the latest version

number).

=---------------------------------------------------

2.0 - Assembly language with DJGPP

=---------------------------------------------------

2.1 - Inline assembly (AT&T style)

=---------------------------------------------------

DJGPP allows you code full-blown inline assembly with one catch: you

have to use a semantically different set of opcodes referred to as

"AT&T-syntax." On top of that, Gas, DJGPP's assembler, is only used to

getting code straight from GCC, which means that it has very limited

syntax-checking and may not do exactly what you think you told it to.

To begin, let's look at the format of an inline-assembler statement in

a function:

unsigned short int AddFour(unsigned short int x)

{

unsigned int y;

__asm__ __volatile__(

"movw %w1, %%ax\n"

"addw $4, %%ax\n"

"movw %%ax, %w0\n"

: "=r" (y)

: "r" (x)

);

return y;

}

This function (as you may have guessed), adds four to a given

parameter and returns the value. You may, however, wonder about the

strange arrangement of registers and variables for the opcodes. Compared to

Intel-syntax asm, they're backwards! In essence, standard x86 opcodes

"act" from right to left, ie: mov ax, bx gets the value of ax from bx; add

ax, 4 adds 4 to ax. AT&T opcodes, however, acts from left to right. Here

are some examples:

Intel AT&T

mov bx, ax movw %%ax, %%bx

add ax, 4 addw $4, %%ax

Basically, the format of an inline assembly statement is:

__asm__ [__volatile__](

"opcodes" :

output-vars :

input-vars :

modified-regs

);

- "__asm__" instructs the compiler to treat the parameters of the

statement as pure assembly and to pass them to the assembler (Gas) as

written. ;

- "__volatile__" is an optional statement which instructs the compiler not

to move any of your opcodes around from where you place them or combine

them as it pleases. This is probably a good idea, as Gas is designed to

take pure GCC output and may take some unwanted liberties with the code;

- output-vars is a list of constraints indicating which variables will be

modified by the routine: use the format "=r" (x), where r is the type of x

and x is the output variable to be modified;

- input-vars is a list of constraints indicating which variables will be

used by the routine: use the format "r" (x), where r is the type of x and x

is the input variable to be referenced in the routine;

- modified-regs is a list of hard registers which are "clobbered" by the

function.

Inside the actual assembly statements, you'll notice that you don't

actually say "movw x, %%ax" or something like that. That's because the

compiler "renames" them, in the order you "mention" them. That means that

the first variable defined as an output variable will be "%w0", and, if it

exists, the next output variable will be "%w1", if it doesn't exists, the

first input variable will be "%w1" instead. In essence, the numbering

starts at output-vars at zero and counts up through the end of output-vars

and then through input-vars, incrementing by one each time.

Inline assembly is convenient in DJGPP and quite fast, but its

complexity in programming and general unreliability makes it difficult to

use for long programming tasks. In the next section, we will explore an

alternative, Intel-syntax-based method that works incredibly well with

DJGPP.

2.2 - When and when not to use NASM

=---------------------------------------------------

There are a few points to consider when choosing whether NASM will be

the best assembly-language compiler for you to use or even whether a

combination of the two methods would be better.

1. Using the __asm__ directive means that you can inline your assembly

language function, which you can't do with NASM. On top of that, you can

tell the compiler which registers you clobber and it can work around that.

2. GAS was only designed to take input from the compiler, not from the

programmer. There are a few things you need to watch out for, as its

error-checking is fairly minimal.

3. NASM follows the TASM/MASM format for assembly in most cases. If you're

used to these compilers, the adjustment period is much less, where as with

AT&T syntax, you have to practice a little more to get used to it.

4. NASM is a solid compiler that won't optimize behind your back. Each

instruction you enter is compiled exactly as you enter it. Also, NASM

supports MMX, which (as far as I can tell) isn't supported by GAS.

My personal preference is this: use NASM for your major assembly

routines (ie: the sprite-drawing routines in a graphic library or callback

functions) and use the __asm__ directive for minor functions that you want

to inline at any point in the future. If you're planning on creating a .S

file, you can probably save a lot of headache by just creating it as a .ASM

file and compiling it with NASM instead of GAS.

=---------------------------------------------------

3.0 - DJGPP and NASM

=---------------------------------------------------

3.1 - Introduction to NASM

=---------------------------------------------------

There is a freeware package named the NASM, Netwide ASseMbler,

project. It is a fully-functional assembler with most of the capabilities

of commercial products such as MASM or TASM, but (in the spirit of DJGPP

and other freeware compilers) costs you nothing. Obtain the latest version

of the package, create a %DJDIR%\NASM directory and unZIP the archive

inside the directory with the -d option. This will create the proper

directories for the program. Finally, either copy the NASM executable to

your %DJDIR%\BIN directory or add the directory to your path.

3.2 - Using NASM with makefiles

=---------------------------------------------------

If you use GNU's make to handle your projects, setting up NASM as a

compiler is fairly easy. For each file, the dependancies/method line

should look like this:

filename.o : filename.asm ; nasm -f coff filename.asm

You could also create a rule for making .o files from .asm files, to

save time:

%.o : %.asm ; nasm -f coff $<

3.3 - Using NASM with RHIDE

=---------------------------------------------------

If you want to include a NASM-compiled file in your project, follow

these steps for each .asm file:

1. Open the project window;

2. Select the .asm file you want to compile with NASM;

3. Hit Ctrl-O for the file's local options;

4. Select "User" for compiler;

5. Enter "nasm -f coff $(SOURCE_FILE)" in the "Compiler" text area; and

6. Set the error-checking to "built-in C-compiler".

This should work for all versions of RHIDE, starting at version 1.2.

In future versions, the author may have implemented a new system for using

external compilers. Check the program updates for more information.

3.4 - Getting used to NASM

=---------------------------------------------------

To begin, let's create a sample assembly language program:

nasmtest.asm:

[BITS 32]

[GLOBAL _AddFour__FUi]

[SECTION .text]

; ---------------------------------------------------------------------------

; Prototype: unsigned int AddFour(unsigned int x);

; Returns: x + 4

; ---------------------------------------------------------------------------

x_AddFour equ 8

_AddFour__FUi:

push ebp

mov ebp, esp

mov eax, [ebp + x_AddFour]

add eax, 4

mov esp, ebp

pop ebp

ret

Let's also create a C++ program to test it:

nasmtest.cc:

#include <stdio.h>

extern unsigned int AddFour(unsigned int);

int main(void)

{

printf("AddFour(4) = %i\n", AddFour(4));

return 0;

}

To make things easier, create a makefile like so:

nasmtest.mak:

nasmtest.exe : nasmtest.cc nasmtest.o ; gcc nasmtest.cc -o nasmtest.exe nasmtest.o -v -Wall

nasmtest.o : nasmtest.asm ; nasm -f coff nasmtest.asm

Type "make -f nasmtest.mak" to create the executable. When you

run it, it should print "AddFour(4) = 8".

Let's go through the assembler source:

[BITS 32]

This line instructs NASM that you want it to prefix 16-bit code and

default to 32-bit code, not the other way around. You must have this in

any functions designed to run in protected mode. If you don't have it (in

a version of NASM prior to 0.91) you'll probably generate a page fault (try

it), because all of your memory references will have the top word truncated

(ie: ebp = 0x47282567, chopped to 0x2567).

[GLOBAL _AddFour__FUi]

This line tells NASM that the label is to be exported and will be

accessable as an external symbol to other modules. There are other things

that can be placed here, but we will learn about those later.

[SECTION .text]

This defines the .text segment, which is the segment placed first in

the assembled file. It contains any executable code (exportable or not)

for the module. There are other types of segments which we will learn

about later.

; ---------------------------------------------------------------------------

; Prototype: unsigned int AddFour(unsigned int x);

; Returns: x + 4

; ---------------------------------------------------------------------------

These lines give the prototype of the function for your C++ function,

for easy reference. They aren't necessary, but help you with remembering

what your function takes as parameters and returns.

x_AddFour equ 8

As NASM has no direct support for parameter structures (as of yet), we

can define our function's parameters with this method. These are used for

referencing parameters placed on the stack by the calling function. Each

parameter will take a minimum of four bytes (padded with zeros), which

helps you keep your code 32-bit.

_AddFour__FUi:

This label declaration defines a function called "AddFour." There is,

however, a suffix which represents the parameters of the function. This

suffix is unique to C++ (the label would be "_AddFour" in C) to account for

external function overloading. Each suffix begins with "__F" to say that

it is, in fact, a function. Following this, there are characters

representing the data types (some of them):

Character Data Type

c char

d double

f float

i int

s short

v void

x long long int

In addition, there are prefixes you can add to the characters to

change them:

Prefix Meaning

U unsigned (ie: Uc is 'unsigned char')

P pointer (ie: Pi is 'int *')

R reference (ie: RUc is 'unsigned char &')

push ebp

mov ebp, esp

These lines are analagous to the "push bp/mov bp, sp" opcodes in

real-mode. They are used to preserve the 32-bit values of the caller's

stack frame.

mov eax, [ebp + x_AddFour]

add eax, 4

We load eax with the 32-bit unsigned int from the stack and add four

to it. Note that all function return values are returned in eax (extended

to edx if necessary).

mov esp, ebp

pop ebp

ret

This simply restores the caller's stack frame and returns to the

caller.

3.5 - A note on memory references

=---------------------------------------------------

In NASM, memory references work a little differently than they do in

most other assemblers. To specify the address of a variable (a symbol), you

don't use the "offset" function. Instead, you just specify the name of the

symbol as so:

mov esi, output_variable ; loads ESI with the *address* of output_variable

mov output_variable, esi ; illegal! trying to load into immediate value

If you want to access the contents of a memory location, put the

variable name in square brackets:

mov esi, [output_variable] ; loads ESI with the *value* of output_variable

mov [output_variable], eax ; loads output_variable with the contents of eax

That's all there is. If you're ever confused, just imagine that the

compiler is just replacing each of the symbols with its absolute constant

value (which it is, as a matter of fact). That way (if output_variable is

at offset 2362), this makes no sense:

mov 2342, esi

But these do make sense:

mov esi, 2342

mov [2342], eax

3.6 - Returning 64-bit values

=---------------------------------------------------

One of the great things about DJGPP is its built-in support for 64-bit

integers (signed and unsigned). How can we use these in NASM? Simple: we

just pass the low byte as we would pass a regular int, but we pass the high

byte in edx. Consider:

[BITS 64]

[GLOBAL _BigNum__FUiUi]

[SECTION .text]

; ---------------------------------------------------------------------------

; Prototype: unsigned long long int BigNum(unsigned int a, unsigned int b);

; Returns: unsigned 64-bit integer, with a as the high word and b as the low

; word

; ---------------------------------------------------------------------------

a_BigNum equ 8

b_BigNum equ 12

_BigNum__FUiUi:

push ebp

mov ebp, esp

mov edx, [ebp + b_BigNum]

mov eax, [ebp + a_BigNum]

mov esp, ebp

pop ebp

ret

Let's create a quick C++ program to test this:

#include <stdio.h>

extern unsigned long long int BigNum(unsigned int a, unsigned int b);

int main(void)

{

unsigned int a = 0x11111111, b = 0x22222222;

printf("The 64-bit integer made from 0x%0x and 0x%0x is 0x%016Lx (%Lu)", a,

b, BigNum(a, b), BigNum(a, b));

return 0;

}

The range of the 64-bit 'unsigned long long int' is from 0 to

18,446,744,073,709,551,615. That's 18 quintillion, or 18x10^18!

3.7 - Name-mangling

=---------------------------------------------------

As mentioned before, C++ functions are "mangled" to account for how

it allows the programmer to overload the parameters of a function. If you

don't want your functions names to have annoying trailers like "__Fv" or

"__FUiUicP11__dpmi_regs", there's another way to do it. If you load up

one of DJGPP's supplied include files, you'll notice this near the start:

#ifdef __cplusplus

extern "C" {

#endif

int foo(int bar);

void fx(void);

...

#ifdef __cplusplus

}

#endif

The extern "C" declaration tells the compiler that the function was

originally compiled with a regular C-compiler, not a fancy C++-compiler

that mangles your function names. This way, you can write your assembly

language functions will just a preceeding underscore (like _foo or _bar).

Before you get mad at C++ for mangling your simple names, take a

look at what C++ does to your complicated class names!

3.8 - Internal data structures

=---------------------------------------------------

Many of the functions you will write with the assembler will require

various variables that won't be seen outside the scope of the module. To

do this, let's create a new section below the function AddFour. Put this

at the end of nasmtest.asm:

[SECTION .data]

AddConstant dd 00000004h

Now modify AddFour like so:

push ebp

mov ebp, esp

mov eax, [ebp + x_AddFour]

add eax, [AddConstant]

mov esp, ebp

pop ebp

ret

The code now references data inside the data segment (everything in

.data is placed in the data segment referenced by the selector in ds).

With NASM, you must place the variable's name in square brackets to

indicate a reference to a memory location. If you forget the brackets, the

compiler will treat it as the actual offset of the variable, often leading

to unwanted side-effects.

3.9 - Exportable data structures

=---------------------------------------------------

What if we want a function to make a variable accessable to a module

it gets linked with? Let's add a version string to the module for the

calling program to read. Add this in the .data section, under AddConstant:

_VersionString db "NASMTEST.ASM - Version 0.0a", 00h

Now, at the top of the file, under the first [GLOBAL ...] declaration,

add:

[GLOBAL _VersionString]

This label is now exportable as a variable in the calling program.

Add the following lines to nasmtest.cc, under the declaration of AddFour:

extern char VersionString[];

And under the first printf() statement, add:

printf("VersionString = '%s'\n", VersionString);

Compile this, and you should see the exported string from the assembly

module. Note that the variable can be accessed internally as well.

Consider:

; ---------------------------------------------------------------------------

; Prototype: void NewVersion(char c);

; Returns: nothing

; ---------------------------------------------------------------------------

c_NewVersion equ 8

_NewVersion__Fc:

push ebp

mov ebp, esp

mov al, [ebp + c_NewVersion] ; mov eax, [ebp + c_NewVersion]

; would be just as valid, as the

; stack is padded with zeros

mov byte [_VersionString + 26], al ; 26 is the offset of the 'a'

; in VersionString from position

mov esp, ebp ; zero

pop ebp

ret

Add the following two lines as well:

NewVersion('g');

printf("VersionString = '%s'\n", VersionString);

Don't forget to export the function with [GLOBAL ...] and declare the

external function in your C++ source.

What if we wanted to offer a complicated structure? Add the following

declaration to nasmtest.cc:

extern struct {

unsigned int x, y;

char * VersionStringPointer;

} DataValues;

Now add the following in the .data section:

_DataValues:

dd 00000001 ; unsigned int x

dd 00000002 ; unsigned int y

dd _VersionString ; char * VersionStringPointer

; Loaded with offset of version string,

; effectively a pointer

And export it with:

[GLOBAL _DataValues]

Now you can check that the structure works:

printf("x = %i, y = %i\n", DataValues.x, DataValues.y);

// This line uses a char * instead of char like before

printf("VersionString = '%s'", DataValues.VersionStringPointer);

Other variable types may be passed back and forth in this method. Just

remember to ensure that the structures on both sides are exactly the same.

3.10 - A note about labels

=---------------------------------------------------

In essence, there are three types of labels within NASM:

1) Exportable references: labels which are declared in a [GLOBAL ...]

statement at the top of a file, accessable by other modules linked with

this one. These labels reference functions, structures and variables, and

begin with an underscore, ie:

_InitMode13h__Fv: ; Exportable function

; function code

_DataTable: ; Exportable structure

db 00h

_XCoord dw 0000h ; Exportable variable

2) Internal references: labels referencing functions, structures and

variables which are private to the module are written without a prefix

character:

SaveRegs: ; Internal function

; function code

DescriptorTable: ; Internal structure

dw 0000h, 0000h

RAMPointer: dd 00000000h ; Internal variable

3) Internal jump targets: labels which will be the target of conditional

and unconditional jumps are written with a period as a prefix:

.LoopStart:

loop .LoopStart

jmp .DoneProc

.DoneProc:

Adhering to these guidelines will ensure your code works properly with

DJGPP and most debuggers.

3.11 - Accessing external symbols

=---------------------------------------------------

Your assembly language module can access public symbols from other

modules it gets linked with as well. Create two strings in the .data

section like so:

; Note that 0ah is used instead of \n, as \n is not interpreted by

; printf, but processed by the compiler instead.

PrintTemplate1 db "VersionString = '%s'", 0ah, 00h

PrintTemplate2 db "unsigned int Exported = 0x%08x", 0ah, 00h

Now create an assembler function to access some external symbols:

; ---------------------------------------------------------------------------

; Prototype: void PrintStrings(void);

; Returns: nothing

; ---------------------------------------------------------------------------

_PrintStrings__Fv:

push ebp

mov ebp, esp

; Remember that C pushes parameters from right to left, not

; left to right like Pascal!

; Note that we push the pointer for a string...

push dword _VersionString

push dword PrintTemplate1

call _printf

; ... but push the actual value of most everything else

push dword [_Exported]

push dword PrintTemplate2

call _printf

mov esp, ebp

pop ebp

ret

We'll have to tell NASM that we have a few external symbols, so it

doesn't give us any errors. Put these by the [GLOBAL ...] definitions:

[GLOBAL _PrintStrings__Fv]

[EXTERN _printf]

[EXTERN _Exported]

Now remove any code previously contain in main() and add the following

lines:

external void PrintStrings(void);

unsigned int Exported;

int main(void)

{

Exported = 0xdeadbeef;

PrintStrings();

return 0;

}

=---------------------------------------------------

4.0 - Advanced NASM topics

=---------------------------------------------------

4.1 - Accessing real-mode interrupts

=---------------------------------------------------

When you write your C++ code, you usually don't worry about having to

call a real-mode interrupt. Usually, you just define a register structure

and call __dpmi_int or one of the various other wrappers provided by DJGPP.

In assembly language, however, you don't have it quite so easy.

The easiest way is to call the DPMI server's function 0x03000, which

simulates a real-mode interrupt. Create a new file with the following:

inttest.asm:

[BITS 32]

[SECTION .text]

SaveRegs:

mov [SaveEAX], eax

mov [SaveEBX], ebx

mov [SaveECX], ecx

mov [SaveEDX], edx

mov [SaveESI], esi

mov [SaveEDI], edi

mov [SaveEBP], ebp

pushf

pop eax

mov [SaveFlags], ax

ret

RestoreRegs:

pushf

pop eax

mov ax, [SaveFlags]

push eax

popf

mov eax, [SaveEAX]

mov ebx, [SaveEBX]

mov ecx, [SaveECX]

mov edx, [SaveEDX]

mov esi, [SaveESI]

mov edi, [SaveEDI]

mov ebp, [SaveEBP]

ret

[SECTION .data]

; Register set structure for int 31h, function 3000h

RegSet

SaveEDI dd 00000000

SaveESI dd 00000000

SaveEBP dd 00000000

dd 00000000 ; Note: reserved

SaveEBX dd 00000000

SaveEDX dd 00000000

SaveECX dd 00000000

SaveEAX dd 00000000

SaveFlags dw 0000

SaveES dw 0000

SaveDS dw 0000

SaveFS dw 0000

SaveGS dw 0000

SaveIP dw 0000

SaveCS dw 0000

SaveSP dw 0000

SaveSS dw 0000

SaveRegs and RestoreRegs can now be called from your program to fill

and examine the values stored in the structure RegSet. Now you can simply

wrap an interrupt call like:

int 10h

like so:

; Pass our registers to the interrupt

SaveRegs

; Function 3000h: Simulate real-mode interrupt

mov ax, 3000h

; Interrupt to simulate: int 10h

mov bx, 0010h

; Copy zero bytes from our stack (you probably won't need otherwise)

mov cx, 0000h

; Load es with ds, so es:edi points to RegSet

push ds

pop es

mov edi, RegSet

; Do it

int 31h

; Get the return values back

RestoreRegs

As you can see, there is a lot of overhead involved in calling a

real-mode interrupt from protected-mode. The message? If you don't have

to, don't. Use these sparingly, in parts of your code that aren't time

critical.

Let's try this with a real-life example. Add these two export

declarations:

[GLOBAL _InitMode13h__Fv]

[GLOBAL _RestoreTextMode__Fv]

Next, add their respective functions:

; ---------------------------------------------------------------------------

; Prototype: void InitMode13h(void);

; Returns: nothing

; ---------------------------------------------------------------------------

_InitMode13h__Fv:

push ebp

mov ebp, esp

; ah = 00h, Sets video mode with int 10h to mode al

mov ax, 0013h

call SaveRegs

; Simulate real-mode interrupt

mov ax, 0300h

; Of int 10h

mov bx, 0010h

; Copy nothing from the stack

mov cx, 0000h

; Load es with ds, so es:edi points to RegSet

push ds

pop es

mov edi, RegSet

; Do it

int 31h

; Load results back into registers

call RestoreRegs

mov esp, ebp

pop ebp

ret

; ---------------------------------------------------------------------------

; Prototype: void RestoreTextMode(void);

; Returns: nothing

; ---------------------------------------------------------------------------

_RestoreTextMode__Fv:

push ebp

mov ebp, esp

; ah = 00h, Sets video mode with int 10h to mode al

mov ax, 0003h

call SaveRegs

; Simulate real-mode interrupt

mov ax, 0300h

; Of int 10h

mov bx, 0010h

; Copy nothing from the stack

mov cx, 0000h

; Load es with ds, so es:edi points to RegSet

push ds

pop es

mov edi, RegSet

; Do it

int 31h

; Load results back into registers

call RestoreRegs

mov esp, ebp

pop ebp

ret

And now create the test program:

inttest.cc:

#include <conio.h>

#include <go32.h>

#include <sys/farptr.h>

extern void InitMode13h(void);

extern void RestoreTextMode(void);

int main(void)

{

int x, y;

InitMode13h();

_farsetsel(_dos_ds);

for (y = 0; y < 100; y++) {

for (x = 0; x < 100; x++) {

_farnspokeb(0xa0000 + y * 320 + x, ((x * y) >> 2) % 256);

}

getch();

RestoreTextMode();

return 0;

}

This should produce a 100x100 square in the top left of your screen,

containing a neat-looking pattern. In a nutshell, that's how you can use

interrupts to perform tasks that are complicated, but aren't called enough

times to warrent optimization. You may have noticed that the routine isn't

quite instant however (if you compiled the program with optimizations off).

In the next section, we will explore ways to increase the speed of direct

memory accesses of all kinds.

4.2 - Direct memory access (to protected-mode memory)

=---------------------------------------------------

As we've seen earlier, data segment pointers can be accessed simply by

using square brackets (ie: mov ax, [DataSegmentWord]). In fact, this

applies to all standard pointers, including those obtained from malloc().

They are all 32-bit near pointers because, hey, in protected-mode,

everything is "near."

This simplifies buffer-to-buffer copying a little. You can copy a

large amount of memory (up to 256k) between two external malloc()'d

pointers by simply writing:

cld ; Forward copy, inc esi/edi

mov esi, [_PointerFrom] ; ds:esi is the source

mov edi, [_PointerTo] ; es:edi is the dest, assumes es = ds

mov ecx, Count ; Count is the actual number of bytes

rep movsd ; divided by 4 (ie: double-words)

Most of the time, you will find es equal to ds while executing your

code. Don't rely on this, however, unless you are absolutely sure. It's a

good idea to set es to ds at the start of the code, to ensure you don't end

up writing to another selector and causing a GPF or worse. If your code

can't afford to waste cycles (and is called repeatedly), consider setting

es to ds in another function and calling it once before the other calls.

Be sure that you aren't calling anything that may change the value of es

under your nose, though.

4.3 - Direct memory access (to real-mode memory)

=---------------------------------------------------

But what if you want to access something from real-mode linear memory?

Here's where it gets a little tricky. You can't simply pull a descriptor

for your particular segment from thin air and access it that way. DJGPP

provides a convenient way to get a descriptor for something like this,

however:

int __dpmi_segment_to_descriptor(int _segment);

You simply pass it the real-mode segment you want a descriptor for and

it returns the descriptor you can use to access it. That's all it takes.

Let's add a function to inttest.asm to do this for us:

; ---------------------------------------------------------------------------

; Prototype: int InitGraphics(void);

; Returns: -1 on error, else zero

; ---------------------------------------------------------------------------

_InitGraphics__Fv:

push ebp

mov ebp, esp

push dword 0a000h ; Push segment for function

call ___dpmi_segment_to_descriptor ; Note the triple underscore

cmp eax, -1 ; Check for error

jne .SelectorOkay ; No error, selector is valid

mov [VideoRAM], 0ffffh ; Error, make selector invalid

jmp .Done ; End function, will return -1

.SelectorOkay:

mov [VideoRAM], ax ; Load selector from ax

xor eax, eax ; Clear eax to return zero

.Done:

mov esp, ebp

pop ebp

ret

We'll also have to add a variable to the .data section:

VideoRAM dw 0000h

And add some declarations at the beginning:

[GLOBAL _InitGraphics__Fv]

[EXTERN ___dpmi_segment_to_descriptor]

Now that we have a descriptor that references segment 0xa000, we can

create a PutPixel routine on its way to being much faster than before:

; ---------------------------------------------------------------------------

; Prototype: void PutPixel(unsigned int x, unsigned int y,

; unsigned char Color);

; Returns: nothing

; ---------------------------------------------------------------------------

x_PutPixel equ 8

y_PutPixel equ 12

Color_PutPixel equ 16

_PutPixel__FUiUiUc:

push ebp

mov ebp, esp

mov eax, [ebp + y_PutPixel]

mov ebx, eax

shl eax, 6 ; Note: x shl 6 + x shl 8 = 320 * x

shl ebx, 8

add ebx, eax

add ebx, [ebp + x_PutPixel]

mov fs, [VideoRAM]

mov al, [ebp + Color_PutPixel]

mov [fs:ebx], al

mov esp, ebp

pop ebp

ret

Now our main procedure becomes:

int main(void)

{

int x, y;

InitGraphics();

InitMode13h();

for (y = 0; y < 100; y++) {

for (x = 0; x < 100; x++) {

PutPixel(x, y, ((x * y) >> 2) % 256);

}

getch();

RestoreTextMode();

return 0;

}

You can remove the #include<...> statements at the start of the file,

with the exception of conio.h. Once you've added the extern declarations,

it should compile and run like before.

One important aspect of __dpmi_segment_to_descriptor is that the

number of selectors available to it is finite. Thus, you should only call

the InitGraphics() function once (along with any others that use this

function), at the beginning of the program. In any case, calling it more

than once is redundant and will put a drain on resources and speed.

4.4 - malloc() and NASM

=---------------------------------------------------

One of the most important functions for managing allocated memory is

malloc(). It allows you to assign any pointer to a contigious block of

memory and access it like a structure, array, variable or even as a

function.

Let's look at an example:

alloctst.cc

#include <stdio.h>

extern void FillString(char * StringToFill);

int main(void)

{

char * TestString;

TestString = (char *)malloc(20);

FillString(TestString);

printf("TestString = '%s'\n", TestString);

free(TestString);

return 0;

}

alloctst.asm

[BITS 32]

[GLOBAL _FillString__FPc]

[SECTION .text]

; ---------------------------------------------------------------------------

; Prototype: void FillString(char * StringToFill);

; Returns: nothing

; ---------------------------------------------------------------------------

StringToFill_FillString equ 8

_FillString__FPc:

push ebp

mov ebp, esp

mov edi, [ebp + StringToFill_FillString]

mov esi, Filler

mov ecx, 20

rep movsb

mov esp, ebp

pop ebp

ret

[SECTION .data]

Filler db "<- Buffer filled ->", 00h

As you may have guessed, the program copies the filler (without any

sort of bounds checking) to the passed string and then prints it. In this

way, malloc() can be used for many different applications in your program.

In graphical applications, you may find these buffers useful for storing

sprites, sound and tile data. With a little bit of effort, you can create

dynamically-loaded drivers that you can load and unload into your program

as needed. This could be useful for creating a standard set of procedures

for sound output with a number of drivers that can be loaded for each

different sound card.

4.5 - Hooking interrupts

=---------------------------------------------------

Interrupt-hooking is an integral part of many different types of

programs, from timer-synchronization in games to serial port monitoring in

communication programs. Let's try a quick example. Create a makefile and

enter the following program:

hookint.asm:

[BITS 32]

[GLOBAL _TickHandler__Fv]

[GLOBAL _TickHandler__Size]

[EXTERN _Timer1]

[EXTERN _Timer2]

[EXTERN _Flag1]

[EXTERN _Flag2]

[EXTERN _TickCount]

[SECTION .text]

_TickHandler__Fv:

inc dword [_TickCount]

dec dword [_Timer1]

dec dword [_Timer2]

cmp dword [_Timer1], 0

jz .SetFlag1

jmp .TestFlag2

.SetFlag1:

mov dword [_Flag1], 1

.TestFlag2:

cmp dword [_Timer2], 0

jz .SetFlag2

jmp .Done

.SetFlag2:

mov dword [_Flag2], 1

.Done:

ret

; Calculate size of function by subtracting offsets

_TickHandler__Size dd $-_TickHandler__Fv

[SECTION .data]

hookint.cc:

// Adapted from the DJGPP test program "libc\go32\timer.c"

#include <stdio.h>

#include <pc.h>

#include <dpmi.h>

#include <go32.h>

#define LOCK_VARIABLE(x) _go32_dpmi_lock_data((void *)&x, (long)sizeof(x));

extern void TickHandler(void);

extern int TickHandler__Size;

int Timer1 = 1, Timer2 = 1, Flag1, Flag2, TickCount = 0;

int main()

{

_go32_dpmi_seginfo OldHandler, NewHandler;

printf("Grabbing timer interrupt...\n");

_go32_dpmi_get_protected_mode_interrupt_vector(8, &OldHandler);

_go32_dpmi_lock_code(TickHandler, (long)TickHandler__Size);

LOCK_VARIABLE(Timer1);

LOCK_VARIABLE(Timer2);

LOCK_VARIABLE(Flag1);

LOCK_VARIABLE(Flag2);

LOCK_VARIABLE(TickCount);

NewHandler.pm_offset = (int)TickHandler;

NewHandler.pm_selector = _go32_my_cs();

_go32_dpmi_chain_protected_mode_interrupt_vector(8, &NewHandler);

while (!kbhit())

{

if (Flag1)

{

printf("Timer 1 expired: %i\n", TickCount);

Flag1 = 0;

Timer1 = 5;

}

if (Flag2)

{

printf("Timer 2 expired: %i\n", TickCount);

Flag2 = 0;

Timer2 = 7;

}

getkey();

printf("Releasing timer interrupt...\n");

_go32_dpmi_set_protected_mode_interrupt_vector(8, &OldHandler);

return 0;

}

What does this program do? Let's examine it:

_go32_dpmi_get_protected_mode_interrupt_vector(8, &OldHandler);

This line reads the selector and offset of the previous handler for

INT 8 (the timer interrupt) into the structure OldHandler for use later.

_go32_dpmi_lock_code(TickHandler, (long)TickHandler__Size);

LOCK_VARIABLE(Timer1);

LOCK_VARIABLE(Timer2);

LOCK_VARIABLE(Flag1);

LOCK_VARIABLE(Flag2);

LOCK_VARIABLE(TickCount);

Here, we ensure that the handler doesn't get paged out from under our

noses and cause a page fault. As the function call for locking a variable

is fairly complicated to type repeatedly, we define a macro to save some

time. The size of the locked code is calculated at compile-time by our

variable in the source file and passed as an int for us to use.

NewHandler.pm_offset = (int)TickHandler;

NewHandler.pm_selector = _go32_my_cs();

_go32_dpmi_chain_protected_mode_interrupt_vector(8, &NewHandler);

We then get the selector/offset pair for our new handler and add it to

the interrupt chain for INT 8. Note that _go32_my_cs() returns the

selector for the program's code. At this point, a wrapper has been created

for our function so that it will execute every time the interrupt is

called. You won't need to terminate your routine with iret, as you

normally do, because the function to chain the vector will create a special

wrapper to ensure that the routine will run like normal. You cannot,

howver, use any special system functions while in your interrupt routine

(like printf, fopen, fread, etc.), as most of these are non-reentrant.

_go32_dpmi_set_protected_mode_interrupt_vector(8, &OldHandler);

Once we have finished with it, we can return the handler to its

previous state by passing the old handler's address we saved before.

Our protected-mode interrupt handler is quite simple. It increments

the variable TickCount for every timer tick and decrements Timer1 and

Timer2. If either of the two timers is equal to zero, it sets the flag

associated with the timer. The flags are latched, meaning that you must

zero them after you have dealt with them. This also means that if you

aren't able to catch one or more timer expiries (because the system is

busy), you'll only have to service it once.

Interrupt handlers can be quite complicated if necessary. The only

restriction is that you can't call any functions that aren't re-entrant.

This includes most of the system functions and any of yours that fall under

the same category. In most cases, you should stay within your function as

much as possible to prevent possible conflicts.

4.6 - _CRT0_FLAG_LOCK_MEMORY

=---------------------------------------------------

It is possible to lock all of your program's memory using one of the

CRT0 startup flags, _CRT_FLAG_LOCK_MEMORY. This flag effectively disables

virtual memory (disk swapping). It's a great feature if you want to write

a program that shouldn't be swapped to disk at any time (perhaps a game).

To use it, you simply add the following lines to your programs:

#include <crt0.h>

int _CRT0_STARTUP_FLAGS = _CRT0_FLAG_LOCK_MEMORY;

This will ensure that your code, data and allocated memory will all

be locked, without the need to use _go32_dpmi_lock_data(). The amount of

available memory will decrease, however.

4.7 - Real-mode callback functions

=---------------------------------------------------

Interrupt handlers are great for handling interrupts, but what if

you want a real-mode program to call one of your functions when a certain

event occurs? If you were in real-mode too, you could just get the segment

and offset of your function and pass it on to the other program and be done

with it. In protected mode, there's one more step: a wrapper. Just like

before, there's a great library function that does all the tough stuff for

you:

_go32_dpmi_allocate_real_mode_callback_wrapper_retf()

How do we use this? It's easy. Let's look at another example:

rmcbtest.cc:

#include <go32.h>

#include <dos.h>

#include <dpmi.h>

#include <conio.h>

#include <stdio.h>

#include <crt0.h>

int _CRT0_STARTUP_FLAGS = _CRT0_FLAG_LOCK_MEMORY;

/* To make sure the name doesn't get mangled */

extern "C" void MouseCallback(_go32_dpmi_registers * r);

extern volatile int MouseButtons;

extern volatile int MouseX;

extern volatile int MouseY;

_go32_dpmi_registers regs;

int main(void)

{

clrscr();

_go32_dpmi_seginfo info;

/* Set up the handler */

info.pm_offset = (int)MouseCallback;

_go32_dpmi_allocate_real_mode_callback_retf(&info, &regs);

__dpmi_regs r;

/* Set the horizontal range valid from 0 to 1000 */

r.x.ax = 0x07;

r.x.cx = 0;

r.x.dx = 1000;

__dpmi_int(0x33, &r);

/* Set the vertical range valid from 0 to 1000 */

r.x.ax = 0x08;

r.x.cx = 0;

r.x.dx = 1000;

__dpmi_int(0x33, &r);

/* Install the real-mode callback routine */

r.x.ax = 0x0c;

r.x.cx = 0x1f; /* 0x1f traps on movements and RMB/LMB presses */

r.x.dx = info.rm_offset;

r.x.es = info.rm_segment;

__dpmi_int(0x33, &r);

while (!MouseButtons) {

printf("(%i, %i)\n", MouseX, MouseY);

delay(250);

}

/* Clean up handler */

r.x.ax = 0x0c;

r.x.cx = 0;

r.x.dx = 0;

r.x.es = 0;

__dpmi_int(0x33, &r);

_go32_dpmi_free_real_mode_callback(&info);

printf("Mouse button pressed.");

return 0;

}

rmcbtest.asm:

[BITS 32]

[GLOBAL _MouseCallback]

[GLOBAL _MouseX]

[GLOBAL _MouseY]

[GLOBAL _MouseButtons]

[SECTION .text]

; ---------------------------------------------------------------------------

; Prototype: void MouseCallback(__dpmi_regs * r)

; Returns: nothing

; ---------------------------------------------------------------------------

Pointer_MouseCallback equ 8

_MouseCallback:

push ebp

mov ebp, esp

mov esi, [ebp + Pointer_MouseCallback]

xor eax, eax

mov ax, [esi + 16] ; offset of bx in __dpmi_regs

mov [_MouseButtons], eax

mov ax, [esi + 24] ; offset of cx in __dpmi_regs

mov [_MouseX], eax

mov ax, [esi + 20] ; offset of dx in __dpmi_regs

mov [_MouseY], eax

mov esp, ebp

pop ebp

ret

[SECTION .data]

_MouseX dd 00000000h

_MouseY dd 00000000h

_MouseButtons dd 00000000h

The code is fairly straight-forward and similar to the code we

created for handling real-mode interrupts. To set up the callback, we

set the pm_offset member of the info struct and pass it to the routine

allocation function (you know, the one with the excessively long name).

The function creates a wrapper and passes the wrapper's entry points

back in the rm_segment and rm_offset in the info struct.

You might have noticed that it also requires a global variable (this

is important: you can't give it any other type of variable). The function

then uses this variable and passes a pointer to it to your function. If

you write your function correctly, you can save yourself a lot of trouble

by passing it to a struct internal to your asm function. Be careful with

this, though, it could change at any time and break your programs.

To access the register struct, just load esi from [ebp + 8] and then

use [esi + n] to access the registers. You'll need to count the offset

manually from DPMI.H, but don't worry too much, there aren't very many

members.

4.8 - Doubleword-aligned accesses

=---------------------------------------------------

In some cases, on 386 machines an up, accessing memory aligned to a

word or double-word boundary is faster than an unaligned access. On higher-

end machines, double-word boundaries offer the greatest benefit. In NASM,

we can tell the assembler that we want to align an entire program segment to

a double-word boundary easily:

[segment .text ALIGN=4]

How can we tell the assembler that we want to align data or functions

to a double-word boundary? It's actually quite simple. Using the assembler

variables "$$" (the start address of the current segment), "$" (the address

of the current opcode) and the "times" directive, we can create a statement

like so:

times ($$ - $) & 3 nop ; Align the next instruction/data to

; a double-word boundary, assuming

; segment is aligned to double-word

It seems fairly lengthy, and indeed, it is. If you have NASM version

0.94 or later, however, the macro facility comes in handy:

%define align times ($$ - $) & 3 nop

Now if you want to align any of your data/procedures, just use the

align keyword as if it were a directive. Make sure that you put it before

the label, or else you'll end up jumping into the nop's and slowing down

your program:

align

_PutPixel__FUiUiUi:

To see this in action, load up the INTTEST.ASM file and add the %define

line listed above to the top of the file and our new "align" keyword before

each of the functions. Also add the keyword before the definition of the

VideoRAM variable:

align

VideoRAM dw 0000h

Add "ALIGN=4" to each of the segment definitions as well:

[segment .text ALIGN=4]

[segment .data ALIGN=4]

Compile the program and then load it into a debugger (FSDB works well

for this). Trace through the program up to the InitGraphics() function call

and step into the function. Go back a few bytes and notice how there are a

few nop's before the actual function. Also look at the starting address of

the function. It'll end in either 0, 4, 8, or C, meaning it's double-word

aligned. If you want, remove the "align" macros from the file and

recompile. Notice how the function aren't aligned anymore. Neat, huh?

=---------------------------------------------------

5.0 - Contacting the author

=---------------------------------------------------

5.1 - Closing comments

=---------------------------------------------------

This document is still in an unfinished state, so there may be some

errors (glaring or otherwise), omissions or misinformation. If you happen

to stumble across any of these (even typos), feel free to send me email to:

mmastrac@acs.ucalgary.ca

5.2 - Getting DJGPPASM.DOC

=---------------------------------------------------

The latest public version of this document is always available at:

http://www.ucalgary.ca/~mmastrac/djgppasm.doc

The examples created in the document are available in a separate

zip-file at:

http://www.ucalgary.ca/~mmastrac/djnasmex.zip

This manual compiled using MC v1.05 by Matthew Mastracci