Assignment 2

Note: You may work with one other student on this assignment if you wish. Turn in only one solution per group.

The assignment is due 2:40pm, Friday, September 28. Submit your solution electronically via the Web page, www.cburch.com/cs/350. Do not e-mail it, and do not give me a paper copy.

Objectives

Written assignment

  1. Do problem 2.10 of textbook (p 149).

  2. Do problem 2.35 of textbook (p 151). Your ``program'' for this problem will be pseudocode. Include a paragraph explaining why your program guarantees that babboons will not deadlock.

  3. Do problem 2.36 of textbook (p 152). Again, your ``program'' will just be pseudocode.

Programming assignment

Your job is to develop a Linux assembly program using NASM that echoes its input to the output, except with all lower-case letters converted to their upper-case equivalents.

% ./a.out
I love x86 assembly!
I LOVE X86 ASSEMBLY!
Don't you?
DON'T YOU?
control-D
This program should also be able to handle a command-line argument naming a separate file.
% cat tmp
I love x86 assembly!
Don't you?
% ./a.out tmp
I LOVE X86 ASSEMBLY!
DON'T YOU?

Your program should handle errors gracefully. Namely, if the user enters a bad command line (with too many arguments), it should report an error. And if the named file isn't found, it should also report an error.

Command-line arguments

Before Linux starts an executable, it pushes several pieces of information onto the stack that you'll need for this program. The top item on the stack (what esp points to at the program's beginning) is the number of command-line arguments given by the user, represented in four bytes. (This count includes the command name itself; if the user invoked the command ``a.out tmp,'' this value will be 2.) The next item on the stack is a four-byte pointer to the first character of the first argument (which is the command name). The second item is a four-byte pointer to the second argument. And so on. After all the argument pointers is the four-byte NULL pointer, represented with the number 0.

System calls

To make a system call in assembly language, place the system call's code in eax and the successive arguments to the call in ebx, ecx, and edx. The system call's return value will be placed in eax.

For example, the following writes the first ten bytes pointed to by buffer to the standard output.

mov eax, 4       ; put code for write() system call into eax
mov ebx, 1       ; first arg of C write() function is file descriptor
mov ecx, buffer  ; second arg is pointer to first byte to be written
mov edx, 10      ; third arg is number of bytes to be written
int 0x80         ; now make interrupt
After the execution of this code, eax will contain the return value for write() (which in this case I'm choosing to ignore).

My solution to this assignment employs five Linux system calls:

exit() (code 1)
read() (code 3)
write() (code 4)
open() (code 5)
close() (code 6)
When you call open(), you'll want to use 0 for the second argument (in place of O_RDONLY).

Using NASM

Our assembler, NASM, should already be in your path, so you won't have to go through any installation process. (If you have access to another Linux computer, you can also download NASM from nasm.2y.net.)

To run NASM, you'll want to connect to a Linux computer. Our department has two Linux computers you can access remotely: hrothgar and chomas. Log into a Solaris computer, and in a terminal window, type ssh hrothgar to connect to a Linux computer. You can still edit your code on the local Solaris computer, but you will want to use this connection to a Linux computer to assemble and run your program.

If your assembly code is in a file (called perhaps cap.s), you can execute the following commands in Linux to assemble and run it.

% nasm -f elf cap.s
% ld cap.o
% ./a.out
The first line tells the assembler to assemble the code into an ELF-formatted object file cap.o. The second line links this object file into an executable file a.out. And the final line actually executes this file.

Debugging code

Debugging an assembly-language program isn't easy. It's useful to have a couple of useful subroutines that you can call that will display information about what's what.

; printregs: a debugging subroutine to print the general registers
; in the following order: edi, esi, ebp, esp, ebx, edx, ecx, eax
printregs:
        pushad
        mov eax, 4      ; make interrupt for write system call
        mov ebx, 2      ;   (arg 1: file descriptor where to write)
        mov ecx, esp    ;   (arg 2: memory location where to find data)
        mov edx, 32     ;   (arg 3: number of characters to print)
        int 0x80
        popad
        ret

; printstack: a debugging subroutine to print the top 64 bytes
;   of the stack to standard error. The top byte of the stack
;   (lowest address) is printed first.
printstack:
        pushad
        mov eax, 4      ; make interrupt for write system call
        mov ebx, 2      ;   (arg 1: file descriptor where to write)
        mov ecx, esp    ;   (arg 2: memory location where to find data)
        add ecx, 36     ;      add 36 to skip over the pushed registers
        mov edx, 64     ;   (arg 3: number of characters to print)
        int 0x80
        popad
        ret

Figure 1. Two useful debugging subroutines.

Figure 1 contain two useful subroutines that you might want to use in your program as you debug it. The first subroutine, printregs, prints the values of the registers immediately before you call the routine. The second, printstack, works similarly: It prints the top 64 bytes on the stack prior to the time the subroutine is called.

These subroutines print the raw binary data to file descriptor 2 --- standard error. In order to make sense of the output of these subroutines, you'll want to pipe the output of your program into the od Unix utility, which converts binary output into a human-readable format.

% ./a.out |& od -t x4
Here, we've used |& to pipe both standard output and standard error into the standard input of od. The -t x4 option to od says to display the numbers as four-byte hexadecimal values. (See the man page for od for more information about the options to od.)

Note that after you print characters to standard output, this will likely get od off track, as it simply takes groups of four bytes and displays their hexadecimal values.

Grading criteria

To grade your program, I will inspect your code and attempt to compile and execute it. I will assign points as follows.

10 written question 1
10 written question 2
10 written question 3
10 with no command-line arguments, the program echoes all lines typed
10 the program exits gracefully at the file's end
15 the program translates letters correctly
15 the program handles command-line arguments correctly