Hints for A3

This file is likely to be updated, so check back periodically and use your browser's refresh button.

Getting Started

When you reconfigure your kernel for Assignment 3, major changes will be made that will prevent your kernel from compiling. These changes include:

the file dumbvm.c will no longer be compiled and included in the kernel
the file vm/addrspace.c will be compiled and included in the kernel - it was not included in the kernels that you compiled for A1 and A2.
the preprocess option OPT_DUMBVM will no longer be defined. This means that most of the fields in the addrspace structure defined in include/addrspace.h will no longer be present in that structure.

If you look at addrspace.c you will find empty skeletons for the addrspace methods. They used to be implemented in dumbvm.c, which is no longer being compiled. The idea is that you should put your new implementations of those functions into addrspace.c

The best way to get started on A3 is to first get back to where you can compile and run your kernel, and then start to implement the requirements for A3. If you don't do this, you will will not be able to test your A3 work incrementally as you get it done, and you'll be left with a huge mess to test at the end - almost a guarantee of testing and debugging nightmares.

One way to get started is to simply copy the implementations of the addrspace functions from dumbvm.c to addrspace.c. You will also need initial implementations of the other functions from dumbvm.c (like the fault handler, vm_fault and the various low level VM and physical memory functions, such as vm_bootstrap and getppages). For these, you can create a new source file (e.g, kern/arch/mips/mips/vm.c) or use kern/vm/addrspace.c and copy their implementations from dumbvm.c to that file. (Don't forget to add your new file to the kernel configuration and reconfigure your kernel!) Finally, you will need to patch up the addrspace structure in addrspace.h so that the fields that went away when OPT_DUMBVM stopped being defined will be present again. Once you've made these changes, make sure that you can build and run your kernel. Everything that worked after A2 should be working again.

Once you have done this, you can start working on the various parts of A3. Since you have a working kernel, you should be able to test each part of A3 as you build it.

Synchronization

Remember that when a page fault occurs and the page needs to be loaded, the process that caused the page fault will be blocked while the page loads. While that process is blocked, other process in the system may run.

Handling TLB Faults

Make sure that your TLB fault handler (vm_fault) does not do anything that might cause another TLB fault. The result will be a potentially infinite nesting of TLB faults from which your kernel will probably not recover. In particular, your TLB fault handler should avoid anything that involves touching virtual addresses in the application's part of the virtual address space, since those attempts might generate faults, depending on what's in the TLB. Functions like copyin and copyout are examples of functions that touch application virtual addresses.

as_copy

The as_copy function is needed only by the fork system call. Most A3 testing will not involve fork. Save as_copy for last - work on it only if you have time.

Testing

Here is how we will plan to test your A3 code:
- general single-process tests
  These single-process tests just test that your virtual memory system is working and that the statistics output by your VM system on shutdown make sense. Other than the statistical output, they do not test any of the specific new features you are adding to the virtual memory system for A3. We will use the palin, matmult and sort applications to test. For matmult and sort, we will increase the physical memory size by a factor of 10 (to 5242880 bytes) before running the tests, to ensure that these applications can fit into memory. See below for instructions on how to change the physical memory size of the virtual machine.
- TLB management
  To test TLB management, we will use the tlbfaulter application run with 10 times the default memory size.
- read-only memory
  To test protection of read-only memory, we will use romemwrite and the default physical memory size.
- on-demand loading
  To test on-demand loading, we will use the sparse and huge applications. sparse will be run with the default physical memory size, and huge will be run with 10 times the default physical memory size.
- physical memory management (Optional)
  To test physical memory management, we will run the matmult and/or sort programs multiple times in sequence, using 10 times the default physical memory size. With proper physical memory management, these programs should be able to run an arbitrary number of times without causing the kernel to run out of memory.
  In addition, we will run one or more of the tt1, tt2 and/or sy1 kernel tests multiple times in sequence. This tests your physical memory manager's ability to free up space allocated for kernel data structures, such as thread stacks.
- general multi-process tests
  These tests are intended to test the robustness of your virtual memory implementation. We will run two tests: forktest and sty. Both will be run with 10 times the default physical memory size. These are the only tests that require system calls other than console writes and _exit from A2.
For many of these tests, we will be relying on the virtual memory statistics output by your kernel as an indication of whether your kernel is behaving as expected.
With the exception of the programs used for the general multi-process tests, none of these programs make use of command line arguments, and none make use of system calls other than write to the console and exit.
To run many of the test programs you will need to increase the physical memory size of the machine. You can do this by editing root/sys161.conf. Look for this line:
```
31      busctl  ramsize=524288
```
which says that the machine has 524288 bytes of physical memory (128 4096-byte frames). Change 524288 to a larger number. The new number may need to be divisible by 4096 (I'm not sure).
We expect to be able to run these programs using command lines like this:
```
% sys161 kernel "p testbin/matmult;q"
```
as we did for Assignment 2.

I didn't get the A2 system calls working. What can I do?

Most of the testing for A3 will involve only writes to the console and _exit. If you have at least that functionality working from A2, you can work with that, and you can ignore the rest of this section.

If you do not even have console writes and _exit working from A2, you can use the following instructions for a quick-and-dirty implementation of those two system calls. This can either replace your implementation in A2 or you can even start by adding this code to a fresh kernel. These implementations don't do much (in particular, this _exit simply reboots the machine) but it is enough to support most of the A3 testing.

First, modify the function mips_syscall(struct trapframe *tf) in kern/arch/mips/mips/syscall.c as shown below. Note that you may need to include some header files for this to compile.

mips_syscall(struct trapframe *tf)

  ...

  switch (callno) {
    case SYS_reboot:
      err = sys_reboot(tf->tf_a0);
      break;

    /* BEGIN NEW CODE ------------------------------------------- */
    /* NEW: Simple code to handle writes to console */
    /*   this *only* works for writing null-terminated */
    /*   strings, which should be sufficient to handle */
    /*   printf()s in the application code */
    case SYS_write:
      /* check that the write is to stdout */
      assert(tf->tf_a0 == STDOUT_FILENO);
      kprintf("%s", (char *) tf->tf_a1);
      retval = strlen((char *) tf->tf_a1);
      break;

    /* NEW: Simple code to handle _exit call */
    /* 
    case SYS__exit:
      thread_exit(); /* NOTE: need to modify thread_exit(); */
      break;         /* may require #include <thread.h> */
    /* END NEW CODE ------------------------------------------- */

    /* don't forget to comment out or otherwise disable */
    /* any existing implementation of SYS_write and */
    /* SYS__exit that you may have */

    default:
      kprintf("Unknown syscall %d\n", callno);
      err = ENOSYS;
      break;
  }

Next, in kern/thread/thread.c , modify the thread_exit function:

void
thread_exit(void)
{
  /* BEGIN NEW CODE ------------------------------------------- */
  /* NEW: just shut everything down */
  extern int sys_reboot(int code);
  sys_reboot(RB_POWEROFF); /* may require #include <kern/unistd.h> */
  /* END NEW CODE ------------------------------------------- */

  /* leave everything else here */

  ...

Finally, in kern/arch/mips/mips/trap.c, modify kill_curthread so that thread_exit will get called.

void
kill_curthread(u_int32_t epc, unsigned code, u_int32_t vaddr)
{
   /* BEGIN NEW CODE ------------------------------------------- */
   /* New: if the current thread gets killed */
   thread_exit();         /* may require #include <thread.h> */
   /* END NEW CODE ------------------------------------------- */

   ...

Programming the TLB

The code that is used to implement the TLB routines can be found in kern/arch/mips/mips/tlb_mips1.S

A description of what these functions do as well as some useful #defines can be found in kern/arch/mips/include/tlb.h

You can find out quite a lot of detail about how the R3000 TLB operates in the R3000 manual. See "Memory Management and the TLB, Chapter 6". PDF viewer starting page number is 80 and document starting page number is "6-1".

Tracking Virtual Memory Statistics

Please do not preload and/or save and restore TLB contents.
Please only demand load the TLB. Even after a TLB invalidation.
This will allow us to more easily compare statistics.

Below we try to answer some frequently asked questions and to clarify what the stats should count.

"TLB Reloads": The number of TLB reloads (TLB faults that did not require a page fault)

This is what some people/systems call a "soft fault".
It is meant to count how many times a TLB fault occurs when the page is actually in memory but there is not a valid mapping in the TLB (so the only action required is to reload/install a valid TLB mapping for the page). So this is a count of TLB faults that do not result in a page fault.
Again, you should not be preloading or reloading the TLB, the TLB should only be demand loaded.

"Page Faults (Zeroed)" : The number of Page Faults that did not require a disk copy.

These are interesting because they are cheaper than page faults that require a copy from disk.
Only count full pages zeroed. If part of the page is copied from disk and the remainder is zeroed that does not count as a Page Fault (Zeroed), instead it counts as a "Page Fault (Disk)"

"Page Faults (Disk)": The number of Page Faults that required a disk copy (e.g., loading a text/code page)

These are the most expensive faults because require a copy from disk.
If part of the page is copied from disk and the remainder is zeroed count that as a "Page Fault (Disk)". Do not count it as a Page Fault (Zeroed).

Losing Output During Booting

Sometimes it can be helpful to buffer more printf output before the kprintf subsystem is initialized. To permit more kprintfs to happen before everything is intialized you can make a modification similar to the one shown below to the file kern/dev/generic/console.c

#if OPT_A3
/* increase the size substantially */
#define DELAYBUFSIZE  10240
#else
#define DELAYBUFSIZE  1024
#endif
static char delayed_outbuf[DELAYBUFSIZE];

Understanding How Physical Memory is Handled (Optional)

Look at gettppages in kern/arch/mips/mips/dumbvm.c. Notice how getppages gets more page frames when needed. When is getppages first called and why? Think about how that will have to differ in an implementation that needs to talk to the coremap to find free pages.

Have a look at ram_stealmem in kern/arch/mips/mips/ram.c. Understand how it works and what firstpaddr and lastpaddr are doing.

Have a look at kern/lib/kheap.c and kmalloc. Understand how kmalloc figures out which physical page frame to give to the caller (when a new frame is needed) and how the physical address of that frame is turned into a virtual address that the caller/kernel uses.

Be sure you understand how the MIPS translates a kernel virtual address to a physical address and what that means as it relates to allocating free page frames.

Have a look at kern/arch/mips/mips/ram.c and the function ram_bootstrap(). You should figure out what is in physical memory at this point, where it is in physical memory and why.

Looking at the comments and information in kern/arch/mips/mips/start.S should help quite a bit.
Remember that the kernel is also a MIPS executable (ELF file). You may find that you can better understand what lives in memory after booting the kernel if you examine the contents of it's executable file (ELF headers).

# Dump the contents of the ELF file into readelf.out
# Now you can look at the contents of the readelf.out
# file to learn more about the kernel.
cs350-readelf -a kernel-ASST3 > readelf.out

Think about how to mark the frames that are already occupied as used in your coremap. Note that there will be a bit of a chicken and an egg problem. In order to mark frames as used you will need to allocate a coremap data structure (using kmalloc). But kmalloc may be needed to find a free page frame to allocate.