Lecture Note Review (Finals OS)

Systems Review

malloc

Recall our memory format is (from top high address to low low address):

In this assigment we:

C Compilation and Linking

  1. *.a : archive libraries. Linked at link time w/ the object files.
  2. *.so: dynamic libraries. Linked at load time w/ the .out from the linker.

History of UNIX

  1. The I,J,K in Dijkstra's Algorithm are in order. That's how you spell it.
  2. The idea of multiprogramming: the idea of having multiple programs in memory so that if one is fail/done, then you can keep running the CPU.
  3. Time sharing: you switch programs when your stuck or when a certain amount of time has passed

Processes and Privilege

Most processes spend most of their time in the blocked state.

Pasted image 20241004142854.png

mechanism vs. policy

There's a distinct difference between mechanism and a policy:

  • Ex: A PHD program has limits on 7 years of staying.
  • The mechanism is giving the degree or kicking people out.
    Most grad schools cut off funding as your years go on, to incentivize you to get out at the wallet.

Pasted image 20241004144601.png

Context Switching (Threads)

You essentially are doing the following calling conention:

  1. Before the call: main() puts parameters at a known location.
  2. Call a function: Push ra and jump to foo address.
  3. Before body: Set up our stack frame
  4. Before return: Clean up and leave:
  1. After return: Clean up

We can use different schedulers like round robin to have different behaviors.

Race Conditions/Critical Sections and Solutions

Race condition

Any situation where the precise interleaving of a sequence of events affects the correctness of the outcome of the whole system.

critical section

A region of a program that cannot safely be interrupted.

For each of these solutions, think about if we had an interrupt right before setting whatever lock=true;, and seeing if two threads call that while the lock is already true from the first assignment.

Sol'n #1: Busy Waiting


// A
for(;;)
{
	while(turn == 1)
	{
		/* spin */
	}
	// critical 
	turn = 1;
	// noncritical
}

// B
for(;;)
{
	while(turn == 0)
	{
		/* spin */
	}
	// critical 
	turn = 0;
	// noncritical
}

This alternation strategy works, but it's still busy-waiting. Notice that it works because you give away control, rather than seize it per thread.

Sol'n #2: Peterson's Solution

int turn; 
bool interested[2]; 
void enter(int self)
{
	int other;
	other = 1 - self; // if self is 0 then other is 1, and vice versa
	interested[self] = true;
	turn = self;
	while((term == self) && interested[other]); // wait for turn
	// enter the process (all others block)
}

void exit_region(int self)
{
	interested[self] = false;
}

Think about an example of going through, especially if both set turn at the same time.

  1. Both try to set turn and thus their own interest
  2. turn is one of the written values
  3. The thread that got it's value in turn breaks from the while and thus runs
  4. The other thread has to wait in the while and thus blocks.

Sol'n #3: Hardware Lock Variables

The problem with these is that:

Sol'n #4: Non-Busy Waiting

Use semaphores:

semaphores (Dijkstra, 1967)

It is a counter with two atomic operations:

P(s) = DOWN(s);
	if(s == 0)
		sleep();
	else
		s--;
V(s) = UP(s);
	if(exists sleeping proc)
		wake one;
	else
		s++;

We need these to be atomic operations! For the producer consumer problem:

#define DOWN(s) (s = s--)
#define UP(s) (s = s++)

// ...
semaphore_t full = 0; // number of full cells in buffer
semaphore_t empty = N; // number of empty cells in the buffer
semaphore_t mutex = 1; // boolean lock, whether something is using it

producer(){
	for(;;){
		produce();
		
		DOWN(empty);  // wait for it to empty 
		DOWN(mutex); // get access
		enter_item();
		UP(mutex); // unlock mutex
		UP(full); // wait until it is full 
	}
}

consumer(){
	for(;;){
		DOWN(full); // we need a full one, so wait
		DOWN(mutex);
		remove_item();
		UP(mutex); // allow mutex locking
		UP(empty); // wait until it's empty
	}
}

Scheduling

Criteria:

Process Types

There's two types of process types:

  • IO Bound: characterized by short bursts of computation before blocking IO
    • Might want to give priority because they can get done and go back to sleep (hide IO latency)
    • Also, more likely to be interactive
  • Compute Bound: characterized by long bursts of computation before blocking on IO (or a semaphore).

Scheduling is mandatory in two cases:

  1. When a process exits
  2. When a process blocks

Examples (non-preemptive, non-interrupting):

Deadlock

Coffman's Criteria (1971)

To have deadlock, you must have all:

  1. Mutual-Exclusion
  2. Holds and Waits
  3. No-preemption
  4. You must have circular wait (like shown in the figure)

Did someone say deadlock? OMG hit game from Valve Software set in the TF2 Universe?!?!?!!

Sol'n #1: Dijkstra's (Bankers) Algorithm

Dijkstra's Banker's Algorithm

Model Resources like a line of credit.
Process declare max need in advance.
Test states for safety.

See Lecture 13 - Disk Drives#Multi-Dimensional Banker's Alg. for an example of its use.

IO Software

  1. Write something to the control register (see the docs for what specifically to write. It's device dependent)
  2. Wait a while (keep checking the status register to see if it's done)
  3. Check status register (again)
  4. Respond appropriately

It's best to have interrupts help notify the process when the IO is done. These are handled by the written ISR's so that when an interrupt does happen, you know what will be ran.

Memory Management

  1. Monoprogramming: only allow one program in memory. It's simple, and honestly a lot of cheap options would use this.
  2. Multi-programming/Fixed Partitions: we take our machine, divide it into chunks (of varying size) and load a program into each chunk.
  3. Relative Addressing: All of a programs addresses are relative to the base address give to the program.
    • It'd be hard to have a Linked List to hold all the relative addresses, and then on a new program run have to calculate the relative addresses for every address.
  4. Base Register: For every memory reference, add a base register for that programs base offset.

The problem, similar with malloc, experiences fragmentation. What can we do about it? We can, similar to malloc, try to keep track of allocated memory via:

Allocation Strategies

We could do:

  • first fit: look for the first open region that has enough room from the root
  • next fit: start looking from the last freed region.
    • This spreads things around (good) but increases fragmentation
  • best fit: look for the smallest one that we can fit into.
    • This doesn't work well since the fragments are so small such that you keep building them up and up until we have too much.
  • worst fit: Carve your allocation unit off the biggest chunk available.
    • Unsurprisingly this shreds your memory too. However, it's interesting to note that big allocation units are near other small allocation units (and similar for big units).

This brings us to the discussion of...

Virtual Memory

It's essentially caching real-address references, where we are *always* converting from VA to RA (virtual address to real address)
### Page Eviction Strategies
  1. Not Recently Used (NRU)
    • Prefer any unmodified page, since we don't have to go to storage to let it know that it's been modified.
    • Divide into 4 categories, based on reference bit R and modified bit M:
R R
M 0 2
M 1 3
Essentially, being not read takes more precedence than being not modified, so it's: MR,MR,MR,MR is the order of the pages you consult.
  1. Oldest Page Evicted: remove the oldest page from a FIFO (when it entered ever).
  2. Combine the above two, use a FIFO for old pages and if it's R read then send it back to the end. This is the:
Clock Algorithm

Due to the nature of the clockwork structure of the FIFO, (5) above is called the Clock Algorithm.

  1. Least Recently Used (LRU): on reference R reset the position in the queue.
  2. Not Frequently Used: Use a counter starting at 0 for every page. Every so often add the R bit and clear it. On a PAGE_FAULT evict the lowest count page. You can alternatively shift to the left so that pages don't starve other pages (over time you'll shift enough to reset back to 0).

Paging Policies

When:

You'll also need a medium PAGE_SIZE because:

Usually common sizes are 4K or even 64K (especially with larger memory sizes now).

File Allocation Table

Using file zone numbers, you have a table of zones that dictate where files (or parts of files are):

Filesystem Performance

In general, you want to cache file-references because if they are referenced again in time (temporal locality) then you are ready. Using buffer caching, we can hold recently used blocks in case we reuse them. With this:

Filesystem Reliability

To check them for consistency:

  1. Create an array of counters for allocated and free blocks:
    Pasted image 20241208140531.png
  2. Sweep through the filesystem and free list and increment the aprpopriate counter each time a block is found.

If all goes well each block will be found in one of the lists:
Pasted image 20241208140609.png
If not there's a problem:
Pasted image 20241208140628.png

You may have the following problems:

  1. Do the same thing for references counts from directories to inodes.