Lecture 6 - Threads

Recall from last time that we liked our processes to have isolation, in resources and in privilege. In a real thread, having a context switch is an expensive operation since copying the entire stack and registers takes a while (it's O(n) where n is the space of the stack). What we want is a lightweight process, known as a thread.

thread

A thread is a lightweight process. Compared to a heavy-weight process, it has an independent stream of control within a heavyweight process.

The threads cannot share a stack. If one thread goes into a function, then the others would follow suit! The good thing is that context switches are light when these stacks are small. However, it can break things since now we have concurrency. For example, consider having the operation:

x = x+1;

This is multiple instructions in ASM:

rx = read(addrx);
rx = add(rx, #1);
rx = write(addrx);

But this means we could have a race condition:

Race condition

Any situation where the precise interleaving of a sequence of events affects the correctness of the outcome of the whole system.

The correctness is important here. If you don't care that your computer can give incorrect answers, then you don't have to worry. We care, so we have to consider the assumptions made between lines of our ASM. For instance:

It's better to yield control at an unsafe time and rectify rather than have that control be taken from you (in the perspective of the OS).

LWP: Lightweight Processes

There's a handful of functions to make for the assignment 2:

lwp_create(); 
lwp_start(); // the calling thread turns into a lightweight thread, except the stack is copied to the new thread. 
lwp_yield(); // Call the scheduler and do a context switch.
lwp_exit(); // Terminate the thread, but don't dealloc the resources, unless some other thread calls `lwp_wait()`
lwp_wait();

What constitutes the state of our thread? It's:

All we have to do is swap the registers of the threads, and swap the pointers. The good news is that it has to be done in ASM since it's machine dependent. As such, you're given magic_64.S . You'll also need to have some known starting floating-point state, which you're given.

What you want to do is you want to know what you want your thread to look like, then do everything for that before calling lwp_start().

State: Registers

We have the A, B, C, D, AX, BX, CX, ... and kept extending them. At some point we got good register names:

RAX // general purpose
RBX
RCX
RDX
RSI // essentially a GP register
RDI // where the first parameter is passed
RSP // stack pointer
RBP // base pointer (for the stack, for locals and parameters)
R8 - R15 // gp registers

Say I want to write this program:

for(int x){
	int t;
	t = x;
}
// ...
foo(5);

Our calling convention:

  1. Before the call: main() puts parameters at a known location.
  2. Call a function: Push ra and jump to foo address.
  3. Before body: Set up our stack frame
  4. Before return: Clean up and leave:
  1. After return: Clean up
foo:
	pushq %rbp // put break pointer on the stack
	movq %vsp, %rbp // move base pointer up the stack to ref. point
	subq #8, %rsp // make locals for space `t`, ie move stack 'up'
	// do function stuff
	addq #8, %rsp // push back the stack space we used. 
	leave // does 2 things: 1) copy bp to sp 2) move sp to bp
	return // return PC to our stack pointer, which has our ra 

main:
	movq #5, %rd; // put 5 in our register
	call foo; // put the ra on the stack, put the instruction pointer to `foo`
	

Note that you can just automate this by just doing gcc -S foo.c to examine the raw assembly instead.

But the idea is that if lwp_create():