Processes
A process is an executing program. It is an active entity.
- programs which are just stored and don’t do anything)
- one program can lead to several processes (e.g. multiple executions by different users, fork)
Process Creation
A system process (parent) creates a user process (child). Each process has a unique Process ID (PID), and a Parent Process ID (PPID).
Tip: We can show the processes running in a Linux machine using ps.
A user process can create another user process using syscalls; in UNIX-based OSs, we use fork()
.
Using fork() in C
Note: Windows uses different functions.
fork()
creates a child process whose address space is shared with the parent
The value returned from fork()
can take on different forms:
fork() < 0
→ an unsuccessful forkfork() == 0
→ this is the child processfork() > 0
→ the PID of the child process, viewed from the parent process
Memory of a Process
5 main areas of memory:
- Text stores the instructions of the program (so they can be executed)
- Data stores the global variables
- bss is the uninitialised data (not got any specific data). Stands for “block starting symbol”?? idk wtf that means icl
- Stack stores local variables (including parameters) and (return address?)
- composed of Stack Frames, which each represent a function call
- more limited memory use than the heap
- Heap is dynamically allocated memory
Note that:
- there’s a block of unused memory between the Stack and the Heap to allow both to grow
- grow the stack by calling functions, and shrink by exiting a function
- grow the heap by allocating memory (in C, malloc/realloc/calloc), and shrink by deallocating (using free)
- this uses virtual addresses generated by the CPU
States of a Process
- A process can be new (when it’s newly created)
- When it has been admitted into memory by the OS, it is ready (and waiting to be assigned to and executed on a processor)
- The process can now be scheduled by the OS to be running
- A running process can be interrupted and go back to being ready
- A running process can wait for a resource, and go to the waiting state
- A running process can run out of time, and be put back in the ready state
- On completion, running processes enter the terminated state
Keeping Track of Processes - Process Control Block (PCB)
The PCB is a data structure that stores data about a process:
- Process State
- Scheduling Info (Priority of the process, pointers to different scheduling queues)
- Memory management info (e.g. how much memory can we allocate? something about CPU finding where real addresses are)
- Accounting Information (Which CPU used, time since start)
- I/O status (list of open files/devices)
Since processes can be paused (e.g. interrupted), the PCB also stores:
- Program Counter (address of next instruction to be executed)
- Contents of CPU registers (memory of data during execution)
Often, PCB also stores the PID.
Context-Switching
Since a single processor can only run one process at a time, we may have to context-switch between processes (stop one process, and start another).
There’s some overhead with context-switching since:
- the CPU must save state (Program Counter, Contents of CPU registers) into PCB
- the CPU must later retrieve state when resuming the old process
This overhead takes time and the CPU does no useful work in this time → we want:
- reduce context-switching time
- reduce frequency of context switches
Concurrency vs Parallelism
Context-Switching is said to support concurrency, which is quickly switching between processes.
There is also parallelism, which is running two processes simultaneously (on different processors).
Process Scheduling
OS maintains several queues:
- Job Queue (for new processes)
- Ready Queue (for ready processes)
- Device Queues (for waiting processes waiting for I/O devices)
These queues store a bunch of PCBs.
I/O Bound vs CPU Bound Processes
I/O Bound jobs/processes spend more time doing I/O than computation
- Short CPU bursts
CPU Bound jobs/processes spend more time doing computation than I/O
- Long CPU bursts
Schedulers
Short-Term Scheduler (STS) selects next process from the Ready Queue.
- Invoked frequently (at least once in every 100ms) to keep up concurrency illusion
- Must be fast to not waste CPU time
Long-Term Scheduler (LTS) selects processes from the Job Queue to be in the Ready Queue.
- May be minutes between invocations (doesn’t have to be fast)
- controls degree of multiprogramming (# processes in memory)
- Too many processes in memory → slow down system
- Stability: arrival rate of jobs = completion rate of jobs
- Maintain a good mix of I/O Bound and CPU Bound processes
- Ensure devices are utilised
- Our computers don’t have LTSs → processes dumped on the STS