Multiprocessor systems Tanenbaum's "timesharing" term dates the book. Easiest to get a single system image. One example is my workstation at NECI (soon to move north). Shared memory makes a single run queue (ready list) natural choice. So scheduling is "trivial": Take uniprocessor code and add semaphores. The standard diagram for process states applies. The diagram is available in postscript (best) and in html (most portable).
What about processor affinity?
Shared memory makes shared I/O buffer cache natural choice.
Gives true (i.e. uniprocessor semantics for I/O with
little extra work.
Avoiding performance bottlenecks is serious.
Shared memory makes uniprocessor file system natural choices.
Summary (Tanenbaum figure 9-13)
Network Distributed Multiprocessor
Operating Operating Operating
Item System System System
--------------------------------------------------------------------
Virtual Uniprocessor No Yes Yes
Same OS No Yes Yes
Copies of OS N N 1
Communication Shared Files Messages Shared memory
Network Protocols Yes Yes No
Single Ready List No No Yes
Well def file sharing Rare Yes Yes
--------------------------------------------------------------------
Design issues
A Big one is TRANSPARANCY, i.e. you don't "see" the multiplicity
of processors. Or you see through them.
(NOT always what is wanted. Might want to create as many (child)
processes as there are processors. Might choose a different
coordination algorithm for 2 processes than for 50)
Location transparancy: cannot tell where the resourses are
located.
So ``pr -Psam'' does not tell you where sam is located (but
cat /etc/printcap does).
I assume tanenbaum means you need not know where the resourses
are located. Presumably there are routing tables somewhere
saying where they are.
Migration transparancy: moving a resources does not require a name
change.
Recall our research system. If you move a file from /e stored
on allan to lab it cannot be in /e (property of mount).
Symlinks can hide this slightly.
Replication transparancy: Don't know how many copyies of a
resourse exist.
E.g. multithreaded (or multiprocess) server
Concurrency transparancy.
Don't notice other users (except for slowdowns).
That is, same as for uniprocessor.
Can lock resource--but deadlocks possible
Again just like for uniprocessor.
homework 9.11
Parallelism transparancy
You write uniprocessor programs and poof it all works hundreds
of times faster on a dist sys with 500 processors.
Far from current state of the art
My wording above is that you don't always want transparancy.
If you had parallelism transparancy, you would always want it.
Flexibility--really monolithic vs micro kernel
Should the (entire) OS run in supervisor mode?
Tanenbaum is rather one sided here. He is a big microkernel
fan. Amoeba is his group's system.
Monolithic kernel is conventional, all services supplied by
kernel, OS. Often derived from uniprocessor OS
Microkernel only supplies
IPC
Some mem mgt
low level proc mgt and sched
low level I/O
Micro kernel does not supply
Filesystems
Most system calls
Full process management (deciding which process is highest
priority)
Instead these are supplied by servers
Advantages
Fairly easy to modify while the system is running.
Bug in server can bring down system but the bug will
appear in the server not in another part of the OS
Disadvantage
Performance: Tanenbaum says its minor but not so clear.
Crossing more protection domains and having more transfers
of control, hurt cache performance and this is
increasingly important as speed ratios between processors
and DRAM grow.
HOMEWORK 9.9
Reliability
It's an improvement if you need just one of many possible
resourses in order to work.
HOMEWORK 9.12
It's a negative if you need all the resourses. (AND vs OR)
Avail with prob p^n if AND (p = prob one resourse avail)
Avail with prob 1 - (1-p)^n if OR
Lamport's def of dist sys "One on which I cannot get any work
done because some machine I have never heard of has crashed"
Availability: percentage of time the system is up. So
increased replication increases availability
Consistency: Must keep copies consistent so data not garbled.
Increasing replication makes this worse (more expensive).
Performance
Coarse grained parallelism: Little, infrequent
communication/coordination. This is the easy case to get high
perf for. Sometimes called "embarassingly parallel".
Fine grained parallelism: Tight coordination and/or much data
communication. Tough. Many msgs.
Scalability
Not trivial! Descirbes mintel where french phone company is
"currently" (i.e. prior to 1992) installing terminal in every
home. If successful, "other countries will inevitably adopt
similar systems". It is 1998 and we don't have it yet.
Is the web a dist sys?
Centralized components/tables/algorithms.
If the degree of parallelism is large any centralized
"thing" is a potential bottleneck.
Fault tolerance (single point of failure).
Centralized tables have fault tolerence performance
bottleneck problems.
The perf problem can be solved if concurrently
accessible by combining requests on the way to the
server of the centralized table.
It is often too expensive to get the entire (accurate) state
of the system to one computer to act on. Instead, one prefers
decentralized algorithms.
No machine has complete information
Decisions made based on locally available info (obvious?)
Tolerate some machine failures
Don't assume a global clock (exactly synchronized)
HOMEWORK 9.13
---------------- Chapter 10: Communication in Dist Sys ----------------
With no shared memory, communication is very different from that in a
uniprocessor (or a shared memory multiprocessor).
PROTOCOL: An agreement between communicating parties on how
communication is to proceed.
Error correction codes.
Blocksize.
Ack/Nak
LAYERED protocol: The protocol decisions concern very different things
How many volts is 1 or zero? How wide is the pulse? (LOW level)
Error correction
Routing
Sequencing
As a result you have many routines that work on the various
aspects. They are called layered.
Layer X of sender acts as if it is directly communicating with
layer X of received but in fact it is communicating with layer X-1
of sender.
Similarly layer X of sender acts as a virtual layer X+1 of
receiver to layer X+1 of sender.
Famous example is the ISO OSI (Intern standards org open sys
interconnect).
First lets look at the OSI diagram just as an example of layering
The diagram is available in
postscript (best) and in
html (most portable).
So for example the network layer sends msgs intended for the other
network layer but in fact sends them to the data link layer
Also the network layer must accept msgs from the transport layer,
which it then sends to the other network layer (really its own
data link layer.
What a layer really does to a msg it receives is that it adds a
header (and maybe a trailer) that is to be interpreted by its
corresponding layer in the receiver.
So the network layer adds a header (in front of the transport
layer's header) and sends to the other network layer (really its
own data link layer that adds a header in front of the network
layer's--and a trailer--)