Multiprocessor systems Tanenbaum's "timesharing" term dates the book. Easiest to get a single system image. One example is my workstation at NECI (soon to move north). Shared memory makes a single run queue (ready list) natural choice. So scheduling is "trivial": Take uniprocessor code and add semaphores. The standard diagram for process states applies. The diagram is available in postscript (best) and in html (most portable).
What about processor affinity? Shared memory makes shared I/O buffer cache natural choice. Gives true (i.e. uniprocessor semantics for I/O with little extra work. Avoiding performance bottlenecks is serious. Shared memory makes uniprocessor file system natural choices. Summary (Tanenbaum figure 9-13) Network Distributed Multiprocessor Operating Operating Operating Item System System System -------------------------------------------------------------------- Virtual Uniprocessor No Yes Yes Same OS No Yes Yes Copies of OS N N 1 Communication Shared Files Messages Shared memory Network Protocols Yes Yes No Single Ready List No No Yes Well def file sharing Rare Yes Yes -------------------------------------------------------------------- Design issues A Big one is TRANSPARANCY, i.e. you don't "see" the multiplicity of processors. Or you see through them. (NOT always what is wanted. Might want to create as many (child) processes as there are processors. Might choose a different coordination algorithm for 2 processes than for 50) Location transparancy: cannot tell where the resourses are located. So ``pr -Psam'' does not tell you where sam is located (but cat /etc/printcap does). I assume tanenbaum means you need not know where the resourses are located. Presumably there are routing tables somewhere saying where they are. Migration transparancy: moving a resources does not require a name change. Recall our research system. If you move a file from /e stored on allan to lab it cannot be in /e (property of mount). Symlinks can hide this slightly. Replication transparancy: Don't know how many copyies of a resourse exist. E.g. multithreaded (or multiprocess) server Concurrency transparancy. Don't notice other users (except for slowdowns). That is, same as for uniprocessor. Can lock resource--but deadlocks possible Again just like for uniprocessor. homework 9.11 Parallelism transparancy You write uniprocessor programs and poof it all works hundreds of times faster on a dist sys with 500 processors. Far from current state of the art My wording above is that you don't always want transparancy. If you had parallelism transparancy, you would always want it. Flexibility--really monolithic vs micro kernel Should the (entire) OS run in supervisor mode? Tanenbaum is rather one sided here. He is a big microkernel fan. Amoeba is his group's system. Monolithic kernel is conventional, all services supplied by kernel, OS. Often derived from uniprocessor OS Microkernel only supplies IPC Some mem mgt low level proc mgt and sched low level I/O Micro kernel does not supply Filesystems Most system calls Full process management (deciding which process is highest priority) Instead these are supplied by servers Advantages Fairly easy to modify while the system is running. Bug in server can bring down system but the bug will appear in the server not in another part of the OS Disadvantage Performance: Tanenbaum says its minor but not so clear. Crossing more protection domains and having more transfers of control, hurt cache performance and this is increasingly important as speed ratios between processors and DRAM grow. HOMEWORK 9.9 Reliability It's an improvement if you need just one of many possible resourses in order to work. HOMEWORK 9.12 It's a negative if you need all the resourses. (AND vs OR) Avail with prob p^n if AND (p = prob one resourse avail) Avail with prob 1 - (1-p)^n if OR Lamport's def of dist sys "One on which I cannot get any work done because some machine I have never heard of has crashed" Availability: percentage of time the system is up. So increased replication increases availability Consistency: Must keep copies consistent so data not garbled. Increasing replication makes this worse (more expensive). Performance Coarse grained parallelism: Little, infrequent communication/coordination. This is the easy case to get high perf for. Sometimes called "embarassingly parallel". Fine grained parallelism: Tight coordination and/or much data communication. Tough. Many msgs. Scalability Not trivial! Descirbes mintel where french phone company is "currently" (i.e. prior to 1992) installing terminal in every home. If successful, "other countries will inevitably adopt similar systems". It is 1998 and we don't have it yet. Is the web a dist sys? Centralized components/tables/algorithms. If the degree of parallelism is large any centralized "thing" is a potential bottleneck. Fault tolerance (single point of failure). Centralized tables have fault tolerence performance bottleneck problems. The perf problem can be solved if concurrently accessible by combining requests on the way to the server of the centralized table. It is often too expensive to get the entire (accurate) state of the system to one computer to act on. Instead, one prefers decentralized algorithms. No machine has complete information Decisions made based on locally available info (obvious?) Tolerate some machine failures Don't assume a global clock (exactly synchronized) HOMEWORK 9.13 ---------------- Chapter 10: Communication in Dist Sys ---------------- With no shared memory, communication is very different from that in a uniprocessor (or a shared memory multiprocessor). PROTOCOL: An agreement between communicating parties on how communication is to proceed. Error correction codes. Blocksize. Ack/Nak LAYERED protocol: The protocol decisions concern very different things How many volts is 1 or zero? How wide is the pulse? (LOW level) Error correction Routing Sequencing As a result you have many routines that work on the various aspects. They are called layered. Layer X of sender acts as if it is directly communicating with layer X of received but in fact it is communicating with layer X-1 of sender. Similarly layer X of sender acts as a virtual layer X+1 of receiver to layer X+1 of sender. Famous example is the ISO OSI (Intern standards org open sys interconnect). First lets look at the OSI diagram just as an example of layeringThe diagram is available in postscript (best) and in html (most portable).
So for example the network layer sends msgs intended for the other network layer but in fact sends them to the data link layer Also the network layer must accept msgs from the transport layer, which it then sends to the other network layer (really its own data link layer. What a layer really does to a msg it receives is that it adds a header (and maybe a trailer) that is to be interpreted by its corresponding layer in the receiver. So the network layer adds a header (in front of the transport layer's header) and sends to the other network layer (really its own data link layer that adds a header in front of the network layer's--and a trailer--)