Clusters (pools, etc)

    Bunch of workstations without displays in machine room connected
    by a network.

    Quite popular now.

    Indeed some clusters are packaged by their manufacturer into a
    serious compute engine.

        IBM SP2 sold $1B in 1997.

            VERY fast network

    Used to solve large problems using many processors at one time

        Pluses of large time sharing system vs small individual
        machines.

        Also the minus of timesharing

        Can use easy queuing theory to show that large fast server
        better in some cases than many slower personal machines

    Tannenbaum suggests using X-terminals to access the cluster, but
    X-terminals haven't caught on.

        Personal workstations don't cost much more

Hybrid

    Each user has a workstation and use the pool for big jobs

    Tannenbaum calls this a possible compromise.

        It is the dominant model for cluster based machines.

            X-terminals haven't caught on

            The cheapest workstations are already serious enough for
            most interactive work freeing the cluster for serious
            efforts.

---------------- Processor Allocation ----------------

Decide which processes should run on which processors

Could also be process allocation

Assume any process can run on any processor

    Often the only difference between different processors is

        CPU speed

        CPU Speed and max Memory

    What if the processors are not homogeneous?
                
        Assume have binaries for all the different architectures.

    What if not all machines are directly connected    

        Send process via intermediate machines
        
    If all else fails view system as multiple subsystems

        If have only alpha binaries, restrict to alphas

        If need machines very close for fast comm, restrict to a group
        of close machines.

Can you move a running process or are processor allocations done a
process creation time?

    Migatory allocation algorithms vs nonmigratory

What is the figure of merit, i.e. what do we want to optimize?

    Similar to CPU scheduling in OS 1.

    Minimize reponse time

        We are NOT assuming all machines equally fast.

            Consider two processes P1 executes 100 millions instructions,
            P2 executes 10 million instructions.

            Both processes enter system at t=0

            Consider two machines A executes 100 MIPS, B 10 MIPS

            If run P1 on A and P2 on B each takes 1 second so avg
            response time is 1 sec.

            If run P1 on B and P2 on A, P1 takes 10 seconds P2 .1 sec
            so avg response time is 5.05 sec

            If run P2 then P1 both on A finish at times .1 and 1.1 so
            avg resp time is .6 seconds!!

        Do not assume machines are ready to run new jobs, i.e. there
        can be backlogs.

    Minimize response ratio.

        Response ratio is the time to run on some machine divided by
        time to run on a standardized (benchmark) machine, assuming
        the benchmark machine is unloaded.

        This takes into account the fact that long jobs should take
        longer.

        Do the P1 P2 A B example with response ratios

HOMEWORK 12-12.

    Maximize CPU utilization

        NOT my favorite figure of merit.

    Throughput

        Jobs per hour

        Weighted jobs per hour

            If weighting is CPU time, get CPU utilization

            This is the way to justify CPU utilization (user centric)

Design issues

    Deterministic vs Heurestic

        Use determanistic for embedded applications, when all
        requirements are known a priori

            Patient monitoring in hospital

            Nuclear reactor monitoring

    Centralized vs distributed

        Usual tradeoff of accuracy vs fault tollerence and bottlenecks

    Optimal vs best effort

        Optimal normally requires off line processing.

        Similar requirements as for determanistic.

        Usual tradeoff of system effort vs result quality

    Transfer policy

        Does a process decide to shed jobs just based on its own load
        or does it have (and use) knowledge of other loads?

        Called local vs global algs by tannenbaum

        Usual tradeoff of system effort (gather data) vs res quality

    Location policy
    
        Sender vs receiver initiated 

            Look for help vs look for work

            Both are done

            Tannenbaum asserts that clearly the decision can't be
            local (else might send to even higher loaded machine)

                NOT clear

                The overloaded recipient will then send it again

                    Tradeoff of #sends vs effort to decide

                    Use random destination so will tend to spread load
                    evenly

                    "Better" might be to send probes first, but the
                    tradeoff is cost of msg (one more for probe in
                    normal case) vs cost of bigger msg (bouncing a job
                    around instead of tiny probes) vs likelyhood of
                    overload at target.