NetworkManager Algorithms for Project Quero



Login Algorithm


The client contacts a MasterBrowser’s NetworkManager, requesting login and providing information the MasterBrowser can analyze to determine whether they can accept the client. Contact is either made using either a specified IP address or using discovery.




The NetworkManager checks to see if it can handle the client and if so, it then asks the QueryManager whether acceptance is ok. The QueryManager will either say no, or send yes along with the RemoteReference to the QueryManager. At this point, the NetworkManager then responds with acceptance and gives the client the RemoteReference as well as other information such as a RemoteReference to its own MasterBrowser if one exists, and adds the client to the client list and considers it logged in. Also, the NetworkManager needs to register a timer event for keep alives.


The client then contacts the QueryManager that login has occurred and hands it the RemoteReference. The QueryManager is then responsible for propogating the file list up to the QueryManager’s of the MasterBrowsers. At this point, the QueryManager of the MasterBrowser can get a RemoteReference to the newly added client’s QueryManager.


The client’s NetworkManager registers a keep alive timer event to determine when the client goes down. When the client does go down, it can then talk to the parent’s parent to try to run an election or find the new MasterBrowser elected in (See the Election Section)




Each MasterBrowser maintains whether it has room for another client (and specifically how much room) as well as whether its children has room as well. Upon sending a rejection msg, the MasterBrowser does one of three things:

a)      Sends a list of all of its children with openings,

b)      Sends its parent if no children have openings,

c)      Sends nothing but the rejection


The client then uses the information to ping the client’s that have room and then try to log into the client with the best ping, or if no such client’s exist try to log into the parent.


Handling MasterBrowser Deaths


There are two house cleaning events that need to occur when a node leaves the network. The first is to ensure that all MasterBrowsers above no to remove the files of the nodes that logged out. The second is to ensure that its place is filled in the network appropriately.


Removing Dead MasterBrowser files:

When a MasterBrowser detects that a client under it is dead, it removes the files that belong to the client. On graceful logouts, the NetworkManager recieves a child nodes logout request, and then notifies the QueryManager to remove all entries belonging to the child node. The NetworkManager then passes the message up to its own MasterBrowser and so on.


Ungraceful logouts are a little more involved. Child nodes maintain login state with the MasterBrowser by periodically sending keep alives. If one of these keep alives are not received in time the MasterBrowser considers the node logged out and cleans up in the same way as a graceful logout. However, such a node logout would not clean up for the children under the node disconnected. One possibility could be to propogate keep alives all the way up the tree. This way stale data will eventually be removed.


The obvious problem with this is that it will increase the number of messages in the network, and increase the burden of processing these messages as well. Our solution is this:


When a node leaves the network, we simply remove its file list from the higher level MasterBrowsers in the same way that graceful logouts handle it. This is acceptable because its children (if there are any) while contact and reconnect under the parent of the dead node after an election process. It is possible that certain children will be lost in the process because they too might have gone down (local area network failures, etc…) or just failed to contact the parent’s parent. However, nodes should connect in the tree again. One further improvement could be just to wait for a node to request file transfer on stale data. When the node requesting file transfer then realizes that the node is stale, it can then notify the master browser that the node could likely be dead. The MasterBrowser could then take appropriate action.



Replacing Dead MasterBrowsers via Election


Shortly after a MasterBrowser goes down an election needs to occur to fill its spot. Under most cases an election is initiated by a child node contacting the parent of the dead node, although such a parent might not exist. First, lets consider the case where the parent’s parent does exist.


The first child node who contacts the parent node is issued by the parent node as head of the election. The other children eventually will contact the parent node to find out who the head is. Each node then sends information about themselves to the head, at which point the head then makes a decision about who the new master browser is. If possible, the children of the elected nodes are assigned to other candidates so as to prevent a recursive election process. However, if this is not possible a recursive election process is held on the elected node’s subtree.


What to do if the dead node has no parent:


In our design, the top level master browsers have links between every other master browser in which case the children of these master browsers could at least maintain links to a small subset of other top level master browsers. This way, when a top level master browser has been detected as down, its children can organize an election between themselves and fill the job accordingly.


However, for implementation reasons, we have chosen to not implement top level MasterBrowser as a special case and instead have each child of the root MasterBrowser store every other child in an ordered list as provided by the root. Then when the root dies, an election process can occur by having the person at the top of the list declare itself head of the election. Other children that are not the top of the list attempt to search for the head by starting at the top and continuing down until they either successfully contact someone or until they have hit their own selves on the list. If the child hits itself on the list, the child declares itself as head of the election since the others with priority appear to be down. The election process then resumes as above.