Lecture 5, 9/20: Lists
Textbook reading: DJW chap. 2
Implementing Lists as Arrays
Two data fields:
- The array holding the elements of the list.
- The number of elements in the list.
A location is an integer index.
Adding/deleting other than at the end involves sliding everything
after the insertion/deletion point down/up.
Note as regards the constructor: In Java generics, if T is a generic
variable, you are not allowed to say new T to create an array of
T's. What you have to do is to create an array of objects, and then
cast that to an array of T's:
T A = (T ) new Object
That is, you create an array of Objects and then you tell the
Java that that can be viewed an array of T's. The Java compiler will
give you a warning, because it can't be sure that this is type safe.
Methods nth, getNumElements,first,last,addEnd take constant time.
Methods addAt, deleteAt take time proportional to list length,
in the worst case.
Lists as expandable arrays
To fix the problem of wasting space, you can use an "expandable" array. That
is, when the list becomes too large for the array, you double the size of the
array; when the array becomes less than half full, you cut the size of
the array in half. Whenever the size of the
array is changed, the entire contents of the old array have to be copied
into the new array.
If you only add items and never delete, then total number of copies performed
is at most double the total number of items in the list. This is known as
an "amortized" analysis; you do an expensive operation from time to time,
but you make up for that by a lot of cheap operations.
This implementation is susceptible to thrashing, if a sequence of adds and
delete takes it back and forth between just overfull and just less than
half full. To avoid this, it is better to have a difference between amount
of expansion and the threshhold for contraction. E.g. double when full, but
only contract when less than 1/3 full.
The Java library class ArrayList works more or less this way.
Many different choices:
Not all combinations are possible, or make any sense. For instance, if
you have a doubly linked list, and you don't have separate Node and List
classes then different lists can't share structure, since each node
determines all its predecessors and all its successors.
- Singly linked (each node points to the next item) or doubly linked
(each node points both to the next and to the previous)?
- Dummy node (header) at beginning and end, only at beginning, or
not at all?
- Pointers to the beginning or both to the beginning and end?
- Field for the length or not?
- Separate classes Node and List, or only one class?
- Can different lists share structure?
General observation: When a data structure includes redundant information:
- The advantage is that some operations can be much faster.
- The disadvantage is that there are more integrity constraints to maintain.
E.g. in a doubly linked list if p.next = q
then you have to have q.prev == p. So it is easier, if you make
a mistake, to end up with an inconsistent data structure. These tend to
lead quickly to disaster.
In a singly linked list, the only integrity constraint is that you don't have
a circular list structure.
Singly linked list with header
Singly linked list; header node; no pointer to the end; no length
field; no separate List class; list may share tails.
List L and M share the last two nodes.
- To delete an item, you need to locate the previous node.
- Methods addAfter, deleteAfter, first, addFirst, are
constant time operations. Methods locate, locateBefore,
last, addLast take time proportional to the length of the list,
in the worst case.
More complex linked list structure
Doubly linked list; no header nodes; pointer to the end; no length
field; separate List and Node classes; list may share tails.