Open Source on IBM i

Lists: Arraylist vs Linked List

Hi,

I am just moving my service program Linked List to the OSSILE project and I am again wondering if people know why there are two list implementations and when to use which implemenation or when to use a list at all.

I am doing most of my development in the Java world and there lists are really cheap. You will encounter them in Java everywhere. In the RPG world it is exactly the other way around. Arrays are really cheap and there is not even a list data
structure natively in RPG.

Note: And by data structure I don't mean a DS or dcl-ds declared variable but something more general as explained on this page.

So using a list is not as cheap as using an array from a performance point of view. But it definitely has its pros. If you don't know the size of the result you want to return from a procedure you are at a loss when using an array because no matter how hard you try you cannot do it correctly. The number of returned elements will almost never match the size of your array. The array will either be too big or may be even too small (which would be rather fatal).

And this was my main motiviation of implementing a list. Because with a list the number of elements doesn't matter. The list starts with the number of elements you declare (probably none) and grows as you add elements to it. Depending on the available API it may also have some very convenient procedures like rotate all elements in the list, remove a range of elements, split a string of characters by one separator to a list, last index of element, swap elements, ... .

Arraylist vs Linked List

The main difference of these lists is the way the data is stored and accessed.

Arraylist

The Arraylist service program uses one big block of data to store the pointers to each element. By doing that it can really quickly access elements by index. Adding elements to the end of the list is performing normally but adding to the beginning of the list is a bad idea performance wise because all pointers to the other elements must be shifted down by one position. Also adding elements could result in the need to expand this block of memory where the pointers are stored. This operation may have a bigger impact on performance. Iterating through the nodes is really easy as it can jump to the next node by just one arthimetic expression.

Linked List

The Linked List service program is implemented as a two-way linked list where each node has a pointer to the previous node and to the next node. Because of this inserting a node at any place takes the same time no matter how large the list grows. Iterating through the list is also very cheap as you go from node to node. But accessing the list by index is not a good idea because the list has to start at the beginning of the list and then count till it reaches the node at the specified index.

Not good at

Both lists are not good at doing one thing: locating an element by value. So if you want to check if a value is already in the list then you can do it. Both lists support it by the contains procedure but both lists have to start from the beginning of the list and check every element if it has the same value as the passed one. For this kind of operation these two lists are the wrong data structure or rather the wrong implementation to use. You would need something like a sorted list where all entries are sorted at any time. Or use a tree data structure.

Wrap it up

  • If you know the size of the content you should use an array.
  • If you don't know the size of an array use a list.
  • If you need to access the list by index use the Arraylist.
  • If you have many insert and remove operations on the list use the Linked List.

I hope that gave you a good idea about when to use a list and which implementation to use.

Happy coding!

Mihael

Tags : List