The Array-Backed List

Agenda

  1. The List Abstract Data Type (ADT)
  2. A List Data Structure
  3. Our List API
  4. Getting started: how to store our data?
  5. Built-in list as array
  6. The ArrayList data structure

1. The List Abstract Data Type (ADT)

An abstract data type (ADT) defines a conceptual model for how data may be stored and accessed.

A list ADT is a data container where:

  • values are ordered in a sequence
  • each value has at most one preceding and one succeeding value
  • a given value may appear more than once in a list

Other common ADTs (some of which we'll explore later) include:

  • Stacks
  • Queues
  • Priority Queues
  • Maps
  • Graphs

2. A List Data Structure

A list data structure is a concrete implementation of the list ADT in some programming language, which, in addition to adhering to the basic premises of the ADT, will also typically support operations that:

  • access values in the list by their position (index)
  • append and insert new values into the list
  • remove values from the list

The implementation of any data structure will generally rely on simpler, constituent data types (e.g., "primitive" types offered by the language), the choice of which may affect the runtime complexities of said operations.

Reference reading: http://interactivepython.org/courselib/static/pythonds/Introduction/WhyStudyDataStructuresandAbstractDataTypes.html

3. The List API

The operations we'll be building into our list data structures will be based on the common and mutable sequence operations defined by the Python library.

In [33]:
class List:        
    ### subscript-based access ###
    
    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        pass

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        pass

    def __delitem__(self, idx):
        """Implements `del self[idx]`"""
        pass
    
    ### stringification ###
            
    def __repr__(self):
        """Supports inspection"""
        return '[]'
    
    def __str__(self):
        """Implements `str(self)`"""
        return '[]'

    ### single-element manipulation ###
    
    def append(self, value):
        pass
    
    def insert(self, idx, value):
        pass
    
    def pop(self, idx=-1):
        pass
    
    def remove(self, value):
        pass
    
    ### predicates (T/F queries) ###
    
    def __eq__(self, other):
        """Implements `self == other`"""
        return True

    def __contains__(self, value):
        """Implements `val in self`"""
        return True
    
    ### queries ###
    
    def __len__(self):
        """Implements `len(self)`"""
        return len(self.data)
    
    def min(self):
        pass
    
    def max(self):
        pass
    
    def index(self, value, i, j):
        pass
    
    def count(self, value):
        pass

    ### bulk operations ###

    def __add__(self, other):
        """Implements `self + other_array_list`"""
        return self
    
    def clear(self):
        pass
    
    def copy(self):
        pass

    def extend(self, other):
        pass

    ### iteration ###
    
    def __iter__(self):
        """Supports iteration (via `iter(self)`)"""
        pass
In [34]:
l = List()
In [35]:
l.append('x')
In [36]:
l[0]
Out[36]:
100
In [32]:
len(l)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-32-e75269d816bd> in <module>()
----> 1 len(l)

<ipython-input-28-2d5b866961ca> in __len__(self)
     52     def __len__(self):
     53         """Implements `len(self)`"""
---> 54         return len(self.data)
     55 
     56     def min(self):

AttributeError: 'List' object has no attribute 'data'

4. Getting started: how to store our data?

In [37]:
class List:
    def __init(self):
        self.data = 0
    
    def append(self, value):
        self.data = value
    
    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        return self.data

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        self.data = value
    
    def __repr__(self):
        """Supports inspection"""
        return '[' + str(self.data) + ']'
In [38]:
l = List()
In [39]:
l.append(10)
In [40]:
l
Out[40]:
[10]
In [41]:
l.append('x')
In [42]:
l
Out[42]:
[x]
In [43]:
l[0]
Out[43]:
'x'
In [44]:
l[0] = 20
In [45]:
l
Out[45]:
[20]
In [46]:
len(l)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-46-e75269d816bd> in <module>()
----> 1 len(l)

TypeError: object of type 'List' has no len()

5. Built-in list as array

To use the built-in list as though it were a primitive array, we will constrain ourselves to just the following APIs on a given list lst:

  1. lst[i] for getting and setting values at an existing, positive index i
  2. len(lst) to obtain the number of slots
  3. lst.append(None) to grow the list by one slot at a time
  4. del lst[len(lst)-1] to delete the last slot in a list
In [47]:
l = []
In [ ]:
l[0] = 10 #not permitted
In [48]:
l.append(None)

l[0] = 10

l[0]
Out[48]:
10
In [49]:
l # not permitted
Out[49]:
[10]
In [50]:
l.append(None)

l[1] = 20

l[0], l[1]
Out[50]:
(10, 20)
In [ ]:
del lst[len(lst)-1] #delete one by one in the reverse order

6. The ArrayList data structure

In [83]:
class ArrayList:
    def __init__(self):
        self.data = []
        pass
    
    def append(self, value):
        pass
    
    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        assert(isinstance(idx, int))
        pass
    
    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        assert(isinstance(idx, int))
        pass
    
    def __delitem__(self, idx):
        """Implements `del self[idx]`"""
        assert(isinstance(idx, int))
        pass
    
    def __len__(self):
        """Implements `len(self)`"""
        pass
    
    def __repr__(self):
        """Supports inspection"""
        pass
In [84]:
class ArrayList:
    def __init__(self):
        self.data = []

    def append(self, value):
        self.data.append(None)
        self.data[len(self.data)-1] = value

    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        assert(isinstance(idx, int))
        return self.data[idx]

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        assert(isinstance(idx, int))
        self.data[idx] = value

    def __delitem__(self, idx):
        """Implements `del self[idx]`"""
        assert(isinstance(idx, int))
        for i in range(idx, len(self)-1): #len(), __getitem()__ already implemented
            self[i] = self[i+1] #shift elements after i one position ahead 
        del self.data[len(self.data)-1]
    
    def __len__(self):
        """Implements `len(self)`"""
        return len(self.data)
    
    def __repr__(self):
        """Supports inspection"""
        pass
In [85]:
l = ArrayList()
    
In [86]:
for i in range(10):
    l.append(i)
In [87]:
l[0], l[5], l[9]
Out[87]:
(0, 5, 9)
In [88]:
l
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/anaconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    398                         if cls is not object \
    399                                 and callable(cls.__dict__.get('__repr__')):
--> 400                             return _repr_pprint(obj, self, cycle)
    401 
    402             return _default_pprint(obj, self, cycle)

/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    693     """A pprint that just redirects to the normal repr function."""
    694     # Find newlines and replace them with p.break_()
--> 695     output = repr(obj)
    696     for idx,output_line in enumerate(output.splitlines()):
    697         if idx:

TypeError: __repr__ returned non-string (type NoneType)
In [89]:
l.data
Out[89]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [90]:
del l[5]
In [91]:
l.data
Out[91]:
[0, 1, 2, 3, 4, 6, 7, 8, 9]